Code Monkey home page Code Monkey logo

cmake-spdx's Introduction

cmake-spdx

What is it?

cmake-spdx is a tool to automatically generate SPDX documents as software bill-of-materials (SBOM) manifests corresponding to the sources and build artifacts from a CMake build process.

It was created with a particular focus for Zephyr using the west build tool. (Zephyr / west are the only context I've actually tested it in so far.) However, nothing in here is Zephyr- or west-specific, so there's no reason that it wouldn't work for any other project that uses CMake for builds.

Note that cmake-spdx is still a very early-stage tool and should be treated as a proof of concept rather than anything more production-ready.

What does it do?

cmake-spdx leverages the CMake file-based API to observe and parse data about a CMake build process. It then translates that data, together with a scan of the relevant code directories, to create two SPDX files:

  • sources.spdx, describing the source files; and
  • build.spdx, describing the built artifacts.

It uses the CMake API metadata to determine which source files are built into which binary artifacts, and creates SPDX relationships to document them in a machine-readable and human-readable manner.

The scanning process also looks for SPDX short-form identifiers as license information in the code, and records any that are found.

Examples of the generated files for a sample run can be found at example/sources.spdx and example/build.spdx. A description of the process that was used for generating these files can be found in process.md.

More details

See the following documents for more details:

License

Apache-2.0

cmake-spdx's People

Contributors

goneall avatar swinslow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cmake-spdx's Issues

Support output in YAML / JSON SPDX formats

It seems that currently only the tag / value format is supported as an output format. Could we get native (i.e. without the need to rely on external conversion tools) support for the arguabley more readable YAML / JSON representation of SPDX?

External document refs should have a SHA1 instead of a SHA256

It is admittedly a bit hard to follow in the spec, but my reading is the external document references must use SHA1.

This is based on section 2.6 subsection 2.6.4:

[Checksum] is a checksum of the external document following the checksum

format defined in section 4.4.

In section 4.4 subsection 4.4.3:

4.4.3 Cardinality: Mandatory, one SHA1, others may be optionally provided.

Conclude binary license based on sources' licenses

Currently, cmake-spdx can determine:

  • a source code file's license (if it has an SPDX-License-Identifier: tag); and
  • which source code file(s) are built into a binary file (to the extent that the CMake file API response contains that info).

It would be interesting to explore whether this can be used to auto-conclude the binary file's license -- in other words, to fill in LicenseConcluded with some value other than NOASSERTION.

The most obvious approach would be to simply AND together the licenses of all the source files and/or libraries that are used as inputs. So, for instance, if a binary has four source file inputs, three of which contain Apache-2.0 licenses and the other containing ISC, then cmake-spdx could follow the relationships to determine that those are the four files used to create the binary, and then conclude the binary's license as Apache-2.0 AND ISC.

Update README and add writeup for SPDX building

After #1 is merged, and before sharing the links back with the Zephyr Slack channel, update the README and add a better writeup explaining:

  • process for building with CMake file API / running cmake-spdx
  • outputs created (also add sample sources.spdx and build.spdx to examples directory)
  • what is included now
  • what still needs to be improved
  • consider also adding graphviz sample image to examples directory
  • interesting results (e.g. invalid SPDX doc b/c module identifiers, with links to incorrect ones)

Handle source packages more appropriately

Currently, cmake-spdx makes some incorrect assumptions about how the source files will be structured:

  • it assumes that all relevant source files are contained within one top-level directory
  • it requires that the user specify what that top-level directory is
  • it assumes that relative paths within the CMake file API responses will be relative to that top-level directory

The CMake file API response does contain an initial "source" directory response in the codemodel object, see for example:

"paths" :
{
"build" : "/home/steve/programming/zephyr/zephyrproject/zephyr/build",
"source" : "/home/steve/programming/zephyr/zephyrproject/zephyr/samples/basic/blinky"
},

The problem is that this points to the sources of the specific application being built, not to the main zephyr directory itself. So this can't be used as the top-level directory for treating all the relevant sources as a single Package in SPDX terminology.

Most likely, the more correct way to do this will be something like the following:

  • use the CMake codemodel's directories, projects and targets data to represent the sources more correctly as a collection of multiple Packages, rather than as a single Package
  • when walking through the source files, appropriately include them in the Package (or even multiple Packages) that contain them, taking into account the potential for different relative paths in the CMake file API responses depending on the context

This will lead to a more flexible process and more correct SPDX document, which will also require less user guidance (as it should be driven entirely from the CMake file API responses). However, it will likely add significant complexity to the process of scanning the sources and creating the sources SPDX document, so I haven't started on an approach to this for the initial proof of concept.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.