sw360 / capycli Goto Github PK
View Code? Open in Web Editor NEWCaPyCLI - Python scripts for software license compliance automation with SW360
License: Other
CaPyCLI - Python scripts for software license compliance automation with SW360
License: Other
I tried exporting a SW360 project with a release with this externalId set:
This however resulted in a BOM with this entry:
{
"type": "library",
"bom-ref": "pkg:generic/[email protected]",
"name": "anacron",
"version": "2.3-24.debian",
"purl": "pkg:generic/[email protected]",
I will send a PR for this in a minute.
Ideally, there should only be one source attachment in SW360 for each release, but in practice, people tend to add more, so we should also reflect this in "project CreateBom"
Currently, this is not possible because create_project_bom()
today builds up a BOM in legacy format first which only supported one SourceFile
per release. As this is immediately converted to cdx format afterwards, I suggest to refactor the method to directly build up a cdx BOM where I can easily store multiple sources as externalReferences.
Any objections, @tngraf?
I used "bom downloadsources" to get the sources for my project. This reported:
Miglayout-swing, 5.3
URL = https://artifactory-internal.ct.daai.siemens.cloud/artifactory/siemens-virtual/com/miglayout/miglayout-swing/5.3/miglayout-swing-5.3-sources.jar
Downloading file miglayout-swing-5.3-sources.jar
jlfgr, 1.0
No URL specified!
javax.mail, 1.6.2
URL = https://artifactory-internal.ct.daai.siemens.cloud/artifactory/siemens-virtual/com/sun/mail/javax.mail/1.6.2/javax.mail-1.6.2-sources.jar
Downloading file javax.mail-1.6.2-sources.jar
It is ok to have no sources for jlfgr because this is only an archive of graphical look&feel-resources.
The BOM update for the downloaded sources fails to handle the missing sources and inserts the previously seen sources instead:
{
"type": "library",
"bom-ref": "pkg:maven/com.oracle/[email protected]?type=jar",
"group": "com.oracle",
"name": "jlfgr",
"version": "1.0",
"description": "POM was created by Sonatype Nexus",
"purl": "pkg:maven/com.oracle/[email protected]?type=jar",
"externalReferences": [
{
"url": "./miglayout-swing-5.3-sources.jar",
"comment": "source archive (local copy)",
"type": "distribution",
}
],
This is then wrongly uploaded to SW360 in a createcomponents/createreleases call.
When creating the project the with a BOM file that has no SW360ID mappings, the process fails with TypeError: bad operand type for unary +: 'str'
.
Test on python 3.10.
It seems that in the current documentation there is no detailed explanation about SBOM filtering:
Commands like project prerequisites
display warnings and errors, but they do not break a CI job.
This new feature adds a flag force error
to project prerequisites
to exit the application with an error code.
Have answers for...
To be discussed: do we need tom kind of -dryrun
option for all commands that write to SW360?
Would this prevent unexperienced users from creating large numbers of moderation requests?
I want to use CaPyCLI inside an automated pipeline. Therefore it's not recommended to use personal tokens because they are coupled directly with the personal user account of Github.
Requests with this authentication look a bit different.
src: Authentifizieren bei der REST-API
Here is an example of searching repositiories.
curl --request GET \
--url 'https://api.github.com/search/repositories?q=Sowas' \
--header 'Accept: application/vnd.github+json' \
--header 'Authorization: Bearer <jwt-token>' \
--header 'X-GitHub-Api-Version: 2022-11-28' \
--cookie logged_in=no
My experience when scanning javascript projects is that many SBOM entries don't have correctly configured external references. Specifically the repository and project site URLs. This results in 'bom findsources' failing to determine many source code URLs.
Obviously, the issue isn't on the findsources command but on the sub-optimal configured SBOM entries.
When investigating many SBOM files I noticed that many entries in fact do contain links to the GitHub repositories, but with the externalReference type "distribution" (without any comment).
I would recommend extending the findsources logic to also parse external references of type "distribution" where the URL is a GitHub URL but not an archive link (e.g. ends with .git).
This would in my experience drastically increase the number of found source code URLs.
After reading the Standard BOM v2 I haven't found that this extension would be violating some design principle.
We we run the unit test on Python 3.8 and 3.9 there are error messages...
After fixing #26 in main
, we still lack correct handling of multiple purls. CaPyCli silently takes the JSON-encoded string containing the array, so we get such a BOM:
{
"type": "library",
"bom-ref": "[\"pkg:deb/debian/[email protected]\",\"pkg:deb/debian/[email protected]?arch=source\"]",
"name": "acl",
"version": "2.2.52-3.debian",
"purl": "[\"pkg:deb/debian/[email protected]\",\"pkg:deb/debian/[email protected]?arch=source\"]",
I also think there's no perfect solution as CycloneDX allows only one purl per component, but we should at least warn the user and probably make it easy for him to select the right purl, e.g. by adding them separated by space?
I have started a clearing run for a Java/Maven project. Maven generated the initial SBOM, this was converted and filtered with capycli. I can provide intermediate BOMs on request. I run the latest capycli version from GitHub. For the result, capycli was called as:
python3 -m capycli bom map --nocache -i ./clearing_results/bom.json -o ./clearing_results/updated_bom.json -ov ./clearing_results/overview.json -mr ./clearing_results/mappingresult.json
CaPyCli, 2.0.1 - Map a given SBOM to data on SW360
Loading SBOM file ./clearing_results/bom.json
Analyzing token...
Token will expire on 2024-01-21 13:47:13
Checking access to SW360...
No cached releases available!
Do mapping...
Retrieving package-url ids, filter: {'mav'}
Traceback (most recent call last):
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli_main.py", line 13, in
cli.main()
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\main\cli.py", line 27, in main
app.run(argv)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\main\application.py", line 157, in run
self._run(argv)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\main\application.py", line 138, in _run
handle_bom.run_bom_command(self.options)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\bom\handle_bom.py", line 82, in run_bom_command
app.run(args)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\bom\map_bom.py", line 980, in run
result = self.map_bom_to_releases(
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\bom\map_bom.py", line 524, in map_bom_to_releases
self.external_id_svc.build_purl_cache(purl_types, self.verbosity <= 1)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\common\purl_service.py", line 49, in build_purl_cache
all_ids = all_ids + self.client.get_releases_by_external_id("package-url")
TypeError: can only concatenate list (not "dict") to list
This looks... unexpected.
Not everyone outside Siemens might agree to our granularity definitions.
When we convert BOM from legacy to capycli [ capycli bom convert -i <input.json> -if legacy -o <output.json> -of capycli ] "SourceUrl" from legacy BOM is getting converted to "source archive (local copy)" in capycli BOM.
What should be the actual conversion?
It should be converted to "source archive (download location)"
How to replicate the issue?
Take a simple capycli BOM and convert to legacy BOM and then again convert to capycli BOM. The 2 capycli BOM output will be different in this case. Attaching a simple capycli BOM to try.
Resolution of the issue?
PR: #63
{
"$schema": "http://cyclonedx.org/schema/bom-1.4.schema.json",
"bomFormat": "CycloneDX",
"specVersion": "1.4",
"serialNumber": "urn:uuid:12432b27-1088-4a8d-a85c-948c1e1812bc",
"version": 1,
"metadata": {
"timestamp": "2024-04-13T06:17:33.177283+00:00",
"tools": [
{
"vendor": "Siemens AG",
"name": "CaPyCLI",
"version": "2.3.0",
"externalReferences": [
{
"url": "https://github.com/sw360/capycli",
"type": "website"
}
]
},
{
"vendor": "Siemens AG",
"name": "standard-bom",
"version": "2.0.0",
"externalReferences": [
{
"url": "https://code.siemens.com/sbom/standard-bom",
"type": "website"
}
]
}
],
"licenses": [
{
"license": {
"id": "CC0-1.0"
}
}
],
"properties": [
{
"name": "siemens:profile",
"value": "capycli"
}
]
},
"components": [
{
"type": "library",
"bom-ref": "pkg:pypi/[email protected]",
"name": "click",
"version": "8.1.7",
"description": "Composable command line interface toolkit",
"licenses": [
{
"license": {
"name": "BSD-3-Clause"
}
}
],
"purl": "pkg:pypi/[email protected]",
"externalReferences": [
{
"url": "click-8.1.7-py3-none-any.whl",
"comment": "relativePath",
"type": "distribution"
},
{
"url": "click-8.1.7.tar.gz",
"comment": "source archive (local copy)",
"type": "distribution"
},
{
"url": "https://files.pythonhosted.org/packages/00/2e/d53fa4befbf2cfa713304affc7ca780ce4fc1fd8710527771b58311a3229/click-8.1.7-py3-none-any.whl",
"comment": "binary (download location)",
"type": "distribution"
},
{
"url": "https://files.pythonhosted.org/packages/96/d3/f04c7bfcf5c1862a2a5b845c6b2b360488cf47af55dfa79c98f6a6bf98b5/click-8.1.7.tar.gz",
"comment": "source archive (download location)",
"type": "distribution"
},
{
"url": "https://pypi.org/project/click/",
"comment": "PyPi URL",
"type": "distribution"
},
{
"url": "https://palletsprojects.com/p/click/",
"type": "website"
}
],
"properties": [
{
"name": "siemens:primaryLanguage",
"value": "Python"
}
]
}
],
"dependencies": [
{
"ref": "pkg:pypi/[email protected]",
"dependsOn": []
}
]
}
We should check if we can have more try/except statements when accessing and SW360 instance.
See for example line 519 in create_components.py - a network error crashes the whole component creation.
I currently have the task to synchronize projects' attachments between two SW360 instances.
In my specific case, certain projects including source attachments are uploaded into two different instances, but clearing results (e.g. reports and CLI files) are uploaded in only one of them. So the remaining bit for me would be to download those from instance A and upload it to instance B.
So my idea would be to add following two commands:
bom
UploadAttachments upload all attachments described in a BOM e.g. created by "project DownloadAttachments"
project
DownloadAttachments download all attachments (of certain types) and create a BOM listing them
This would require extending the BOM formats by some fields, probably something like
"externalReferences" : [
{
"type": "other",
"url": "file:///attachments/CLIXML_keyutils_1.6-6-debian-combined.tar.bz2_2019-10-10_10_02_52.xml",
"comment": "component license information (XML)"
}
]
Alternatively, I could also implement "project uploadattachments", but I think with the suggestion above, it could be used flexible for multiple use cases.
Together "project createbom", "bom map" and "bom createcomponents", I think we could even support complete project export/import use cases in the future like:
What do you think?
running the capycli on a new and empty project resulted in this exception:
../capycli/project/create_project.py", line 97, in update_project
print(" " + str(len(project["_embedded"]["sw360:releases"])) + " releases in project after update")
was on capycli 1.91
but the code in 2.0 will suffer with the same issue here:
https://github.com/sw360/capycli/blob/main/capycli/project/create_project.py#L77
if no release yet is attached to a project, it will also fails.
-> add a check if "sw360:releases" in project["_embedded"] as in l.65
When using --search-meta-data
on the npm version 2 lock file, and the discovered repository URL is a github one, the URL in the BOM item isn't updated correctly. The URL written to BOM file is the original URL from the found metadata and not the actual URL of the source code archive file.
Example of the found metadata:
{'type': 'git', 'url': 'git+https://github.com/fastify/accept-negotiator.git'}
Observed external resource:
{
"url": "https://github.com/fastify/accept-negotiator.git",
"comment": "source archive (download location)",
"type": "distribution"
},
Expected external resource:
{
"url": "https://github.com/fastify/accept-negotiator/archive/refs/tags/v1.1.0.zip",
"comment": "source archive (download location)",
"type": "distribution"
},
Current documentation leads to confusion between "capycli project create" and "capycli project update".
Looking at the code, both are doing the same stuff, but with the major difference being the onlyUpdateProject==True set in https://github.com/sw360/capycli/blob/main/capycli/project/create_project.py.
Looking further, this is setting the 'add' in https://github.com/sw360/sw360python/blob/master/sw360/project.py / update_project_releases(self, releases, project_id, add=False):
which says
If
addis True, given
releases are added to the project, otherwise, the existing releases will be replaced.
And the code indeed create a list of releases to be linked to the project by getting the existing ones and adding the new ones.
Therefore, 'capycli project create' instead of 'capycli project update' would update the project but replace all releases, i.e. remove existing ones.
'create' is actually the only option if we want to update an existing project but remove some releases that do not belong to it anymore.
Please clarify the doc to reveal this major difference.
Hello,
Starting from poetry 1.5.0 ref the "category" field based on which capycli skips dev dependency is no more part of the poetry.lock
file. #7637
This causes capycli to list all the dependencies, included dev ones, and in our projects to load also those into SW360.
I'm not so acquainted with poetry internals, but I guess that a solution would require reading the main dependencies from pyproject.toml
file and resolving the transitive dependencies of the main dependencies from the poetry.lock
, as by looking only at the lock file isn't enough to exclude dev dependencies.
calling capycli bom map in a job, I got the exception
File "/usr/local/lib/python3.10/site-packages/capycli/bom/map_bom.py", line 511, in map_bom_to_releases
if len(component.purl) > 8 and component.purl.startswith("pkg:"):
TypeError: object of type 'NoneType' has no len()
quick fix on line 511
if component.purl and len(component.purl) > 8 and component.purl.startswith("pkg:"):
I wanted to push a branch and create a PR but I get a 403...
We already use pytest, but still derive our test classes from traditional unittest.
This is perfectly valid, but we cannot use some nice pytest features that way. As an example, pytest has built-in support for capturing stdout/stderr, so we could drop our own implementation. I also like pytest's more intuitive assert syntax.
Concrete reason why I ask is that I recently extended our own capture_stdout implementation to support an arbitrary number of arguments and now I noticed that with our own implemenation, I can't capture stdout and check the return value at the same time. All of that would already be available from pytest.
It would however mean a larger rework, but I think it would be possible to migrate file by file where needed.
If you're interested, @tngraf, I could try adding my new tests for #33 in pytest-style or also rewrite the "project createbom" tests in #37 to pytest style.
https://github.com/sw360/capycli/blob/main/capycli/bom/findsources.py#L524
The findsources.py code will break as the binary_url is not converted to str. Please add str(binary_url) in the above line.
Read all dependencies from the poetry.lock
file.
mypy is static code checking for Python.
They use this also for all CycloneDX Python projects.
Useful, but will costs many hours of effort and may affect nearly each file (we need correct typing information!)
At the moment we an only handle package-lock.json
files in version 2.
We should also be able to handle version 3.
When exporting an SW360 project using "project CreateBom", I get the following header:
"metadata": {
"timestamp": "2023-07-12T20:23:44.589297+00:00",
"tools": [
{
"vendor": "Siemens AG",
"name": "CaPyCLI",
"version": "2.0.1",
"externalReferences": [
{
"url": "https://github.com/sw360/capycli",
"type": "website"
}
]
},
{
"vendor": "Siemens AG",
"name": "standard-bom",
"version": "2.0.0",
"externalReferences": [
{
"url": "https://code.siemens.com/sbom/standard-bom",
"type": "website"
}
]
}
],
"licenses": [
{
"license": {
"id": "CC0-1.0"
}
}
],
"properties": [
{
"name": "siemens:profile",
"value": "capycli"
}
]
},
According to https://cyclonedx.org/docs/1.4/json/#metadata, this means the BOM itself is licensed under CC0
, right? But even in this case, I wonder if Joe Defaultuser (from the industry) expects his BOMs to be flagged as freeware? Is this intended?
@Garbald reported another crash which I can also reproduce:
poetry run python3 -m capycli bom downloadsources --nocache -i ./bom-christoph-tietz.json -url https://sw360.siemens.com -oa -t ...
leads to:
CaPyCli, 2.0.1 - Download source files from the URL specified in the SBOM
Loading SBOM file ./bom-christoph-tietz.json
Downloading source files to folder ./ ...
logback-classic, 1.3.11
URL = https://repo1.maven.org/maven2/ch/qos/logback/logback-classic/1.3.11/logback-classic-1.3.11-sources.jar
Downloading file logback-classic-1.3.11-sources.jar
Traceback (most recent call last):
File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/gernot/checkout/capycli/capycli/__main__.py", line 13, in <module>
cli.main()
File "/home/gernot/checkout/capycli/capycli/main/cli.py", line 27, in main
app.run(argv)
File "/home/gernot/checkout/capycli/capycli/main/application.py", line 157, in run
self._run(argv)
File "/home/gernot/checkout/capycli/capycli/main/application.py", line 138, in _run
handle_bom.run_bom_command(self.options)
File "/home/gernot/checkout/capycli/capycli/bom/handle_bom.py", line 102, in run_bom_command
app.run(args)
File "/home/gernot/checkout/capycli/capycli/bom/download_sources.py", line 196, in run
self.download_sources(bom, source_folder)
File "/home/gernot/checkout/capycli/capycli/bom/download_sources.py", line 113, in download_sources
ext_ref = ExternalReference(
TypeError: __init__() missing 1 required keyword-only argument: 'url'
It seems
capycli/capycli/bom/download_sources.py
Line 117 in ad9f60d
CaPyCLI reports an error when generating an SBOM from a project on SW360 when this project contains a component with more than on package-url. Actually it is the packageurl
component which reports the error.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.