Code Monkey home page Code Monkey logo

capycli's People

Contributors

gernot-h avatar maxhbr avatar nzupan avatar t-graf avatar tngraf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

capycli's Issues

"project CreateBom" only supports one source attachment per release

Ideally, there should only be one source attachment in SW360 for each release, but in practice, people tend to add more, so we should also reflect this in "project CreateBom"

Currently, this is not possible because create_project_bom() today builds up a BOM in legacy format first which only supported one SourceFile per release. As this is immediately converted to cdx format afterwards, I suggest to refactor the method to directly build up a cdx BOM where I can easily store multiple sources as externalReferences.

Any objections, @tngraf?

fails to run after clean install in various containers

Hi,

I tried to run the capycli command in fresh container images

  • apt update
  • apt install -y pip
  • pip install capycli

results are the same in both debian:stable and buildpack-deps:stable

image

I can't even run with --help - same error. Am I doind something wrong?

Bad handling of missing sources.

I used "bom downloadsources" to get the sources for my project. This reported:

Miglayout-swing, 5.3
URL = https://artifactory-internal.ct.daai.siemens.cloud/artifactory/siemens-virtual/com/miglayout/miglayout-swing/5.3/miglayout-swing-5.3-sources.jar
Downloading file miglayout-swing-5.3-sources.jar
jlfgr, 1.0
No URL specified!
javax.mail, 1.6.2
URL = https://artifactory-internal.ct.daai.siemens.cloud/artifactory/siemens-virtual/com/sun/mail/javax.mail/1.6.2/javax.mail-1.6.2-sources.jar
Downloading file javax.mail-1.6.2-sources.jar

It is ok to have no sources for jlfgr because this is only an archive of graphical look&feel-resources.
The BOM update for the downloaded sources fails to handle the missing sources and inserts the previously seen sources instead:

{
"type": "library",
"bom-ref": "pkg:maven/com.oracle/[email protected]?type=jar",
"group": "com.oracle",
"name": "jlfgr",
"version": "1.0",
"description": "POM was created by Sonatype Nexus",
"purl": "pkg:maven/com.oracle/[email protected]?type=jar",
"externalReferences": [
{
"url": "./miglayout-swing-5.3-sources.jar",
"comment": "source archive (local copy)",
"type": "distribution",
}
],

This is then wrongly uploaded to SW360 in a createcomponents/createreleases call.

Explain SBOM filtering

It seems that in the current documentation there is no detailed explanation about SBOM filtering:

  • why is is needed/recommended
  • how it is done
  • format specification of the filter file

Add FAQ

Have answers for...

  • Can I create an SBOM for a SW360 project?
  • I have an SBOM for the third party software components of my project. Is there any automated way to find the source code for these components?
    ...

Have a -dryrun option

To be discussed: do we need tom kind of -dryrun option for all commands that write to SW360?
Would this prevent unexperienced users from creating large numbers of moderation requests?

feature: use app tokens for authentication

I want to use CaPyCLI inside an automated pipeline. Therefore it's not recommended to use personal tokens because they are coupled directly with the personal user account of Github.

Idea

  1. I create a Github App with necessary rights which can be shared between multiple people.
  2. With the ClientId, ClientSecret and ClientCertificate I generate a short-living JWT-Token each time the pipeline runs.
    src: Generating a JSON Web Token (JWT) for a GitHub App

Expected Change

Requests with this authentication look a bit different.
src: Authentifizieren bei der REST-API

Here is an example of searching repositiories.

curl --request GET \
  --url 'https://api.github.com/search/repositories?q=Sowas' \
  --header 'Accept: application/vnd.github+json' \
  --header 'Authorization: Bearer <jwt-token>' \
  --header 'X-GitHub-Api-Version: 2022-11-28' \
  --cookie logged_in=no

feature proposal: broaden scope of the findsources command

Current behavior

My experience when scanning javascript projects is that many SBOM entries don't have correctly configured external references. Specifically the repository and project site URLs. This results in 'bom findsources' failing to determine many source code URLs.

Obviously, the issue isn't on the findsources command but on the sub-optimal configured SBOM entries.

When investigating many SBOM files I noticed that many entries in fact do contain links to the GitHub repositories, but with the externalReference type "distribution" (without any comment).

Proposed extension

I would recommend extending the findsources logic to also parse external references of type "distribution" where the URL is a GitHub URL but not an archive link (e.g. ends with .git).
This would in my experience drastically increase the number of found source code URLs.

After reading the Standard BOM v2 I haven't found that this extension would be violating some design principle.

"project createBom" doesn't handle multiple purls

After fixing #26 in main, we still lack correct handling of multiple purls. CaPyCli silently takes the JSON-encoded string containing the array, so we get such a BOM:

    { 
      "type": "library",
      "bom-ref": "[\"pkg:deb/debian/[email protected]\",\"pkg:deb/debian/[email protected]?arch=source\"]",
      "name": "acl",
      "version": "2.2.52-3.debian",
      "purl": "[\"pkg:deb/debian/[email protected]\",\"pkg:deb/debian/[email protected]?arch=source\"]",

I also think there's no perfect solution as CycloneDX allows only one purl per component, but we should at least warn the user and probably make it easy for him to select the right purl, e.g. by adding them separated by space?

Crash in mapping Java/Maven project against SW360

I have started a clearing run for a Java/Maven project. Maven generated the initial SBOM, this was converted and filtered with capycli. I can provide intermediate BOMs on request. I run the latest capycli version from GitHub. For the result, capycli was called as:

python3 -m capycli bom map --nocache -i ./clearing_results/bom.json -o ./clearing_results/updated_bom.json -ov ./clearing_results/overview.json -mr ./clearing_results/mappingresult.json

CaPyCli, 2.0.1 - Map a given SBOM to data on SW360

Loading SBOM file ./clearing_results/bom.json
Analyzing token...
Token will expire on 2024-01-21 13:47:13
Checking access to SW360...
No cached releases available!

Do mapping...
Retrieving package-url ids, filter: {'mav'}
Traceback (most recent call last):
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli_main
.py", line 13, in
cli.main()
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\main\cli.py", line 27, in main
app.run(argv)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\main\application.py", line 157, in run
self._run(argv)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\main\application.py", line 138, in _run
handle_bom.run_bom_command(self.options)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\bom\handle_bom.py", line 82, in run_bom_command
app.run(args)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\bom\map_bom.py", line 980, in run
result = self.map_bom_to_releases(
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\bom\map_bom.py", line 524, in map_bom_to_releases
self.external_id_svc.build_purl_cache(purl_types, self.verbosity <= 1)
File "C:\Users\me.pyenv\pyenv-win\versions\3.10.10\lib\site-packages\capycli\common\purl_service.py", line 49, in build_purl_cache
all_ids = all_ids + self.client.get_releases_by_external_id("package-url")
TypeError: can only concatenate list (not "dict") to list

This looks... unexpected.

BOM conversion from legacy to capycli not correct

When we convert BOM from legacy to capycli [ capycli bom convert -i <input.json> -if legacy -o <output.json> -of capycli ] "SourceUrl" from legacy BOM is getting converted to "source archive (local copy)" in capycli BOM.

What should be the actual conversion?
It should be converted to "source archive (download location)"

How to replicate the issue?
Take a simple capycli BOM and convert to legacy BOM and then again convert to capycli BOM. The 2 capycli BOM output will be different in this case. Attaching a simple capycli BOM to try.

Resolution of the issue?
PR: #63

{
"$schema": "http://cyclonedx.org/schema/bom-1.4.schema.json",
"bomFormat": "CycloneDX",
"specVersion": "1.4",
"serialNumber": "urn:uuid:12432b27-1088-4a8d-a85c-948c1e1812bc",
"version": 1,
"metadata": {
"timestamp": "2024-04-13T06:17:33.177283+00:00",
"tools": [
{
"vendor": "Siemens AG",
"name": "CaPyCLI",
"version": "2.3.0",
"externalReferences": [
{
"url": "https://github.com/sw360/capycli",
"type": "website"
}
]
},
{
"vendor": "Siemens AG",
"name": "standard-bom",
"version": "2.0.0",
"externalReferences": [
{
"url": "https://code.siemens.com/sbom/standard-bom",
"type": "website"
}
]
}
],
"licenses": [
{
"license": {
"id": "CC0-1.0"
}
}
],
"properties": [
{
"name": "siemens:profile",
"value": "capycli"
}
]
},
"components": [
{
"type": "library",
"bom-ref": "pkg:pypi/[email protected]",
"name": "click",
"version": "8.1.7",
"description": "Composable command line interface toolkit",
"licenses": [
{
"license": {
"name": "BSD-3-Clause"
}
}
],
"purl": "pkg:pypi/[email protected]",
"externalReferences": [
{
"url": "click-8.1.7-py3-none-any.whl",
"comment": "relativePath",
"type": "distribution"
},
{
"url": "click-8.1.7.tar.gz",
"comment": "source archive (local copy)",
"type": "distribution"
},
{
"url": "https://files.pythonhosted.org/packages/00/2e/d53fa4befbf2cfa713304affc7ca780ce4fc1fd8710527771b58311a3229/click-8.1.7-py3-none-any.whl",
"comment": "binary (download location)",
"type": "distribution"
},
{
"url": "https://files.pythonhosted.org/packages/96/d3/f04c7bfcf5c1862a2a5b845c6b2b360488cf47af55dfa79c98f6a6bf98b5/click-8.1.7.tar.gz",
"comment": "source archive (download location)",
"type": "distribution"
},
{
"url": "https://pypi.org/project/click/",
"comment": "PyPi URL",
"type": "distribution"
},
{
"url": "https://palletsprojects.com/p/click/",
"type": "website"
}
],
"properties": [
{
"name": "siemens:primaryLanguage",
"value": "Python"
}
]
}
],
"dependencies": [
{
"ref": "pkg:pypi/[email protected]",
"dependsOn": []
}
]
}

Be more resilient when accessing SW360

We should check if we can have more try/except statements when accessing and SW360 instance.
See for example line 519 in create_components.py - a network error crashes the whole component creation.

feature: "project downloadattachments" / "bom uploadattachments"

I currently have the task to synchronize projects' attachments between two SW360 instances.

In my specific case, certain projects including source attachments are uploaded into two different instances, but clearing results (e.g. reports and CLI files) are uploaded in only one of them. So the remaining bit for me would be to download those from instance A and upload it to instance B.

So my idea would be to add following two commands:

    bom
        UploadAttachments    upload all attachments described in a BOM e.g. created by "project DownloadAttachments"
   
    project
        DownloadAttachments    download all attachments (of certain types) and create a BOM listing them

This would require extending the BOM formats by some fields, probably something like

 "externalReferences" : [
   {
    "type": "other",
    "url": "file:///attachments/CLIXML_keyutils_1.6-6-debian-combined.tar.bz2_2019-10-10_10_02_52.xml",
    "comment": "component license information (XML)"
    }
  ]

Alternatively, I could also implement "project uploadattachments", but I think with the suggestion above, it could be used flexible for multiple use cases.

Together "project createbom", "bom map" and "bom createcomponents", I think we could even support complete project export/import use cases in the future like:

  • sync projects between different SW360 instances
  • archive projects from SW360 to other systems
  • migrate projects between SW360 and other platforms

What do you think?

create project fails if empty

running the capycli on a new and empty project resulted in this exception:

../capycli/project/create_project.py", line 97, in update_project
    print("  " + str(len(project["_embedded"]["sw360:releases"])) + " releases in project after update")

was on capycli 1.91

but the code in 2.0 will suffer with the same issue here:
https://github.com/sw360/capycli/blob/main/capycli/project/create_project.py#L77

if no release yet is attached to a project, it will also fails.
-> add a check if "sw360:releases" in project["_embedded"] as in l.65

GetDependencies JavaScript --search-meta-data - found source archive URL is not written back to the bom item

When using --search-meta-data on the npm version 2 lock file, and the discovered repository URL is a github one, the URL in the BOM item isn't updated correctly. The URL written to BOM file is the original URL from the found metadata and not the actual URL of the source code archive file.

Example of the found metadata:
{'type': 'git', 'url': 'git+https://github.com/fastify/accept-negotiator.git'}

Observed external resource:

{
  "url": "https://github.com/fastify/accept-negotiator.git",
  "comment": "source archive (download location)",
  "type": "distribution"
},

Expected external resource:

{
  "url": "https://github.com/fastify/accept-negotiator/archive/refs/tags/v1.1.0.zip",
  "comment": "source archive (download location)",
  "type": "distribution"
},

Clarify the doc between project create vs update.

Current documentation leads to confusion between "capycli project create" and "capycli project update".
Looking at the code, both are doing the same stuff, but with the major difference being the onlyUpdateProject==True set in https://github.com/sw360/capycli/blob/main/capycli/project/create_project.py.
Looking further, this is setting the 'add' in https://github.com/sw360/sw360python/blob/master/sw360/project.py / update_project_releases(self, releases, project_id, add=False):
which says
Ifaddis True, givenreleases are added to the project, otherwise, the existing releases will be replaced.
And the code indeed create a list of releases to be linked to the project by getting the existing ones and adding the new ones.

Therefore, 'capycli project create' instead of 'capycli project update' would update the project but replace all releases, i.e. remove existing ones.
'create' is actually the only option if we want to update an existing project but remove some releases that do not belong to it anymore.

Please clarify the doc to reveal this major difference.

Python scanner for poetry.lock doesn't exclude dev dependencies for Poetry >= 1.5.0

Hello,

Starting from poetry 1.5.0 ref the "category" field based on which capycli skips dev dependency is no more part of the poetry.lock file. #7637

This causes capycli to list all the dependencies, included dev ones, and in our projects to load also those into SW360.

I'm not so acquainted with poetry internals, but I guess that a solution would require reading the main dependencies from pyproject.toml file and resolving the transitive dependencies of the main dependencies from the poetry.lock, as by looking only at the lock file isn't enough to exclude dev dependencies.

Fix bom map purl not set.

calling capycli bom map in a job, I got the exception

File "/usr/local/lib/python3.10/site-packages/capycli/bom/map_bom.py", line 511, in map_bom_to_releases
if len(component.purl) > 8 and component.purl.startswith("pkg:"):
TypeError: object of type 'NoneType' has no len()

quick fix on line 511
if component.purl and len(component.purl) > 8 and component.purl.startswith("pkg:"):

I wanted to push a branch and create a PR but I get a 403...

refactor testcases to directly use pytest

We already use pytest, but still derive our test classes from traditional unittest.

This is perfectly valid, but we cannot use some nice pytest features that way. As an example, pytest has built-in support for capturing stdout/stderr, so we could drop our own implementation. I also like pytest's more intuitive assert syntax.

Concrete reason why I ask is that I recently extended our own capture_stdout implementation to support an arbitrary number of arguments and now I noticed that with our own implemenation, I can't capture stdout and check the return value at the same time. All of that would already be available from pytest.

It would however mean a larger rework, but I think it would be possible to migrate file by file where needed.

If you're interested, @tngraf, I could try adding my new tests for #33 in pytest-style or also rewrite the "project createbom" tests in #37 to pytest style.

Introduce mypy

mypy is static code checking for Python.
They use this also for all CycloneDX Python projects.
Useful, but will costs many hours of effort and may affect nearly each file (we need correct typing information!)

"project CreateBom" metadata: license "CC0-1.0" and missing project name/version

When exporting an SW360 project using "project CreateBom", I get the following header:

  "metadata": {
    "timestamp": "2023-07-12T20:23:44.589297+00:00",
    "tools": [
      {
        "vendor": "Siemens AG",
        "name": "CaPyCLI",
        "version": "2.0.1",
        "externalReferences": [
          {
            "url": "https://github.com/sw360/capycli",
            "type": "website"
          }
        ]
      },  
      {   
        "vendor": "Siemens AG",
        "name": "standard-bom",
        "version": "2.0.0",
        "externalReferences": [
          {
            "url": "https://code.siemens.com/sbom/standard-bom",
            "type": "website"
          }
        ]
      } 
    ],    
    "licenses": [
      { 
        "license": {
          "id": "CC0-1.0"
        }
      }
    ],
    "properties": [
      {
        "name": "siemens:profile",
        "value": "capycli"
      } 
    ]     
  },      

According to https://cyclonedx.org/docs/1.4/json/#metadata, this means the BOM itself is licensed under CC0, right? But even in this case, I wonder if Joe Defaultuser (from the industry) expects his BOMs to be flagged as freeware? Is this intended?

crash in "bom downloadsources"

@Garbald reported another crash which I can also reproduce:

poetry run python3 -m capycli bom downloadsources --nocache -i ./bom-christoph-tietz.json -url https://sw360.siemens.com -oa -t ... leads to:

CaPyCli, 2.0.1 - Download source files from the URL specified in the SBOM

Loading SBOM file ./bom-christoph-tietz.json
Downloading source files to folder ./ ...
  logback-classic, 1.3.11
    URL = https://repo1.maven.org/maven2/ch/qos/logback/logback-classic/1.3.11/logback-classic-1.3.11-sources.jar
    Downloading file logback-classic-1.3.11-sources.jar
Traceback (most recent call last):
  File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/gernot/checkout/capycli/capycli/__main__.py", line 13, in <module>
    cli.main()
  File "/home/gernot/checkout/capycli/capycli/main/cli.py", line 27, in main
    app.run(argv)
  File "/home/gernot/checkout/capycli/capycli/main/application.py", line 157, in run
    self._run(argv)
  File "/home/gernot/checkout/capycli/capycli/main/application.py", line 138, in _run
    handle_bom.run_bom_command(self.options)
  File "/home/gernot/checkout/capycli/capycli/bom/handle_bom.py", line 102, in run_bom_command
    app.run(args)
  File "/home/gernot/checkout/capycli/capycli/bom/download_sources.py", line 196, in run
    self.download_sources(bom, source_folder)
  File "/home/gernot/checkout/capycli/capycli/bom/download_sources.py", line 113, in download_sources
    ext_ref = ExternalReference(
TypeError: __init__() missing 1 required keyword-only argument: 'url'

It seems

ext_ref.url = path
should be moved to the constructor call in line 113.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.