Code Monkey home page Code Monkey logo

parsers's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

parsers's Issues

Conformance Tests

We should write a couple of conformance tests to ensure the parsers are following the design we want to apply to all of them. I will start dumping some thoughts about what I think should be important to capture in the conformance suite:

Tests Across All Ecosystems:

  • Ensure Uniform Package Representation
    We should create test repositories consisting of a simple project with a fixed set of dependencies. Maybe two direct dependencies one of them with a transient one. Once we replicate
  • Ensure Uniform License and Copyright Detection
    License data is often found in code comments. We need to make sure all ecosystems are extracting the same data from their own language.
  • Ensure Consistent Hashing
    One of the common problems in SBOMs is the wrong hashing of files. We should ensure all ecosystems produce the same hashes when looking at the same file while expressing their own ecosystem hashes correctly.
  • Common Errors for Repeatable Failures
    While I'm not a fan of predefining errors, I think it is useful when dealing with plugin-like projects to factor out common errors. Things like emitting errors when the build environment is not ready or complete, execution errors when shelling out, etc are good candidates to unify.

Fix failing tests

It looks like the github workflow requires installation of all the package managers and using package managers to build applications. Add installation of package managers (eg: yarn) and building of test package (yarn install in the right location).

Consider using kubernetes command package

I think we should replace the command execution library with the command package in kubernetes-sigs/release-utils. That library is under constant maintenance and has more features and control.

error creating SBOM, err: writing serialized document: json: error calling MarshalJSON for type *common.Supplier

➜  lsif-node git:(main) ✗ ./sbomgen -o . -f JSON
INFO[2023-08-27T14:53:17+08:00] Starting to generate SPDX ...                
INFO[2023-08-27T14:53:18+08:00] Using npm, current Language Version 6.14.11  
INFO[2023-08-27T14:53:18+08:00] Global Setting File path                     
INFO[2023-08-27T14:53:18+08:00] Parsing . for packages                       
FATA[2023-08-27T14:53:29+08:00] error creating SBOM, err: writing serialized document: json: error calling MarshalJSON for type *common.Supplier: failed to marshal invalid Supplier: {Supplier: SupplierType:Organization} 

The process will break if either the SupplierType or Supplier field is empty.
Maybe we need some default values (both name and type) for avoiding this, not only assign a name. Like:

mod.Supplier = meta.Supplier{

Shelling out external network call dependency

Right now, the parsers try to make external network calls for fetching package data.

nuget

https://github.com/opensbom-generator/parsers/blob/main/nuget/helpers.go#L14

Used to fetch:

  • nuget spec
  • checksum

pip

https://github.com/opensbom-generator/parsers/blob/main/pip/worker/pypi.go#L85

Used to fetch:

  • author details
    • There's no need to make an external network call here since this information is available with pip's metadata.
  • checksum
  • download url

Bonus:

For all dependency managers, pip offers a command pip inspect to get metadata
for the current environment which contains package metadata, platform information(this can be used for #25) and a lot more.

Platform Awareness / Option

When analyzing the dependencies of projects, the effective dependencies can change based on what platform the build is targeting. In order to cover those cases, we need to be able to do two things from any of the parsers:

  1. Give the parsers a unified way of knowing which platform they are running on and
  2. Create an option to specify the target platform so that each parser can pass a platform slug to the underlying tooling

Umbrella: Code Migration

This umbrella issue tracks the progress of the code migration from the generator repository to
the new standalone repository according to the plan agreed upon in the Oct 19th community meeting.

  • Import relevant ecosystem files from the original codebase preserving their history
  • Initialize new go module
  • Update parsers to new import path
  • Import rest of required files
  • Update packages to use the newly migrated general modules

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.