Code Monkey home page Code Monkey logo

Comments (14)

pombredanne avatar pombredanne commented on May 25, 2024 1

@sschuberth btw, this mapping is not right: https://github.com/heremaps/oss-review-toolkit/blob/518e17c0b1385cc403960cfdfdff69e76240cb27/spdx-utils/src/main/kotlin/SpdxDeclaredLicenseMapping.kt#L120
"CDDL v1.0 / GPL v2 dual license" to (CDDL_1_0 and GPL_2_0_ONLY), should be instead an OR and not and AND IMHO

from tools-python.

sschuberth avatar sschuberth commented on May 25, 2024 1

@nishakm how about creating language-agnostic license mappings in JSON / YAML format that are initially populated with the existing mappings from ORT / the ones that nexB has, and put them in a "neutral" place like probably a new repository at https://github.com/spdx?

from tools-python.

nishakm avatar nishakm commented on May 25, 2024 1

@nishakm how about creating language-agnostic license mappings in JSON / YAML format that are initially populated with the existing mappings from ORT / the ones that nexB has, and put them in a "neutral" place like probably a new repository at https://github.com/spdx?

This sounds good to me if @kestewart and @goneall are OK with creating an independent repo under the spdx github namespace. I am not a license geek but I already feel like a 1:1 mapping is not going to get all the way there. There needs to be some kind of string formatting or downstream processing from there. But a 1:1 mapping to start off would be great!

from tools-python.

pombredanne avatar pombredanne commented on May 25, 2024 1

@sschuberth re:

how about creating language-agnostic license mappings in JSON / YAML format that are initially populated with the existing mappings from ORT / the ones that nexB has, and put them in a "neutral" place like probably a new repository at https://github.com/spdx?

I was exactly talking about this yesterday about this very ticket of @nishakm

@nishakm re:

I am not a license geek but I already feel like a 1:1 mapping is not going to get all the way there. There needs to be some kind of string formatting or downstream processing from there. But a 1:1 mapping to start off would be great!

The right approach would be indeed not to have a 1:1 mapping but something imho which would be this way:
Given these:

  • a license string and or structured data snippet (to account for npm old styles and Maven structures) as found in a package manifest.
  • a package manager type (e.g. a Package URL type)
    Then we map to:
  • a license expression
  • some indication of confidence (say between 0 and 100) for the accuracy of this mapping
  • some optional notes

Let me create the repo :)

from tools-python.

pombredanne avatar pombredanne commented on May 25, 2024 1

@nishakm one of the reason that having a separate repo is better is that it could be reused in many places beyond this Python tool repo.

from tools-python.

pombredanne avatar pombredanne commented on May 25, 2024

@nishakm Thank you! This is indeed useful but such a mapping would be most useful in the context of a certain package manifest and type and its actual raw license declaration data and IMHO in the context of license detection. As such the mappings from ORT are generic and would better if they were package-type specific (e.g. capturing the conventions of npms, Maven, etc).

But my main point is that I am not sure there is a place in this library where I could make use of this data. When you craft SPDX documents, you need to have already the proper normalized licenses ids and license expressions so there would not be a place to plug this in. This would have to be done/used before.

Therefore I think this is something that could be best used in two places:

  1. as a list of aliases when parsing license expressions with the license-expression(https://github.com/nexB/license-expression/) library. This is something that is explicitly supported there and it would be great to have a list of aliases. This could be a list of generic aliases alright

  2. as mappings used when you parse the declared licenses of package manifests in the scancode-toolkit. This probably would be another good place.

Feedback welcomed!

from tools-python.

sschuberth avatar sschuberth commented on May 25, 2024

Thanks @pombredanne for the hint, @mnonnenmacher any comments?

from tools-python.

nishakm avatar nishakm commented on May 25, 2024

@nishakm Thank you! This is indeed useful but such a mapping would be most useful in the context of a certain package manifest and type and its actual raw license declaration data and IMHO in the context of license detection. As such the mappings from ORT are generic and would better if they were package-type specific (e.g. capturing the conventions of npms, Maven, etc).

Agreed. It would be a much larger undertaking then :)

But my main point is that I am not sure there is a place in this library where I could make use of this data. When you craft SPDX documents, you need to have already the proper normalized licenses ids and license expressions so there would not be a place to plug this in. This would have to be done/used before.
I was going to go off and create an independent python module for this purpose, but I was told by the SPDX folks that this repo would be a good central location for such a module.

Therefore I think this is something that could be best used in two places:

  1. as a list of aliases when parsing license expressions with the license-expression(https://github.com/nexB/license-expression/) library. This is something that is explicitly supported there and it would be great to have a list of aliases. This could be a list of generic aliases alright

The use case is basically translating what looks like the appropriate license declared somewhere in the artifact into the license expression. So I'm not sure how a license expression parser would help here.

  1. as mappings used when you parse the declared licenses of package manifests in the scancode-toolkit. This probably would be another good place.

I'd like it to be independent of scancode-toolkit because other projects who want the same thing can use it as well. But if there are already mappings in here, where in the project might I find it?

Feedback welcomed!

from tools-python.

pombredanne avatar pombredanne commented on May 25, 2024

@nishakm @sschuberth there it is: https://github.com/spdx/package-licenses-mapping
@nishakm I invited you there as a committer too.

from tools-python.

pombredanne avatar pombredanne commented on May 25, 2024

See spdx/package-licenses-mapping#1 which is the continuation for this ticket

from tools-python.

nishakm avatar nishakm commented on May 25, 2024

It's what I asked for initially. It was suggested that I file an issue here :)

from tools-python.

pombredanne avatar pombredanne commented on May 25, 2024

@nishakm that's fine, you could have made it clear in the ticket

from tools-python.

pombredanne avatar pombredanne commented on May 25, 2024

@nishakm re:

The use case is basically translating what looks like the appropriate license declared somewhere in the artifact into the license expression. So I'm not sure how a license expression parser would help here.

This would be important to validate that the license expressions are correct and in a canonical form. That's a something to add as a test of sorts

from tools-python.

goneall avatar goneall commented on May 25, 2024

Just catching up on the issue .

I recall discussing this on one of the SPDX calls and I do recall talking about adding the issue to the tools.

I also agree with the above comments that we should have a separate data mapping repo along with tools implementations that support the mapping. I don't recall the discussion precisely, but I don't think anyone had a concern with a neutral mapping repo - just a concern about adding it to the spec since the mapping may be updated quite frequently.

I like the idea of the mapping repo and can use this in the SPDX Java Tools as well.

from tools-python.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.