spdx / spdx-spec Goto Github PK
View Code? Open in Web Editor NEWThe SPDX specification in MarkDown and HTML formats.
Home Page: https://spdx.github.io/spdx-spec/
License: Other
The SPDX specification in MarkDown and HTML formats.
Home Page: https://spdx.github.io/spdx-spec/
License: Other
The 2.1 spec has:
"Person:" person name and optional "("email")"
and:
PackageSupplier: Person: Jane Doe ([email protected])
However, it would seem more usable if we lean on RFC 2822 and use <>
to set off the address. Borrowing from RFC 2822's rules for display-name
and angle-addr
, the ABNF for the PackageSupplier:
would be would be:
package-supplier = "PackageSupplier:" 1*(person / organization)
person = "Person:" name-optional-addr
organization = "Organization:" name-optional-addr
name-optional-addr = display-name [angle-addr]
https://github.com/package-url is an emerging grass root effort to design a simple and mostly universal package id and locator. This would be a much improved alternative to the partially defined https://github.com/spdx/spdx-spec/blob/cfa1b9d08903befdf03e669da6472707b7b60cb9/chapters/appendix-VI-external-repository-identifiers.md
I help out with the Open Container Initiative, which does this. There's no effect on the rendered content, and putting each sentence on its own line helps with git blame
-based workflows. For example, links like this can target a single sentence. And it's easier to see the last commit that the content with sentence granularity (instead of paragraph granularity):
$ git blame chapters/appendix-IV-SPDX-license-expressions.md | grep ' 90)'
f902b619 (Thomas Steenbergen 2017-05-02 22:46:48 +0200 90) Sometimes a set of license terms apply except under special circumstances. In this case, use the binary "WITH" operator to construct a new license expression to represent the special exception situation. A valid \<license-expression> is where the left operand is a \<simple-expression> value and the right operand is a \<license-exception-id> that represents the special exception terms.
Spun out from today's legal meeting, I think we should add UNLICENSED
NONE
to license expressions, because external tools like npm's package.json
are currently defining UNLICENSED
as an extra for the “all rights reserved” case. I'm happy to work up a pull request if/when #37 lands.
There lots of best practices documents created by the SPDX tech/outreach teams but none of them are easy to find. Propose to add appendix to the spec that links to various best practices resources
In Appendix V, it is not clear if multiple lines are to be used for compound set of licenses. Suggest changing the following statement:
The SPDX License Identifier syntax may consist of a single license (represented by a short identifier from the SPDX license list) or a compound set of licenses (represented by joining together multiple licenses using the license expression syntax).
to:
The SPDX License Identifier syntax may consist of a single license (represented by a short identifier from the SPDX license list) or a compound set of licenses (represented by joining together multiple licenses using the license expression syntax which is enclosed in parenthesis and may span multiple lines).
This has been moved from bugzilla: https://bugs.linuxfoundation.org/show_bug.cgi?id=1327
Kate Stewart 2015-11-19 19:00:15 UTC
Jilayne wrote:
in http://lists.spdx.org/pipermail/spdx-tech/2015-November/002905.html
in http://wiki.spdx.org/view/Technical_Team/Minutes/2014-09-16#
Case_sensitivity_for_license_information - the tech team discussed this on 16 Sept 2014, note saying “License ID’s case sensitive”
and then the legal team discussed it - http://wiki.spdx.org/view/Legal_Team/Minutes/2014-09-18 - and concluded:
• Mark raised issue of whether SPDX License List short identifiers and (new) license expression operators should be case sensitive with the Tech Team and discussed further here: decided that for purposes of spec, in terms of a legitimate value, both could be case insensitive (but best practice would be to display with precise capitalization). Mark to go back to tech team with this decision.
So… looks like maybe we didn’t really capture this elsewhere? In any case, I don’t see a reason to have them be case sensitive in terms of matching (for tools), but have them display with the upper/lower case as they are shown in the SPDX License List - it’s easier for humans to read/spot :)
Kate Stewart 2015-11-19 19:01:50 UTC
I'll add it to the 2.1 version of the spec. Also consider adding this as an appendum/erratta for 2.0.
Kate Stewart 2015-12-22 18:13:49 UTC
Discussed on 12/22 - no concerns, going forward with documenting.
Bill Schineller 2016-05-10 17:53:56 UTC
didn't jump out at me where / if we made edit yet to SPDX 2.2
todo
Kate Stewart 2016-05-17 17:01:29 UTC
Have proposed edit to 6.1, and Appendix I. Lets review.
Kate Stewart 2016-05-17 17:14:40 UTC
In discussion, some concern about other tools and matching in future.
Circling back this discussion to include Mark Gisi.
Bill Schineller 2016-05-17 17:15:33 UTC
fwiw:
from http://lists.w3.org/Archives/Public/www-rdf-interest/2003Aug/0002.html
RDF is case-sensitive. From the last call Concepts working draft:
Two RDF URI references are equal if and only if they compare as
equal, character by character, as Unicode strings.
-- http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref
An upper-case 'A' and a lower-case 'a' are different Unicode characters.
Bill Schineller 2016-05-24 17:13:32 UTC
Kate / Jilayne agreed to leave the Spec language as-is for 2.1
as-is means 'it IS case-sensitive'
leaving ticket open with Version 'unspecified' in case we want to revisit in the future.
We were reluctant to make case-insensitive now for 2.1 without understanding the impacts case might have on URIs (website, other tools, RDF graphs, ...)
The 2.1 spec is not particularly DRY on idstring
values. There are a number of local definitions that match up with 1*(ALPHA / DIGIT / "-" / ".")
, a definition that includes +
(perhaps from before it was a Licence Expression operator?) and a “defined in Appendix” (without specifying which appendix). I think we should extend our use of ABNF to include more than just appendix IV. We'd define idstring
(or just id
?) in the first place we needed it (here?), and then later sections would link that earlier definition and consume it's ABNF rule.
I'm happy to work up a PR for this if it sounds useful.
Thomas: Expand §8.3.4 Annotation Type with new values “LICENSE” | “PATENT”. Enables annotator to more precisely indicate type of annotation
Discussed use case brought up by customers - will save research. OpenCV is case cited. Be able to annotate that a patent has expired, etc.
Use LICENSE when its not 100% clear, so may want to provide information about equivalence or not with another. Zlib 1.0.6 and another close to it.
Different lawyers handle different roles, want to give lawyers comments that apply to the appropriate reviewers.
Thomas: Copyrighter holder may be out of business, so may want to have a “COPYRIGHT” as well.
Discussion of who adds the and roles between Alexios & Thomas.
Desire to permit multiple TYPES to be used with ANNOTATION
@tsteenbe Following up on the tech call on 10 July, I updated the license generation tools to create a markdown page for the license list: https://github.com/spdx/license-list-data/blob/master/licenses.md
To make this work for the license-list-data repository, the links and the text are different from the license list chapter itself. You are welcome to use the licenses.md to help generate the chapter text, but it may be easier to just generate the page using a node.js script using a well structured JSON file as input.
I would recommend using the JSON table of contents page at https://github.com/spdx/license-list-data/blob/master/json/licenses.json and https://github.com/spdx/license-list-data/blob/master/json/exceptions.json. The structure and tag names for these pages are stable. If you would like to get the license list for a specific version, checkout the tag by the version name.
extend Appendix VI: External Repository Identifiers category PACKAGE_MANAGER with common developer service cocoapods, rubygems, pip, sbt, etc..
Yev Bronshteyn 2016-05-30:
The "PACKAGE-MANAGER" category is inconsistent with other names, where we use underscore instead of hyphen (such as "DISTRIBUTION_ARTIFACT" or "DATAFILE_OF" in relationship).
The categories are not demonstrated in the RDF examples. To demonstrate them, we would need to, ideally, represent them with URIs, e.g.
<category rdf:resource="http://spdx.org/rdf/terms#referenceCategory_package_manager" />
This also means categories need to be added to the ontology.
Lastly, upon further reading, I would recommend separating the "target" property in RDF into two: "type" and "locator", which are terms we already define spearately. Unliked the tag format, which aims to be readable, the core tenet of RDF is to be resolvable. This way, type can be represented in RDF by a URI that can resolve to provide more information about the target. We can define the vocabulary of that as part of the ontology work for SPDX 2.1 - it needn't be in the spec.
So an example of a full external reference in to a standard repository might be:
<spdx:Package rdf:about="http://yevster.com/packages/foobar">
<spdx:externalRef>
<spdx:ExternalRef>
<spdx:referenceCategory rdf:resource="http://spdx.org/rdf/terms#referenceCategory_package_manager" />
<spdx:referenceType rdf:resource="http://spdx.org/rdf/refeferences/maven-central" />
<spdx:referenceLocator>org.apache.commons:commons-lang:3.2.1</spdx:referenceLocator>
</spdx:ExternalRef>
</spdx:externalRef>
</spdx:package>
Yev Bronshteyn 2016-05-31 04:55:57 UTC
It should be pointed out that the approach described for external reference types above is the same one that we use for all other "listed values", including license IDs, relationship types, etc. Anything that's listed in the spec (the body or appendix) is identified by a URI in RDF format. I submit that this should also be the case for reference types.
Note: This was https://bugs.linuxfoundation.org/show_bug.cgi?id=1361
This has been transferred from: https://bugs.linuxfoundation.org/show_bug.cgi?id=1295
Bill Schineller 2015-06-23 16:08:02 UTC
Capture External Identifiers (e.g. Maven GAV, NIST CPE) by which a Package is known in SPDX doc.
So that SPDX data can be easily correlated with data that other repositories, package management, build systems have about the package.
Each of these external systems has their own format for a specific version of a 'package' (what SPDX calls a package, other systems might call an 'artifact' or Vendor-Product-Version...)
Maven
Format: :[:]
Example: activemq:activemq-transport-http:1.3
CPE (Common Product Enumeration) see https://cpe.mitre.org/specification/
Format: cpe:/a:::[:][: | packed field]
Example: cpe:/a:acegisecurity:acegi-security:1.0.3
Rubygems
Format: [/]
Example: ActionTimer/0.0.2
npmjs
Format: [/]
Example: rethinkdbdash/1.16.3
NuGet
Format: [/]
Example: AForge.Controls/2.2.3
Bill Schineller 2015-06-23 16:17:23 UTC
Bill Schineller 2015-06-23 17:47:14 UTC
Per conversation on tech concall, we should be clear that we only want to accept External Identifiers that point to a specific, discrete version of software / set of files. i.e. no wildcards, no 'this version or greater' semantics ----- the 'namespace' i.e. what system the identifier is unique within is critical to this ---- where to find the repository online is important --- requirements for a 'repository' (repository of information, not necesarily repository of bits) to be legitimate (the identifier must be unique within that repository) - should be able to get a hardcopy of the software? (nah, NIST CPE is just a list...) - --- is there a way to factor out the list of repositories from the spec? maybe a list of 'repositories of information' that we might maintain on spdx.org ?
Bill Schineller 2015-07-14 13:59:40 UTC
Draft spec proposal at https://docs.google.com/document/d/1j6LWnkh5GbMV9Xo5_zJ0wTNLROEIa4o1OU279YueI90/edit?usp=sharing
Kate Stewart 2015-12-22 18:49:01 UTC
This is still a work in progress for tech team.
Bill Schineller 2016-05-10 17:41:36 UTC
in Section 3.21 and the new Appendix VI (6) of SPDX 2.1 near-final draft Note (the Appendix has a finite list of some External repositories e.g. NIST Common Product Enumeration (CPE) and Maven GAV. SPDX 2.1 chose not to try to implement custom-defined External repos not in the list. Also a relatively coarse-grained list of Categories
Bill Schineller 2016-05-24 17:54:25 UTC
reassigning to Kate, to pull the proposed Appendix VI into the SPDX version 2.1 spec.[reply] [-]
Bill Schineller 2016-06-28 17:29:29 UTC
Appendix VI: External Repository Identifiers was pulled into SPDX 2.1 https://docs.google.com/document/d/112x3s3g1Qg2tj8bjvIPsqIBlWUp3Sob37cvAx2eiS6U/edit#heading=h.hb0u4akk190q One pending issue is how to have a single type for 'debian' but be able to differentiate different distro versions jessie, wheezy, ...
Transfered from bugzilla: https://bugs.linuxfoundation.org/show_bug.cgi?id=1356
Kate Stewart 2016-05-19 14:06:22 UTC
see: http://lists.spdx.org/pipermail/spdx-tech/2016-May/003101.html
and from farther down the thread.
I see how making the SHA1 algorithm non-mandatory would be a breaking change, and that we'd like to avoid that. But maybe we could at least allow SHA1GIT as an additional algorithm and add it to the spec.
WRT the use-case you're asking for: It's all about performance. In our case scanners actually do scan Git checkouts most of the time, as dependencies (be it build time or runtime time) are usually included as Git submodules. When scanning these files, it does not make much sense to force the scanner to calculate the SHA1 on each file (in order to create valid SPDX) if the SHA1GIT is already known. However, I have to admit that getting the blob SHA1 for a given file name is a rather slow operation in Git, and for single small files (which is not uncommon for source code files) it might actually be faster to calculate the SHA1 instead of looking up the known SHA1GIT.
Finally, there's also the "reverse" use-case: Suppose you have an SPDX file with a bunch of File Checksums given, an you'd like to know which are the candidate Git commits these files can originate from. If only the SHA1s are given, you'd have to iterate over all eligible commits in you Git repositiory, checkout the files, and calculate the SHA1 on them to see whether there's a match. With the SHA1GIT on the other hand, you could directly search Git's object database to find the trees / commits that contain the given blobs.
I agree it probably is an edge-case, but maybe still enough reason to at least allow SHA1GIT as a File Checksum algorithm.
Regards, Sebastian
Bill Schineller 2016-05-24 17:19:48 UTC
Decided not to change Spec version 2.1 with respect to mandatory SHA1.
Also at present not adding sha1git as a checksum type in Spec version 2.1
Changing the whole story around checksums is the type of thing we would consider discussing for an SPDX version 3.0.
(For now we hope to encourage consistent re-usable SPDX documents by sticking with our current approach of uniquely identifying each file by a SHA1)
The following table contains the full names and short identifiers for the SPDX License List, v2.5 which was released July 2016. For the full and most up-to-date version of the SPDX License List as well as other related information, please see http://spdx.org/licenses/
Should we upgrade to v2.6 in the next fix release
Size of File (optional) - express as number of bytes similar to mechanism used for snippet. Will be useful for heuristics working with snippets and licensing.
This is likely a condition for projects going for CII badging so good thing to do, given public compromises noted. Other notes from earlier discussion on google doc
Enable referring to package repositories by URL such Artifactory, Bintray, Nexus, etc.
Examples:
ExternalRef: PACKAGE-MANAGER cocoapods FBSDKCoreKit/4.17.0
ExternalRef: PACKAGE-MANAGER pypi numeral/0.1.0.8
ExternalRef: PACKAGE-MANAGER sbt+repo.scala-sbt.org/scalasbt/sbt-plugin-releases com.sc.sbt %sbt-eslint%1.0.0
ExternalRef: PACKAGE-MANAGER gradle+http://repo.jfrog.org/artifactory#easymock easymock:easymock:2.0
Hi Kate, Gary recommended I open this bug here. Please let me know if this would be better handled elsewhere. Here's the issue:
I noticed a minor issue in the HTML version of SPDX 2.1 specification. All of the HTML links in that section go through the Google redirect service, prepending the SPDX URL with a Google URL (e.g. https://www.google.com/url?q=http://spdx.org/spdx-license-list/matching-guidelines&sa=D&ust=1473291615549000&usg=AFQjCNGAF8fFt6wIxj4Sj1XSOS0LdR2a5A). I'm guessing that this may be due to copy-and-paste from a gdoc draft?
For the PDF version, I didn't do a thorough check, but did look at Appendix II, and this issue does not seem to affect that version. However, I did notice some links to Google Docs in Appendix IV there (the 2nd and 3rd links, to "Appendix I.1" and "Appendix I.2") which are probably meant to be internal links rather than to an s.sfusd.edu Google Doc (this was present in both the PDF and HTML version of the spec).
Best,
Brad
Thomas: Make the matching template formats of license part of the spec - both SPDX listed licenses and NON-SPDX listed licenses. Would like to add matching guidelines annotation to SPDX licenses and to NON-SPDX licenses. Also add templating for copyright holders and dates.
XML specification of license texts. Has templating. Matching guidelines.
Want to add cross references to license that are on the SPDX license list.
Concern: schema to store information about license is ok, but matching templates could become problematic. May be differnently to apply consistently. Old templating language in specification is only available on listed licenses. Make other properties to listed licenses. XML language is being used by legal team to line up with guidelines, but may not be standardized enough. Non-standardized input format, move to output format.
This is possibly 3 different proposals:
Add additional properties to OTHER LICENSE INFORMATION file to bring up to same level as SPDX listed licenses.
Add additional fields for listed licenses, so information present in XML can be made visible as start of output representations (for instance bullets, copyright) (we don’t want them using the input format)
Add in OTHER LICENSE INFORMATION that is not in SPDX license list model to the SPDX license lists (ie. comment)
Some harmonization here is going to be needed. We probably want to include license exceptions in remodeling discussion. This is probably a 3.0 feature.
4.8.2 Intent: Record any copyright notice for the package.
This is in the file section, not package section.
Fix typo in next update.
Note: this was https://bugs.linuxfoundation.org/show_bug.cgi?id=1384.
Which has been closed so we can track all bugs here.
Problem: how do we capture sets of related licenses, esp. Translations
<relatedLicenses>
<relatedLicense relationshipType="official-translation" targetLicenseIdentifier="EUPL-1.1">EUPL-1.1</relatedLicense>
</relatedLicenses>
...
Note: Up to 24 for EUPL, etc.
SPDX 2.1 has 34 mandatory tags. Propose to reduce the number of mandatory fields to minimal fields needed for exchange to reduce friction to participate.
Believe the following fields could be made optional:
§2.4 DocumentName
§2.5 DocumentNamespace - Most of the time these URL are totally artificial as producers do not maintain SPDX as linked data
§3.9 PackageVerificationCode - Verification should be optional, license scanners ignore different type of files such as .git dirs and as such two scanners can produce different PackageVerificationCode for the same package
§3.15 PackageLicenseDeclared - PackageLicenseConcluded and PackageLicenseInfoFromFiles provide the same information
§5.3 SnippetByteRange - Making this optional reduces friction to SPDX participation. Maintainer can easily manually specify SnippetLineRange but SnippetByteRange requires tooling
Gary: consider going to profiles? Simpler - same field names, make optional or not. Some redundancy in documentation.
SPDX-Document
2.1.5 SPDXVersion
2.2.5 DataLicense
2.3.5 SPDXID
2.4.5 DocumentName
2.5.5 DocumentNamespace
2.8.5 Creator
2.9.5 Created
SPDX-Package
3.1.5 PackageName
3.2.5 SPDXID
3.7.5 PackageDownloadLocation
3.9.6 PackageVerificationCode
3.13.5 PackageLicenseConcluded
3.14.5 PackageLicenseInfoFromFiles
3.15.5 PackageLicenseDeclared
3.17.5 PackageCopyrightText
SPDX-File
4.1.5 FileName
4.2.5 SPDXID
4.4.6 FileChecksum
4.5.5 LicenseConcluded
4.6.5 LicenseInfoInFile
4.8.5 FileCopyrightText
SPDX-Snippet
5.1.5 SnippetSPDXID
5.2.5 SnippetFromFileSPDXID
5.3.5 SnippetByteRange
5.5.5 SnippetLicenseConcluded
5.8.5 SnippetCopyrightText
Non-SPDX License Identifier
6.1.5 LicenseID
6.2.5 ExtractedText
6.3.5 LicenseName
SPDX–Annotation
8.1.5 Annotator
8.2.5 AnnotationDate
8.3.5 AnnotationType
8.5.5 AnnotationComment
Should we bump the current CC BY 3.0 to the current 4.0 license? The CC covers the improvements here and suggests (in the “Clarity about adaptations” section) that a license bump from 3.0 to 4.0 is a valid adaptation licence choice, so we may not need to collect “I'll provide my previous contributions under the CC BY 4.0 as well” statements from previous contributors.
For the RDF examples, some of the SPDX terms use the spdx: prefix, others use no prefix.
This was https://bugs.linuxfoundation.org/show_bug.cgi?id=1369
Thomas: New tag PackageTag (optional, cardinality - multiple) enable users to add custom (user-defined?) tags to package to group. Useful to automatic assessment of license results
Example:
PackageName: jUnit
# Similar to Maven dependency scopes e.g. compile, runtime, etc.
PackageTag: scope:test
# Defines type of SW such as build_tool, test_framework, sw_library, utility
PackageTag: type:build_tool
Kate: Currently using Package Comment - but problematic with filtering with other comments that are in the comments. Overloaded, so problematic.
Kate: Could it be a Package Type - like https://spdx.org/spdx-specification-21-web-version#h.7vzbl5vywpa7 ? Yev not sure worth going in this direction.
Gary: Could Annotation be used? Extend annotation types? https://spdx.org/spdx-specification-21-web-version#h.wlc7jg3vsu43
Kate: Use case of package compiled to binary file - would we want to tag a binary or jar file? Thomas, Gary - yes can see it used but not as common as packages
Yev: when documenting a supply chain - what should be tagged, what should be in relationship?
Gary: can see it being useful. Maybe annotations isn’t best approach. But having it applicable to all elements may make sense? Property of element.
Yev: Images, containers, etc.
Alex: Licenses - …
New High level property appropriate to any element?
Yev: We need to be able careful with tagging license…
Alex: Whatever you put in tags, should be interpreted of author of SPDX document.
Freeform tagging vs. specific categories - user assertion. Specify tags are declaration of documentation author, vs. Custom tags not declarative. Create adhoc tags, and no meaning as far as SPDX concerns. Spec says it can exist.
Yev: That makes it useful for insourcing
Thomas: Also could be used when customer relationship.
Tending towards: May apply to any element, optional, signal author may want to convey. Enumeration 0 or more.
Yev: Field in Document Scope to describe the meanings of tags? Also could be done in Comments. Explanation of used tags included in creator comment - as best practice.
Conclusion: Keyword Section - may apply to any element, its optional, with cardinality 0 to many, Signal that authors to convey. Schedule for 3.0
(spec 7.1 needs to be expand - which useful ones are missing & definitions.
We currently have some awkward wording around cardinality (#40). I think the difficulty comes from defining cardinality alongside the child element, when it's really a parent property. I'd rather see each parent clearly define their allowed children, with potential backlinks from children to possible parents. For an example of this in another spec, see HTML's html
and p
elements, which have a normative “Content model” and a convenience “Contexts in which this element can be used”. Once we shift those around, we could have the root element clearly declare that it could contain zero or more package entries, zero or more file entries, etc.
This would be a rather large change, and there are a number of open PRs already in flight. I'm happy to put together a PR for this, but it would be good to get at least preliminary agreement on the approach first, and ideally have fewer in-flight PRs going on in parallel ;).
Having checksums mandatory is very impractical and you often end up with SPDX docs with checksums that do not match what you redistribute by the bit making them useless in practice.
Having checksums is not a bad idea but making them mandatory is a very bad one IMHO and it makes creating valid SPDX document difficult without a good reason.
Several links within the 2.1 specification are broken or unclear where they are referring to.
Examples
2.9 .. This field is distinct from the fields in section 7, which involves the addition of information during a subsequent review.
Guess this link is incorrect refers to Relationship section. How does this related to Created
attribute?
5.5 If the Concluded License is not the same as the License Information in File, a written explanation should be provided in the Comments on License field (section X.5). With respect to NOASSERTION, a written explanation in the Comments on License field (section X.7) is preferred.
X.5? X.7. Think this should be twice reference to 5.7
In Appendix I: SPDX License List Master Files -> http://git.spdx.org/?p=license-list.git%3Ba=summary 404's should point to https://github.com/spdx/license-list
Like #49, but for NOASSERTION
instead of NONE
. The semantics would be:
NOASSERTION means:
(i) the SPDX License Expression author has attempted to but cannot reach a reasonable objective determination;
(ii) the SPDX License Expression author has made no attempt to determine this field; or
(iii) the SPDX License Expression author has intentionally provided no information (no meaning should be implied by doing so).
That matches our existing usage except for PackageLicenseInfoFromFiles
and similar, where we currently drop (i). I don't think those consumers would suffer from the additional case, because I don't see an actionable distinction between those cases. When would you care about the distinction between “tried but gave up”, “did not try”, and “won't tell you”? If folks did care about those distinctions (which I think unlikely), we'd want to be using different tokens for each case.
Other divergent NOASSERTION
consumers are:
SnippetLicenseConcluded
, which adds an additional case:
the SPDX document creator is uncomfortable concluding a license, despite some license information being available;
I don't think we need to bother with that one at all, since I can't think of a case where I could distinguish between it and the “cannot reach a reasonable objective determination” case, even for license expressions I write myself. But I haven't looked up the background motivation for this case, perhaps it is useful. If so, I don't see the harm in including it for all consumers.
LicenseName
, which is completely unrelated to license expressions.
The current license expression syntax states that whitespace must be used between elements of a compound expression. However, it does not explicitly state if a newline, CR, or LF is included in the definition of whitespace.
Suggest adding a definition of white space which include new line.
Thomas: See a need for syntax to capture how a package dependencies where the dependencies are specified with version range e.g. resulting non deterministic builds. SPDX only now offers to specify dependencies using fixed one-on-one relationships e.g Package A depends on Package B v1.1. In reality Package A specifies it relies on Package B v1.0 or newer.
Having this in the spec provides package maintainers a technology agnostic way to specify their dependencies closer to reality. Provides consumers of these packages with an indicator that including package may result license mix that can change with every build. May also be useful to handle the difference between the declared (by maintainer) and resolved dependencies (by package manager).
Example - SPDX specifies dependency on angular 4.1.1, see it’s package.json specifies depends on core-js 2.4.1 or newer
Note: approach is not figured out yet, but general agreement that this is a problem and we should look into solving it for the next release.
This was https://bugs.linuxfoundation.org/show_bug.cgi?id=1334
Kate Stewart 2015-12-08 16:57:25 UTC
From David Wheeler on December 4, 2015
The current Appendix IV is also overly complex and confusing:
There's no need to have "compound-expression" as separate from "license-expression". The "license-expression" is defined to be either simple or compound, but a simple-expression is also a legal compound-expression, so the whole indirection is unnecessary and confusing.
In simple-expression, the "+" should just optionally follow license-id; that's how anyone would parse it, and it's easier to explain too.
So I suggest replacing simple-expression, compound-expression (to be removed), and license-expression with this simpler spec:
simple-expression = license-id ["+"] / license-ref
license-expression = simple-expression [ "WITH" license-exception-id ] /
license-expression "AND" license-expression /
license-expression "OR" license-expression /
"(" license-expression ")"
You could change simple-expression to be:
simple-expression = license-id ["+"][ "WITH" license-exception-id ] / license-ref and omit the ["WITH...] in the following line, but I like the idea of allowing a license-ref with a standard exception. Besides, that's currently allowed, no reason to remove this functionality.
Both this and the original description are silent about left-to-right or right-to-left; I don't think it matters, but if someone wants things to be parsed identically, perhaps that should be mentioned.
I can imagine adding suffixes like "!" (I'm sure it's only this particular version of the license) or "?" (I'm not sure that it's limited to this particular version of the license), in addition to "+".
However, that's a separate discussion.
Also: is there any reason to FORBID the "+" suffix after a license-ref or license-exception-id?
In particular, someone might use a license-ref while waiting for a license to be added to the SPDX license list or exception list.
A way Would change my proposal grammar above to:
simple-expression = license-id / license-ref
license-expression = simple-expression ["+"] [ "WITH" license-exception-id ["+"] ] /
Kate Stewart 2015-12-15 18:34:44 UTC
From discussion on 20151215 - Mark wants to confirm that the revised version still works in his encoding. Other than that, simpler is better, so once its proven out, we'll look at changing this in the 2.1 spec.
Bill Schineller 2016-05-17 17:35:11 UTC
Mark?
Yev Bronshteyn 2016-04-26 17:12:03 UTC
Currently, either all the files in a package must be specified or, via the filesAnalyzed attribute, none.
However, there's a use case for specifying only those files that are exceptions to package-level licensing or, perhaps, other metadata. From the email conversation at http://lists.spdx.org/pipermail/spdx-tech/2016-April/003068.html:
I don't see the value of including the filesAnalyzed tag in my use case. I'm not doing "analysis", I am telling you what the answer is. Others can later do analysis, using that and other data, if they want to. Since this is human-created, I'm trying to minimize the number of lines.
Bill Schineller 2016-05-17 17:57:21 UTC
Won't come to closure on this for 2.1 version of the Spec, so setting bugzilla Version to 'unspecified'
Project Haystack supports multiple formats
• Formats: JSON, Zinc (typed CSV), Trio (YAML)
• RDF/linked data (Microdata, RDFa, JSON-LD, Turtle, SPARQL)
Idea - could we use similar concept to add support for multiple formats in SPDX?
Transferred from https://bugs.linuxfoundation.org/show_bug.cgi?id=1360
Bill Schineller 2016-05-24 17:51:10 UTC
From David Wheeler on mailing list http://lists.spdx.org/pipermail/spdx-tech/2016-May/003091.html
In the Linux Foundation CII "best practices" badge effort I'm noticing an interesting problem. Some projects have different license situations for their source code and documentation, but there's no simple way to express that using SPDX License expressions. Examples of projects where the license isn't easily expressed with SPDX expressions are:
https://bestpractices.coreinfrastructure.org/projects/1
https://bestpractices.coreinfrastructure.org/projects/137
I propose adding a new construct:
"(IF THEN [ELSE ])" to License expressions.
For starters, can be:
DOCUMENTATION = True if & only if (iff) documentation
SOURCE = True if & only if (iff) source code
So "Source code under MIT, everything else under CC-BY-3.0 or later" becomes this license expression:
"(IF SOURCE THEN MIT ELSE CC-BY-3.0+)".
If there's no "else" and the condition is false, it'd be interpreted as the empty set of rights ("no rights"), so these would mean the same thing:
"MIT OR (IF DOCUMENTATION THEN CC-BY-3.0+)"
"(IF DOCUMENTATION THEN (MIT OR CC-BY-3.0+) ELSE MIT)"
I imagine Condition could be beefed up to allow AND/OR/NOT, file matching, jurisdiction matching, and comparisons with the current date (for timed releases in the future). But that's for a later discussion.
--- David A. Wheeler
There are no good reasons for (
parens )
to be mandatory for most compound expressions.
The only cases I can fathom would be:
Therefore, they should be optional and best left to the cases where they are needed only. This would make things much simpler.
Tool wise, any decent boolean expression handler does not care much about un-needed parens. If they are that do, they should be fixed.
Allow relationship types to be predicates
(e.g. http://mynamspace#mypackage spdx:contains http://mynamespace#myfile).
Verbose due to the presence of optional comment field
Details from discussion from Yev:
Package → Relationship
Relationship → File
Package:contains → File
contains: http://myname:myFile
<.... id=”myPackage”>
<spdx:contains id=”http://myname:myFile” />
</rdf:Description>
http://myNamespace#myPackge spdx:contains http://myNamepsace#myFile
Thomas notes they are using relationship comments to customize relationship. So would not like to see this ability remove. Likes the proposal, but not want to see “other” removed.
Yev, both should be valid (short version, as well as original). Yev to provide example. Not make old ones go away, just enable addition of concise way of expressing relationships.
Thomas agrees Tag value will become clearer as result of having this additional syntax.
Open question - can annotation describe a relationship? Based on model, not able to. So not a solution at this point.
Thanks to Alexios for noticing typo.
There are several properties used in the SPDX Listed Licenses which are not documented in the specification.
They are currently documented in the RDFa terms used section of the Accessing SPDX Licenses document. There are also references to these fields in the License XML Elements and Attributes document.
Missing elements include:
Propose we add another appendix Listed License Information which details out all the fields including those in common with extracted license text (e.g. licenseId, etc.).
Enable referring to package repositories by URL such Artifactory, Bintray, Nexus, etc.
Examples:
ExternalRef: PACKAGE-MANAGER cocoapods FBSDKCoreKit/4.17.0
ExternalRef: PACKAGE-MANAGER pypi numeral/0.1.0.8
ExternalRef: PACKAGE-MANAGER sbt+repo.scala-sbt.org/scalasbt/sbt-plugin-releases com.sc.sbt %sbt-eslint%1.0.0
ExternalRef: PACKAGE-MANAGER gradle+http://repo.jfrog.org/artifactory#easymock easymock:easymock:2.0
Agreement from those on the call, yes it should be at File & Package level.
This will be an optional field.
Useful for generating notice files, etc.
Consider does it make sense for snippets?
Adding “Attributions” to package field to store required attributions (per Oliver)
Alexios: property named FileAttributionText which will hold the text that has to be reproduced.
It can be considered as a combination of information found in properties like FileCopyrightText, LicenseInfoInFile, FileContributor, but it might not simply the sum of these values.
The relative part in the spec could be something like:
4.xx File Attribution Text
4.xx.1 Purpose: This field provides a place for the SPDX data creator to record all attributions found in the file that are required to be communicated. These typically include copyright statement(s), license text, and a disclaimer.
4.xx.2 Intent: The intent is to provide the recipient of the SPDX file with all the legally required attributions in the file, therefore complying with the license obligations.
4.xx.3 Cardinality: Optional, one.
4.xx.4 Data Format: free form text that can (and usually will) span multiple lines
4.xx.5 Tag: "FileAttributionText:"
In Tag:value format, the multiple lines are delimited by "" and "".
Example:
FileAttributionText: <text>
# Copyright (C) 2004 Free Software Foundation, Inc.
# Written by Scott James Remnant, 2004
#
# This file is free software; the Free Software Foundation gives
# unlimited permission to copy and/or distribute it, with or without
# modifications, as long as this notice is preserved.
</text>
4.xx.6 RDF: property fileAttributionText in class spdx:File
Example:
<File rdf:about="...">
<fileAttributionText>
# Copyright (C) 2004 Free Software Foundation, Inc.
# Written by Scott James Remnant, 2004
#
# This file is free software; the Free Software Foundation gives
# unlimited permission to copy and/or distribute it, with or without
# modifications, as long as this notice is preserved.
</fileAttributionText>
</File>
We're currently using #
and ##
to mark h1 and h2 headers (e.g. here and here). But once we get down to h3 and beyond, we start using emphasis **
instead of headers ###
(e.g. here). We also stop providing anchors. That means that, while we can link to h2 headers (like this), there's no way to link to the h3+ sections. If folks are ok with it, I'd like to file a PR that converted our h3+ headers to use ###
, etc. and gave them all anchors like we have for h2. Thoughts?
Thomas: Instead of PackageCopyrightText and CopyrightText add introduce new PackageCopyrightHolder and CopyrightText. Both optional and more than one entry can exists per SPDX-File or SPDX-Package. Better indicates individual rights holders and makes parsing of this data easier.
Kate: https://spdx.org/spdx-specification-21-web-version#h.2grqrue - relax one to many?
Yev: Copyright holder implies present tense, could have been reassigned. Declared vs concluded may be required, not sure we want to go there.
Thomas: Changing cardinality would help.
Yev: We’ll need to apply this to Files & Snippets, as well.
Gary, Kate, Yev, Alexios, Thomas - all +1 on relaxing cardinality. 2.2, permit cardinality to go from one to many. Shelve copyrightholder until more compelling use case.
Tags: PackageCopyrightText, FileCopyrightText, SnippetCopyrightText; RDF: property spdx:copyrightText
Thomas: Extend SPDX-Package with optional “PackageNameAliases” field to record aliases to PackageName. Use case - support renamed packages or the same package that have different names on various distribution platforms. Example MySQL <-> MariaDB
Idea is being able to record alternate names for same package.
This would be optional. Want to be able to semantically detect this.
Yev: possibly change the cardinality? Prevents the document creator on what the “official name” is.
Gary: Might be valuable to retain an official name? Can see it both way. Use case, package originator.
Yev: Gets to declared/concluded dichotomy, are we sure we want to go there?
Gary: Not sure, can think of some cases either way.
Yev: Maybe introduce AlternateName (for package, file, license, snippet, etc.) and leave the package name specified is the one that SPDX document author as authorative voice.
Gary: Apply at element level makes sense.
Yev: Not sure if files should have alternate name though. File encoding could be probablamatic. We’ll need to apply to each with semantic, so not sure we got to there.
Yev: Prefers AlternateName over “..Alias” concept. Cardinality 0 to many.
Thomas: Is good with AlternateName terminology. Only question is should we prefix it? Ie. PackageAlternateName in tag:value. And then AlternateName in RDF.
Yev & Thomas in agreement.
Yev: If one or more alternate names provided, does PackageName needed?
Gary, Kate: Yes.
Alexios: Only for Packages right now? Yev: Yes, lets limit it to this right now.
Can think of case that it would be good to have license name alternates…
Yev: Possibly, but may want to consult further with legal. Snippet names avoid. File names - no.
Conclusion: ok to add PackageAlternateName as optional field with 0-many.
Comments in SPDX documents (depends on file formats).
In RDFa/XML - there is a specific term defined.
In tag:value - # - at start of line - need to be added to document. Entire line comments.
No middle of comment line in SPDX document.
Comments take the form of '#', as the first non-blank character, and continue to the end of line (marked by characters U+000D or U+000A) or end of file if there is no end of line after the comment marker. Comments are treated as white space.
⇒ Alexios recommends aligning with Turtle
What about SPDX-License-Identifier: ( ) in source code.
"#" ok if disallowed character? Need to check specification.
Would be nice if the SPDX specification specifies itself with a SPDX file as CC-BY-3.0 AND MIT
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.