Code Monkey home page Code Monkey logo

Comments (12)

zvr avatar zvr commented on August 20, 2024

The example includes the lines:

   "fileName": "internal/helm/testdata/charts/helmchart-0.1.0.tgz",
   "fileTypes": [
    "APPLICATION",
    "ARCHIVE"
   ]

In this case, the correct fileType for the file is application/vnd.cncf.helm.chart.content.v1.tar+gzip .

Both application and archive are valid filePurpose values and can both be present.

from spdx-3-model.

armintaenzertng avatar armintaenzertng commented on August 20, 2024

So you are saying that the contentType has to be determined from the filename alone (the provided SPDX2 fileTypes APPLICATION and ARCHIVE are no big help in this).

But for an automated conversion approach, extracting the contentType from the filename is not feasible/maintainable. The list of possible MIME types is huge and dynamic, and the helmchart example above makes use not only of the file extension but also of (from an algorithm's perspective) seemingly random other parts of the filename.

Furthermore, I found this warning, indicating that the file extension might not necessarily be the source of truth for the contentType:

Browsers use the MIME type, not the file extension, to determine how to process a URL, so it's important that web servers send the correct MIME type in the response's Content-Type header.

In light of this, I don't think an automated conversion can do much more than a basic conversion from the fileType Enum to some very generic MIME types. Two of these generic types are text/plain and application/octet-stream. For audio/image/video I have yet to find such a generic analogue.

I will prepare a conversion table for this to help the disussion.

from spdx-3-model.

armintaenzertng avatar armintaenzertng commented on August 20, 2024

Here is a conversion table from fileType to contentType or SoftwarePurpose. I would consider this issue closed as soon as this table contains no question marks anymore.

SPDX2 File Type MIME Content Type SoftwarePurpose
SOURCE source
BINARY application/octet-stream
ARCHIVE archive
APPLICATION application
AUDIO audio/???
IMAGE image/???
TEXT text/plain
VIDEO video/???
DOCUMENTATION documentation
SPDX text/spdx for tag-value, application/spdx+json for json; what about the other formats?
OTHER other

from spdx-3-model.

goneall avatar goneall commented on August 20, 2024

SPDX - for tag/value, the media type is text/spdx, for JSON the media type is application/spdx+json.

from spdx-3-model.

iamwillbar avatar iamwillbar commented on August 20, 2024

One of the things that we identified is that FileType was being used for two things:

  1. Describing the purpose of the file.
  2. Describing the type of content in the file.

For SPDX 3.0 we split this into two properties:

  • SoftwarePurpose to capture the purpose (which is of type SoftwarePurpose).
  • ContentType to capture the type of content (which is of type MediaType).

The name ContentType was chosen to mirror the Content-Type header in HTTP (which is also of type MediaType) and to express that this is describing the type of content (as opposed to metadata, headers, or something else). For example, if (and not saying we would) we extended File in the future to be able to capture the type of executable header a file has (e.g. ELF), that could also be of type MediaType but the property name might be ExecutableHeaderType.

from spdx-3-model.

goneall avatar goneall commented on August 20, 2024

FYI - I updated the SPDX 3 Migration Analysis with @iamwillbar description above.

from spdx-3-model.

armintaenzertng avatar armintaenzertng commented on August 20, 2024

Thanks to @iamwillbar, I've updated the conversion table to include SoftwarePurpose as an alternative property to convert to.

This mainly leaves the problem of AUDIO, VIDEO and IMAGE as well as SPDX in the formats rdf-xml, xml and yaml.

from spdx-3-model.

goneall avatar goneall commented on August 20, 2024

Since we can't determine the exact format for Audio, Video or Image, I suggest we use the default application/octet-stream:

SPDX2 File Type MIME Content Type SoftwarePurpose
SOURCE source
BINARY application/octet-stream
ARCHIVE archive
APPLICATION application
AUDIO application/octet-stream
IMAGE application/octet-stream
TEXT text/plain
VIDEO application/octet-stream
DOCUMENTATION documentation
SPDX text/spdx for tag-value, application/spdx+json for json; what about the other formats?
OTHER other

from spdx-3-model.

goneall avatar goneall commented on August 20, 2024

@kestewart @armintaenzertng @zvr - Any objections?

from spdx-3-model.

zvr avatar zvr commented on August 20, 2024

Why lose the information of types?
I mean, even if you don't want to look at filenames for a hint (and decide it's an image/gif), you can always use a wildcard subtype image/* for what has previously been tagged as IMAGE before.

And a more general question: are the issues discussed here recommendations on how to do a conversion from SPDXv2 data or must-do-it-this-way ?

from spdx-3-model.

goneall avatar goneall commented on August 20, 2024

Why lose the information of types? I mean, even if you don't want to look at filenames for a hint (and decide it's an image/gif), you can always use a wildcard subtype image/* for what has previously been tagged as IMAGE before.

if wildcards are valid, then let's go with that. I've updated the migration guide section.

And a more general question: are the issues discussed here recommendations on how to do a conversion from SPDXv2 data or must-do-it-this-way ?

I would suggest these are all recommendations - I'll be adding an Annex to the spec repo based on the migration guide for review in the next couple of days.

from spdx-3-model.

goneall avatar goneall commented on August 20, 2024

No objections to the wild card over - the migration guide has been updated. Closing this issue as resolved.

from spdx-3-model.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.