Code Monkey home page Code Monkey logo

Comments (5)

SteveLasker avatar SteveLasker commented on June 2, 2024

riffing a bit more:
One of the values of an OCI distribution based service is the named/content addressable storage.
The blobs are details, but not the “human” interaction model
registry.io/namespace/artifact:tag

What would be interesting is how these names elements can be combined.

If a curl client want to pull binary, can it pass registry.io/namespace/binary:version

If the notation client is invoked, as a gate to a curl, it uses the same names reference: notation verify registry.io/namespace/binary:version

If a remote desktop video client wishes to get metadata, it can also use the same named reference.

The registry would serve a redirect to the content (blob) based on the type of request.

This model could preserve the common named reference, pointing at a manifest to describe the artifact. But also support the other scenarios like signatures, sboms and metadata.

Thoughts?

from artifacts.

awakecoding avatar awakecoding commented on June 2, 2024

The tag human notation is not flexible enough for this, it's probably better if we simply parse the manifests to find the blob URLs, and then construct associative "file URLs" that use the content address (digest) combined with a "human address" (file name + type) to pull the blob as a regular file that will be recognized by the browser.

The spec uses the following URL structure to pull blobs:
/v2/<name>/blobs/<digest>

Here is what I suggest:
/v2/<name>/files/<digest>/<filename>

So let's say you have a manifest that refers to a PDF presentation for which you now have the digest. You could pull the blob and open it in a PDF viewer, but if you try reading the blob directly in a browser, it will likely just download it instead of launching the built-in PDF viewer of the browser, and that's because it didn't have the correct mime type (application/pdf). Let's fix this with my suggestion:

/v2/<name>/files/<digest>/presentation.pdf

The OCI registry would use the last element of the URL as the file name, but serve the contents of the corresponding blob. With automatic mime types, ".pdf" can be served with "application/pdf" as the mime type, which should make it work inside the built-in PDF viewer of most browsers.

The last improvement to discuss is how to explicitly specify a different mime type instead of leaving it to default mime type detection based on the file name. We could add a query parameter, or a request header for this.

What do you think?

from artifacts.

SteveLasker avatar SteveLasker commented on June 2, 2024

It’s getting closer.

We do have this general request for curling urls from a few folks, including some internal Azure teams. I’m trying to find a way to meet the url requirements while staying aligned to some principles around the distribution spec. Some of them aren’t written in the specs but are standard implementation details.

The premise of how far the distribution spec diverges from these core concepts has been the source of much discussion, so I do want to recognize these challenges and try to keep expanding the capabilities to continue on the vision that distribution can be the base for most new package managers, while maintaining some core principals.

Blob URLs are neither fixed over time or tied to the same domain or URL as the artifact reference

A distribution instance has two endpoints:

  • A REST endpoint for auth and discovery
  • A data endpoint for blob content delivery.

A user references: wabbit-networks.io/net-monitor:v1
The blob content (layers for container images), may be served from 1234567.blobs.core.cloud.io

See an example of ACR Dedicated Data endpoints

Distribution clients know how to negotiate this series of requests. A standard and simplified “happy path” would be:

  1. A client requests an artifact (by tag or digest)
  2. The registry responds with a manifest
  3. The client evaluates the blobs defined in the manifest. The blobs are wrapped in descriptors.
  4. Based on the digests (in the descriptors), the client evaluates if it has any, already on the client.
  5. The client identifies the missing blobs and sends a list of requests to the server for urls for each blob.
  6. Based on various factors, different blob urls are returned. Two requests for the same manifest, or different manifests, even in the same repo, may return different blob urls.

Reasons for differences:

  • A geo-replicated registry may return regionalized data endpoints for a traffic routed registry endpoint. [registry].[region].data.azurecr.io. ( The wabbitnetworks.azurecr.io registry endpoint may have wabbitnetworks.eastus.data.azurecr.io and wabbitnetworks.westeu.data.azurecr.io data endpoints)
  • Similar to geo-/region replicated, zone routing is also done
  • For tax & trade compliance requirements, a global request must be served within that geo-fenced region (The was one of the requirements that led to mcr
    • When a paying customer purchases software from Australia, it must be billed and delivered from within that region.
  • Same as above, but some customers need their data limited to sovereign boundaries
  • A cdn registry may return a cdn backed blob url, where another registry or even a repo on the same registry may serve cloud blob urls directly. Docker Hub and other cloud providers have made cdn changes over time
  • Some cross-cloud registries have used blob url re-routing to deliver the expensive (network egress) blobs from within that clouds data center/region
  • Windows foreign layers use this model to serve windows layers from mcr.microsoft.com, regardless of where the manifest is located. (this is actually a problematic one we’re working to undo)

What your asking for is something I'm hoping we can solve. I’m just searching for a solution that gives the benefits above for using the same url to get supporting artifact types (signature, sboms, scan results) and stay true to the core capabilities of the distribution spec that has scaled to many scenarios.

What you, and others, are asking for is a way for the registry to redirect a request, rather than the client having to negotiate the manifest content.

Perhaps @sajayantony, @stevvooe or @jonjohnsonjr might have some ideas on how we can redirect requests, based on the mediaType in the header.

from artifacts.

mikebrow avatar mikebrow commented on June 2, 2024

We are archiving this repo and this issue will thus become read only would you like to move it to the distribution spec repo?

from artifacts.

mikebrow avatar mikebrow commented on June 2, 2024

closing for now due to pending archive action.. pls reopen if archive is not completed and/or if you believe this close to be in error

from artifacts.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.