Code Monkey home page Code Monkey logo

Comments (6)

fjmilens3 avatar fjmilens3 commented on May 28, 2024

@TheBay0r, when you're testing this, are you modifying the file in any way between uploads? (For example, are you reuploading the exact same file without any changes at all, or are you changing it in some way before reuploading?)

from nexus-repository-composer.

TheBay0r avatar TheBay0r commented on May 28, 2024

@fjmilens3 I wasn't changing the files in between. Will try to test the case where the content of the zip is slightly changed if that has an impact on the JSON generated.

from nexus-repository-composer.

TheBay0r avatar TheBay0r commented on May 28, 2024

@fjmilens3 So I tested this case. When the zip file contains a change it seems that the json is rebuilt

from nexus-repository-composer.

fjmilens3 avatar fjmilens3 commented on May 28, 2024

@TheBay0r:

So I tested this case. When the zip file contains a change it seems that the json is rebuilt

This is something we're going to have to live with because of other factors related to suppressing unneeded rebuilds, but fortunately shouldn't be that much of a problem.

Explanation

Every time a new upload comes in for an existing artifact, we generate an update event within the data layer:

https://github.com/sonatype/nexus-public/blob/e9668b4f9aeff4a19c263c40121ad40e2780182a/components/nexus-orient/src/main/java/org/sonatype/nexus/orient/entity/EntityHook.java#L272

We then receive this event and if certain conditions hold, we use that to generate a rebuild event for the metadata:

https://github.com/sonatype-nexus-community/nexus-repository-composer/blob/master/src/main/java/org/sonatype/nexus/repository/composer/internal/ComposerHostedMetadataFacetImpl.java#L82

However, since these are database-level events, other events, notably downloads, can also cause the same record to be updated (as we have to increment the last downloaded time, etc.).

We don't want to rebuild metadata in this event, and until we have a better application-level solution, our preferred workaround for this problem is to see whether or not the blob was updated within a short period of time.

If it has, we assume that it was the result of the blob changing, and if has not, we infer that it was the result of a download (or other operation) that touched the asset but should not force a rebuild of any associated metadata:

https://github.com/sonatype-nexus-community/nexus-repository-composer/blob/master/src/main/java/org/sonatype/nexus/repository/composer/internal/ComposerHostedMetadataFacetImpl.java#L116

Along with the above, Nexus Repository Manager tries to deduplicate blobs (for the same asset), such that if we receive a blob for an asset that's identical to the blob we already have for that asset, we don't have any churn at the storage level. As a practical matter, that means that the blob was not updated, so the blob updated timestamp is not updated either:

https://github.com/sonatype/nexus-public/blob/729ac4987d99f581e6ff95a2c1b92945057107aa/components/nexus-repository/src/main/java/org/sonatype/nexus/repository/storage/StorageTxImpl.java#L722

The end result being that you won't have any metadata rebuilt because the blob has not changed. Of course, there are changes within the repository manager itself that could be used to handle this, or we could broadcast custom events from within the content facet; however, I'm trying to minimize the divergence between the approach we have here and the approach we have in our supported/proprietary format implementations.

Conclusions

Under normal circumstances this won't matter as if the blob hasn't changed then the generated metadata would not change (at least not in any meaningful way) either, as the metadata is extracted from the content of the blob (in our case, the composer.json file in the archive).

However, under unusual circumstances it can be advantageous to have a scheduled task within Nexus that can rebuild metadata for all or part of a repository's contents. This is typically useful either to mitigate the effects of some breaking change or to recover from some unexpected situation where the generated metadata is inaccurate or incomplete (a special case of which would be the scenario you first encountered, where I'd made breaking changes with the metadata generation in that PR, but you didn't see the metadata regenerate by reuploading the same artifact).

If you have no objections (and are satisfied with the above explanation), I would like to consider this "closed" and implement the aforementioned scheduled task in #21 before we someday promote this to 1.0.0.

from nexus-repository-composer.

TheBay0r avatar TheBay0r commented on May 28, 2024

Ah, wow! Thank you for the detailed explanation. My naive approach to this was that it just would be hooked up to the post request and whenever a post request comes in an update is triggered no matter what 🤔
But this approach makes sense of course! 🙂

From my point of view this one can be closed, thanks.

from nexus-repository-composer.

fjmilens3 avatar fjmilens3 commented on May 28, 2024

Closing based on conversation with @TheBay0r.

from nexus-repository-composer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.