Code Monkey home page Code Monkey logo

Comments (3)

ejweber avatar ejweber commented on May 29, 2024 1

Hello @egyptianbman. When I originally read your analysis, it sounded correct. Thanks to @PhanLe1010 (who I discussed it with) for clueing me in to the following!

When Longhorn purges snapshots after a snapshot deletion, the following steps are triggered:

So, to make a long story short, even though the oldest snapshot file appears to be removed in favor of the next oldest snapshot file, data is actually copied in reverse (from the newer snapshot file to the older one). Then, some file system metadata operations take place to make things look "right".

I would not expect the feature you are requesting to speed up this process. If you have the following snapshots:

  • snap1 (contains 1TiB of writes)
  • snap2 (contains 5 GiB of writes)
  • snap3 (contains 5 GiB of writes)
  • head (contains 5 GiB of writes)

Both of these operations are virtually identical from a Longhorn perspective:

  • Deleting snap1 actually involves copying 5GiB of data from snap2 to snap1, then "renaming" the resulting file snap2.
  • Deleting snap2 actually involves copying 5GiB of data from snap3 to snap2, then "renaming" the resulting file snap3.

Please chime in to keep me honest if I've said something incorrect @PhanLe1010 and @longhorn/dev-data-plane!

from longhorn.

egyptianbman avatar egyptianbman commented on May 29, 2024

Thank you so much for such a thorough response! Are the snapshots always growing since longhorn is essentially merging oldest-1 into oldest? Is oldest then actually always the delta from the first snapshot to the oldest-1?

from longhorn.

ejweber avatar ejweber commented on May 29, 2024

Are the snapshots always growing since longhorn is essentially merging oldest-1 into oldest?

Each snapshot has a maximum size that is the nominal size of the volume. If oldest-1 contains writes to blocks that were not in oldest, then the actual space consumed by oldest increases accordingly during the coalescing operation. However, if oldest is already consuming the space of the nominal size of the volume, the coalescing operation simply overwrites blocks.

In the "worst" case, coalescing does not reduce the actual size, because oldest contained some blocks and oldest-1 contained other blocks. Coalescing deletes a file, but all of the data from the snapshots must be retained. In most cases, oldest-1 mostly contains changed blocks that are also in oldest. So coalescing allows us to get rid of the outdated copy of the blocks in oldest, reducing actual space consumption.

Is oldest then actually always the delta from the first snapshot to the oldest-1?

I'm not sure I follow.

  • oldest is self-contained snapshot of the volumw as it existed at some point in the past. Maybe there were previous snapshots, but all history of them has been lost during previous coalescing operations.
  • oldest-1 has the changes to the volume since oldest. If you had oldest-1 but not oldest, you would have only a corrupted mess of blocks and that almost certainly would not contain a valid file system.
  • oldest-2 has the changes to the volume since oldest-1. If you had oldest-1 but not oldest AND oldest-1, you would have only a corrupted mess of blocks and that almost certainly would not contain a valid file system.
  • volume-head has the changes to the volume since the most recent snapshot. If you had volume-head but not all of the previous snapshots, you would have only a corrupted mess of blocks and that almost certainly would not contain a valid file system.

In some since, oldest is the most important snapshot in the chain. If you have it, you have SOMETHING useful, even if it is out-of-date. All of the others are just differences from previous. It would be very unlikely (though technically possible) for your data to be intact if you lost the oldest snapshot (outside of intentional coalescing operations performed by Longhorn of course).

from longhorn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.