Code Monkey home page Code Monkey logo

Comments (1)

sbahra avatar sbahra commented on August 24, 2024

Hey @huonw and @aturon,

First off, thanks much for reviewing the ck_epoch code. However, I believe it is correct. It's fine to be many generations old as long as you don't actually physically destroy objects in this situation or tick the epoch (unless that snapshot is from an older grace period), which is the central concept. The academic implementations have readers tick over the epoch, where following this constraint to the letter matters since they might otherwise be deferring to stale limbo lists (this also implies significant serialization penalty on some write paths). All epoch ticks here occur outside of protected sections and on epoch snapshots with respect to the individual writers.

The point of the fence is ordering with respect to the active flag and not the epoch counter. The epoch counter will be either observably old or has passed a grace period (wrap-around). As for the limbo lists, remember we only garbage collect a limbo list as a result of a successful tick (from the calling thread or another) with respect to the callers observation of the global epoch. Also, the limbo lists are local to every caller of ck_epoch_call/ck_epoch_poll. I think ck_epoch_poll is way too conservative but I'm not confident in relaxing some of the constraints yet (primarily because once I have time to revisit it, there is a better and more-finely grained interface to implement poll).

Let's first tackle ck_epoch_begin ordering semantics. The Concurrency Kit implementation does not apply modulo for the epoch distance calculation and relies on global ordering of memory operations with respect to the epoch tick, this allows for much better amortization across writers trying to detect a grace period or tick (fewer write activity). The memory fence is really about ordering protected sections with respect to setting the active flag and nothing to do with the epoch counter. The epoch counter counter observed is either current, observably old or has passed a grace period.

For example, let's take the execution history you mention where we read an old epoch and stop executing before setting the active flag. Imagine now that it is some handful of generations old. This would be detectable unless the epoch was ticked UINT_MAX generations. For example, global epoch might be 12 but our snapshot was from 4. This doesn't break correctness, because protected section's load operation is either from epoch 4 or current (up to previous generation). Eitherway the epoch counter will not tick because the processor doing the epoch scan will also either see 4 or something current. What about overflow in the tick? If it was ticked UINT_MAX generations, we know that at least a grace period has passed. Since we have a full fence and global memory ordering, we also know that the hazardous references (contents of section between begin and end) would occur with respect to either the current global epoch or the previous, so the epoch invariant is maintained. Again, if that isn't the case, then the epoch counter would not tick to begin with.

Alright, what about ck_epoch_poll? Things get a little bit hairy here because we are applying the modulo semantics. There is a full memory fence up top. Any updates on the writer's side (remember, you have per-writer limbo lists, not global limbo lists) are visible. If all threads have active = 0, then we know that any "stale" epoch (arbitrarily many generations apart) is irrelevant because they haven't hit their fence yet with respect to the fence in ck_epoch_poll (similarly in ck_epoch_synchronize).

We know that any wrap-around for a position in the limbo list represents a grace period with respect to the contents of the limbo list. Given the epoch snapshot of the caller of ck_epoch_poll, we can apply this same invariant to the epoch that we incremented to (but not incremented from, unless it was mutated due to a successful tick which also indicates a grace period with respect to the caller and limbo list offset). This is all assuming that ck_epoch_call defers objects to the same epoch as ck_epoch_poll. What if that isn't the case with respect to readers? For example, I might defer to 2 as a writer but a reader is at "2" (limbo list position) from a long time ago. In this That doesn't matter either because we only collect a limbo list if we have observed an actual tick occur and we always only collect to the tick we mutated to (which implies at least e + 2).

But is there a reason not to move the update of the record past the fence and after active? If so, it would likely be some negligible performance reason (gain / loss depending on protected section).

I would like to note that I think is ck_epoch_poll is utter-shit as far as incremental reclamation semantics, it is best-effort, too conservative and more coarse-grained than I would like. There's definitely room for improvement here and any efforts to take a stab at this are welcome. I think simply adding more limbo lists might solve a lot of the issues along with reclaiming up to modulus e - 2. We use 4 here primarily for performance (modulus on fast path of call sucks when deleting objects in bulk).

Does this clarify things for you? I'll be around #rust-libs for a few in case you want to chat about it.

from ck.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.