The mixt from mpmilano

Language/runtime: "context" enum

this should be part of a larger context object which is passed around as a first-class item and includes a pointer to the current store-specific TransactionContext(s).

Language: fix object-field referencing syntax

blocking on #2

Ops: many-core deployment

blocking on #10

Language/runtime: HandleCaching review

there's something fishy about the way I just fixed HandleCaching. In particular:

we'll fetch too many times, probably
we still need lots of testing here.

change from std::string for named items to just int. Hash in the Tracker if i need to.

Merge on eiger is unimplemented

I kid you not, and have spent an embarrassing amount of time trying to be wrong about this. It was there in COPS, but it's missing from Eiger. Eiger is a pure last-writer-wins database, which is actually a really poor match for weak consistency in general. Among other things, my demo applications really won't work under this regime.

Last-writer is deterministic based on their logical time-stamping, which also is pretty low resolution, which may admit local history inversions - I.E. I design my system so that it makes a decision if a value is over a certain threshold, but due to conflicting updates only 1 replica site is able to observe that value being over that threshold, despite the fact that no operation intends to decrease the value.

Eiger: interface design

First "real" numbers

blocking on #13, #12, #9, #8, #7, #3

macro simplification: multimethods.

Context: I'm retaining the Level information at the RemoteObject point, not just at the handle point, so that I can specify extra requirements on causal remoteobjects. This means I need to change the code which resolves calls to operations as multimethods. Surprising nobody, the resolution process for multimethods is really hard to follow. I've made the decision to restrict the parameters that declared operations can take in order to simplify the resolution code.

If I get time (unlikely), I should probably either re-generalize the resolution code or add further constraints to declarations in order to simplify it.

SQLStore: serialization code needs cleanup

blocking on #3

Eiger: Hello World / initial configuration

Testing: figure out parallel make

Robustness: expose non-boolean failure mechanism; eliminate boolean failure indicators

don't forget: forbidden merge function

you just thought of having a default-merge function which, in the absence of a static "merge" function, will next default to forbidding merges in general; i.e, crash the program with an error saying "divergent history for non-mergable object foo encountered."

It would be really nice if we had a way of statically preventing this, but we honestly don't other than forbidding the creation of causal objects which fail to have some merge() defined on them. Which I honestly don't think I want to do.

Language: operations outside of transactions

we used to support this; we don't anymore. This is a distinctly wishlist item.

Hygene: Break up Temporary.hpp

Language: disable type coercion

that thing where we can upgrade the handle's level to ro / strong or wo / causal is broken w.r.t assumptions made in transaction splitting. Probably should just delete that code, honestly.

Ops: multi-IP deployment (local)

FileStore: nameDecoder is in-memory; cannot currently retrieve existingObject created in previous run of program.

If I create an object via FileStore, I translate the given int name into a string and store a mapping from that int to that string. When calling existingObject(int) later, I check that mapping for int and assert-fail if I can't find it. That's fine for a single run, but if we try and persist objects across runs they'll be un-nameable.

Obvious solution is to just store this mapping on the filesystem itself, and reference it from there. But FileStore is for debugging rather than actual use, which means this probably doesn't need to be fixed.

Eiger: first run

blocking on #8, #9, #7

Ends: broken?

We need to write causal metadata alongside each causal write of an object. This is so obvious it's almost a tautology. Right now each object has exactly one metadata slot, which we plop our ends object into. This is wrong, because we might have a stale view of other replicas' timestamps, and doing this could over-write them.

Best way to fix this would be for Ends to have semantics at the datastore itself. There are probably other was to fix it too, but they seem slow.

Things were much easier when we just tracked everything by accident.

Hygene: require fewer macro-level declarations

is rand() random enough?

Probably should swap it out for something more random.

Hygene: Move DeclarationScope out of TempBuilder

SQLStore: non-blob object types

Language/runtime: transactionContext association

right now, TransactionContexts are stored in the various remoteObjects that are currently undergoing the transaction. This will break if we have concurrent or nested transactions. Better idea is to modify levelCall(...) to take in transaction context along with Cache and Store. As stopgap, we could implement this by storing the TransactionContext in the Cache or Store.

Language: support "else"

blocking on #2

Language/Story: why are we splitting transactions?

It seems like we should be able to accomplish the sort of transactions we're currently supporting without splitting transactions up. Let's explore exactly what benefits can come from splitting transactions.

Ops: multi-host deployment (VM?)

Hygene: unify contexts

we have two notions of context right now; one is the state of a running transaction, and the other is the manner in which we should be treating encountered handles. We just refactored the code to allow the first to be passed around during transaction execution; the second is still looked up via "magic number" in the execution Cache. It would be better if we made it a member of the transaction context

Metadata structures: allow custom implementations

really just for Ends. If we can implement it with a datastructure whose "native" merge semantics match the behavior we want, then everyone wins - especially because I have no idea how eiger would merge the Ends structure otherwise.

un-stub SQLStore new stubs

Metadata explosion: we are not updating ends

This is a fairly big-deal bug, but it does go a long way towards explaining why the tracking code has been so terribly slow. Here's the two components of tracking:

linearizable writes need to explicitly list causal dependencies
causal reads need to read from stores that can satisfy causal dependencies.

We satisfy the second point with tombstones - when you encounter the tombstone for a lin. operation, you have access to all its dependencies. This is intuitive, fast, and sound; everybody wins!

We satisfy the first one with the a vector clock named "ends." This tracks a global "known throughout the world" clock, which serves to limit the number of causal operations a particular lin. write needs to track. For some reason, we've actually never been updating this clock - not even in the java prototype code. Which means the tracking set grows without bound.

actually incrementing ends is not hard; the problem is figuring out when to do so. Also it's probably masking bugs, because that's how my this project seems to go.

FileStore: either implement or disable transactions

Notes on VM setup

These aren't really associated with this repo, but I left my notepad in the department, so I'll leave them here regardless.

tc lets you introduce latency, but it'll effect all traffic going out a specific interface, which we don't really want. So what we're going to do is use a USB ethernet adapter (from the department) to add a new interface to both desktops, put these on the 192.168.1.x subnet, and then route all replication traffic over them. This should simulate serious delays in postgres replication.

Testing: expand language tests, add functionality tests

Tracker: implement needed datastructures

Endorsements

blocking on #2, #12

CrossStore: merging objects

CrossStore's design assumes all objects stored at causal locations have some native ability to be merged. This isn't enforced or exposed right now. Ideas:

assume stores support a merge() function on objects
always take from most up-to-date remote store (for the specific constraint we're optimizing)
forbid storage of objects which do not implement some Mergable interface.

Language, Eiger: support read-only/write-only causal fragments in transactions

blocking on #2, #20, #13

CrossStore: implement interface

this is large.

blocking on #18 and #28

Hygene: why are EnvironmentExpressions duplicated?

When we search for environment expressions within a transaction, the search comes back with serious duplication. This isn't actually a problem now (so long as it's duplication and not phantom values), but would be if we allowed variable arity for environment expressions. Probably just got lazy with a tuple_cat somewhere, but it will be annoying to track down.

File under: TODO if we ever need variable arity.

Ends: causal metadata with native merge

I should really allow a "native" implementation of Ends which is store-specific and can take advantage of the causal store's natural merge semantics. Right now I have an efficient vector-based map for Ends, which is fine but means the "natural" merge function is the wrong merge function.

namespaces

the transactions language should be in a separate namespace from everything else because operator overloading. The whole project should be in some non-global namespace.

re-write temporary bindings (let_ifvalid and let_mutable) for new drefing paradigm

Pursuant to a conversation with ACM, i should replace the let_ifValid and let_Mutable constructs with new constructs more focused on referencing/dereferencing. let_ifValid() becomes a dereferencing bind; the bound variable mutates the remote object when assigned to, and is drefd in free_exprs and operations let_mutable() becomes a box-bind; the bound variable mutates the pointer itself, and is preserved as a raw handle in free_exprs and operations Note: we should likely change the syntax of free_expr; no need to "dref" a handle syntactically, as that behavior is now dependant on how it was bound. Note: this should be a large re-write (if done correctly). Will defer until after current context work and store implementations.

CrossStore: replicate commit

right now, commit is only at the client. Obviously not fault-tolerant.

Design for this was: have a bunch of lightweight workers co-located with the stores. Each of these serves the single purpose of accepting commit messages and carrying them out.

Revisiting tracking

see email to andru, 11/4/2015, for explanation of problem.

What would be sound: if objects have vector-clock version numbers, can only drop extra causal replica once all objects observed from that replica have equal-or-older versions of those same objects at your local replica. This seems unlikely to happen.

Natural result of the read-from-extra-replicas scheme is swiftly imposing full quorum reads on the entire cluster, thus upgrading the causal store to a linearizable store.

Could always abandon tracking entirely. Then the composite schedule of causal + linearizable would not be causally consistent, even though each fragment of the schedule would be.

mpmilano / mixt Goto Github PK

mixt's People

Contributors

Stargazers

Watchers

Forkers

mixt's Issues

Recommend Projects

Recommend Topics

Recommend Org