Comments (3)
Nonetheless, seems like a good idea to build the partially robust operator, which orders by times and if we want to support partially ordered times probably maintains runs of less_equal times which do admit binary search (but all of our updates may be an arbitrarily large number of runs).
I implemented this idea as part of MaterializeInc/materialize#25720 and also studied the linked paper out of interest. I noticed an interesting property that I thought I'd share.
In the paper the authors talk about whether a chain decomposition of a poset is optimal, with the lower bound being w
chains where w
is the width of the widest antichain. What I observed is that for the types we are mostly concerned with (arbitrarily nested Product<_, _>
s) we can always get the optimal decomposition if we extract chains after having sorted the data with the linear extension of the poset.
As an example, if we take Product<u8, u8>
we can see that the widest antichain that can be constructed has width 256
and is the following: {(255, 0), (254, 1), ..., (0, 255)}
. Because the linear extension order will sort things lexicographically the maximum number of chains we can ever end up with is 256, since each update will be grouped to the chain corresponding to its first coordinate. This seems to generalize to additional levels of nesting.
from differential-dataflow.
Related Issues (20)
- Question about some codes in `arrangement.rs`
- Deterministic completion of a batch (frontier) HOT 8
- Consider using `RefOrMut` in `half_join`
- Removing an unseen tuple actually inserts it HOT 4
- Strategies for maintaining persistent states (the data in Collections) HOT 2
- Consolidate Timestamps and Time Windowed Dataflows
- what different with flink Retraction
- Optional Abomonation? HOT 1
- Replicate Cross Join Situation HOT 2
- Operator to flatten `Collection<Collection<G, D, R>>` into `Collection<G, D, R>`
- Difficulty understanding how to use prefix_sum / how to implement topK HOT 6
- miri: Undefined Behavior: trying to retag from <20432167> for Unique permission in push_unchecked HOT 2
- Support `TimelyStack` as storage for `(T, R)` in arrangement leafs HOT 1
- Holding on to a trace with physical/logical compaction to the empty frontier stalls compaction
- Question: how to change data timestamp for late arriving data HOT 4
- Question: how to query data from past timestamps? HOT 5
- maybe the doc should add some instructions at geting started section
- Does all data have to be in memory? HOT 3
- Revisit the stashing logic in MergeBatcherColumnation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from differential-dataflow.