dspa's People
dspa's Issues
Windowing
Todo:
- add last active timestamp on a post(from likes, comments, replies)
- operator that outputs all active posts that have timestamps >= time - 12 hours(like a filter)
- count number of comments, replies, number of people who interact with post in last 30 mins
UTF-16 records
Some records in the data are UTF-16, but csv's StringRecord supports only UTF-8. For now, these records are ignored and the follow error is logged:
Error: Error(Utf8 { pos: Some(Position { byte: 9271, line: 123, record: 122 }), err: Utf8Error { field: 2, valid_up_to: 7 } })
Kafka not reading from beginning
As in the title.
Test DSU
As above.
Watermarks & Tests
-- If we do watermarks at producer?
- Watermarking in the producer is done correctly. We need to explain why in detail.
- In the current implementation, watermark offers guarantees for delivery of all elements strictly below its timestamp. That is because the stash is non-inclusive.
- Watermark need to move out of the Buffer operator: If watermarks are eliminated at buffer, then is like we did not have them at all in the next operators, hence no periodic outputs.
- Rest of operators account for delays and synchronization at the moment. They should not, since the watermark offer all the needed guarantees.
-- Is watermarking at the producer a good idea in the end? It may make more sense to actually just add the periodic extra events without any guarantees and then use the logic we already have? The producer is also only one worker and is not an actual part of the streaming system, which is a bit wierd considering the task 0 formulation...
-- For tests: Input for tests needs to be done via a source operator since probes don't downgrade capabilities(at least in the Timely version we use).
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.