mkuthan / stream-processing Goto Github PK
View Code? Open in Web Editor NEWLearn how to develop and test stateful streaming and batch data pipelines
Home Page: http://mkuthan.github.io/tags/#stream-processing
License: Apache License 2.0
Learn how to develop and test stateful streaming and batch data pipelines
Home Page: http://mkuthan.github.io/tags/#stream-processing
License: Apache License 2.0
How to update state of the streaming job using BigQuery?
TimestampedMatchers delegates to ScioCollectionMatcher but they should be tested anyway
Right now the jobs can be deployed on Dataflow but there is no test data.
It's a great opportunity to use Dataflow data generator template :)
https://cloud.google.com/dataflow/docs/guides/templates/provided/streaming-data-generator
Split shared package into shared (allowed as domain dependency) and infrastructure (allowed only as application dependency)
Unify diagnostic output for all IOs. Typical use-case: handling corrupted Pubsub messages if the message couldn't be decoded into raw case class. Currently handled explicitly in toll-application:
SCollection
.unionAll(
Seq(
boothEntriesRawDlq.map(x => IoDiagnostic(x.id, x.error)),
boothExitsRawDlq.map(x => IoDiagnostic(x.id, x.error)),
vehicleRegistrationsRawUpdatesDlq.map(x => IoDiagnostic(x.id, x.error))
)
)
.keyBy(_.key)
.writeDiagnosticToBigQuery(IoDiagnosticTableIoId, config.ioDiagnosticTable)
Report vehicles with expired registration using side input as registration lookup table.
SELECT EntryStream.EntryTime, EntryStream.LicensePlate, EntryStream.TollId, Registration.RegistrationId
FROM EntryStream TIMESTAMP BY EntryTime
JOIN Registration
ON EntryStream.LicensePlate = Registration.LicensePlate
WHERE Registration.Expired = '1'
Generated payloads for BigQuery should show what is accepted and what is not.
Domain types like dates and numbers might accept values out of valid ranges.
Re-think diagnostic output:
Show unified batch and streaming for TollApplication.
ClosedTap
and we need dead letters.A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.