onyx-platform / onyx Goto Github PK
View Code? Open in Web Editor NEWDistributed, masterless, high performance, fault tolerant data processing
Home Page: http://www.onyxplatform.org
License: Eclipse Public License 1.0
Distributed, masterless, high performance, fault tolerant data processing
Home Page: http://www.onyxplatform.org
License: Eclipse Public License 1.0
This API will compare it to a UUID and fail. Do the cast inside the function so users don't need to worry about it.
As of 0.3.0, the only strategy for balancing peers across jobs and tasks is round robin/breadth-first, respectively. This ticket should break out the algorithms used for planning and coordination into functions behind multimethods, and allow for a greedy strategy. A greedy strategy will try to complete an entire job before moving on to the next.
Removed while construction took place on 0.5.0.
The catalog should off an optional onyx/max-peers
parameter that takes an integer value representing the maximum number of peers that may be executing an instance of that task at any single point in time.
Reproduced in core.async plugin tests with a high number of virtual peers. Sometimes, closing a peer will block as it tries to flush its pipeline. The pipeline will block on reading from an ingress queue. This queue should always provide the sentinel value. Something is hanging on to the sentinel as a consumer and never committing it back to the queue, hence the hang.
This is a particularly rough edge case. If a virtual peer receives a grouping task, it's capable of being starved from receiving the sentinel segment off the queue due to the way that HornetQ pins messages as it groups. If a consumer closes out, it might not necessarily requeue the sentinel in a server node where other consumers can reach it. Hence, the other virtual peers may deadlock and wait forever. This only affects batch mode - streaming mode is fine.
Allow value-level parameterization through the catalog.
[{...
:my/param 42
:my/other-param 44
:onyx/params [:my/param :my/other-param]}]
Should enable better searchability of the docs.
With any task proceeded by a sequential task, the following exception will be thrown:
org.hornetq.api.core.HornetQInternalErrorException: HQ119000: ClientSession closed while creating session
type: #<INTERNAL_ERROR>
org.hornetq.core.client.impl.ClientSessionFactoryImpl.createSessionInternal ClientSessionFactoryImpl.java: 782
org.hornetq.core.client.impl.ClientSessionFactoryImpl.createSession ClientSessionFactoryImpl.java: 366
sun.reflect.NativeMethodAccessorImpl.invoke0 NativeMethodAccessorImpl.java
sun.reflect.NativeMethodAccessorImpl.invoke NativeMethodAccessorImpl.java: 57
sun.reflect.DelegatingMethodAccessorImpl.invoke DelegatingMethodAccessorImpl.java: 43
java.lang.reflect.Method.invoke Method.java: 606
clojure.lang.Reflector.invokeMatchingMethod Reflector.java: 93
clojure.lang.Reflector.invokeNoArgInstanceMember Reflector.java: 313
onyx.queue.hornetq/eval20966/fn hornetq.clj: 201
clojure.lang.MultiFn.invoke MultiFn.java: 231
onyx.peer.operation/start-lifecycle? operation.clj: 55
onyx.peer.transform/eval21156/fn transform.clj: 95
clojure.lang.MultiFn.invoke MultiFn.java: 231
onyx.peer.task-lifecycle-extensions/merge-api-levels/fn task_lifecycle_extensions.clj: 19
clojure.lang.ArrayChunk.reduce ArrayChunk.java: 63
clojure.core.protocols/fn protocols.clj: 98
clojure.core.protocols/fn/G protocols.clj: 19
The virtual peer will shutdown and instantly reboot, continuing as normal. This bug is mostly harmless. It is causes by the concurrent optimizations set for the HornetQ configuration. The Session Factory is swapped out in favor of a different factory, but the new factory doesn't "stick" for new tasks. Virtual peers reuse the old Session Factory that has been closed. The exception is thrown. After reboot, a fresh Session Factory is used.
Harmless, but annoying to see in the logs.
In 0.4.0
, we're going to move away from the tree/map based workflow to a vector-of-vectors. This will properly support multi input streams to any task, and continue to support multiple output streams. It will look like this:
[[:in-1 :inc]
[:in-2 :inc]
[:in-3 :inc]
[:inc :out]]
Tasks:
read-batch
must return a mapHard coded for 250ms.
Observing high processor load after starting and stopping Onyx many times in the same repl session. Reproducing what @prasincs saw a few weeks ago.
There's been some confusion around what the difference between the "event map", "lifecycle event map", and "context map" are. They are all the same thing. This should be fixed in the docs. I think I'd like to choose "lifecycle event" as the canonical term.
Seems like this function should block, but actually returns a future.
Considering a feature that will let a task complete when only one (not all) of its upstream inputs have pushed the sentinel onto the input stream. This would aid use cases where a privileged kill stream is utilized.
Just a placeholder, needs more thought.
When we're running locally, it might be better for performance to use LMAX Disruptor instead of HornetQ for messaging. Requires benchmarking.
As mentioned in #2, workflow should be validated such that only input tasks are missing incoming edges, only output tasks should be missing output edges, and DAG should not have any cycles (dependency will throw an exception for you when creating the graph).
Coverage Protection is described here: https://github.com/MichaelDrogalis/onyx/blob/12c72be61d056446b1b7fe0a54a33782bbedc03b/doc/design/masterless.md#partial-coverage-protection
This issue serves as a placeholder for the creation of another repository - onyx-dashboard. This dashboard will serve as a point of monitoring the status of what's happening inside Onyx by querying ZooKeeper. The data in ZooKeeper is immutable, and compressed with Fressian.
The exception right now is confusing. start-lifecycle?
, inject-lifecycle-resources
, inject-temporal-resources,
close-temporal-resources, and
close-lifecycle-resources` all need a type check on their return values.
Some of the examples are incorrect due to design changes, or require pictures for better explanation.
Unsure of how I want it to look, but make it less verbose.
Command section is mostly correct, but needs a final pass to ensure that it's update to date.
If you supply a core async channel (or other non-serializable object) in a task map within a catalog, and submit-job, it will hang silently.
Ideally there should be some kind of validation of the catalog for serializabilty.
Logs gets pretty huge after a short time. Add Rotor to Timbre to fix that.
If a task throws an exception, the Peer crashes and isn't able to service additional work.
This plugin should be capable of reading off the Hadoop file system and writing segments back to it. The point of input and partitioning should be a single file, and the partitioning will happen over the byte sequence representing the file distributed over blocks in the cluster.
Hi, I wonder why virtual peer use 15 threads to process data? Are there other considerations?
(-> (fn1 event) fn2 fn3)
, isn't this simpler?
In a very early version of Onyx, if the batch size of messages didn't accrue within a certain period of time, Onyx wouldn't attempt to keep reading and time out. This was removed due to a bug in HornetQ that didn't preserve sequential ordering. This is useful for sparse message streams, so I have found a workaround to add this back in.
A Kafka plugin should be created that offers both input and output functionality. Additionally, it should be capable of working with Kafka partitions.
If the Coordinator and Peer are running on the same machine, they'll log to the same file. This can be a little bit of a pain during development. Each should log to its own file.
Reads block forever in aggregation readers. Fixed in 0.4.0-SNAPSHOT
.
The Coordinator logs very infrequently as of 0.3.0. Logging using Dire should be implemented on events like job submission, task completion, peer birth/death, etc.
Submitting a catalog that doesn't conform to the specification of the informational model throws an unhelpful assertion inside the Coordinator. The catalog should be validated using something a library like Schema to obtain helpful error messages.
Key it under onyx.core/job-id
.
There needs to be an API function that takes a job ID and halts any peer execution of that job's tasks. The job's tasks will no longer be eligible for execution.
Reproduced with the grouping test in Onyx core by turning up the number of virtual peers. Observed that two peers can continually take the sentinel segment off the queue and re-enqueue it infinitely, neither of them able to complete the task.
This silently fails right now, the exception gets swallowed up and everything halts.
If both of these are the same, the same multimethod for lifecycle resources gets dispatched to. This is really confusing. See #36
As of release 0.3.0, Clojure is the only supported language for Onyx. Java users can use the APIs that Clojure offers to tap some of the Onyx functionality, but this becomes problematic for areas such as lifecycle extensions that rely on implementations of multimethods.
Furthermore, EDN isn't the friendliest cross-language data format to send catalogs and workflows through. Part of this issue should explore options that Java users have on this front.
Seeing this happen exactly twice, every single time. Never noticed until now.
This was removed by the underlying mechanism changed for 0.5.0. Reimplement this API function.
Messed this one up during the redesign. Fix this before shipping 0.5.0.
Submitting a workflow that doesn't conform to the specification of the informational model throws an unhelpful assertion inside the Coordinator. The workflow should be validated using something a library like Schema to obtain helpful error messages.
When a peer dies, it should attempt to recover by rebooting itself. See core.async test for failure.
From the mailing list:
"Also, it'd be really nice to specify a group-by operation on the input queue of each function, because then you could make things like wordcount really easy--you'd be able to say "send all instances of the same word to the the same downstream task", which would enable workflows that require an implicit sort/shuffle step."
https://groups.google.com/forum/#!topic/onyx-user/xniQcgCPEn8
Aggregators are at a disadvantage, performance-wise, to transformers and groupers. Aggregators can only hold a single session open which needs to be reused across pipeline iterations in the peer. The reason for this is that if multiple sessions were used, all sessions needs to be read from at the same time since groupers will pin particular message ids to consumers. These sessions shouldn't be closed, otherwise the messages will be repinned. Further, once the sentinel is read, all other sessions will block indefinitely.
The goal of this issue is to speed up this aggregators using an alternate design approach.
I think :hornetq.udp/refresh-timeout should be :hornetq.jgroups/refresh-timeout (along with the others)
It would be pretty great to Jepsen both HornetQ and Onyx itself. Partitioning virtual peers and coordinators fit the bill nicely.
I never got around to adding a test for this because I implemented JGroups before the test suite used embedded HornetQ. This is pretty straightforward to test with an embedded cluster.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.