rsimmonsjr / axiom Goto Github PK
View Code? Open in Web Editor NEWImplementation of a highly-scalable and ergonomic actor model for Rust
License: Other
Implementation of a highly-scalable and ergonomic actor model for Rust
License: Other
This should allow the user to send to all system actors in a try fashion and then be told if there were any errors sending to those actors.
Currently the code checks to see if receivable = 1
to determine if it needs to schedule the actor. If the actor has messages already then the dispatcher thread will have put it in the channel. I am wondering if try_send
will race the dispatcher threads and an actor with a message not get scheduled. if its in the channel twice that is fine, though not best performant, but that should be preferable to not being in the channel when it has messages. The bad part is if it is out of synch it wont get back in synch because the actor will have more than one receivable message and wont be in the channel. Perhaps an AtomicBool should be used instead to track if the actor is scheduled.
If a user tries to deserialize a Remote
ActorId and the remote is not connected the code currently panics. It should, instead, merely return a deserialization error.
The local name should not be a uuid but something that is person readable that a developer can set when they create the actor and then look up afterwards.
Some actors just react to messages and don't hold state. They shouldn't have to declare a processor with a placeholder state.
Currently Shutdown is a bit brutal, messages are stranded in the channels. This ticket is to add an additional shutdown process that makes sure all messages are processed and sends stop to all actors so that they can wrap up work before the threads exit.
This is just a usability thing since the types can easily ditch the prefix. It is also worth considering deleting the Secc
prefix on all of the other sub types.
A monitor is a special actor that receives messages from the system and allows one actor to know about the life status of another actor. One possible implementation is to expand SystemMsg
to include the message ActorStopped(Arc<ActorId>)
to enable the receiver to know the actor that was shut down. When an actor is monitoring another actor the system will track all monitoring actor ids in the actor and then send the message to all those actors. Of course if the system is hard killed one cannot be sure the monitor message will be received. In later implementations across a network this should take advantage of location transparency.
Axiom uses SECC but not the other way around. It should be in its own crate. Once SECC is super stable after being used by Axiom it should be made its own repository.
send_new
and try_send_new
will wrap the message inefficiently in a double Arc
. This needs to have new API to make it efficient.
This could happen if an actor was already stopped but the user still had an ActorId
in serialized form and then tries to deserialize it. Right now that would probably panic and take down the system.
Currently the threads are just flat terminated. It would be better if they are gracefully shutdown and they inform the remotes that they are shut down. This could be wrapped up in a protocol for system to system communication.
Rather than doing only one message at a time, it would be more efficient if a dispatcher thread would perform work up to a certain configurable time limit. The default might be 1 millisecond If a message takes less than that, the dispatcher thread should handle the next message for the actor, if any, and the next and so on until it gets to the timeout.
While adding this issue the developer will need to add a configurable time_slice
to set the time in nanoseconds and then a configurable time_slice_max
which will serve as an upper bound for the time a message executes, beyond this level the system should log a warning.
The actor system should maintain a map of UUID
s to ActorId
s where the key is the id
field inside the ActorId
. This will enable a user to lookup the actor even when the actor is remote by its ID because UUID v4 values have an incredibly small chance of colliding.
Currently errors will mostly just panic. This should be changed to be more robust and tolerant.
I'm used to Erlang's pattern of feeding an actor message into a pattern match, and the chains of if let Some(x) = msg.content_as::<Foo>()
feels clunky in comparison. I know there's only so much one can do with TypeId
but I was wondering if you had any thoughts on whether it would be possible to have a nicer pattern?
Another concern I have that I don't know the answer to is whether TypeId
is stable between compiler version, or even multiple builds of the same program. If not, having multiple separate programs in the same cluster could lead to their TypeId
's being incompatible...
To make Status more consistent the variants should have constant cognitive theme.
Processed
should be renamed Consume
Skipped
should be renamed: Skip
Note that it might entail renaming status as well because its really a verb. Maybe Action
Right now errors in connect will mostly just panic. This should make them more robust and tolerant.
From #53:
TypeId should not be depended on when compiled on different machines and that is not recommended. When running an Axiom Cluster (after I get that finished) the recommendation will be to deploy the same compiled binary on all nodes.
I want to throw down some thoughts I have about TypeId
and see what you think of them, so that I can get them straight in my own head, and hopefully someone will find this useful someday down the line.
First, I'm coming from ROS, which is designed to allow multiple programs to operate together, even written in different languages, and basically implements a message-based RPC system complete with message definition files similar to gRPC or Cap'n Proto. I sure as heck don't want to build something as complicated as that, but there are certain advantages that come out of it that I do kinda do want:
Your recommendation for using the same binary for every node in a cluster makes sense for something scaling horizontally like a web service, but is less convenient for something with lots of asymmetric parts like a robot system. If rebuilding your debugger program requires rebuilding and re-deploying your whole system, and that system is fundamentally stateful, that gets annoying and slow.
So I guess my question is, what can we reliably do with TypeId
, and what are the exact constraints? How far can we rely on TypeId::of::<u32>() == whatever
being accurate? Obviously if everything is built into one statically-linked binary, all TypeId
's will line up with each other. And the TypeId
docs say "...it is worth noting that the hashes and ordering will vary between Rust releases. Beware of relying on them inside of your code!", so that's the other extreme. But "same compiler" is pretty easy to guarantee, and the docs don't say a whole lot beyond that. So, will TypeId
comparisons be valid if:
Currently in the test cases there is a usage of the downcast function that is very manual and with poor ergonomics.
fn handle(&mut self, aid: Arc<ActorId>, msg: &Arc<Message>) -> Status {
dispatch(self, aid.clone(), msg.clone(), &StructActor::handle_op)
.or_else(|| dispatch(self, aid.clone(), msg.clone(), &StructActor::handle_i32))
.or_else(|| {
dispatch(
self,
aid.clone(),
msg.clone(),
move |state: &mut StructActor, aid: Arc<ActorId>, msg: &u8| -> Status {
assert_eq!(3, aid.received());
assert_eq!(7 as u8, *msg);
state.count += *msg as usize;
assert_eq!(29 as usize, state.count);
Status::Processed
},
)
})
.unwrap()
}
This is unfortunately necessary due to rust mechanics with Any
but could be improved with a macro that would take the state
, the aid
, the msg
and then a list of handler functions (which could be closures and then generate the code above.
Currently sending to remote actors is not enabled. Once the system has the ability to connect in the cluster, this ability should be added. The ActorSender
enum should be enhanced to allow the message to be serialized to the recipient and decoded at the other side and sent to a Local sender where the actor lives.
Rather than have a mechanism by which the Actor System reads config from a file, I would like to create a struct with the configuration options and allow the user to instantiate this struct however they want when passing to the actor system. There should also be a set of defaults that if the user passes no config structure or only a partially filled structure, the system will configure with defaults.
The preferred way of implementation would be to use the builder pattern as in:
let config = ActorSystem::config().poll_ms(20);
Currently if you try to send to an actor and your request to send is beaten by the actor stopping an going out of the hashmap then the system would crash. Instead it should gracefully handle this and return an error message. This can happen when an actor is killed or stops gracefully.
Because of the needs of SystemActor and other issues, there is a need to refactor the ActorId to get rid of the threadlocal.
Once std::any::type_name
is stabilized, add the name to the message to be used for debugging.
In order to enable this application to be monitored it is worth considering integration of Prometheus APM into the library. This may take the form of a secondary crate such as axiom_prometheus
to make it optional but some of the metrics should be structured for APM monitoring.
There should be a new spawn function that allows a user to spawn an actor with overrides for the mailbox size and other parameters we care to add in the future. The ActorConfig
object should also be part of the ActorSystemConfig
that is passed at system startup.
When sending messages there is a need to serialize and deserialize when the message crosses network boundaries. This ticket is geared towards finding a means to do this. For this to work, Message
might very well be converted to an envelope struct instead of type
. This is part of enabling remote actors.
This is an issue for efficiency and simplification. Currently the MessageContent type looks like the following:
pub enum MessageContent {
/// The message is a local message.
Local(Arc<dyn ActorMessage + 'static>),
/// The message is from remote and has the given hash of a [`std::any::TypeId`] and the
/// serialized content.
Remote(Vec<u8>),
}
Note that the local message is holding the content inside an inner Arc
. It would be much nicer if we could get rid of the inner Arc
if possible because it would reduce complexity and indirection.
SECC currently has no tests around skipping functions and they need to be added.
Currently if an actor panics while processing a message the whole system will go down. In reality only that actor should go down. This should involve using some means of panic unwinding in the receiver thread in order to shut down the offending actor but keep going.
When you return ResetSkip it should first dequeue the current message before resetting the pointers.
With the changes from the serialization branch it should be possible now to implement
aid.send(Message::new(x));
Given aid
is a an ActorId
When the actor is stopping it should be able to return a reason for that stop via a trait. This reason should be debug and display enabled and structured so that monitoring actors can decide what to do about the monitored actor that was stopped.
I'm making a pubsub-ish thing and it would be nice if I could store ActorId
's in a deterministic order. PR will come if you want it.
Using a thread builder, give a human readable name to each of the dispatcher threads. This will improve debugging.
Currently the Actor has to be looked up in a hashtable when it is scheduled and that is inefficient and unnecessary. The ActorSender::Local should be refactored to store the actor and avoid the lookup.
the philosophers.rs example occasionally locks up but it was merged because there were so many pertinent changes in the core code. This continues the work on that example.
Currently the actor doesn't know who sent the message so replying becomes a problem.
The secc::receive_await_timeout
obeys the timeout but the send does not. This should be added and both functions should be tested to make sure they work as expected.
Actors should track how long their messages take to process and how much time they spend in the channel and use that to warn the user when they are sending messages that take too long to process. The threshold for warning should be added to the configuration object for the actor system
Logging using https://github.com/rust-lang-nursery/log should be integrated into the system and provide all of the output currently in println!()
It would be potentially useful to know how long a message has been in the channel. SECC should implement that by tracking the difference in microseconds between enqueue and dequeue time in the SeccNode
and then rolling those numbers up into an average when a message is received in order to be able to report timing metrics. At the same time any other timing metrics should be explored and implements such as "time waiting for capacity" and "time waiting for messages".
If the actor cant be scheduled temporarily or before the timeout, it currently panics. It should handle this gracefully.
Currently actors go on forever until the system shuts down and that is obviously not optimal. There need to be the following capabilities integrated.
SystemMsg::Stop
that an actor can process to shutdown gracefully.ActorSystem::stop()
to force the behavior above immediately.This is an ergonomics issue as its unnecessary when there is the module prefix if needed.
The API has changed so the readme must as well.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.