petabridge / akkadotnet-code-samples Goto Github PK
View Code? Open in Web Editor NEWAkka.NET professional reference code samples
Home Page: http://petabridge.com/blog
License: Apache License 2.0
Akka.NET professional reference code samples
Home Page: http://petabridge.com/blog
License: Apache License 2.0
I have started here
I have started by setting up a FAKE build script.
Also moved from nuget to paket
Shutting down a node results in the cluster entering a state that it can't recover from. This affects all the nodes/roles including lighthouse. From what I can tell, this has been brought up in gitter chat room a number of times.
To recreate you follow the instructions described in the readme for how to start the crawl. Then kill the cluster node (or tracker), which results in errors across all the remaining services. Lighthouse starts reporting AssociationErrors, then it reports Leader can currently not perform its duties. Starting the crawler node again doesn't recover. LH reports that the node is joining, but it keeps showing AssociationErrors. The solution isn't working anymore. The only way to recover from this is to stop and start everything again, including LH.
Shouldn't the cluster recover from failed nodes better? This seems too fragile.
I am running Win10 x64, VS 2015, IIS express. No HOCON changes, everything runs on the default ports and addresses.
I wasn't sure if this should be filed here or under akka.net github repo. I am happy to move it.
Here are the screnshots after the crawler node has been restarted:
In this doc, https://github.com/petabridge/akkadotnet-code-samples/tree/master/PipeTo there is a section What are those TaskContinuationOptions.AttachedToParent &
TaskContinuationOptions.ExecuteSynchronously flags you keep using on ContinueWith inside an actor?
I stumbed on this commit where the |
operator is used instead: akkadotnet/akka.net#1383
I think the |
is correct but I'm rusty w/bit manipulation.
I cloned master, and found that I could not compile because the packages.config references helios v1.4.3, which does not seem to be a published package.
I changed the version to 1.4.2, and was able to restore the package and compile. However, I got a runtime error saying that helios version 1.4.1 could not be found.
Changing the version down to 1.4.1 seemed to fix the problem completely.
Hey guys,
This crawler needs to loosen up on a per domain setting. It can lead to a denial of service the way it is. It needs to incorporate a crawl delay.
I'm willing to work on this, asking for feedback on approach. We would need to coordinate the crawl delay between machines.
Tthe abot design is worth taking a look at in respect to crawl delays.
https://github.com/petabridge/akkadotnet-code-samples/tree/master/src/reliability/rabbitmq-backpressure isn't working right now - unable to connect to RabbitMQ docker even when started correctly
Hello,
for my understanding, with Akka.net it should be possible to create systems which are resilient and self-healing. With the WebCrawler example everything is working like expecting, but if the CrawlerService is crashing and afterwards restarted nothing works anymore?
Do I have any configuration issue here or do I misunderstand the situation?
If we want to crawl correctly and politely, the crawler should honor the robots.txt file if it exists.
There is an existing open source Robots project that can analyze the robots.txt file and provides functionality given a url if the file is allowed or not.
I looked at the abot design and this should borrow elements of finding and analyzing robots.txt from there.
Now since the design of this crawler can span machines, the CrawlJob needs to have the contents of this file at the TrackingService.DownloadMaster (in my opinion)
I'm willing to work on this. Asking for feedback on approach.
I have a .net core background service that processes messages from a RabbitMq Queue, the messages can grow to thousands depending on the peak period, hence I have to manually instantiate each background service to process the messages one after the other.
My question is how can I use Actor Model to process the messages concurrently as each message arrives in the queue.
Please help.
In the WebCrawler sample, the TrackingService
implements an ApiMaster
actor that uses a configured broadcast-group
router to broadcast the FindRunningJob
message. The configuration looks as follows:
deployment {
/api/broadcaster {
router = broadcast-group
routees.paths = ["/user/api"]
cluster {
enabled = on
max-nr-of-instances-per-node = 1
allow-local-routees = on
use-role = tracker
}
}
However, when trying in a local setup with multiple instances of the tracking services running as part of the cluster (according to the logs in the Lighthouse, all the nodes have been detected correctly), only one message is ever sent. Also, the counter variable OutstandingAcknowledgements
that the ApiMaster
is using to keep track of the number of routees that responded is always set to 1.
I was also able to reproduce the same behavior in a scenario where the Sender
actor was not part of the routees (as it is in the case of the ApiMaster
) and tried to remove the max-nr-of-instances-per-node
setting in my own test project.
Could this be a bug in the router or is there a problem with the configuration ?
Per some of the feedback we received during the 2022 Akka.NET User Survey, we are investing in building more end-to-end samples that follow the "Pit of Success" paradigm - these samples don't feature complex business domains, but they will demonstrate how to correctly leverage Akka.NET infrastructure in a variety of combinations and fashions.
This includes:
If you have a specific sample you would like to see addressed, please file a new issue!
I am trying to follow your example but I am getting a bit lost...
Would it be possible to have a list of what each actor is doing, the normal flow of data between them and the split once the applicaiton is deployed in multiple machines? This is my view (in order, and splitted by tier), not sure if I got everything right:
(WEB)
(SERVICE)
There are quite a lot messages going on between actors which makes it a bit difficult to follow (for me :-() sorry.
My last question: if I wanted to have this distributed in 4 machines: 2 crawlers, 1 lighthouse and 1 web server, what would I need to change? I understand I would need to enter the lighthouse IP address in the other 3 machines, but can I leave the helios.tcp section as it currently is?
Sorry I had this many questions.
By the way when I start 3 or 4 crawler services I start getting errors after a little bit. Is that normal? SOrry some of the error message is in Spanish - it might just be a timeout.
[ERROR][12/05/2015 1:09:54][Thread 0017][[akka://webcrawler/system/transports/ak
kaprotocolmanager.tcp.0/akkaProtocol-tcp%3a%2f%2fwebcrawler%40127.0.0.1%3a12690-
2]] Underlying transport closed.
Cause: Helios.Exceptions.HeliosConnectionException: Se produjo un error durante
el intento de conexión ya que la parte conectada no respondió adecuadamente tras
un periodo de tiempo, o bien se produjo un error en la conexión establecida ya
que el host conectado no ha podido responder ---> System.Net.Sockets.SocketExcep
tion: Se produjo un error durante el intento de conexión ya que la parte conecta
da no respondió adecuadamente tras un periodo de tiempo, o bien se produjo un er
ror en la conexión establecida ya que el host conectado no ha podido responder
en System.Net.Sockets.Socket.Send(Byte[] buffer, Int32 offset, Int32 size, So
cketFlags socketFlags)
en Helios.Net.Connections.TcpConnection.SendInternal(Byte[] buffer, Int32 ind
ex, Int32 length, INode destination)
And after a little while one of the services usually crashes with:
[WARNING][12/05/2015 1:10:34][Thread 0053][remoting] Association to [akka.tcp://
[email protected]:12690] having UID [817718607] is irrecoverably failed. UID
is now quarantined and all messages to this UID will be delivered to dead letter
s. Remote actorsystem must be restarted to recover from this situation.
I followed steps from "Running the sample" section. It says "0 routees". am I missing anything?
WebCrawler.TrackerService\Actors\IO\CrawlMaster.EndJob code ignores input parameter and "hardcodes" the status result
instead of
private void EndJob(JobStatus finalStatus)
{
RunningStatus = RunningStatus.WithStatus(JobStatus.Finished).WithEndTime(DateTime.UtcNow);
PublishJobStatus();
Self.Tell(PoisonPill.Instance);
}
read
private void EndJob(JobStatus finalStatus)
{
RunningStatus = RunningStatus.WithStatus(finalStatus).WithEndTime(DateTime.UtcNow);
PublishJobStatus();
Self.Tell(PoisonPill.Instance);
}
in akkadotnet-code-samples/PipeTo/src/PipeTo.App/Actors/HttpDownloaderActor.cs the Initialize method prepares for receipt of DownloadImage message, to invoke this code
//asynchronously download the image and pipe the results to ourself
_httpClient.GetAsync(imageUrl).ContinueWith(httpRequest =>
{
var response = httpRequest.Result;
// ...
return new ImageDownloadResult(image, response.StatusCode);
}, TaskContinuationOptions.AttachedToParent & TaskContinuationOptions.ExecuteSynchronously).PipeTo(Self);
});
the response variable has IDisposable datatype HttpResponseMessage, but the code makes no attempt to call Dispose() so GC will not happen until [much] later which will impede resources.
I believe this code should be wrapped in a using block to ensure the underlying un/managed resources are released to benefit performance. Something like this
//asynchronously download the image and pipe the results to ourself
_httpClient.GetAsync(imageUrl).ContinueWith(httpRequest =>
{
using (var response = httpRequest.Result)
{
// ...
return new ImageDownloadResult(image, response.StatusCode);
}
}, TaskContinuationOptions.AttachedToParent & TaskContinuationOptions.ExecuteSynchronously).PipeTo(Self);
As an aside, I notice that .NET Core 2.1 now sports a HttpClientFactory that better supports lifetime issues but I have not explored sufficiently to know if normal Framework (or any NuGet package) has equivalent support.
Presumably other Akka sample code (e.g. the full Akka Cluster suite) will exhibit similar challenges, but I haven't investigated. Thanks.
Hi @skotzko,
Here is an issue for sample.
I've tried the RemoteDeploymentSample solution and noted that in case of "normal" (hit any key) termination of Deployer following exception appear:
[ERROR][16/07/2015 18:02:47][Thread 0021][[akka://DeployTarget/system/transports
/akkaprotocolmanager.tcp.0/akkaProtocol-tcp%3a%2f%2fDeployTarget%40127.0.0.1%3a6
202-1]] Underlying transport closed.
Cause: Helios.Exceptions.HeliosConnectionException: An existing connection was f
orcibly closed by the remote host ---> System.Net.Sockets.SocketException: An ex
isting connection was forcibly closed by the remote host
at System.Net.Sockets.Socket.EndReceive(IAsyncResult asyncResult)
at Helios.Reactor.Tcp.TcpProxyReactor.ReceiveCallback(IAsyncResult ar)
--- End of inner exception stack trace ---
[ERROR][16/07/2015 18:02:47][Thread 0010][[akka://DeployTarget/system/endpointMa
nager/reliableEndpointWriter-akka.tcp%3a%2f%2fDeployer%40localhost%3a6201-1/endp
ointWriter]] AssociationError [akka.tcp://DeployTarget@localhost:8090] <- akka.t
cp://Deployer@localhost:6201: Error [Shut down address: akka.tcp://Deployer@loca
lhost:6201] [ at Akka.Remote.EndpointReader.HandleDisassociated(DisassociateIn
fo info)
at Akka.Remote.EndpointReader.OnReceive(Object message)
at Akka.Actor.UntypedActor.Receive(Object message)
at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
at Akka.Actor.ActorCell.ReceiveMessage(Object message)
at Akka.Actor.ActorCell.Invoke(Envelope envelope)]
Cause: Unknown
[ERROR][16/07/2015 18:02:47][Thread 0010][akka://DeployTarget/system/endpointMan
ager/reliableEndpointWriter-akka.tcp%3a%2f%2fDeployer%40localhost%3a6201-1] Shut
down address: akka.tcp://Deployer@localhost:6201
Cause: Akka.Remote.ShutDownAssociation: Shut down address: akka.tcp://Deployer@l
ocalhost:6201 ---> Akka.Remote.Transport.InvalidAssociationException: The remote
system terminated the association because it is shutting down.
--- End of inner exception stack trace ---
at Akka.Actor.ActorCell.HandleFailed(Failed f)
at Akka.Actor.ActorCell.SystemInvoke(Envelope envelope)
How to make it not to throw an exception on Deployer termination?
I'm new to AKKA.net and wondering what's the remote deployment function provided by Akka. Could you give me some advice. Thanks in advance.
My question is does it support to send a batch of DLLs to the deploy target and dynamically update the original ones to provide new features?
In the sample, the 'DeployTarget' reference the 'Shared' project. Is it possible to remove this reference and let it still running?
From my understanding, the system works like the 'Deployer' send a message to the target to active an instance of EchoActor, instead of deploy the DLL containing 'EchoActor' to the target.
So, I am a new user of Alka. I'm trying to learn. I've done the bootcamp, I've read some documentation.
Then, I come to this repo, which has the mission statement "to provide a small number of exceptionally well-explained and documented examples".
So I open up CrawlerService.cs.
First thing I noticed is that there are no code comments. At all.
So I read one of the first lines
protected ActorSystem ClusterSystem;
public Task WhenTerminated => ClusterSystem.WhenTerminated;
Ok. So what does this do when terminated? There is no static method for handling the termination?
Similarly, the line
return CoordinatedShutdown.Get(ClusterSystem).Run();
Huh? Where did CoordinatedShutdown come from?
I mean, it is all in the documentation. And I'm definitely a novice. But, judging by the statement in the readme, the code in this repo should be rather self explanatory.
Instead, it seems to be another example on how to do as much as possible in as few lines of code as possible. Which is nice to demonstrate the power of a framework. But not so nice when trying to get a better understanding of how the framework should be used.
Not sure I understand this logic correctly and maybe everything is correct.
in DownloadsMaster there is a logic to find a tracker.
In method 'HandleGetDownloadTracker' there is a check if the tracker is already exists, but dead. This logic send message "TrackerDead"
In Recieve there is no logic to return to ready state:
So in this case Dead tracker will always lead to Timeout.
So in case of 3 dead trackers we will need to wait about 5 seconds to continue.
Or am I miss something?
This can help reduce the learning curve
in https://github.com/petabridge/akkadotnet-code-samples/tree/master/PipeTo/README.md says "send the results to ourselves" whereas the closure relates to the source actor. I believe the comment should say "to originator" (there is no need to perform closure on Self as that won't ever change)
in akkadotnet-code-samples/PipeTo/src/PipeTo.App/Actors/FeedParserCoordinator.cs
the FeedParserCoordinator actor receives a ErrorParsingFeed message but ignores the Uri content and instead prints previous state leftovers (i.e. intervening activity during async network operation)
instead of this
Receive(
feed => SignalFeedProcessingFailure(string.Format("Error parsing feed {0}", _feedUri), _feedUri.ToString()));
IMHO code should be
Receive(
feed => SignalFeedProcessingFailure(string.Format("Error parsing feed {0}", feed.Feed), feed.Feed.ToString()));
In the FeedParserCoordinator.cs:
protected override void PreRestart(Exception reason, object message)
{
// Kill any child actors explicitly on shutdown
Context.Stop(_httpDownloaderActor);
Context.Stop(_feedParserActor);
base.PreRestart(reason, message);
}
I thought all children get stopped by the system when the parent is stopped. What is the point of doing that explicitly?
this is a test issue...
I am having some problems in communicating between actors in Cluster.
My test project has this structure below.
TestJob [C# Console Project]
MainProject [C# Console Project] //Note: I configured this service as a seed node. I didn't use lighthouse.
Note: I don't want to put actors in Shared project or Main project. The actors that are supposed to do a test job should be under "TestJob" project.
I already followed this article http://getakka.net/docs/clustering/cluster-overview and video. I did enable Akka.Cluster based on the article. I am able to run both console projects but when I tried to "tell" from JobManagerActor to TestJobActor, it doesn't work. No error but doesn't work.
I have this config in MainProject.
actor {
provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"
deployment {
/TestJobAActor {
router = consistent-hashing-group
routees.paths = ["/user/TestJobAActor"]
virtual-nodes-factor = 8
cluster {
enabled = on
max-nr-of-instances-per-node = 2
allow-local-routees = off
use-role = backend
}
}
}
}
Here is the code that I use for sending the message.
var backendRouter = Context.ActorOf(Props.Empty.WithRouter(new ClusterRouterGroup(new ConsistentHashingGroup("/user/TestJobAActor"),new ClusterRouterGroupSettings(10, false, "backend", ImmutableHashSet.Create("/user/TestJobAActor")))));
backendRouter.Tell("Yo yo!");
What am I missing? Thanks in advance.
Note: My test project with similar structure can be found here https://github.com/michaelsync/APMDemo . (VS2015 project)
One more question: Can we still use the actor selection when using cluster?
var actorSelection = Context.ActorSelection("akka.tcp://[email protected]:2553/user/BackEndJobAActor"); //This can come from Model
actorSelection.Tell("Yo yo!");
God I wish we had never left this in the sample.
I am getting this error
get-dockerip.sh: line 11: syntax error: unexpected end of file
It seems all issues that were open were closed one hour ago.
Was this an intentional action (ie, is the repo being archived), or did someone test their issue bot against the wrong endpoint?
sample uses https://github.com/petabridge/akkadotnet-bootstrap and thus doesn't actually need to perform a DNS resolution on startup. Ans special reason why?
is this production ready?
The hyperlink of "HTML Agility Pack" in akkadotnet-code-samples/Cluster.WebCrawler/README.md is outdated : HAP is now at http://html-agility-pack.net
Shouldn't this read this "Technically, if you sent a list of 100 different feed URLs to the FeedValidatorActor then it would create 100 different FeedValidatorActor instances to process each feed in parallel."
as
"Technically, if you sent a list of 100 different feed URLs to the FeedValidatorActor then it would create 100 different FeedCoordinatorActor instances to process each feed in parallel."
Hi,
I have downloaded the samples and when I run the PipeTo example and the end it shows two Dead Letter messages, it displays:
'Message DeathwatchNotification from akka://MyFirstActorSystem/user/feedValidator/b$ to akka://MyFirstActorSystem/user/feedValidator/b$ was not delivered. 1 dead letters encountered
There two messages. I am new to Akka.Net and could not find where and why this occurs.
I have included a Context.Watch in FeedValidatorActor to track the FeedParserCoordinator actor, that is created by the Validator, and included a: Receive in the constructor of FeedValidatorActor class and included a sendMessage like this:
Receive(message => SendMessage(string.Format("Validator: {0}.", message));
The Terminated Receive is activated once, as expected, but the two Dead Letter messages are still displayed.
How can I avoid these messages ?
Thanks
Let's make this sucker scale on the Azures.
Hello!
I've changed actor system name in Lighthouse\src\Lighthouse\akka.hocon.
lighthouse{
actorsystem: "webcrawler" #POPULATE NAME OF YOUR ACTOR SYSTEM HERE
}
WebCrawler creates router but this router has no routees. I hardcode all actors creation for simplification:
public static ActorSystem StartAkka()
{
var config = HoconLoader.ParseConfig("web.hocon");
SystemActors.ActorSystem = ActorSystem.Create("webcrawler", config.BootstrapFromDocker());
var router = SystemActors.ActorSystem.ActorOf(Props.Empty.WithRouter(FromConfig.Instance), "tasker");
var processor = SystemActors.CommandProcessor = SystemActors.ActorSystem.ActorOf(Props.Create(() => new CommandProcessor(router)), "commands");
SystemActors.SignalRActor = SystemActors.ActorSystem.ActorOf(Props.Create(() => new SignalRActor(processor)), "signalr");
var startJob = new StartJob(new CrawlJob(new Uri("https://github.com/", UriKind.Absolute), true), SystemActors.SignalRActor);
processor.Tell(startJob);
router.Ask<Routees>(new GetRoutees()).ContinueWith(tr =>
{
// Members.Count() == 0
var grrr =
new SignalRActor.DebugCluster(string.Format("{0} has {1} routees: {2}", router,
tr.Result.Members.Count(),
string.Join(",",
tr.Result.Members.Select(
y => y.ToString()))));
return grrr;
});
return SystemActors.ActorSystem;
}
I begin to study Akka.Cluster. Could you help me to find a mistake?
The example has a downloadsmaster next to the apimaster. When starting a Job a request is done handeled by the apimaster, to find only 1 Crawlmaster for the particular domain. A similar process is started to find an existing DownloadsTracker.
What is the benefit of doing this, instead of having the downloadstracker directly under the Crawlmaster. Basically two processes are looking to find a single actor doing something for a job.
Isn't it simpler to have ApiMaster (1) --- (many) Crawlmaster
and Crawlmaster (1) --- (1) DownloadsTracker
and Crawlmaster (1) --- (1) DownloadCoordinatorRouter
?
I am new to Akka.NET so please feel free to correct me if I am wrong.
I understand that Cluster.WebCrawler is just a sample project (not the production code) but is it the best practice to put all logic in one main project? I think the main logic for crawling is in WebCrawler.TrackingService and Shared. As the micro-service architecture is being used in this sample, wouldn't it be better if the code that has different purpose can be separated? What I worry is that when the project got bigger (e.g. adding new cluster.uploaders ), we might end-up writing more logic in tracker and more classes in Shared project..
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.