How to wait for all actors to process their inboxes in a test - testing

In an asynchronous (typed) actor test I have to make sure that a specific message has been received by an actor before I send the next one. This is necessary because messages can reach the actor under test via several child actors (when sending the message directly the order would be guaranteed anyway).
Since the actor might only change its internal state and not signal to the outside world that the message has been received, I have to find another way to wait until the message is processed.
Is there a way to wait until all inboxes are empty? I think that ManualTime.timePasses(0.seconds) could do the job but I'm not sure and it slows down my tests considerably. Obviously I don't want to use Thread.sleep(...) because it doesn't really guarantee that all messages are processed and the tests would be even slower.
I also tried using a BehaviorInterceptor to figure out when the main actor has received a message but this is only possible when I know which messages the child actors send to the main actor. This should actually be transparent in the test so I'm looking for a generic way to assert that the actors are done with processing messages (since I control the scheduler, no timer messages are generated).

I found out this is actually pretty easy to accomplish: A CallingThreadDispatcher processes all actor messages immediately and delivery and processing of messages is deterministic. It is never necessary to wait for messages to process because they are processed in the test thread before the tell (or !) call returns.
You can configure it for your test actor system like this in Akka Typed:
val config = ConfigFactory.parseString(
"""akka.actor.default-dispatcher =
{ type = akka.testkit.CallingThreadDispatcherConfigurator }"""
)
val testKit = ActorTestKit(ActorTestKitBase.testNameFromCallStack(), config)

Related

Create a kind of Serial Operation Queue by intentionally awaiting/blocking in a ReceiveAsync handler

Reading here:
https://petabridge.com/blog/akkadotnet-async-actors-using-pipeto/
The actor’s mailbox pushes a new message into the actor’s OnReceive method once the previous call to OnReceive exits.
Followed by
On a ReceiveActor, use ReceiveAsync where T is the type of message this receive handler expects. From there you can use async and await inside the actor to your hearts’ desire.
However, there is a cost associated with this. While your actor awaits any given Task, the actor will not be able to process any other messages sent to it until it finishes processing the message in its entirety. (emphasis mine)
It seems to me that I can use this blocking quality to force an Actor to be a kind of serial operation queue. Yes, if the process crashes and the messages enqueued were not persisted, that will cause those messages to be lost. Assuming that is ok however, and in my case that is desirable. Are there any other reasons not to an Actor like this?
Are there any other reasons not to an Actor like this?
Your overall question has a flaw in its premise, but the short answer is that you should absolutely use Actors in this manner.
The flaw in your question is that you are referencing a blog post that is talking about using async and PipeTo. What you seem to be missing is that all Actors work this way, whether synchronous or asynchronous, and whether using PipeTo or not!
The whole idea of an Actor (at least in Akka.Net) is built around processing messages from a mailbox one at a time (a "Serial Operation Queue" as you called it).

How Akka.Net handles system falts during message processing

Suppose that one of cluster nodes received a message and one of actors started to process it. Somewhere in the middle this node died for some reason. What will happen with message, I mean will it be processed by another available node or will be lost?
By default akka (and every other actor model framework) offers at-most-once delivery. This means that messages are send to actors using best effort guarantees - if they won't reach the target they won't be redelivered. This also means, that if message reached the target, but the process associated with it was interrupted before finishing, it won't be retried.
That being said, there are numerous ways to offer a redelivery between actors with various guarantees.
The simplest and most unreliable is to use Ask pattern in combination with i.e. Polly library. This however won't help if a node, on which sender lives, will die - simply because message are still stored only in memory.
The more reliable pattern is to use some event log/queue in front of your cluster (i.e. Azure Service Bus, RabbitMQ or Kafka). In this approach clients are sending requests via bus/queue, while the first actor in process pipeline is responsible for picking it up. If some actor or node in pipeline dies, the whole pipeline for that message is being retried.
Another idea is to use at-least-once delivery found in Akka.Peristence module. It allows you to use eventsourcing capabilities of persistent actors to persist messages. However IMO it requires a bit of exerience with Akka.
All of these approaches present at-least-once delivery guarantees, which means that it's possible to send the same message to its destination more than once. This also means, that your processing logic needs to acknowledge that by either an idempotent behavior or by recognizing and removing duplicates on the receiver side.

Synchronizing dependent asychnronized functions Objective C

So I am running into a race condition and I have a few solutions on how to fix the issue. I am new to threading so obviously, my opinion and research is limited. I have a large amount of asynchronization calls that can happen if a user receives certain messages from server. Thus, my design is poor due to the dependent nature of my objects.
Lets say I have a function called
adduser:(NSString s){
does some asynchronize activity
}
Messageuser:(NSString s)
{
Does some more asychronize activity
}
if a user were to recieve a message telling it to addUser "Ryan". he would than create a thread and proceed with looking up Ryan and storing him. However, if the user has the application in suspended mode, and in the buffered of messages waiting to be recieved there is a addUser request and a MessageUser request, a race condition occures because it takes longer to complete Adduser than it does to complete MessageUser. Thus, If messageUser is called , and (in our example) "Ryan" has not been fully added, it throws an error.
What would be a possible solution to this issue. I looked into locks and semaphores, and what I am trying to do is, when MessageUser recieves a call, check to make sure there is no thread currently proccessing addUser. If there is none, proceed. Else wait, than proceed after it has finished.
Well it depends on how the messages are being issued in the first place and what the async response events are.
If the operations have dependencies (ordering requirements) then perhaps a background serial queue would be appropriate? That is a simple way to ensure the messages are processed in order.
If the async operations take completion blocks, then you could have the completion block issue the request for the next operation to be performed, though you may not know about that ahead of time.
If you need to solve this in a more general way then you need some kind of system for tracking prerequisites so you can skip work items that don't have their prerequisites met yet. That probably means your own background thread that monitors a list of waiting tasks and receives notification of all task completions so it can scan for items waiting on that completion and issue them.
It seems really complicated though... I suspect you don't really have such strong async parallel processing requirements and a much simpler design would be just as effective. Given your situation where you are receiving messages from a server, I think a serial queue would be the best option. Then you can process messages in the order the server sent them and keep things simple.
//do this once at app startup
dispatch_queue_t queue = dispatch_queue_create("com.example.myapp", NULL);
//handle server responses
dispatch_async(queue, ^{
//handle server message here, one at a time
});
In reality, depending on how you connect to your server you might be able to just move the entire connection handling to the background queue and communicate with it via messages from the UI, and update the UI by dispatching to the dispatch_get_main_queue() which will be the UI thread.

What is considered "best practice" to test semantic race conditions in akka?

Consider the following actor:
class Stateful(worker: ActorRef) extends Actor {
val queue = // immutable queue with details
def receive = {
case NewJob(details) => worker ! details
case JobRejection(details) if sender == worker => // enqueue
case JobRequest if sender == worker => // dequeue and send to worker
}
}
This simple actor forwards all the jobs to its underlying worker. If the worker is too busy he rejects the job and the job gets enqueued for later. At some point the worker is done and requests another job from the queue and so on.
In order to test this actor I'm passing a fake worker which rejects the first job, so I can test if it is actually in the queue (there is a GetJobs message for this and the queue is immutable so no worries). After having the job rejected I scheduleOnce to send the JobRequest with a delay of 100 millis.
Now I send the job from my test suite, wait a little using the scheduleOnce technique and send the GetJobs message. If I'm lucky the job is in the queue. I repeat the procedure and this time the queue should be empty again. And sometimes it is.
Is there a better way to control the timing? Because essentially there are three delays, which I manually have to tune. And there are no guaranties that this tuning is going to work on a different machine or even on mine after adding another couple of such tests.
Instead of using a fake worker use a TestProbe. Then you can use the standard TestKit methods on the probe and have the probe send messages back the the Stateful actor.
See the section on using Probes in the reference manual.

How to know when a set of RabbitMQ tasks are complete?

I am using RabbitMQ to have worker processes encode video files. I would like to know when all of the files are complete - that is, when all of the worker processes have finished.
The only way I can think to do this is by using a database. When a video finishes encoding:
UPDATE videos SET status = 'complete' WHERE filename = 'foo.wmv'
-- etc etc etc as each worker finishes --
And then to check whether or not all of the videos have been encoded:
SELECT count(*) FROM videos WHERE status != 'complete'
But if I'm going to do this, then I feel like I am losing the benefit of RabbitMQ as a mechanism for multiple distributed worker processes, since I still have to manually maintain a database queue.
Is there a standard mechanism for RabbitMQ dependencies? That is, a way to say "wait for these 5 tasks to finish, and once they are done, then kick off a new task?"
I don't want to have a parent process add these tasks to a queue and then "wait" for each of them to return a "completed" status. Then I have to maintain a separate process for each group of videos, at which point I've lost the advantage of decoupled worker processes as compared to a single ThreadPool concept.
Am I asking for something which is impossible? Or, are there standard widely-adopted solutions to manage the overall state of tasks in a queue that I have missed?
Edit: after searching, I found this similar question: Getting result of a long running task with RabbitMQ
Are there any particular thoughts that people have about this?
Use a "response" queue. I don't know any specifics about RabbitMQ, so this is general:
Have your parent process send out requests and keep track of how many it sent
Make the parent process also wait on a specific response queue (that the children know about)
Whenever a child finishes something (or can't finish for some reason), send a message to the response queue
Whenever numSent == numResponded, you're done
Something to keep in mind is a timeout -- What happens if a child process dies? You have to do slightly more work, but basically:
With every sent message, include some sort of ID, and add that ID and the current time to a hash table.
For every response, remove that ID from the hash table
Periodically walk the hash table and remove anything that has timed out
This is called the Request Reply Pattern.
Based on Brendan's extremely helpful answer, which should be accepted, I knocked up this quick diagram which be helpful to some.
I have implemented a workflow where the workflow state machine is implemented as a series of queues. A worker receives a message on one queue, processes the work, and then publishes the same message onto another queue. Then another type of worker process picks up that message, etc.
In your case, it sounds like you need to implement one of the patterns from Enterprise Integration Patterns (that is a free online book) and have a simple worker that collects messages until a set of work is done, and then processes a single message to a queue representing the next step in the workflow.