Synchronizing dependent asychnronized functions Objective C

Synchronizing dependent asychnronized functions Objective C - objective-c

So I am running into a race condition and I have a few solutions on how to fix the issue. I am new to threading so obviously, my opinion and research is limited. I have a large amount of asynchronization calls that can happen if a user receives certain messages from server. Thus, my design is poor due to the dependent nature of my objects.
Lets say I have a function called
adduser:(NSString s){
does some asynchronize activity
}
Messageuser:(NSString s)
{
Does some more asychronize activity
}
if a user were to recieve a message telling it to addUser "Ryan". he would than create a thread and proceed with looking up Ryan and storing him. However, if the user has the application in suspended mode, and in the buffered of messages waiting to be recieved there is a addUser request and a MessageUser request, a race condition occures because it takes longer to complete Adduser than it does to complete MessageUser. Thus, If messageUser is called , and (in our example) "Ryan" has not been fully added, it throws an error.
What would be a possible solution to this issue. I looked into locks and semaphores, and what I am trying to do is, when MessageUser recieves a call, check to make sure there is no thread currently proccessing addUser. If there is none, proceed. Else wait, than proceed after it has finished.

Well it depends on how the messages are being issued in the first place and what the async response events are.
If the operations have dependencies (ordering requirements) then perhaps a background serial queue would be appropriate? That is a simple way to ensure the messages are processed in order.
If the async operations take completion blocks, then you could have the completion block issue the request for the next operation to be performed, though you may not know about that ahead of time.
If you need to solve this in a more general way then you need some kind of system for tracking prerequisites so you can skip work items that don't have their prerequisites met yet. That probably means your own background thread that monitors a list of waiting tasks and receives notification of all task completions so it can scan for items waiting on that completion and issue them.
It seems really complicated though... I suspect you don't really have such strong async parallel processing requirements and a much simpler design would be just as effective. Given your situation where you are receiving messages from a server, I think a serial queue would be the best option. Then you can process messages in the order the server sent them and keep things simple.
//do this once at app startup
dispatch_queue_t queue = dispatch_queue_create("com.example.myapp", NULL);
//handle server responses
dispatch_async(queue, ^{
//handle server message here, one at a time
});
In reality, depending on how you connect to your server you might be able to just move the entire connection handling to the background queue and communicate with it via messages from the UI, and update the UI by dispatching to the dispatch_get_main_queue() which will be the UI thread.

Related

Create a kind of Serial Operation Queue by intentionally awaiting/blocking in a ReceiveAsync handler

Reading here:
https://petabridge.com/blog/akkadotnet-async-actors-using-pipeto/
The actor’s mailbox pushes a new message into the actor’s OnReceive method once the previous call to OnReceive exits.
Followed by
On a ReceiveActor, use ReceiveAsync where T is the type of message this receive handler expects. From there you can use async and await inside the actor to your hearts’ desire.
However, there is a cost associated with this. While your actor awaits any given Task, the actor will not be able to process any other messages sent to it until it finishes processing the message in its entirety. (emphasis mine)
It seems to me that I can use this blocking quality to force an Actor to be a kind of serial operation queue. Yes, if the process crashes and the messages enqueued were not persisted, that will cause those messages to be lost. Assuming that is ok however, and in my case that is desirable. Are there any other reasons not to an Actor like this?

Are there any other reasons not to an Actor like this?
Your overall question has a flaw in its premise, but the short answer is that you should absolutely use Actors in this manner.
The flaw in your question is that you are referencing a blog post that is talking about using async and PipeTo. What you seem to be missing is that all Actors work this way, whether synchronous or asynchronous, and whether using PipeTo or not!
The whole idea of an Actor (at least in Akka.Net) is built around processing messages from a mailbox one at a time (a "Serial Operation Queue" as you called it).

RabbitMQ and Delivery Guarantees in Distributed Database Transaction

I am trying to understand what is the right pattern to deal with RabbitMQ deliveries in the context of distributed database transaction.
To make this simple, I will illustrate my ideas in pseudocode, but I'm in fact using Spring AMQP to implement these ideas.
Anything like
void foo(message) {
processMessageInDatabaseTransaction(message);
sendMessageToRabbitMQ(message);
}
Where by the time we reach sendMessageToRabbitMQ() the processMessageInDatabaseTransaction() has successfully committed its changes to the database, or an exception has been thrown before reaching the message sending code.
I know that for the sendMessageToRabbitMQ() I can use Rabbit transactions or publisher confirms to guarantee that Rabbit got my message.
My interest is understanding what should happen when things go south, i.e. when the database transaction succeeded, but the confirmation does not arrive after certain amount of time (with publisher confirms) or the Rabbit transaction fails to commit (with Rabbit transaction).
Once that happens, what is the right pattern to guarantee delivery of my message?
Of course, having developed idempotent consumers, I have considered that I could retry the sending of the messages until Rabbit confirms success:
void foo(message) {
processMessageInDatabaseTransaction(message);
retryUntilSuccessFull {
sendMessagesToRabbitMQ(message);
}
}
But this pattern has a couple of drawbacks I dislike, first, if the failure is prolonged, my threads will start to block here and my system will eventually become unresponsive. Second, what happens if my system crashes or shuts down? I will never deliver these messages then since they will be lost.
So, I thought, well, I will have to write my messages to the database first, in pending status, and then publish my pending messages from there:
void foo(message) {
//transaction commits leaving message in pending status
processMessageInDatabaseTransaction(message);
}
#Poller(every="10 seconds")
void bar() {
for(message in readPendingMessagesFromDbStore()) {
sendPendingMessageToRabbitMQ(message);
if(confirmed) {
acknowledgeMessageInDatabase(message);
}
}
}
Possibly sending the messages multiple times if I fail to acknowledge the message in my database.
But now I have introduced other problems:
The need to do I/O from the database to publish a message that 99% time would have successfully being published immediately without having to check the database.
The difficulty of making the poller closer to real time delivery since now I have added latency to the publication of the messages.
And perhaps other complications like guarantee delivery of events in order, poller executions stepping into one another, multiple pollers, etc.
And then I thought well, I could make this a bit more complicated like, I can publish from the database until I catch up with the live stream of events and then publish real time, i.e. maintain a buffer of size b (circular buffer) as I read based on pages check if that message is in buffer. If so then switch to live subscription.
To this point I realized that how to do this right is not exactly evident and so I concluded that I need to learn what are the right patterns to solve this problem.
So, does anyone has suggestions on what is the right ways to do this correctly?

While RabbitMQ cannot participate in a truly global (XA) transaction, you can use Spring Transaction management to synchronize the Database transaction with the Rabbit transaction, such that if either update fails, both transactions will be rolled back. There is a (very) small timing hole where one might commit but not the other so you do need to deal with that possibility.
See Dave Syer's Javaworld Article for more details.

When Rabbit fails to receive a message (for whatever reason, but in my experience only because the service is down or unavailable) you should be in a position to catch an error. At this point, you can make a record of that - and any subsequent - failed attempt in order to retry when Rabbit becomes available again. The quickest way of doing this is just logging the message details to file, and iterating over to re-send when appropriate.
As long as you have that file, you've not lost your messages.
Once messages are inside Rabbit, and you have faith in the rest of the architecture, it should be safe to assume that messages will end up where they are supposed to be, and that no further persistence work needs doing at your end.

How to make a Saga handler Reentrant

I have a task that can be started by the user, that could take hours to run, and where there's a reasonable chance that the user will start the task multiple times during a run.
I've broken the processing of the task up into smaller batches, but the way the data looks it's very difficult to tell what's still to be processed. I batch it using messages that each process a bite sized chunk of the data.
I have thought of using a Saga to control access to starting this process, with a Saga property called Processing that I set at the start of the handler and then unset at the end of the handler. The handler does some work and sends the messages to process the data. I check the value at the start of the handler, and if it's set, then just return.
I'm using Azure storage for Saga storage, if it makes a difference for the next bit. I'm also using NSB 6
I have a few questions though:
Is this the correct approach to re-entrancy with NSB?
When is a change to Saga data persisted? (and is it different depending on the transport?)
Following on from the above, if I set a Saga value in a handler, wait a while and then reset it to its original value will it change the persistent storage at all?

Seem to be cross posted in the Particular Software google group:
https://groups.google.com/forum/#!topic/particularsoftware/p-qD5merxZQ
Sagas are very often used for such patterns. The saga instance would track progress and guard that the (sub)tasks aren't invoked multiple times but could also take actions if the expected task(s) didn't complete or is/are over time.
The saga instance data is stored after processing the message and not when updating any of the saga data properties. The logic you described would not work.
The correct way would be having a saga that orchestrates your process and having regular handlers that do the actual work.
In the saga handle method that creates the saga check if the saga was already created or already the 'busy' status and if it does not have this status send a message to do some work. This will guard that the task is only initiated once and after that the saga is stored.
The handler can now do the actual task, when it completes it can do a 'Reply' back to the saga
When the saga receives the reply it can now start any other follow up task or raise an event and it can also 'complete'.
Optimistic concurrency control and batched sends
If two message are received that create/update the same saga instance only the first writer wins. The other will fail because of optimistic concurrency control.
However, if these messages are not processed in parallel but sequential both fail unless the saga checks if the saga instance is already initialized.
The following sample demonstrates this: https://github.com/ramonsmits/docs.particular.net/tree/azure-storage-saga-optimistic-concurrency-control/samples/azure/storage-persistence/ASP_1
The client sends two identical message bodies. The saga is launched and only 1 message succeeds due to optimistic concurrency control.
Due to retries eventually the second copy will be processed to but the saga checks the saga data for a field that it knows would normally be initialized by by a message that 'starts' the saga. If that field is already initialized it assumes the message is already processed and just returns:
It also demonstrates batches sends. Messages are not immediately send until the all handlers/sagas are completed.
Saga design
The following video might help you with designing your sagas and understand the various patterns:
Integration Patterns with NServiceBus: https://www.youtube.com/watch?v=BK8JPp8prXc
Keep in mind that Azure Storage isn't transactional and does not provide locking, it is only atomic. Any work you do within a handler or saga can potentially be invoked more than once and if you use non-transactional resources then make sure that logic is idempotent.

So after a lot of testing
I don't believe that this is the right approach.
As Archer says, you can manipulate the saga data properties as much as you like, they are only saved at the end of the handler.
So if the saga receives two simultaneous messages the check for Processing will pass both times and I'll have two processes running (and in my case processing the same data twice).
The saga within a saga faces a similar problem too.
What I believe will work (and has done during my PoC testing) is using a database unique index to help out. I'm using entity framework and azure sql, so database access is not contained within the handler's transaction (this is the important difference between the database and the saga data). The database will also operate across all instances of the endpoint and generally seems like a good solution.
The table that I'm using has each of the columns that make up the saga 'id', and there is a unique index on them.
At the beginning of the handler I retrieve a row from the database. If there is a row, the handler returns (in my case this is okay, in others you could throw an exception to get the handler to run again). The first thing that the handler does (before any work, although I'm not 100% sure that it matters) is to write a row to the table. If the write fails (probably because of the unique constraint being violated) the exception puts the message back on the queue. It doesn't really matter why the database write fails, as NSB will handle it.
Then the handler does the work.
Then remove the row.
Of course there is a chance that something happens during processing of the work, so I'm also using a timestamp and another process to reset it if it's busy for too long. (still need to define 'too long' though :) )
Maybe this can help someone with a similar problem.

How do I get a list of worker threads of nservicebus

How do I get a list of worker threads of nservicebus. I need to register workerThread ids in to db and then bind some type of messages to the exact workerthread. Real idea is handling poison messages. Want to block all the threads not to handle poison messages except specified ones. There will be a seperate service that will manage threads through database.

I would not try to do that. It is almost sure to run into problems.
Of course, in order to get some sort of "identity" for each thread, you could place something like this in your message handler:
[ThreadStatic]
private static readonly Guid ThreadId = Guid.NewGuid();
But again, I wouldn't do that! The guids would change every time the endpoint was restarted, for one.
You could also query the list of threads direct from .NET and try to determine which ones were the message handling threads, but that sounds so scary I don't even want to go into it.
The real issue: Poison Message Handling
As your comment states, the real problem is that a poison message is REALLY poison. Not only is it failing, but it's taking so long to do so that it's really screwing up all the other threads!
Since you are able to identify these messages based on certain properties of the message, I would detect and throw an exception before the operation that times out. All the time.
If you want to be able to test periodically to see if the issue has been fixed, you have a few options:
Test via other means, and return the messages to the source queue when it has been fixed.
Add an appSetting so that the quick-throw behavior is skipped when the config setting is enabled. Then periodically you can edit the config, restart the endpoint, see if it's fixed, and then switch it back if it isn't.
Create another message handler that maintains a thread-locked increment value of zero. Send it a control message to say "Hey, try one now." Then your quick-throw behavior can decrement that value and allow one through to see what happens. This is also dangerous of course. Make sure your locking is tight since you are now sharing this state between different message processing threads.

How to know when a set of RabbitMQ tasks are complete?

I am using RabbitMQ to have worker processes encode video files. I would like to know when all of the files are complete - that is, when all of the worker processes have finished.
The only way I can think to do this is by using a database. When a video finishes encoding:
UPDATE videos SET status = 'complete' WHERE filename = 'foo.wmv'
-- etc etc etc as each worker finishes --
And then to check whether or not all of the videos have been encoded:
SELECT count(*) FROM videos WHERE status != 'complete'
But if I'm going to do this, then I feel like I am losing the benefit of RabbitMQ as a mechanism for multiple distributed worker processes, since I still have to manually maintain a database queue.
Is there a standard mechanism for RabbitMQ dependencies? That is, a way to say "wait for these 5 tasks to finish, and once they are done, then kick off a new task?"
I don't want to have a parent process add these tasks to a queue and then "wait" for each of them to return a "completed" status. Then I have to maintain a separate process for each group of videos, at which point I've lost the advantage of decoupled worker processes as compared to a single ThreadPool concept.
Am I asking for something which is impossible? Or, are there standard widely-adopted solutions to manage the overall state of tasks in a queue that I have missed?
Edit: after searching, I found this similar question: Getting result of a long running task with RabbitMQ
Are there any particular thoughts that people have about this?

Use a "response" queue. I don't know any specifics about RabbitMQ, so this is general:
Have your parent process send out requests and keep track of how many it sent
Make the parent process also wait on a specific response queue (that the children know about)
Whenever a child finishes something (or can't finish for some reason), send a message to the response queue
Whenever numSent == numResponded, you're done
Something to keep in mind is a timeout -- What happens if a child process dies? You have to do slightly more work, but basically:
With every sent message, include some sort of ID, and add that ID and the current time to a hash table.
For every response, remove that ID from the hash table
Periodically walk the hash table and remove anything that has timed out
This is called the Request Reply Pattern.

Based on Brendan's extremely helpful answer, which should be accepted, I knocked up this quick diagram which be helpful to some.

I have implemented a workflow where the workflow state machine is implemented as a series of queues. A worker receives a message on one queue, processes the work, and then publishes the same message onto another queue. Then another type of worker process picks up that message, etc.
In your case, it sounds like you need to implement one of the patterns from Enterprise Integration Patterns (that is a free online book) and have a simple worker that collects messages until a set of work is done, and then processes a single message to a queue representing the next step in the workflow.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas