Don't allow NServiceBus saga to be executed again - nservicebus

Is it possible to prohibit saga to be runned again for the same correlation id after it have been completed?
now nsericebus just delete saga's row from the database.

If a saga is started by a message type that is received later, after an instance of a saga has already been completed, a new instance of that saga will be created. Otherwise, messages will be ignored.
Either correlation ID has to be unique or saga should not be completed until it is safe to do so.

Related

How can I ensure that a message will be given to a specific consumer in a queue?

We are working in a microservice architecture and we are using RabbitMQ as a message broker. We want to avoid the scenarios where the following happens:
An entity begins its creation but it takes a while for it to finish.
The system decides that the creation time has taken too long and that the entity should be deleted due to a timeout, so it sends out a message to delete the entity which is currently still being created
Delete message gets consumed and the system checks whether the entity exists and does not find it due to the entity still being in the process of being created.
Delete entity message consumer returns an error due to not finding the entity.
How can we ensure that the delete message is consumed after the create message is finished in such a way that we do not block the consumption of other messages?
How can we ensure that the delete message is consumed after the create message is finished in such a way that we do not block the consumption of other messages?
Let's say your entity creation timeout is N. The worker(s) responsible for creating entities should know about this timeout, and should be able to cancel entity creation should N be reached. This isn't strictly necessary but it sounds like your entity creation may be resource intensive so cancellation should be a feature you have.
If your workers know to cancel entity creation when timeout N is reached, then perhaps you don't even need the deletion message?
If you keep the delete message, the workers processing that could do the following:
First, ensure your queue has a Dead Letter Exchange configured
Consume the message, and try to delete the entity
If deletion succeeds, great, ack the message with RabbitMQ and you're done
If deletion fails, nack (reject) the message with RabbitMQ and do set requeue to be false. This will cause the message to be routed to the dead-letter exchange
A worker should consume from a queue bound to this dead-letter exchange. You could have a queue dedicated to re-trying entity deletions. When a worker consumes a message from this queue, it can re-try the deletion. If it fails, you can reject it again (after a delay, of course) and, if this queue has the same dead-letter settings, the same process will happen
Finally, ensure that your deletion workers respect the count property and only try a certain number of times to delete an entity. If a limit is exceeded, this should create an exception in your system
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

How to improve the performance of my NServiceBus Saga under load

I have a very simple Saga built with NSB7 using SQL Transport and NHibernate persistence.
The Saga listens on a queue and for each message received runs through 4 handlers. These are called in a sequential order, with 2 handlers run in parallel and the last handler only runs once both the parallel handlers are complete. The last handler writes a record to DB
Let's say for a single message, each handler takes 1 second. When a new message is received, which starts the Saga, the expected result is that 3-4 seconds later the record is written to the DB.
If the queue backs up with say 1000 messages, once they begin processing again, it takes almost 2000 seconds before a new record is created in the last handler. Basically, instead of running through the expected 4 second processing time for each message, they effectively bunch up in the initial handlers until the queue is emptied and then does that again for the next handler and on and on.
Any ideas on how I could improve the performance of this system when under load so that a constant stream of processed messages come out the end rather than the bunching of messages and long delay before a single new record comes out the other side?
Thanks
Will
There is documentation for saga concurrency issues: https://docs.particular.net/nservicebus/sagas/concurrency#high-load-scenarios
I still don't fully understand the issue though. Every message that instantiates a saga, should create a record in the database after the message was processed. Not after 1000 messages. How else is NServiceBus going to guarantee consistency?
Next to that, you probably should not have the single message be processed by 4 handlers. If it really needs to work like this, use publish/subscribe and create different endpoints. The saga should be done with processing as soon as possible, especially under high load scenarios.

How to make a Saga handler Reentrant

I have a task that can be started by the user, that could take hours to run, and where there's a reasonable chance that the user will start the task multiple times during a run.
I've broken the processing of the task up into smaller batches, but the way the data looks it's very difficult to tell what's still to be processed. I batch it using messages that each process a bite sized chunk of the data.
I have thought of using a Saga to control access to starting this process, with a Saga property called Processing that I set at the start of the handler and then unset at the end of the handler. The handler does some work and sends the messages to process the data. I check the value at the start of the handler, and if it's set, then just return.
I'm using Azure storage for Saga storage, if it makes a difference for the next bit. I'm also using NSB 6
I have a few questions though:
Is this the correct approach to re-entrancy with NSB?
When is a change to Saga data persisted? (and is it different depending on the transport?)
Following on from the above, if I set a Saga value in a handler, wait a while and then reset it to its original value will it change the persistent storage at all?
Seem to be cross posted in the Particular Software google group:
https://groups.google.com/forum/#!topic/particularsoftware/p-qD5merxZQ
Sagas are very often used for such patterns. The saga instance would track progress and guard that the (sub)tasks aren't invoked multiple times but could also take actions if the expected task(s) didn't complete or is/are over time.
The saga instance data is stored after processing the message and not when updating any of the saga data properties. The logic you described would not work.
The correct way would be having a saga that orchestrates your process and having regular handlers that do the actual work.
In the saga handle method that creates the saga check if the saga was already created or already the 'busy' status and if it does not have this status send a message to do some work. This will guard that the task is only initiated once and after that the saga is stored.
The handler can now do the actual task, when it completes it can do a 'Reply' back to the saga
When the saga receives the reply it can now start any other follow up task or raise an event and it can also 'complete'.
Optimistic concurrency control and batched sends
If two message are received that create/update the same saga instance only the first writer wins. The other will fail because of optimistic concurrency control.
However, if these messages are not processed in parallel but sequential both fail unless the saga checks if the saga instance is already initialized.
The following sample demonstrates this: https://github.com/ramonsmits/docs.particular.net/tree/azure-storage-saga-optimistic-concurrency-control/samples/azure/storage-persistence/ASP_1
The client sends two identical message bodies. The saga is launched and only 1 message succeeds due to optimistic concurrency control.
Due to retries eventually the second copy will be processed to but the saga checks the saga data for a field that it knows would normally be initialized by by a message that 'starts' the saga. If that field is already initialized it assumes the message is already processed and just returns:
It also demonstrates batches sends. Messages are not immediately send until the all handlers/sagas are completed.
Saga design
The following video might help you with designing your sagas and understand the various patterns:
Integration Patterns with NServiceBus: https://www.youtube.com/watch?v=BK8JPp8prXc
Keep in mind that Azure Storage isn't transactional and does not provide locking, it is only atomic. Any work you do within a handler or saga can potentially be invoked more than once and if you use non-transactional resources then make sure that logic is idempotent.
So after a lot of testing
I don't believe that this is the right approach.
As Archer says, you can manipulate the saga data properties as much as you like, they are only saved at the end of the handler.
So if the saga receives two simultaneous messages the check for Processing will pass both times and I'll have two processes running (and in my case processing the same data twice).
The saga within a saga faces a similar problem too.
What I believe will work (and has done during my PoC testing) is using a database unique index to help out. I'm using entity framework and azure sql, so database access is not contained within the handler's transaction (this is the important difference between the database and the saga data). The database will also operate across all instances of the endpoint and generally seems like a good solution.
The table that I'm using has each of the columns that make up the saga 'id', and there is a unique index on them.
At the beginning of the handler I retrieve a row from the database. If there is a row, the handler returns (in my case this is okay, in others you could throw an exception to get the handler to run again). The first thing that the handler does (before any work, although I'm not 100% sure that it matters) is to write a row to the table. If the write fails (probably because of the unique constraint being violated) the exception puts the message back on the queue. It doesn't really matter why the database write fails, as NSB will handle it.
Then the handler does the work.
Then remove the row.
Of course there is a chance that something happens during processing of the work, so I'm also using a timestamp and another process to reset it if it's busy for too long. (still need to define 'too long' though :) )
Maybe this can help someone with a similar problem.

NServiceBus: How to archive completed or terminated sagas

NServiceBus removes Saga data at least in the RavenDB persistens store when this.MarkAsComplete(); is called from the Saga itself.
Is there a built-in way to archive the Saga data when the Saga becomes completed or terminated? We need such a feature for traceability reasons.
You can put an internal flag in you saga data, set it to complete instead of calling MarkAsComplete and check it in your (saga) handlers.
(this way you can restart a saga if you want and you sagas will live forever)
Dose that make sense?
When using the rest of the Particular Service Platform, all actions on a saga get audited automatically, including the state that the saga was in when it completed.
ServiceInsight provides visualization of all of these state changes.

NServiceBus Sagas - At Least Once Delivery

Using NServiceBus with the NHibernate saga persister, how can I avoid duplicate sagas when it's possible for a message to be received more than once?
Here are some solutions that I've thought of so far:
Never call MarkAsComplete() so the deduplication is handled in the usual fashion by the saga itself.
Implement my own saga persister which stores the correlation ids for completed sagas so duplicate/additional messages are ignored.
The question is what would cause the message to be received multiple times - is it due to retries of the same message (like in the case where there was a deadlock in the DB)? Those kinds of retries (causing the same message to be "processed" multiple times) are already handled by the transactional nature of NServiceBus.
If the situation is due to the message being sent by some other endpoint multiple times, the recommendation would be to see what you could do to prevent that on the sending side. If that isn't possible, then yes, a saga that doesn't ever complete could serve as your filter.