We are having an issue with recovery for messages originating from Sagas.
When a Saga sends a message for processing, the message handler can sometimes fail with an exception. We currently use a try/catch and when an exception is thrown, we "Reply" with a failed message to the Saga. The issue with this approach is that Recoverability retries don't happen since we are handling the error in the message handler.
My thought was to add custom logic to the pipeline and if the Command message implements some special Interface, the custom logic would send a failed message response to the Saga if an exception occurs (after the retries fails), but I'm not sure where to plug into the pipeline that would allow me to send messages after retries fails.
Is this a valid approach? If not, how can I solve for Saga to Handler failure messages after retries?
You can use immediate dispatch to not wait for a handler to complete.
However, I would like to suggest an alternate approach. Why not create a Timeout in the saga? If the reply from the processing-handler isn't received within a certain TimeSpan, you take an alternate path. The processing-handler gets 5 minutes and if it doesn't respond within 5 minutes, we do something else. If it still responds after 6 minutes, we know we've already taken the alternate path (use a boolean flag or so and store that inside the saga data) and put aside the reply that arrived too late.
If you want to start a discussion based on this, check our community platform.
Related
I have a requirement for an endpoint to receive commands from a client and also to subscribe to events from another endpoint such as:
1- the received command is tried only once then sent to the error queue if an exception occurred
2- the received event is tried indefinitely until it is processed
Could the MaxRetries bet set differently depending on the message type?
NServiceBus MaxRetries setting is to handle things like deadlocks, so not really what you want for this scenario.
What you want is to use SLRs to handle this situation.
To filter based on an exception type, have a look at http://andreasohlund.net/2012/09/26/disabling-second-level-retries-for-specific-exceptions/
Hope this helps!
I looked into NSB source and I notice that MaxRetries can't take different values for different messages in NSB 3.3. Happily you can override class that forward messages to error queue and implement your own version that checking if failed message is event and then instead forward to error queue you can send it again to current endpoint.
Hopefully this is a simple question but i need to verify that my assumption is correct: If i send 4 messages in one batch send and one of the 4 messages causes a fault and fails retries in its handler does that single message get forwarded to the error queue or does the entire batch message get placed into the error queue?
Common sense tells me that the single message would be moved to the error queue as the batch message has been unwrapped and delegated to its handlers.
The transaction boundary is the handler and therefore each message has its own set of retries. The only complexity to this is that if you are using a pipeline of message handlers you also have to consider that if any of the handlers fail for a given message, a retry will occur.
I have a situation where the Maxtries in my MSMQ is 5. After 5 times nservicebus sends the message to the Error que that I have defined. Now I want to perfomr some further action when this happens (I have to update status of some processes to Error)
Is it possible to write a handler in my Saga class to read these error queues?
Thanks in Advance
Haris
If your are using 2.x you may want to consider writing a separate endpoint where the error queue is its input queue. The downside to this is that the messages will come off the queue. Assuming you still want to store them, you'll have to push them off to a database or some other type of storage.
You could also write a Saga that polls the error queue to check for messages and updates the appropriate status. After each time you check the queue, you would need to request another Timeout.
In 3.0, you have more control over the exceptions, and can implement your own way to handle the errors. If you implement the interface IManageMessageFailures, you can do your work there.
As an alternative to the solutions provided by Adam, you can subscribe to events raised by ServiceControl which are raisesd when a messages is sent to the errorqueue. See the official documentation about this here: http://docs.particular.net/servicecontrol/contracts
Another approach would be the notification API as described here: http://docs.particular.net/nservicebus/errors/subscribing-to-error-notifications. It allows you to subscribe to certain events (not event messages) like "MessageSentToErrorQueue" directly on the endpoint, so you wouldn't need to consume the error queue.
I have an endpoint that has a message handler which does some FTP work.
Since the ftp process can take some time, I encapsulated the ftp method within a TransactionScope with TransactionScopeOption.Suppress to prevent the transaction timeout exceptions.
Doing this got rid of the timeout exceptions, however the handler was fired 5 times
(retries is set to 5 in my config file)
The files were ftp'd ok, but they were just ftp'd 5 times.
the handler look like it is re-fired after 10 or 11 minutes.
some test code looks as follows:
public void Handle(FtpMessage msg)
{
using (TransactionScope t = new TransactionScope(TransactionScopeOption.Suppress))
{
FtpFile(msg);
}
}
Any help would be greatly appreciated.
thanks.
If this truly is an FTP communication that cannot be completed within the transaction timeout, another approach would be to turn this into a Saga.
The Saga would be started by FtpMessage, and in the handler it would start the FTP work asynchronously, whether in another thread, or via another process, or whatever, and store enough information in saga data to be able to look up the progress later.
Then it would request a timeout from the TimeoutManager for however long makes sense. Upon receiving that timeout, it would look up the state in saga data, and check on the FTP work in progress. If it's completed, mark the saga as complete, if not, request another timeout.
Alternatively, you could have a process wrapping the FTP communication that hosts its own Bus but does not have any message handlers of its own. It could receive its FTP information via the command line (including requesting endpoint), do its work, and then send a message back to the requesting endpoint saying it is complete. Then you would not have to wait for a timeout to move on with the process.
I'd recommend configuring that endpoint as non-transactional rather than trying to suppress the transaction. Do that by including .IsTransactional(false) in your initialization code if self-hosting or by implementing IConfigureThisEndpoint, AsA_Client when using the generic host.
My guess is that by not completing the inner scope you're causing the outer scope, created by NSB, to rollback. This will cause NSB to retry your FtpMessage.
Try to add: t.Complete(); after your call to FtpFile and see if that does it for you.
Edit: After rereading your question I realized that this won't solve your timeout issue. Have you tried to increase the timeout? (10 min is the default maxValue in machine.config so you can't set it to higher without modifying machine.config)
NServiceBus provides a timeout mechanism. From nservicebus.com:
The RequestTimeout method on the base
class tells NServiceBus to send a
message to another endpoint which will
durably keep time for us ... There's a
process that comes with NServiceBus
called the Timeout Manager which
provides a basic implementation of
this functionality.
When time is up, the Timeout Manager
sends a message back to the saga
causing its Timeout method to be
called with the same state object
originally passed.
As I see it there is a possibility that the timeout is triggered even though the message has been delivered to the receiver (the reply got stuck somewhere for example).
How do I design my application in such a way that my application will behave correctly regardless if the message made it to the receiver or not.
If the Client sends a message to the Server and then requests a Timeout, the state of the request will be stored. If the Timeout message is received by the Client prior to the reply from the Server then you can compare the state returned by the Timeout to the current state and see that the Server has not replied and decide what to do. If the request is no longer valid, you might ignore the reply. If that is the case, you may want to look at the "TimeToBeReceived" attribute for the Server message. It will throw away messages it can't receive in the designated time.