Can timeouts be retried like messages? - nservicebus

We have a long running saga process which includes timeouts, when the timeout kicks in the first thing that happens is a call to an external data source.
We're wondering if it's appropriate to query the source directly and have the timeout hit the error queues if the source is down (or if some other issue comes up), or should we have the timeout create and send a message which is handled and from there query the source (and thus the message hits error queues if a problem arises) and then reply back to the original sender
We feel our NserviceBus code is a bit overly complex and are looking for ways to simplify it, and we're wondering if this is a good chance to do so.
public void Timeout(TimeoutEvent event)
{
bus.send(ExternalServiceCallCmd cmd)
}
public void handle(ExternalServiceCallCmd cmd)
{
manager.CallToExternalService();
}
If the call to the external service fails, the ExternalServiceCallCmd gets retried and eventually end up on error queues.
We're wondering if we can simplify like this:
public void Timeout(TimeoutEvent event)
{
manager.CallToExternalService();
}
and if the call to the external service fails, the TimeoutEvent would be retried and end up on error queues if necessary

Well, the first rule of Sagas is don't do I/O in saga handler, send commands to worker handler instead...
Other than interacting with its own internal state, a saga should not access a database, call out to web services, or access other resources - neither directly nor indirectly by having such dependencies injected into it.
Full details are provided here: https://docs.particular.net/nservicebus/sagas/#accessing-databases-and-other-resources-from-a-saga
Does that answer your question?
Please feel free to contact support support#particular.net to get more details :-)

Related

How to handle network time-out exception with rabbit mq while sending messages asynchronously using spring-amqp library

I have written a program which requires multiple queues interaction - means consumer of one queue writes message to another queue and same program has consumer to take action on that queue.
Problem: How to handle network time-out issues with queue while sending messages asynchronously using spring rabbit ampq library?or RabbitTemplate.send() function must throw an exception if there are network issues.
Currently, I have implemented RabbitTemplate.send() that returns immediately and working fine. But, If network is down, send function returns immediately, doesn't throw any exception and client code assumes success. As a result, i have in-consistent state in DB that message is successfully processed. Please note that call to send function is wrapped inside transactional block and goal is if queue writing fails, DB commit must also rollback. I am exploring following solutions but no success:
Can we configure rabbitTemplate to throw run-time exception if any network connectivity issue so that client call is notified? Please suggest how to do this.
Shall we use synchronous SendAndReceive function call but it leads to delay in processing? Another problem, observed with this function, my consumer code gets notification while sendAndReceive function is still blocked for writing message to queue. Please advise if we can delay notification to queue unless sendAndReceive function is returned. But call to SendAndReceive() was throwing an amqp exception if network was down which we were able to capture, but it has cost associated related to performance.
My application is multi-threaded, if multiple threads are sending message using sendAndReceive(), how spring-amqp library manages queue communication? Does it internally creates channel per request? If messages are delivered via same channel, it would impact performance a lot for multi-threaded application.
Can some-one share sample code for using SendAndReceive function with best-practices?
Do we have any function in spring-amqp library to check health of RabbitMQ server before submitting send function call? I explored rabbitTemplate.isRunning() but not getting proper result. If any specific configuration required, please suggest.
Any other solution to consider for guaranteed message delivery or handle network time-out issues to throw runtime exceptions to client..
As per Gary comment below, I have set: rabbitTemplate.setChannelTransacted(true); and it makes call sync. Next part of problem is that if I have transaction block on outer block, call to RabbitTemplate.send() returns immediately. I expect transaction block of outer function must wait for inner function to return, otherwise, ii don't get expected result as my DB changes are persisted though we enabled setChannelTransacted to true. I tried various Transaction propagation level but no success. Please advise if I am doing anything wrong and review transactional propagation settings as below
#Transactional
public void notifyQueueAndDB(DBRequest dbRequest) {
logger.info("Updating Request in DB");
dbService.updateRequest(dbRequest));
//Below is call to RabbitMQ library
mqService.sendmessage(dbRequest); //If sendMessage fails because of network outage, I want DB commit also to be rolled-back.
}
MQService defined in another library of project, snippet below.
#Transactional( propagation = Propagation.NESTED)
private void sendMessage(......) {
....
rabbitTemplate.send(this.queueExchange, queueName, amqpMessage);
}catch (Exception exception) {
throw exception
}
Enable transactions so that the send is synchronous.
or
Use Publisher confirms and wait for the confirmation to be received.
Either one will be quite a bit slower.

MessageBus: wait when processing is done and send ACK to requestor

We work with external TCP/IP interfaces and one of the requirements is to keep connection open, wait when processing is done and send ACK with the results back.
What would be best approach to achieve that assuming we want to use MessageBus (masstransit/nservicebus) for communication with processing module and tracing message states: received, processing, succeeded, failed?
Specifically, when message arrives to handler/consumer, how it will know about TCP/IP connection? Should I store it in some custom container and inject it to consumer?
Any guidance is appreciated. Thanks.
The consumer will know how to initiate and manage the TCP connection lifecycle.
When a message is received, the handler can invoke the code which performs some action based on the message data. Whether this involves displaying an green elephant on a screen somewhere or opening a port, making a call, and then processing the ACK, does not change how you handle the message.
The actual code which is responsible for performing the action could be packaged into something like a nuget package and exposed over some kind of generic interface if that makes you happier, but there is no contradiction with a component having a dual role as a message consumer and processor of that message.
A new instance of the consumer will be created for each message
receiving. Also, in my case, consumer can’t initiate TCP/IP
connection, it has been already opened earlier (and stored somewhere
else) and consumer needs just have access to use it.
Sorry, I should have read your original question more closely.
There is a solution to shared access to a resource from NServiceBus, as documented here.
public class SomeEventHandler : IHandleMessages<SomeEvent>
{
private IMakeTcpCall _caller;
public SomeEventHandler(IMakeTcpCalls caller)
{
_caller = caller;
}
public Task Handle(SomeEvent message, IMessageHandlerContext context)
{
// Use the caller
var ack = _caller.Call(message.SomeData);
// Do something with ack
...
return Task.CompletedTask;
}
}
You would ideally have a DI container which would manage the lifecycle of the IMakeTcpCall instance as a singleton (though this might get weird in high volume scenarios), so that you can re-use the open TCP channel.
Eg, in Castle Windsor:
Component.For<IMakeTcpCalls>().ImplementedBy<MyThreadsafeTcpCaller>().LifestyleSingleton();
Castle Windsor integrates with NServiceBus

In pub/sub model, how to make Subscriber pause processing based on some external state?

My requirement is to make the Subscriber pause processing the messages depending on whether a web service is up or not. So, when the web service is down, the messages should keep coming to the subscriber queue from Publisher and keep piling up until the web service is up again. (These messages should not go to the error queue, but stay on the Subscriber queue.)
I tried to use unsubscribe, but the publisher stops sending messages as the unsubscribe seems to clear the subscription info on RavenDB. I have also tried setting the MaxConcurrencyLevel on the Transport class, if I set the worker threads to 0, the messages coming to Subscriber go directly to the error queue. Finally, I tried Defer, which seems to put the current message in audit queue and creates a clone of the message and sends it locally to the subscriber queue when the timeout is completed. Also, since I have to keep checking the status of service and keep defering, I cannot control the order of messages as I cannot predict when the web service will be up.
What is the best way to achieve the behavior I have explained? I am using NServiceBus version 4.5.
It sounds like you want to keep trying to handle a message until it succeeds, and not shuffle it back in the queue (keep it at the top and keep trying it)?
I think your only pure-NSB option is to tinker with the MaxRetries setting, which controls First Level Retries: http://docs.particular.net/nservicebus/msmqtransportconfig. Setting MaxRetries to a very high number may do what you are looking for, but I can't imagine doing so would be a good practice.
Second Level Retries will defer the message for a configurable amount of time, but IIRC will allow other messages to be handled from the main queue.
I think your best option is to put retry logic into your own code. So the handler can try to access the service x number of times in a loop (maybe on a delay) before it throws an exception and NSB's retry features kick in.
Edit:
Your requirement seems to be something like:
"When an MyEvent comes in, I need to make a webservice call. If the webservice is down, I need to keep trying X number of times at Y intervals, at which point I will consider it a failure and handle a failure condition. Until I either succeed or fail, I will block other messages from being handled."
You have some potentially complex logic on handling a message (retry, timeout, error condition, blocking additional messages, etc.). Keep in mind the role that NSB is intended to play in your system: communication between services via messaging. While NSB does have some advanced features that allow message orchestration (e.g. sagas), it's not really intended to be used to replace Domain or Application logic.
Bottom line, you may need to write custom code to handle your specific scenario. A naive solution would be a loop with a delay in your handler, but you may need to create a more robust in-memory collection/queue that holds messages while the service is down and processes them serially when it comes back up.
The easiest way to achieve somewhat the required behavior is the following:
Define a message handler which checks whether the service is available and if not calls HandleCurrentMessageLater and a message handler which does the actual message processing. Then you specify the message handler order so that the handler which checks the service availability gets executed first.
public interface ISomeCommand {}
public class ServiceAvailabilityChecker : IHandleMessages<ISomeCommand>{
public IBus Bus { get; set; }
public void Handle(ISomeCommand message) {
try {
// check service
}
catch(SpecificException ex) {
this.Bus.HandleCurrentMessageLater();
}
}
}
public class ActualHandler : IHandleMessages<ISomeCommand>{
public void Handle(ISomeCommand message) {
}
}
public class SomeCommandHandlerOrdering : ISpecifyMessageHandlerOrdering{
public void SpecifyOrder(Order order){
order.Specify(First<ServiceAvailabilityChecker>.Then<ActualHandler>());
}
}
With that design you gain the following:
You can check the availability before the actual business code is invoked
If the service is not available the message is put back into the queue
If the service is available and your business code gets invoked but just before the ActualHandler is invoked the service becomes unavailable you get First and Second Level retries (and again the availability check in the pipeline)

How to tell NServiceBus to retry a message?

I have a process whereby an admin must be alerted and the message automatically retried if some business logic is not meet.
Currently what I did is I throw and Exception to force NServiceBus to retry the message.
I have a feeling this is not what I am supposed to do. Is this the proper way of doing it?
public void Handle(ImportantCmd message)
{
//do some awesome business logic here
..a business logic is not meet..
//send email alert in case of error
Bus.Publish<SendEmailCmd>(email =>
{
email.To = "pooradmin#awesomecompany.com";
email.Title = "Important title";
email.Body = "Important message";
});
//then force NServiceBus to retry
throw new Exception("Blah blah...., retrying this message.");
}
Update: I would like an admin to be alerted whenever some condition is not met and he/she should be able to see all messages that are affected (perhaps in a dedicated queue?) and possibly retry them.
Basically our service depends on an external service. This external service occasionally could return erroneous respond (but if we retry it might work). That is why I am alerting the admin and at the same time retrying them.
Given your update (i'm assuming the admin will not alter the message) i would say you can use the FLR (First Level retry) and SLR(Second Level Retry) to retry the messages as the web service you are calling will eventually be able to process your message.
If that fails, the message will end up in the error queue.
You can monitor the error queue, by polling ServiceControl using it's API (if you use the platform installer it will install ServiceControl with NServiceBus) or subscribing to the MessageFailed event ServiceControl is publishing like this spike code more on David's blog .
Here is a link about SLR
Check Out David's book
The retry mechanism of NServiceBus (driven by throwing an exception) is supposed to be for infrastructure problems (deadlocks, servers unavailable, outright bugs, etc.) that a developer would need to look at. That way transient failures (deadlocks, web service down) is taken care of on an automatic retry, and permanent errors (whoops looks like I divided by zero!) go to an error queue for a developer to figure out and take administrative action.
Now, if your endpoint is transactional, your code above will not work as expected because either everything in the message handler is in the transaction. That means if you throw an exception, your Bus.Publish (or Bus.Send, and you can't/shouldn't publish a command) will not actually happen.
Really, I don't understand what sort of business logic would require an alert and a retry. Can you elaborate? What is it that makes your business logic so non-deterministic based on the incoming message? And can anything be done about that?
But at the end of the day, this business logic sounds like it's part of a business process, which should stay expressed in messages, not in errors and retry. So if a condition means you need to notify someone and so something else, publish a ThingHappened event (a subscriber can send an email) and then have another handler do whatever is necessary to handle that business process. If that means that, in the future, a new command comes through with largely the same data, then so be it.

NServiceBus - Problem with using TransactionScopeOption.Suppress in message handler

I have an endpoint that has a message handler which does some FTP work.
Since the ftp process can take some time, I encapsulated the ftp method within a TransactionScope with TransactionScopeOption.Suppress to prevent the transaction timeout exceptions.
Doing this got rid of the timeout exceptions, however the handler was fired 5 times
(retries is set to 5 in my config file)
The files were ftp'd ok, but they were just ftp'd 5 times.
the handler look like it is re-fired after 10 or 11 minutes.
some test code looks as follows:
public void Handle(FtpMessage msg)
{
using (TransactionScope t = new TransactionScope(TransactionScopeOption.Suppress))
{
FtpFile(msg);
}
}
Any help would be greatly appreciated.
thanks.
If this truly is an FTP communication that cannot be completed within the transaction timeout, another approach would be to turn this into a Saga.
The Saga would be started by FtpMessage, and in the handler it would start the FTP work asynchronously, whether in another thread, or via another process, or whatever, and store enough information in saga data to be able to look up the progress later.
Then it would request a timeout from the TimeoutManager for however long makes sense. Upon receiving that timeout, it would look up the state in saga data, and check on the FTP work in progress. If it's completed, mark the saga as complete, if not, request another timeout.
Alternatively, you could have a process wrapping the FTP communication that hosts its own Bus but does not have any message handlers of its own. It could receive its FTP information via the command line (including requesting endpoint), do its work, and then send a message back to the requesting endpoint saying it is complete. Then you would not have to wait for a timeout to move on with the process.
I'd recommend configuring that endpoint as non-transactional rather than trying to suppress the transaction. Do that by including .IsTransactional(false) in your initialization code if self-hosting or by implementing IConfigureThisEndpoint, AsA_Client when using the generic host.
My guess is that by not completing the inner scope you're causing the outer scope, created by NSB, to rollback. This will cause NSB to retry your FtpMessage.
Try to add: t.Complete(); after your call to FtpFile and see if that does it for you.
Edit: After rereading your question I realized that this won't solve your timeout issue. Have you tried to increase the timeout? (10 min is the default maxValue in machine.config so you can't set it to higher without modifying machine.config)