How to handle network time-out exception with rabbit mq while sending messages asynchronously using spring-amqp library - rabbitmq

I have written a program which requires multiple queues interaction - means consumer of one queue writes message to another queue and same program has consumer to take action on that queue.
Problem: How to handle network time-out issues with queue while sending messages asynchronously using spring rabbit ampq library?or RabbitTemplate.send() function must throw an exception if there are network issues.
Currently, I have implemented RabbitTemplate.send() that returns immediately and working fine. But, If network is down, send function returns immediately, doesn't throw any exception and client code assumes success. As a result, i have in-consistent state in DB that message is successfully processed. Please note that call to send function is wrapped inside transactional block and goal is if queue writing fails, DB commit must also rollback. I am exploring following solutions but no success:
Can we configure rabbitTemplate to throw run-time exception if any network connectivity issue so that client call is notified? Please suggest how to do this.
Shall we use synchronous SendAndReceive function call but it leads to delay in processing? Another problem, observed with this function, my consumer code gets notification while sendAndReceive function is still blocked for writing message to queue. Please advise if we can delay notification to queue unless sendAndReceive function is returned. But call to SendAndReceive() was throwing an amqp exception if network was down which we were able to capture, but it has cost associated related to performance.
My application is multi-threaded, if multiple threads are sending message using sendAndReceive(), how spring-amqp library manages queue communication? Does it internally creates channel per request? If messages are delivered via same channel, it would impact performance a lot for multi-threaded application.
Can some-one share sample code for using SendAndReceive function with best-practices?
Do we have any function in spring-amqp library to check health of RabbitMQ server before submitting send function call? I explored rabbitTemplate.isRunning() but not getting proper result. If any specific configuration required, please suggest.
Any other solution to consider for guaranteed message delivery or handle network time-out issues to throw runtime exceptions to client..
As per Gary comment below, I have set: rabbitTemplate.setChannelTransacted(true); and it makes call sync. Next part of problem is that if I have transaction block on outer block, call to RabbitTemplate.send() returns immediately. I expect transaction block of outer function must wait for inner function to return, otherwise, ii don't get expected result as my DB changes are persisted though we enabled setChannelTransacted to true. I tried various Transaction propagation level but no success. Please advise if I am doing anything wrong and review transactional propagation settings as below
#Transactional
public void notifyQueueAndDB(DBRequest dbRequest) {
logger.info("Updating Request in DB");
dbService.updateRequest(dbRequest));
//Below is call to RabbitMQ library
mqService.sendmessage(dbRequest); //If sendMessage fails because of network outage, I want DB commit also to be rolled-back.
}
MQService defined in another library of project, snippet below.
#Transactional( propagation = Propagation.NESTED)
private void sendMessage(......) {
....
rabbitTemplate.send(this.queueExchange, queueName, amqpMessage);
}catch (Exception exception) {
throw exception
}

Enable transactions so that the send is synchronous.
or
Use Publisher confirms and wait for the confirmation to be received.
Either one will be quite a bit slower.

Related

ActiveMQ CMS: Can messages be lost between creating a consumer and setting a listener?

Setting up a CMS consumer with a listener involves two separate calls: first, acquiring a consumer:
cms::MessageConsumer* cms::Session::createConsumer( const cms::Destination* );
and then, setting a listener on the consumer:
void cms::MessageConsumer::setMessageListener( cms::MessageListener* );
Could messages be lost if the implementation subscribes to the destination (and receives messages from the broker/router) before the listener is activated? Or are such messages queued internally and delivered to the listener upon activation?
Why isn't there an API call to create the consumer with a listener as a construction argument? (Is it because the JMS spec doesn't have it?)
(Addendum: this is probably a flaw in the API itself. A more logical order would be to instantiate a consumer from a session, and have a cms::Consumer::subscribe( cms::Destination*, cms::MessageListener* ) method in the API.)
I don't think the API is flawed necessarily. Obviously it could have been designed a different way, but I believe the solution to your alleged problem comes from the start method on the Connection object (inherited via Startable). The documentation for Connection states:
A CMS client typically creates a connection, one or more sessions, and a number of message producers and consumers. When a connection is created, it is in stopped mode. That means that no messages are being delivered.
It is typical to leave the connection in stopped mode until setup is complete (that is, until all message consumers have been created). At that point, the client calls the connection's start method, and messages begin arriving at the connection's consumers. This setup convention minimizes any client confusion that may result from asynchronous message delivery while the client is still in the process of setting itself up.
A connection can be started immediately, and the setup can be done afterwards. Clients that do this must be prepared to handle asynchronous message delivery while they are still in the process of setting up.
This is the same pattern that JMS follows.
In any case I don't think there's any risk of message loss regardless of when you invoke start(). If the consumer is using an auto-acknowledge mode then messages should only be automatically acknowledged once they are delivered synchronously via one of the receive methods or asynchronously through the listener's onMessage. To do otherwise would be a bug in my estimation. I've worked with JMS for the last 10 years on various implementations and I've never seen any kind of condition where messages were lost related to this.
If you want to add consumers after you've already invoked start() you could certainly call stop() first, but I don't see any problem with simply adding them on the fly.

Mule ESB: How to achieve typical ReTry Mechanism in MULE ESB

I need to implement a logic on Retry. Inbound endpoint pushes the messages to Rest (Outbound). If the REST is unavailable, I need to retry for 1 time and put it in the queue. But the second upcoming messages should not do any retry, it has to directly put the messages in to queue until the REST service is available.
Once the service is available, I need to pushes all the messages from QUEUE to REST Service (in ordering) via batch job.
Questions:
How do I know the service is unavailable for my second message? If I use until Successful, for every message it do retry and put in queue. Plm is 2nd message shouldn't do retry.
For batch, I thought of using poll, but how to tell to poll, when the service becomes available to begin the batch process. (bcz,Poll is more of with configuring timings to run batch)?
Other ticky confuses me is - Here ordering has to be preserved. once the service is available. Queue messages ( i,e Batch) has to move first to REST Services then with real time. I doubt whether Is it applicable.
It will be very helpful for the quick response to implement the logic.
Using Mule: 3.5.1
I could try something like below: using flow controls
process a message; if exception or bad response code, set a variable/property like serviceAvailable=false.
subsequent message processing will first check the property serviceAvailable to process the messages. if property is false, en-queue the messages to a DB table with status=new/unprocessed
create a flow/scheduler to process the messages from DB sequentially, but it will not check the property serviceAvailable and call the rest service.
If service throws exception it will not store the messages in db again but if processes successfully change the property serviceAvailable=true and de-queue the messages or change the status. Add another property and set it to true if there are more messages in db table like moreDBMsg=true.
New messages should not be processed/consumed until moreDBMsg=false
once moreDBMsg=false and serviceAvailable=true start processing the messages from queue.
For the timeout I would still look at the response code and catch time-outs to determine if the call was successful or requires a retry. Practically you normally do multi threading anyway, so you have multiple calls in parallel anyway. Or simply one call starts before the other ends.
That is just quite normal.
But you can simply retry calls in a queue that time out. And after x amounts of time-outs you "skip" or defer the retry.
But all of this has been done using actual Mule flow components like either:
MEL http://www.mulesoft.org/documentation/display/current/Mule+Expression+Language+Reference
Or flow controls: http://www.mulesoft.org/documentation/display/current/Choice+Flow+Control+Reference
Or for example you reference a Spring Bean and do it in native Java code.
One possibility for the queue would be to persist it in a database. Mule has database connector that has a "poll" feature, see: http://www.mulesoft.org/documentation/display/current/JDBC+Transport+Reference#JDBCTransportReference-PollingTransport

In pub/sub model, how to make Subscriber pause processing based on some external state?

My requirement is to make the Subscriber pause processing the messages depending on whether a web service is up or not. So, when the web service is down, the messages should keep coming to the subscriber queue from Publisher and keep piling up until the web service is up again. (These messages should not go to the error queue, but stay on the Subscriber queue.)
I tried to use unsubscribe, but the publisher stops sending messages as the unsubscribe seems to clear the subscription info on RavenDB. I have also tried setting the MaxConcurrencyLevel on the Transport class, if I set the worker threads to 0, the messages coming to Subscriber go directly to the error queue. Finally, I tried Defer, which seems to put the current message in audit queue and creates a clone of the message and sends it locally to the subscriber queue when the timeout is completed. Also, since I have to keep checking the status of service and keep defering, I cannot control the order of messages as I cannot predict when the web service will be up.
What is the best way to achieve the behavior I have explained? I am using NServiceBus version 4.5.
It sounds like you want to keep trying to handle a message until it succeeds, and not shuffle it back in the queue (keep it at the top and keep trying it)?
I think your only pure-NSB option is to tinker with the MaxRetries setting, which controls First Level Retries: http://docs.particular.net/nservicebus/msmqtransportconfig. Setting MaxRetries to a very high number may do what you are looking for, but I can't imagine doing so would be a good practice.
Second Level Retries will defer the message for a configurable amount of time, but IIRC will allow other messages to be handled from the main queue.
I think your best option is to put retry logic into your own code. So the handler can try to access the service x number of times in a loop (maybe on a delay) before it throws an exception and NSB's retry features kick in.
Edit:
Your requirement seems to be something like:
"When an MyEvent comes in, I need to make a webservice call. If the webservice is down, I need to keep trying X number of times at Y intervals, at which point I will consider it a failure and handle a failure condition. Until I either succeed or fail, I will block other messages from being handled."
You have some potentially complex logic on handling a message (retry, timeout, error condition, blocking additional messages, etc.). Keep in mind the role that NSB is intended to play in your system: communication between services via messaging. While NSB does have some advanced features that allow message orchestration (e.g. sagas), it's not really intended to be used to replace Domain or Application logic.
Bottom line, you may need to write custom code to handle your specific scenario. A naive solution would be a loop with a delay in your handler, but you may need to create a more robust in-memory collection/queue that holds messages while the service is down and processes them serially when it comes back up.
The easiest way to achieve somewhat the required behavior is the following:
Define a message handler which checks whether the service is available and if not calls HandleCurrentMessageLater and a message handler which does the actual message processing. Then you specify the message handler order so that the handler which checks the service availability gets executed first.
public interface ISomeCommand {}
public class ServiceAvailabilityChecker : IHandleMessages<ISomeCommand>{
public IBus Bus { get; set; }
public void Handle(ISomeCommand message) {
try {
// check service
}
catch(SpecificException ex) {
this.Bus.HandleCurrentMessageLater();
}
}
}
public class ActualHandler : IHandleMessages<ISomeCommand>{
public void Handle(ISomeCommand message) {
}
}
public class SomeCommandHandlerOrdering : ISpecifyMessageHandlerOrdering{
public void SpecifyOrder(Order order){
order.Specify(First<ServiceAvailabilityChecker>.Then<ActualHandler>());
}
}
With that design you gain the following:
You can check the availability before the actual business code is invoked
If the service is not available the message is put back into the queue
If the service is available and your business code gets invoked but just before the ActualHandler is invoked the service becomes unavailable you get First and Second Level retries (and again the availability check in the pipeline)

How to detect alarm-based blocking RabbitMQ producer?

I have a producer sending durable messages to a RabbitMQ exchange. If the RabbitMQ memory or disk exceeds the watermark threshold, RabbitMQ will block my producer. The documentation says that it stops reading from the socket, and also pauses heartbeats.
What I would like is a way to know in my producer code that I have been blocked. Currently, even with a heartbeat enabled, everything just pauses forever. I'd like to receive some sort of exception so that I know I've been blocked and I can warn the user and/or take some other action, but I can't find any way to do this. I am using both the Java and C# clients and would need this functionality in both. Any advice? Thanks.
Sorry to tell you but with RabbitMQ (at least with 2.8.6) this isn't possible :-(
had a similar problem, which centred around trying to establish a channel when the connection was blocked. The result was the same as what you're experiencing.
I did some investigation into the actual core of the RabbitMQ C# .Net Library and discovered the root cause of the problem is that it goes into an infinite blocking state.
You can see more details on the RabbitMQ mailing list here:
http://rabbitmq.1065348.n5.nabble.com/Net-Client-locks-trying-to-create-a-channel-on-a-blocked-connection-td21588.html
One suggestion (which we didn't implement) was to do the work inside of a thread and have some other component manage the timeout and kill the thread if it is exceeded. We just accepted the risk :-(
The Rabbitmq uses a blocking rpc call that listens for a reply indefinitely.
If you look the Java client api, what it does is:
AMQChannel.BlockingRpcContinuation k = new AMQChannel.SimpleBlockingRpcContinuation();
k.getReply(-1);
Now -1 passed in the argument blocks until a reply is received.
The good thing is you could pass in your timeout in order to make it return.
The bad thing is you will have to update the client jars.
If you are OK with doing that, you could pass in a timeout wherever a blocking call like above is made.
The code would look something like:
try {
return k.getReply(200);
} catch (TimeoutException e) {
throw new MyCustomRuntimeorTimeoutException("RabbitTimeout ex",e);
}
And in your code you could handle this exception and perform your logic in this event.
Some related classes that might require this fix would be:
com.rabbitmq.client.impl.AMQChannel
com.rabbitmq.client.impl.ChannelN
com.rabbitmq.client.impl.AMQConnection
FYI: I have tried this and it works.

WCF client causes server to hang until connection fault

The below text is an effort to expand and add color to this question:
How do I prevent a misbehaving client from taking down the entire service?
I have essentially this scenario: a WCF service is up and running with a client callback having a straight forward, simple oneway communication, not very different from this one:
public interface IMyClientContract
{
[OperationContract(IsOneWay = true)]
void SomethingChanged(simpleObject myObj);
}
I'm calling this method potentially thousands of times a second from the service to what will eventually be about 50 concurrently connected clients, with as low latency as possible (<15 ms would be nice). This works fine until I set a break point on one of the client apps connected to the server and then everything hangs after maybe 2-5 seconds the service hangs and none of the other clients receive any data for about 30 seconds or so until the service registers a connection fault event and disconnects the offending client. After this all the other clients continue on their merry way receiving messages.
I've done research on serviceThrottling, concurrency tweaking, setting threadpool minimum threads, WCF secret sauces and the whole 9 yards, but at the end of the day this article MSDN - WCF essentials, One-Way Calls, Callbacks and Events describes exactly the issue I'm having without really making a recommendation.
The third solution that allows the service to safely call back to the client is to have the callback contract operations configured as one-way operations. Doing so enables the service to call back even when concurrency is set to single-threaded, because there will not be any reply message to contend for the lock.
but earlier in the article it describes the issue I'm seeing, only from a client perspective
When one-way calls reach the service, they may not be dispatched all at once and may be queued up on the service side to be dispatched one at a time, all according to the service configured concurrency mode behavior and session mode. How many messages (whether one-way or request-reply) the service is willing to queue up is a product of the configured channel and the reliability mode. If the number of queued messages has exceeded the queue's capacity, then the client will block, even when issuing a one-way call
I can only assume that the reverse is true, the number of queued messages to the client has exceeded the queue capacity and the threadpool is now filled with threads attempting to call this client that are now all blocked.
What is the right way to handle this? Should I research a way to check how many messages are queued at the service communication layer per client and abort their connections after a certain limit is reached?
It almost seems that if the WCF service itself is blocking on a queue filling up then all the async / oneway / fire-and-forget strategies I could ever implement inside the service will still get blocked whenever one client's queue gets full.
Don't know much about the client callbacks, but it sounds similar to generic wcf code blocking issues. I often solve these problems by spawning a BackgroundWorker, and performing the client call in the thread. During that time, the main thread counts how long the child thread is taking. If the child has not finished in a few milliseconds, the main thread just moves on and abandons the thread (it eventually dies by itself, so no memory leak). This is basically what Mr.Graves suggests with the phrase "fire-and-forget".
Update:
I implemented a Fire-and-forget setup to call the client's callback channel and the server no longer blocks once the buffer fills to the client
MyEvent is an event with a delegate that matches one of the methods defined in the WCF client contract, when they connect I'm essentially adding the callback to the event
MyEvent += OperationContext.Current.GetCallbackChannel<IFancyClientContract>().SomethingChanged
etc... and then to send this data to all clients, I'm doing the following
//serialize using protobuff
using (var ms = new MemoryStream())
{
ProtoBuf.Serializer.Serialize(ms, new SpecialDataTransferObject(inputData));
byte[] data = ms.GetBuffer();
Parallel.ForEach(MyEvent.GetInvocationList(), p => ThreadUtil.FireAndForget(p, data));
}
in the ThreadUtil class I made essentially the following change to the code defined in the fire-and-foget article
static void InvokeWrappedDelegate(Delegate d, object[] args)
{
try
{
d.DynamicInvoke(args);
}
catch (Exception ex)
{
//THIS will eventually throw once the client's WCF callback channel has filled up and timed out, and it will throw once for every single time you ever tried sending them a payload, so do some smarter logging here!!
Console.WriteLine("Error calling client, attempting to disconnect.");
try
{
MyService.SingletonServiceController.TerminateClientChannelByHashcode(d.Target.GetHashCode());//this is an IContextChannel object, kept in a dictionary of active connections, cross referenced by hashcode just for this exact occasion
}
catch (Exception ex2)
{
Console.WriteLine("Attempt to disconnect client failed: " + ex2.ToString());
}
}
}
I don't have any good ideas how to go and kill all the pending packets the server is still waiting to see if they'll get delivered on. Once I get the first exception I should in theory be able to go and terminate all the other requests in some queue somewhere, but this setup is functional and meets the objectives.