active mq delay recovery - activemq

I'm trying to use an ActiveMQ queue as a Apache Storm Spout.
I use the "INDIVIDUAL_ACK" strategy.
In my idea, I'm planning to trigger a session.recover() periodically, to resend messages that would not be acknowledged (error in the Bolt processing chain).
But if I do that, all the messages corresponding to Storm tuple, currently processed, would be resent. I would try to limit this phenomenon.
Ideally, I would like to parameter a delay, all the message sent younger that this delay should not be resent (this delay should also be in phase with the timeout of Storm tuple processing)
I've read about AMQ policies (http://activemq.apache.org/redelivery-policy.html) but I'm not sure that the redeliveryDelay param applies to my problem.
Any hint?
Franck

With INDIVIDUAL_ACKNOWLEDGE you don't use a session.recover(), you do a message.acknowledge(). Additionally, its best practice to only use JMS-style transactions for automatically recoverable errors (ie. host is down). For a contextual issue (ie.. bad data) that will never work, you should move the message to another queue.. i.e. some .DLQ or .ERR queue.

Related

Immediate flag in RabbitMQ

I have a clients that uses API. The API sends messeges to rabbitmq. Rabbitmq to workers.
I ought to reply to clients if somethings went wrong - message wasn't routed to a certain queue and wasn't obtained for performing at this time ( full confirmation )
A task who is started after 5-10 seconds does not make sense.
Appropriately, I must use mandatory and immediate flags.
I can't increase counts of workers, I can't run workers on another servers. It's a demand.
So, as I could find the immediate flag hadn't been supporting since rabbitmq v.3.0x
The developers of rabbitmq suggests to use TTL=0 for a queue instead but then I will not be able to check status of message.
Whether any opportunity to change that behavior? Please, share your experience how you solved problems like this.
Thank you.
I'm not sure, but after reading your original question in Russian, it might be that using both publisher and consumer confirms may be what you want. See last three paragraphs in this answer.
As you want to get message result for published message from your worker, it looks like RPC pattern is what you want. See RabbitMQ RPC tuttorial. Pick a programming language section there you most comfortable with, overall concept is the same. You may also find Direct reply-to useful.
It's not the same as immediate flag functionality, but in case all your publishers operate with immediate scenario, it might be that AMQP protocol is not the best choice for such kind of task. Immediate mean "deliver this message right now or burn in hell" and it might be a situation when you publish more than you can process. In such cases RPC + response timeout may be a good choice on application side (e.g. socket timeout). But it doesn't work well for non-idempotent RPC calls while message still be processed, so you may want to use per-queue or per-message TTL (or set queue length limit). In case message will be dead-lettered, you may get it there (in case you need that for some reason).
TL;DR
As to "something" can go wrong, it can go so on different levels which we for simplicity define as:
before RabbitMQ, like sending application failure and network problems;
inside RabbitMQ, say, missed destination queue, message timeout, queue length limit, some hard and unexpected internal error;
after RabbitMQ, in most cases - messages processing application error or some third-party services like data persistence or caching layer outage.
Some errors like network outage or hardware error are a bit epic and are not a subject of this q/a.
Typical scenario for guaranteed message delivery is to use publisher confirms or transactions (which are slower). After you got a confirm it mean that RabbitMQ got your message and if it has route - placed in a queue. If not it is dropped OR if mandatory flag set returned with basic.return method.
For consumers it's similar - after basic.consumer/basic.get, client ack'ed message it considered received and removed from queue.
So when you use confirms on both ends, you are protected from message loss (we'll not run into a situation that there might be some bug in RabbitMQ itself).
Bogdan, thank you for your reply.
Seems, I expressed my thought enough clearly.
Scheme may looks like this. Each component of system must do what it must do :)
The an idea is make every component more simple.
How to task is performed.
Clients goes to HTTP-API with requests and must obtain a respones like this:
Positive - it have put to queue
Negative - response with error and a reason
When I was talking about confirmation I meant that I must to know that a message is delivered ( there are no free workers - rabbitmq can remove a message ), a client must be notified.
A sent message couldn't be delivered to certain queue, a client must be notified.
How to a message is handled.
Messages is sent for performing.
Status of perfoming is written into HeartBeat
Status.
Clients obtain status from HeartBeat by itself and then decide that
it's have to do.
I'm not sure, that RPC may be useful for us i.e. RPC means that clients must to wait response from server. Tasks may works a long time. Excess bound between clients and servers, additional logic on client-side.
Limited size of queue maybe not useful too.
Possible situation when a size of queue maybe greater than counts of workers. ( problem in configuration or defined settings ).
Then an idea with 5-10 seconds doesn't make sense.
TTL doesn't usefull because of:
Setting the TTL to 0 causes messages to be expired upon reaching a
queue unless they can be delivered to a consumer immediately. Thus
this provides an alternative to basic.publish's immediate flag, which
the RabbitMQ server does not support. Unlike that flag, no
basic.returns are issued, and if a dead letter exchange is set then
messages will be dead-lettered.
direct reply-to :
The RPC server will then see a reply-to property with a generated
name. It should publish to the default exchange ("") with the routing
key set to this value (i.e. just as if it were sending to a reply
queue as usual). The message will then be sent straight to the client
consumer.
Then I will not be able to route messages.
So, I'm sorry. I may flounder in terms i.e. I'm new in AMQP and rabbitmq.

RabbitMQ - purge a queue from all of its unacked messages

I have thousands of unacked messages in my dev environment which I can't restart.
Is there a way to remove (purge) all messages even if they are unacknowledged?
Close the channel that the unacked messages reside on, which will nack them back into the queue, then call purge.
You have to make consumer ack them (or nack) and only after that they will be removed. Alternatively you can shutdown consumers and purge the queue completely.
If you are looking for some way to purge all unacked messages - there are no such feature nor in AMQP protocol neither in RabbitMQ.
It looks like your consumer is the cause of the problem, so you have to adjust it (rewrite) to release message immediately after it processed or failed.
Once there are no "ready" messages in the queue, delete it and recreate.
YOU WILL LOSE THE QUEUE CONTENTS with this method.
You need to put messages back into the queue before you can purge them:
close the channel
close the connection (the script doesn't work for me)
As an alternative, this doesn't require to wait:
delete and recreate the queue
restart the server
You need to call basic.recover to force all unacked messages to be re-enqueued to a channel that failed. Be aware of the errata concerning this function specifying that only the requeue mode is supported by RabbitMQ.
For software developer use below code.
channel.purgeQueue(queue-name);
if we use this code the Queue will be clear and same queue will exist.
One way this can happen is if the consumer is stuck recycling the same messages due to a processing error. In this case, the RabbitMQ queue management interface may show the messages as Unacked, but really they are being read from the queue and processed (to the point of the failure) then requeued (to enable a retry) at a rapid pace -- maybe thousands of times per second.
During this loop, the messages exist briefly in the Ready state, but are immediately removed again by you application -- and the cycle begins again. As an example, this auto-requeue behavior is the default for Spring AMQP.
Since the messages are never left in the Ready state, the Management Interface's Get Message(s) button is unlikely work. What can work, if you have queue access, is to run a separate custom consumer instance, perhaps locally, but with the specific intent of removing and not requeuing the messages in question.
By RabbitMQ's Fair Dispatch mechanism, your additional consumer will likely receive the messages in question and have the opportunity to perform your custom handling.
You might even write a custom utility to do this, with logic to filter, analyze, or deadletter the messages of interest.
If you want to clear the contents of the queue, then you can use the AMQP method queue.purge: There is queue purge in AMQP: http://www.rabbitmq.com/amqp-0-9-1-reference.html#queue.purge
You could do similar using the management plugin.

How to specify another timeout queue for NSB?

I am using NSB 4.4.2
I want to have something like heartbeats on my saga to show processing statistics.
When i request a timeout it sends to sagas input queue.
In case of many messages prior to this timeout message, IHandleTimeouts may not be fired at specific time.
Is it a bug? Or how can i use separate queue for timeout messages?
Thanks
You are correct - when a timeout is ready to be dispatched, it is sent to the incoming queue of the endpoint, and if there are already many other messages in there, it will have to wait its turn to be processed.
Another thing you might want to consider, is that the endpoint may be down at that time.
If you want to guarantee that your saga code will be invoked at (or very close to) the time of the timeout, you'll need to set up a high availability deployment first. Then, you should look at setting the SLA required of that endpoint - how quickly messages should be processed, and then monitor the time to breach SLA performance counter.
See here for more information: http://docs.particular.net/nservicebus/monitoring-nservicebus-endpoints
You should be prepared to scale out your endpoint as needed to guarantee enough processing power to keep up with the load coming in.
NOTE: The reason we use the same incoming queue for processing these timeouts is by design. A timeout message is almost always the same priority or lower than the other business messages being processed by a saga. As such, it doesn't make sense to have them cut ahead of other messages in line.
Timeouts are sent to the [endpointname].timeouts

How to set a redelivery policy in RabbitMQ/AMQP

I'm currently using ActiveMQ for my queueing system, and I'm wanting to make the transition to RabbitMQ. One feature I've been using that belongs to ActiveMQ is a redelivery policy, as sometimes our consumer rejects a message because it cannot handle it at this time, but may want to try again later, so it requeues it.
Right now in AMQP, when I reject a message, it's instantly pulled off the queue again immediately and tried again.
Is there a way, in RabbitMQ, to specify a redelivery policy for a queue, consumer, or message?
I also had problems with that behaviour. According to documentation (as far as I remember, maybe in newer version something changed) after requeue it is not stated where a message will be placed (it was described as undetermined). In my testcases (with version 2.8.2) some of messages were put to the end of a queue and one message (precisely first from clients prefetch) land on beggining (and being consumed immediately). In our application this caused livelock.
You could walkaround this by publishing copy of message to a queue and acking already delivered one in one transaction (but I recommend to carefully read section about transactions in docs) or use deadlettering to deal with temporaly unprocessable messages.

NServiceBus Retry Delay

What is the optimal way to configure/code NServiceBus to delay retrying messages?
In its default configuration retry happens almost immediately up to the number of attempts defined in the configuration file. I'd ideally like to retry again after an hour, etc.
Also, how does HandleCurrentMessageLater() work? What does the Later aspect refer to?
The NSB retries is there to remedy temporary problems like deadlocks etc. Longer retries is better handled by creating another process that monitors the error queue and puts them back into to the source queue at the interval you like. Take a look at the ReturnToSourceQueue.exe that comes with NSB for reference.
Edit: NServiceBus now supports this , we call it Second Level Retries, see http://docs.particular.net/ for more details
Here is a blog post on why NServiceBus doesn't include a retry delay that I wrote after asking Udi this very same question in his distributed systems architecture course:
NServiceBus Retries: Why no back-off delay?
And here is a discussion thread covering some of the points involved in building an error queue monitor/retry endpoint:
http://tech.groups.yahoo.com/group/nservicebus/message/10964
As far as HandleCurrentMessageLater(), all that does is puts the current message back at the end of the queue. If there are no other messages waiting, it's going to be processed again immediately.
As of NServiceBus 3.2.1, they provide an out of the box solution to handle back off delays in the event of consecutive message failures. The previously existing retry mechanism still retries failures without a delay to handle cases like Database deadlocks, quickly self healing network issues, etc.
Once a message has been retried the configured number of times, the message is moved to a "Second Level Retry" queue. This queue, as configured below, will retry after a 10, 20, and 30 second delay, then the message will be moved to the configured error queue. You're free to change these values to something that better suites your environment.
You can also check out this link:
http://docs.particular.net/nservicebus/second-level-retries