RabbitMQ reordering messages - rabbitmq

RabbitMQ ticks all the boxes for the project I am planning, save one. I would have different workers listening on a queue and it is important that they process the newest messages (i.e., latest sequence number) first (LIFO).
My application is such that newer messages pretty much obsolete older messages. If you have workers to spare you could still process the older messages but it is important the newer ones are done first.
After trawling the various forums and such I can only see one solution and that is for a client to process a message it should first:
consume all messages
re-order them according to the sequence number
re-submit to the queue
consume the first message
Ugly and problematic if the client dies halfway. But mabye somebody here has a better solution.
My research is based (in part) on:
http://groups.google.com/group/rabbitmq-discuss/browse_thread/thread/e79e77d86bc7a3b8?fwc=1
http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2010-July/007934.html
http://groups.google.com/group/rabbitmq-discuss/browse_thread/thread/e40d1069dcebe2cc
http://old.nabble.com/Priority-Queue-implementation-and-performance-td29946348.html
Note: the expected traffic of messages will roughly be in the range of 1 msg/hour for some queues and 100/minute for others. So nothing stellar.

Since there is no reply I guess I did my homework rather well ;)
Anyway, after discussing the requirements with the other stakeholders it was decided I can drop the LIFO requirement for now. We can worry about that when it comes to it.
A solution that we will probably end up adopting is for the worker to open a second queue that the master can use to let the worker know what jobs to ignore + provide additional control/monitoring information (which it looks like we will need anyway).
The RabbitMQ implementing the AMQP 1.0 spec may also help here.
So I will mark this question as answered for now. Somebody else is still free to add or improve.

One possibility might be to use basic.get in a loop and wait for the response basic-ok.message-count to become zero (throwing away all other messages):
while (<get ok> = <call basic.get>) {
if (<get ok>.message-count == 0) {
// Now <get ok> is the most recent message on this queue
break;
} else if (<is get-empty>) {
// Someone else got it
}
}
Of course, you'd have to set up the message routing patterns on the broker such that 1 consumer throwing away messages doesn't mess with another. Try to avoid re queueing messages as they will re queue at the top of the stack, making them look like the most recent.

Related

ActiveMQ: How do I limit the number of messages being dispatched?

Let's say I have one ActiveMQ Broker and an undefined numbers of consumers.
Problem:
To process a message, consumers need an external service which is either "DATA1" or "DATA2" (specified in the message)
Each server, "DATA1" and "DATA2", can only handle 20 connections
So at most 20 "DATA1" and 20 "DATA2" messages must be dispatched at any time
Because of priorization, the messages must be enqueued in the same queue
Even if message A has a higher prio than message B, if A can't be processed because the external service has no free slots, message B needs to be processed instead
How can this be solved? As long as I was using message pulling (prefetch of 0), I was able to do this by using a BrokerPlugin that, on messagePull, achieved this by using semaphores and selectors. If the limits were reached, the pull returned null.
However, due to performance issues I had to set prefetch to 1 and use push instead. Therefore, my messagePull hack no longer works (it's never called).
So far I'm considering implementing a custom Cursor but I was wondering if someone knows a better solution.
Update the custom cursor worked but broke features like message removal. I tried with a custom Queue and QueueDispatchSelector (which is a pain to configure since there isn't a proper API to do so) and it mostly works but I still have synchronisation issues.
Also, a very suitable API seems to be DispatchPolicy, however, while it is referenced by Queue, it's never used.
Queues give you buffering for system processing time for free. Messages are delivered on demand. With prefetch=0 or prefetch=1, should effectively get you there. Messages will only be delivered to a consumer when the consumer is ready (ie.. during the consumer.receive() method).
consumer.receive() is a blocking call, so you should not need any custom plugin or other to delay delivery until the consumer process (and its required downstream services) are ready to handle it.
The behavior should work out-of-the-box, or there are some details to your use case that are not provided to shed more light on the scenario.

Immediate flag in RabbitMQ

I have a clients that uses API. The API sends messeges to rabbitmq. Rabbitmq to workers.
I ought to reply to clients if somethings went wrong - message wasn't routed to a certain queue and wasn't obtained for performing at this time ( full confirmation )
A task who is started after 5-10 seconds does not make sense.
Appropriately, I must use mandatory and immediate flags.
I can't increase counts of workers, I can't run workers on another servers. It's a demand.
So, as I could find the immediate flag hadn't been supporting since rabbitmq v.3.0x
The developers of rabbitmq suggests to use TTL=0 for a queue instead but then I will not be able to check status of message.
Whether any opportunity to change that behavior? Please, share your experience how you solved problems like this.
Thank you.
I'm not sure, but after reading your original question in Russian, it might be that using both publisher and consumer confirms may be what you want. See last three paragraphs in this answer.
As you want to get message result for published message from your worker, it looks like RPC pattern is what you want. See RabbitMQ RPC tuttorial. Pick a programming language section there you most comfortable with, overall concept is the same. You may also find Direct reply-to useful.
It's not the same as immediate flag functionality, but in case all your publishers operate with immediate scenario, it might be that AMQP protocol is not the best choice for such kind of task. Immediate mean "deliver this message right now or burn in hell" and it might be a situation when you publish more than you can process. In such cases RPC + response timeout may be a good choice on application side (e.g. socket timeout). But it doesn't work well for non-idempotent RPC calls while message still be processed, so you may want to use per-queue or per-message TTL (or set queue length limit). In case message will be dead-lettered, you may get it there (in case you need that for some reason).
TL;DR
As to "something" can go wrong, it can go so on different levels which we for simplicity define as:
before RabbitMQ, like sending application failure and network problems;
inside RabbitMQ, say, missed destination queue, message timeout, queue length limit, some hard and unexpected internal error;
after RabbitMQ, in most cases - messages processing application error or some third-party services like data persistence or caching layer outage.
Some errors like network outage or hardware error are a bit epic and are not a subject of this q/a.
Typical scenario for guaranteed message delivery is to use publisher confirms or transactions (which are slower). After you got a confirm it mean that RabbitMQ got your message and if it has route - placed in a queue. If not it is dropped OR if mandatory flag set returned with basic.return method.
For consumers it's similar - after basic.consumer/basic.get, client ack'ed message it considered received and removed from queue.
So when you use confirms on both ends, you are protected from message loss (we'll not run into a situation that there might be some bug in RabbitMQ itself).
Bogdan, thank you for your reply.
Seems, I expressed my thought enough clearly.
Scheme may looks like this. Each component of system must do what it must do :)
The an idea is make every component more simple.
How to task is performed.
Clients goes to HTTP-API with requests and must obtain a respones like this:
Positive - it have put to queue
Negative - response with error and a reason
When I was talking about confirmation I meant that I must to know that a message is delivered ( there are no free workers - rabbitmq can remove a message ), a client must be notified.
A sent message couldn't be delivered to certain queue, a client must be notified.
How to a message is handled.
Messages is sent for performing.
Status of perfoming is written into HeartBeat
Status.
Clients obtain status from HeartBeat by itself and then decide that
it's have to do.
I'm not sure, that RPC may be useful for us i.e. RPC means that clients must to wait response from server. Tasks may works a long time. Excess bound between clients and servers, additional logic on client-side.
Limited size of queue maybe not useful too.
Possible situation when a size of queue maybe greater than counts of workers. ( problem in configuration or defined settings ).
Then an idea with 5-10 seconds doesn't make sense.
TTL doesn't usefull because of:
Setting the TTL to 0 causes messages to be expired upon reaching a
queue unless they can be delivered to a consumer immediately. Thus
this provides an alternative to basic.publish's immediate flag, which
the RabbitMQ server does not support. Unlike that flag, no
basic.returns are issued, and if a dead letter exchange is set then
messages will be dead-lettered.
direct reply-to :
The RPC server will then see a reply-to property with a generated
name. It should publish to the default exchange ("") with the routing
key set to this value (i.e. just as if it were sending to a reply
queue as usual). The message will then be sent straight to the client
consumer.
Then I will not be able to route messages.
So, I'm sorry. I may flounder in terms i.e. I'm new in AMQP and rabbitmq.

Dealing with dead letters in RabbitMQ

TL;DR: I need to "replay" dead letter messages back into their original queues once I've fixed the consumer code that was originally causing the messages to be rejected.
I have configured the Dead Letter Exchange (DLX) for RabbitMQ and am successfully routing rejected messages to a dead letter queue. But now I want to look at the messages in the dead letter queue and try to decide what to do with each of them. Some (many?) of these messages should be replayed (requeued) to their original queues (available in the "x-death" headers) once the offending consumer code has been fixed. But how do I actually go about doing this? Should I write a one-off program that reads messages from the dead letter queue and allows me to specify a target queue to send them to? And what about searching the dead letter queue? What if I know that a message (let's say which is encoded in JSON) has a certain attribute that I want to search for and replay? For example, I fix a defect which I know will allow message with PacketId: 1234 to successfully process now. I could also write a one-off program for this I suppose.
I certainly can't be the first one to encounter these problems and I'm wondering if anyone else has already solved them. It seems like there should be some sort of Swiss Army Knife for this sort of thing. I did a pretty extensive search on Google and Stack Overflow but didn't really come up with much. The closest thing I could find were shovels but that doesn't really seem like the right tool for the job.
Should I write a one-off program that reads messages from the dead letter queue and allows me to specify a target queue to send them to?
generally speaking, yes.
you could set up a delayed re-try to resend the message back to the original queue, using a combination of the delay message exchange plugin.
but this would only automate the retries on an interval, and you may not have fixed the problem before the retries happen.
in some circumstances this is ok - like when the error is caused by an external resource being temporarily unavailable.
in your case, though, i believe your thoughts on creating an app to handle the dead letters is the best way to go, for several reasons:
you need to search through the messages, which isn't possible RMQ
this means you'll need a database to store the messages from the DLX/queue
because you're pulling the messages out of the DLX/queue, you'll need to ensure you get all the header info from the message so that you can re-publish to the correct queue when the time comes.
I certainly can't be the first one to encounter these problems and I'm wondering if anyone else has already solved them.
and you're not!
there are many solutions to this problem that all come down to the solution you've suggested.
some larger "service bus" implementations have this type of feature built in to them. i believe NServiceBus (or the SaaS version of it) has this built in, for example - though I'm not 100% sure of it.
if you want to look into this further, do some search for the term "poison message" - this is generally the term used for this situation. I've found a few things on google with a quick search, that may help you down the path:
http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2013-January/025019.html
https://web.archive.org/web/20170809194056/http://tafakari.co.ke/2014/07/rabbitmq-poison-messages/
https://web.archive.org/web/20170809170555/http://kjnilsson.github.io/blog/2014/01/30/spread-the-poison/
hope that helps!

Does EasyNetQ support nack?

What I'm really trying to do is leave the message on the queue in the case where it is rejected by the current consumer. In RabbitMQ I could send a NACK to accomplish this. Is NACK supported in EasyNetQ? Is there another way to achieve the behavior I'm looking for?
Update: not a lot of responses, so I'm wondering how people are generally handling the lack of NACK in EasyNetQ. Not having the equivalent of basic.reject limits consumers to "I can always process every message" scenarios. I suppose consumers could throw a specific "rejected" exception to cause EasyNetQ to dequeue the message to the error queue, and I could requeue messages with those errors. Anyone else have other workarounds in place?
I used EasyNetQ for almost a year, but no matter how we tweaked it (amongst other things added our own implementation of IConsumerErrorStrategy) I never really got it to work the way I wanted. The fact that it is single threaded gave us some unexpected behaviour (sometimes deadlocks) when performing RequestAsync while in a SubscribeAsync handler.
The solution for us was to move from EasyNetQ. After working with the official RabbitMq Client for a while, I spent a few days writing a super thin client on top of that. It is influenced by EasyNetQ and supports most of the concepts that EasyNetQ has. However, I added some neat features like pluggable message contexts. I think that the Nack feature of IAdvancedMessageContext that I just added can be something for you:
var client = service.GetService<IBusClient<AdvancedMessageContext>>();
client.RespondAsync<BasicRequest, BasicResponse>((req, ctx) =>
{
ctx?.Nack(); // the context implements IAdvancedMessageContext.
return Task.FromResult<BasicResponse>(null);
}, cfg => cfg.WithNoAck(false));
If you're interested you can read more about it at the Github page (especially the NackTests.cs).
I think you can change the behavior by implementing your own IConsumerErrorStrategy:
https://github.com/EasyNetQ/EasyNetQ/blob/master/Source/EasyNetQ/Consumer/DefaultConsumerErrorStrategy.cs
But if you need that kind of control you might consider just using the RabbitMQ client directly?
It sounds like you are trying to handle failures. You can NACK a message, but that means it sits at the head of the queue. Great, but then it means that you could end up with a bunch of messages that are truthfully unable to be processed, and you will be unable to actually process real messages.
The solution that I have always used when using RabbitMQ is to utilize the default error handling of EasyNetQ, and have a separate application to resend messages. That is, when an exception is captured in RabbitMQ, it routes the message to a queue called "EasyNetQ_Default_Error_Queue". You are able to override this name and have different queues go to different error queues, but for now let's stick with the default. You can then have a Windows Service/Azure Worker role reading these messages, and working out what to do. That may include having a "RetryCount" on your message envelope/wrapper to make sure that it only loops around so many times. All in all, it's going to be a bit of work.
What you are finding, is what many people run into when using RabbitMQ/EasyNetQ. She's pretty raw.

RabbitMQ: throttling fast producer against large queues with slow consumer

We're currently using RabbitMQ, where a continuously super-fast producer is paired with a consumer limited by a limited resource (e.g. slow-ish MySQL inserts).
We don't like declaring a queue with x-max-length, since all messages will be dropped or dead-lettered once the limit is reached, and we don't want to loose messages.
Adding more consumers is easy, but they'll all be limited by the one shared resource, so that won't work. The problem still remains: How to slow down the producer?
Sure, we could put a flow control flag in Redis, memcached, MySQL or something else that the producer reads as pointed out in an answer to a similar question, or perhaps better, the producer could periodically test for queue length and throttle itself, but these seem like hacks to me.
I'm mostly questioning whether I have a fundamental misunderstanding. I had expected this to be a common scenario, and so I'm wondering:
What is best practice for throttling producers? How is this done with RabbitMQ? Or do you do this in a completely different way?
Background
Assume the producer actually knows how to slow himself down with the right input. E.g. a hardware sensor or hardware random number generator, that can generate as many events as needed.
In our particular real case, we have an API that users can use to add messages. Instead of devouring and discarding messages, we'd like to apply back-pressure by having our API return an error if the queue is "full", so the caller/user knows to back-off, or have the API block until the consumer catches up. We don't control our user, so regardless of how fast the consumer is, I can create a producer that is faster.
I was hoping for something like the API for a TCP socket, where a write() can block and where a select() can be used to determine if a handle is writable. So either having the RabbitMQ API block or have it return an error if the queue is full.
For the x-max-length property, you said you don't want messages to be dropped or dead-lettered. I see there was an update in adding some more capabilities for this. As I see it is specified in the documentation:
"Use the overflow setting to configure queue overflow behaviour. If overflow is set to reject-publish, the most recently published messages will be discarded. In addition, if publisher confirms are enabled, the publisher will be informed of the reject via a basic.nack message"
So as I understand it, you can use queue limit to reject the new messages from publishers thus pushing some backpressure to the upstream.
I don't think that this is in any way rabbitmq specific. Basically you have a scenario, where there are two systems of different processing capabilities, and this mismatch will either pose a risk of overflowing the queue (whatever it would be), or even in case of a constant mismatch between producer and consumer, simply create more and more time-distance between event creation and its handling.
I used to deal with this kind of scenarios, and unfortunately there is no magic bullet. You either have to speed up even handling (better hardware, more suited software?) or throttle the event creation (which has nothing to do with MQ really).
Now, I would ask you what's the goal and how the events are produced. Are the events are produced constantly, with either unlimitted or just very high rate (for example readings from sensors - the more, the better), or are they created in batches/spikes (for example: user requests in specific time periods, batch loads from CRM system). I assume that the goal is to process everything cause you mention you don't want to loose any queued message.
If the output is constant, then some limiter (either internal counter, if the producer is the only producer, or external queue length checks if queue can be filled with some other system) is definitely in place.
IF eventsInTimePeriod/timePeriod > estimatedConsumerBandwidth
THEN LowerRate()
ELSE RiseRate()
In real world scenarios we used to simply limit the output manually to the estimated values and there were some alerts set for queue length, time from queue entry to queue leaving etc. Where such limiters were omitted (by mistake mostly) we used to find later some tasks that were supposed to be handled in few hours, that were waiting for three months for their turn.
I'm afraid it's hard to answer to "How to slow down the producer?" if we know nothing about it, but some ideas are: aforementioned rate check or maybe a blocking AddMessage method:
AddMessage(message)
WHILE(getQueueLength() > maxAllowedQueueLength)
spin(1000); // or sleep or whatever
mqAdapter.AddMessage(message)
I'd say it all depends on specific of the producer application and in general your architecture.