I would like to have this constraint on a queue in RabbitMQ:
Next message in the queue can't be dequeued before previous message (the one being processed) is acked.
Through this I will achieve ordered processing of events and parallel processing across multiple queues. How do I/can I configure RabbitMQ for this?
Edit (clarification): There will be many consumers all trying to get work from all the queues and since they can't get work from a queue that has an event being processed that isn't acked - ordered processing is maintained.
Next message in the queue can't be dequeued before previous message (the one being processed) is acked.
you can do this through the consumer prefetch limit for a single consumer.
Through this I will achieve ordered processing of events and parallel processing across multiple queues.
unfortunately, this won't have the effect that you want.
you can set for an individual consumer. that consumer will wait for a message to be acknowledged before getting the next one.
However, this applies to the individual consumer, not the queue.
if you have 2 consumers, each of them will process a message in parallel. if you have 10 consumers, 10 messages will be processed in parallel.
the only way to process every message in order, is to have a single consumer with a prefetch of 1.
Related
I have a RabbitMQ setup in which jobs are sent to an exchange, which passes them to a queue. A consumer carries out the jobs from the queue correctly in turn. However, these jobs are long processes (several minutes at least). For scalability, I need to be able to have multiple consumers picking a job from the top of the queue and executing it.
The consumer is running on a Heroku dyno called 'queue'. When I scale the dyno, it appears to create additional consumers for each dyno (I can see these on the RabbitMQ dashboard). However, the number of tasks in the queue is unchanged - the extra consumers appear to be doing nothing. Please see the picture below to understand my setup.
Am I missing something here?
Why are the consumers showing as 'idle'? I know from my logs that at least one consumer is actively working through a task.
How can my consumer utilisation be 0% when at least one consumer is definitely working hard.
How can I make the other three consumers actually pull some jobs from the queue?
Thanks
EDIT: I've discovered that the round robin dispatching is actually working, but only if the additional consumers are already running when the messages are sent to the queue. This seems like counterintuitive behaviour to me. If I saw a large queue and wanted to add more consumers, the added consumers would do nothing until more items are added to the queue.
To pick out the key point from the other answer, the likely culprit here is pre-fetching, as described under "Consumer Acknowledgements and Publisher Confirms".
Rather than delivering one message at a time and waiting for it to be acknowledged, the server will send batches to the consumer. If the consumer acknowledges some but then crashes, the remaining messages will be sent to a different consumer; but if the consumer is still running, the unacknowledged messages won't be sent to any new consumer.
This explains the behaviour you're seeing:
You create the queue, and deliver some messages to it, with no consumer running.
You run a single consumer, and it pre-fetches all the messages on the queue.
You run a second consumer; although the queue isn't empty, all the messages are marked as sent to the first consumer, awaiting acknowledgement; so the second consumer sits idle.
A new message arrives in the queue; it is distributed in round-robin fashion to the second consumer.
The solution is to specify the basic.qos option in the consumer. If you set this to 1, RabbitMQ won't send a message to a consumer until it has acknowledged the previous message; multiple consumers with that setting will receive messages in strictly round-robin fashion.
I am not familiar to Heroku, so I don't know how Heroku worker build rabbitMQ consumer, I just have a quick view over Heroku document.
Why are the consumers showing as 'idle'?
I think your mean the queue is 'idle'? Because the queue's state is about the queue's traffic, it just means there is not on-doing job for the queue's job thread. And it will become 'running' when a message is published in the queue.
How can my consumer utilisation be 0% when at least one consumer is definitely working hard.
The same as queue state, from official explanation, consumer utilisation too low means:
There were more consumers
The consumers were faster
The consumers had a higher prefetch count
In your situation, prefetch_count = 0 means no limits on prefetch, so it's too large. And Messages.total = Messages.unacked = 78 means your consumer is too slow, there are two many messages have been processed by consumer.
So if your message rate is not large enough, the state and consumer utilisation field of the queue is useless.
If I saw a large queue and wanted to add more consumers, the added consumers would do nothing until more items are added to the queue.
Because these unacked messages have already been prefetched by exist consumers, they will not be consumed by new consumers unless you requeue the unacked messages.
Let's suppose we have one producer, one queue and some consumers which are subscribed on queue.
Producer -> Queue -> Consumers
Queues contains messages about life events. These messages should receive all consumers.
When queue will be erased?
When all consumers get message?
Or when one of consumers confirm message with flag ack (true)?
And how to manage priority, who from consumers must to get message first/last (don't confuse with message priority).
As instance I have 10 consumers and I want that the fifth consumer get message first, remaining consumers later after specified time.
Be careful: when there are many consumers on one queue, only one of them will receive a given message, provided that it is consumed and acked properly. You need to bind as many queues as consumers to an exchange to have all consumers receive the message.
For your priority question, there is no built-in mecanism to have consumers receive the same message with a notion of priority: consumer priority exists (see https://www.rabbitmq.com/consumer-priority.html), but it is made to have consumer receive a given message before the others on a given queue, so the other consumers won't receive this message. It you need to orchestrate the delivery of your messages, you have to think of a more complex system (maybe a saga or a resequencer?).
Note that you can delay messages using this pattern. Again, this requires having multiple queues.
Finally, there are many scenarios when a queue is deleted. Take a look at the documentation, these are well explained.
I have one direct exchange. There is also one queue, bound to this exchange.
I have two consumers for that queue. The consumers are manually ack'ing the messages once they've done the corresponding processing.
The messages are logically ordered/sorted, and should be processed in that order. Is it possible to enforce that all messages are received and processed sequentially accross consumer A and consumer B? In other words, prevent A and B from processing messages at the same time.
Note: the consumers are not sharing the same connection and/or channel. This means I cannot use <channel>.basicQoS(1);.
Rationale of this question: both consumers are identicall. If one goes down, the other queue starts processing messages and everything keeps working without any required intervention.
One approach to handling failover in a case where you want redundant consumers but need to process messages in a specific order is to use the exclusive consumer option when setting up the bind to the queue, and to have two consumers who keep trying to bind even when they can't get the exclusive lock.
The process is something like this:
Consumer A starts first and binds to the queue as an exclusive consumer. Consumer A begins processing messages from the queue.
Consumer B starts next and attempts to bind to the queue as an exclusive consumer, but is rejected because the queue already has an exclusive consumer.
On a recurring basis, consumer B attempts to get an exclusive bind on the queue but is rejected.
Process hosting consumer A crashes.
Consumer B attempts to bind to the queue as an exclusive consumer, and succeeds this time. Consumer B starts processing messages from the queue.
Consumer A is brought back online, and attempts an exclusive bind, but is rejected now.
Consumer B continues to process messages in FIFO order.
While this approach doesn't provide load sharing, it does provide redundancy.
Even though this is already answered. May be this can help others.
RabbitMQ has a feature known as Single Active Consumer, which matches your case.
We can have N consumers attached to a Queue but only 1 (one) of them will be actively consuming messages from the Queue. Fail-over happens only when active consumer fails.
Kindly take a look at the link https://www.rabbitmq.com/consumers.html#single-active-consumer
Thank you
Usually the point of a MQ system is to distribute workload. Of course, there are some situations where processing of message N depends on result of processing the message N-1, or even the N-1 message itself.
If A and B can't process messages at the same time, then why not just have A or just B? As I see it, you are not saving anything with having 2 consumers in a way that one can work only when the other one is not...
In your case, it would be best to have one consumer but to actually do the parallelisation (not a word really) on the processing part.
Just to add that RMQ is distributing messages evenly to all consumers (in round-robin fashion) regardless on any criteria. Of course this is when prefetch is set to 1, which by default it is. More info on that here, look for "fair dispatch".
We run multiple concurrent RabbitMQ consumers each one executes “basicGet” in a loop. We see that a single consumer gets most of the messages. Is there a way to spread messages more evenly between all consumers? Basically can we somehow interrupt RabbitMQ serveing the first consumer and switch to the next in line. Note: we must pull messages (basicGet) and cannot switch to push (basicConsume) Thanks.
set a consumer prefetch limit of 1, and put the consumer into noAck: false mode.
... that may be autoAck: false, instead of noAck...
this will force your consumer to only retrieve 1 message at a time, and require you to manually ack the message.
with these two things in place, your messages should distribute across multiple consumers more evenly - assuming you have multiple messages in the queue
I want to read the payload, or messageId of unacknowledged messages in a RabbitMQ queue. Is this possible?
The reason I want to do so is I trying to use RabbitMQ dead letter feature to build a cycle to for auto-generating message periodically.
Briefly, create two queues - work queue and delay queue.
Set TTL of the message in delay queue as the time frequency of need to periodically. Can have different messages with different TTL for different job purpose;
put a message into the delay queue. When the message expires, it gets republished into the work queue. The message can sit in the work queue as long as needed until a consumer is up to consume it.
One consumer picks up the message, and process it. If processing succeeds, the consumer needs acknowledge the work queue, and then write the message back to the delay queue; If processing fails (e.g., the thread crashes), no acknowledgement. Then the message would re-appear in the worker queue automatically. Then another consumer can take up the job. When the message sent back to the delay queue gets expired again, it gets republished, then re-consumed by a consumer ...... A cycle constructed, workload distributed.
I want to make sure there is no missing or duplicate messages in the cycle since I do not want missing job or double doing the job at the same time. However, there is tiny tiny chance duplicate messages can happen. Below show the consumer first write back the message to delay queue, and acknowledge the work queue. If the thread crashes right between below two lines, the message would be in the delay queue, and Rabbit republish the message again into work queue. The end up with duplicate messages in the cycle.
channel.basicPublish(DELAY_EXCHANGE, "", null, message.getBytes());
channel.basicAck(delivery.getEnvelope().getDeliveryTag(), false);
To prevent above, I want to add a dog watch logic after above two line:
Check the total number of messages in the cycle (total messages in both queues) to see whether it is equal my expected number (I expected the number less 10);
If the number does not matches, I want to figure out which one is missing or which one is duplicate, then deal with it. I do not care about the sequence of those messages, or the frequency has been disturbed since this is a really really edge case to consider. I can easily retrieve those messages which are ready and requeue them. But the problem is how to deal with those unacknowledged messages?
Thank you very much in advance!
Roy
It's not possible to read unacknowledged messages from other context the original messages was consumed and held as un-aked.