I have a setup of multiple queues to distribute different types of memory and time intensive tasks to workers.
While the worker instances are intended to listen to different subsets of the queues, they should only consume a limited amount of messages (usually 1) at a time per worker to prevent them from running out of memory. We are using a SimpleMessageListenerContainer for the queues with a concurrency and prefetch count of e.g. 1.
The problem is that the prefetch count seems to be set on the channel with the global option set to false, so it is not applied per channel, but per consumer / queue (see docs).
As a result, a single worker that processes a long running task blocks a message on every queue it listens to, while the other workers idle.
Because of the requirement that the workers can be configured for changing subsets of task types, we cannot route all tasks to a single queue or worker specific ones.
I couldn't find a way to change the qos setting on the channel before the consumers are subscribed, because everything happens within the start method of the BlockingQueueConsumer.
Related
I have a RabbitMQ setup in which jobs are sent to an exchange, which passes them to a queue. A consumer carries out the jobs from the queue correctly in turn. However, these jobs are long processes (several minutes at least). For scalability, I need to be able to have multiple consumers picking a job from the top of the queue and executing it.
The consumer is running on a Heroku dyno called 'queue'. When I scale the dyno, it appears to create additional consumers for each dyno (I can see these on the RabbitMQ dashboard). However, the number of tasks in the queue is unchanged - the extra consumers appear to be doing nothing. Please see the picture below to understand my setup.
Am I missing something here?
Why are the consumers showing as 'idle'? I know from my logs that at least one consumer is actively working through a task.
How can my consumer utilisation be 0% when at least one consumer is definitely working hard.
How can I make the other three consumers actually pull some jobs from the queue?
Thanks
EDIT: I've discovered that the round robin dispatching is actually working, but only if the additional consumers are already running when the messages are sent to the queue. This seems like counterintuitive behaviour to me. If I saw a large queue and wanted to add more consumers, the added consumers would do nothing until more items are added to the queue.
To pick out the key point from the other answer, the likely culprit here is pre-fetching, as described under "Consumer Acknowledgements and Publisher Confirms".
Rather than delivering one message at a time and waiting for it to be acknowledged, the server will send batches to the consumer. If the consumer acknowledges some but then crashes, the remaining messages will be sent to a different consumer; but if the consumer is still running, the unacknowledged messages won't be sent to any new consumer.
This explains the behaviour you're seeing:
You create the queue, and deliver some messages to it, with no consumer running.
You run a single consumer, and it pre-fetches all the messages on the queue.
You run a second consumer; although the queue isn't empty, all the messages are marked as sent to the first consumer, awaiting acknowledgement; so the second consumer sits idle.
A new message arrives in the queue; it is distributed in round-robin fashion to the second consumer.
The solution is to specify the basic.qos option in the consumer. If you set this to 1, RabbitMQ won't send a message to a consumer until it has acknowledged the previous message; multiple consumers with that setting will receive messages in strictly round-robin fashion.
I am not familiar to Heroku, so I don't know how Heroku worker build rabbitMQ consumer, I just have a quick view over Heroku document.
Why are the consumers showing as 'idle'?
I think your mean the queue is 'idle'? Because the queue's state is about the queue's traffic, it just means there is not on-doing job for the queue's job thread. And it will become 'running' when a message is published in the queue.
How can my consumer utilisation be 0% when at least one consumer is definitely working hard.
The same as queue state, from official explanation, consumer utilisation too low means:
There were more consumers
The consumers were faster
The consumers had a higher prefetch count
In your situation, prefetch_count = 0 means no limits on prefetch, so it's too large. And Messages.total = Messages.unacked = 78 means your consumer is too slow, there are two many messages have been processed by consumer.
So if your message rate is not large enough, the state and consumer utilisation field of the queue is useless.
If I saw a large queue and wanted to add more consumers, the added consumers would do nothing until more items are added to the queue.
Because these unacked messages have already been prefetched by exist consumers, they will not be consumed by new consumers unless you requeue the unacked messages.
Similar to this question, we have FIFO queues and the messages must be processed in order. We want competing consumers from different machines for redundancy and performance reasons, but only one consumer on one machine should handle a message for a given queue at a time.
I tried setting the prefetch count to 1, but I believe this will only work if used with a single machine. Is this possible by default with RabbitMQ or do we need to implement our own lock?
Given a single queue with multiple consumers there is no way to block one of the consumers, all of them receive the messages in round-robin fashion.
EDIT
See https://www.rabbitmq.com/consumers.html#single-active-consumer
/EDIT
You could see this plugin, https://github.com/rabbitmq/rabbitmq-consistent-hash-exchange to distribute the load using different queues.
I tried setting the prefetch count to 1
prefetch=1 means that the consumers take one message at a time.
do we need to implement our own lock
Yes, If you want one single consumer for queue avoiding other consumers.
EDIT
There are also the Exclusive Queues https://www.rabbitmq.com/queues.html#exclusive-queues but note:
Exclusive queues are deleted when their declaring connection is closed or gone (e.g. due to underlying TCP connection loss). They, therefore, are only suitable for client-specific transient state.
I have a camel route processing messages from a RabbitMQ endpoint. I am keeping the defaults for concurrentConsumers (1) and threadPoolSize(10).
I am relative new to RabbitMQ, and still do not quite understand the relationship between the concurrentConsumer and threadPoolSize properties. The messages in my queues need to be processed in sequence, which I think shall be achieved by using a single consumer. However, will using a threadPoolSize value greater than one cause messages to be processed in parallel?
The default value is 10 (source : https://camel.apache.org/components/latest/rabbitmq-component.html)
It won't affect your concurrency. That means the only one consumer will have 10 threads available to use for the process. You can check at exclusiveConsumer if you want only one consumer shared between all your apps (needed if you could have multiple apps targeting the queue)
We're seeing an issue where consumers of our message queues are picking up messages from queues at the top of the alphabetical range. We have two applications: a producer, and a subscriber. We're using RabbitMQ 3.6.1.
Let's say that the message queues are setup like so:
Our first application, the producer, puts say 100 messages/second onto each queue:
Our second application, the subscriber, has five unique consumer methods that can deal with messages on each respective queue. Each method binds to it's respective queue. A subscriber has a prefetch of 1 meaning it can only hold one message at a time, regardless of queue. We may run numerous instances of the subscriber like so:
So the situation is thus: each queue is receiving 100 msg/sec, and we have four instances of subscriber consuming these messages, so each queue has four consumers. Let's say that the consumer methods can deal with 25 msg/sec each.
What happens is that instead of all the queues being consumed equally, the alphabetically higher queues instead get priority. It's seems as though when the subscriber becomes ready, RabbitMQ looks down the list of queues that this particular ready channel is bound to, and picks the first queue with pending messages.
In our situation, A_QUEUE will have every message consumed. B_QUEUE may have some consumed in certain race conditions, but C_QUEUE/D_QUEUE and especially E_QUEUE will rarely get touched.
If we turn off the publisher, the queues will eventually drain, top to bottom.
Is it possible to configure either RabbitMQ itself or possibly even the channel to use some sort of round robin distribution policy or maybe even random policy so that when a channel has numerous bound queues, all with messages pending, the distribution is even?
to clarify: you have a single subscriber application with multiple consumers in it, right?
I'm guessing you're using a single RabbitMQ Connection within the subscriber app.
Are you also re-using a single RabbitMQ Channel for all of your consumers? If so, that would be a problem. Be sure to use a new Channel for each consumer you start.
Maybe the picture is wrong, but if it's not then your setup is wrong. You don't need 4 queues if you are going to have subscribers that listen to each and every queue. You'd just need one queue, that has multiple instances of the same subscriber consuming from it.
Now to answer, yes (but no need to configure, as long as prefetch is 1), actually rabbitmq does distribute messages evenly. You can find about about that here, and on the same place actually how your setup should look like. Here is a quote from the link.
RabbitMQ just dispatches a message when the message enters the queue.
It doesn't look at the number of unacknowledged messages for a
consumer. It just blindly dispatches every n-th message to the n-th
consumer.
I'm new to RabbitMQ and I'm wondering how to implement the following: producer creates tasks for multiple sites, there's a bunch of consumers that should process these tasks one by one, but only talking to 1 site with concurrency of 1, without starting a new task for this site before the previous one ended. This way slow site would be processed slowly, and the fast ones - fast (as opposed by slow sites taking up all the worker capacity).
Ideally a site would be processed only by one worker at a time, being replaced by another worker if it dies. This seems like a task for exclusive queues, but apparently there's no easy way to list and subscribe to new queues. What is the proper way to achieve such results with RabbitMQ?
I think you may have things the wrong way round. For workers you have 1 or more producers sending to 1 exchange. The exchange has 1 queue (you can send directly to the queue, but all that is really doing is going via a default exchange, I prefer to be explicit). All consumers connect to the single queue and read off tasks in turn. You should set the queue to require messages to be ACKed before removing them. That way if a process dies it should be returned to the queue and picked up by the next consumer/worker.