I want a consumer to perform some actions every time that a message is received. Must the consumer be running 24/7 "listening" to the queue or it can be run only when an appropiate message is received?
I am not sure your question makes sense. The message can only be received from a queue by the consumer of a queue. To know if a message is in the queue one must look at the queue. The only way to do that is to be a consumer.
If you really want you could have a script that ran the commandline interface for the management plugin. That could poll the queue and when it had a size of more than one could start a program that would run a consumer to consume from the queue.
None of this makes any sense. If it is just sitting waiting for the queue and doing nothing else it is hardly consuming any resources so I do not see what the problem would be running a consumer 24/7.
Of course the consumer doesn't have to run 24/7, thats part of the point of MQ. It is asynchronous. The consumer does not have to be running when the producer writes to the queue. You could therefore have a scheduled task that runs your consumer periodically to check and process messages from the queue. But I do not think that is what you want.
The whole point of listening is: do nothing until a message comes, process the message, do nothing until the next message. This is what you want the first sentence of your question. Why the problem with listening?
Related
I have a publisher that sends messages to a consumer that moves a motor.
The motor has a work queue which I cannot access, and it works slower than the rate of the incoming messages, so I'm trying to control the traffic on the consumer.
To keep updated and relevant data coming to the motor without the queue filling up and creating a traffic jam, I set the RabbitMQ queue size limit to 5 and basicQos to 1.
The idea is that the RabbitMQ queue will drop the old messages when it is filled up, so the newest commands are at the front of the queue.
Also by setting basicQos to 1 I ensure that the consumer doesn't grab all messages from the queue and bombards the motor at once, which is exactly what i'm trying to avoid since I can't do anything once the command was sent to the motor.
This way the consumer takes messages from the queue one by one, while new messages replace the old ones on the queue.
Practically this moves the bottleneck to the RabbitMQ queue instead of the motor's queue.
I also cannot check the motor's work queue, so all traffic control must be done on the consumer.
I added messageId and tested, and found out many messages are still coming and going long after the publisher is being shut down.
I'm expecting around 5 messages after shutdown since that's the size of the queue, but i'm getting hundreds.
I also added a few seconds of sleep inside the callback to make sure this isn't the robot queue that's acting up, but i'm still getting many messages after shutdown, and I can see in the logs the callback is being called every time so it's definitely still getting messages from somewhere.
Please help.
Thanks.
Moving the acknowledgment to the end of the callback solved the problem.
I'm guessing that by setting basicQos to 1 it did execute the callback for each message one after another, but in the background it kept grabbing messages from the queue.
So even when the publisher was shutdown, the consumer still had messages that were taken from the queue in it, and those messages were the ones that I saw being executed.
I am using a RabbitMQ Server (v3.8.9) with Java clients.
Use case is:
Our Backend creates messages for different clients. We send them out to their respective Endpoints.
1 Producer -> Outbound Queue -> 1 Consumer
The producer creates messages for n clients
Which the consumer should send out to the clients' endpoints
Messages must be kept in the correct order regarding each client
Works fine, unless all clients are up and running. Problem: If one client becomes unavailable, we need to have a bulletproof retry mechanism for that.
Say:
Wait 1 Minute and try again
All following messages must NOT be delivered before the first failed one and kept in the correct order
If a retry works, then ALL other messages should be send to the client immediately
As you can see, it is not a solution to just "supsend" the consumer, because it should still deliver msg to the other (alive) clients. Due to application limitations and a dynamic number of clients, we cannot spawn one consumer per client queue.
My best approach right now is to dynamically create one queue per client, which are then routed to a single outbound queue. If one msg to one client cannot be delivered by the consumer, I would like to "pause" the clients queue for x minutes. An API call like "queue_pause('client_q1', '5 Minutes')" would help. But even then I have to deal with the other, already routed messages to that particular client and keep them in the correct order...
Any better ideas?
I think the key here is that a single consumer script can consume from multiple queues. So if I'm understanding correctly, you could model this as:
Each client has its own queue. These could be created by the consumer script when it starts up, or by a back-end process when a new client is created.
The consumer script subscribes to each queue separately
When a message is received, the consumer tries to send it immediately to the client; if it succeeds, it is manually acknowledged with basic.ack, and the consumer is ready to send the next message to that client.
When a message cannot be delivered to the client, it is requeued (basic.nack or basic.reject with requeue=1), retaining its position in the client's queue.
The consumer then needs to pause consuming from that particular queue. Depending on how its written, that could be as simple as a sleep in that particular thread, but if that's not practical, you can effectively "pause" the subscription to the queue:
Cancel the subscription to that queue, leaving other subscriptions in tact
Store the queue name and the retry time in an appropriate variable
If the consumer script is implemented with an event/polling loop, check the list of "paused" subscriptions each time around that loop; if the retry time has been reached, re-subscribe.
Alternatively, if the library / framework supports it, register a delayed event that will fire at the appropriate time and re-subscribe the queue. The exact mechanics of this depend on the technologies you're using.
All the other subscriptions will continue, so messages to other clients will be delivered. The queue with no subscribers will retain the messages for the offline client in order until the consumer script starts consuming them again.
I have a RabbitMQ setup in which jobs are sent to an exchange, which passes them to a queue. A consumer carries out the jobs from the queue correctly in turn. However, these jobs are long processes (several minutes at least). For scalability, I need to be able to have multiple consumers picking a job from the top of the queue and executing it.
The consumer is running on a Heroku dyno called 'queue'. When I scale the dyno, it appears to create additional consumers for each dyno (I can see these on the RabbitMQ dashboard). However, the number of tasks in the queue is unchanged - the extra consumers appear to be doing nothing. Please see the picture below to understand my setup.
Am I missing something here?
Why are the consumers showing as 'idle'? I know from my logs that at least one consumer is actively working through a task.
How can my consumer utilisation be 0% when at least one consumer is definitely working hard.
How can I make the other three consumers actually pull some jobs from the queue?
Thanks
EDIT: I've discovered that the round robin dispatching is actually working, but only if the additional consumers are already running when the messages are sent to the queue. This seems like counterintuitive behaviour to me. If I saw a large queue and wanted to add more consumers, the added consumers would do nothing until more items are added to the queue.
To pick out the key point from the other answer, the likely culprit here is pre-fetching, as described under "Consumer Acknowledgements and Publisher Confirms".
Rather than delivering one message at a time and waiting for it to be acknowledged, the server will send batches to the consumer. If the consumer acknowledges some but then crashes, the remaining messages will be sent to a different consumer; but if the consumer is still running, the unacknowledged messages won't be sent to any new consumer.
This explains the behaviour you're seeing:
You create the queue, and deliver some messages to it, with no consumer running.
You run a single consumer, and it pre-fetches all the messages on the queue.
You run a second consumer; although the queue isn't empty, all the messages are marked as sent to the first consumer, awaiting acknowledgement; so the second consumer sits idle.
A new message arrives in the queue; it is distributed in round-robin fashion to the second consumer.
The solution is to specify the basic.qos option in the consumer. If you set this to 1, RabbitMQ won't send a message to a consumer until it has acknowledged the previous message; multiple consumers with that setting will receive messages in strictly round-robin fashion.
I am not familiar to Heroku, so I don't know how Heroku worker build rabbitMQ consumer, I just have a quick view over Heroku document.
Why are the consumers showing as 'idle'?
I think your mean the queue is 'idle'? Because the queue's state is about the queue's traffic, it just means there is not on-doing job for the queue's job thread. And it will become 'running' when a message is published in the queue.
How can my consumer utilisation be 0% when at least one consumer is definitely working hard.
The same as queue state, from official explanation, consumer utilisation too low means:
There were more consumers
The consumers were faster
The consumers had a higher prefetch count
In your situation, prefetch_count = 0 means no limits on prefetch, so it's too large. And Messages.total = Messages.unacked = 78 means your consumer is too slow, there are two many messages have been processed by consumer.
So if your message rate is not large enough, the state and consumer utilisation field of the queue is useless.
If I saw a large queue and wanted to add more consumers, the added consumers would do nothing until more items are added to the queue.
Because these unacked messages have already been prefetched by exist consumers, they will not be consumed by new consumers unless you requeue the unacked messages.
I am using celery and rabbitmq , but due to pushing several task in queue my server memory utilization becomes more than 40% , so that rabbit further will not accepting any task . so i want to delete those message which are already executed , but due to durable behavior of rabbitmq those message not automatically delete, so i want to set some configuration like autoAck=True , so that if message is consumed from celery ,it will delete from rabbitmq queues and also from my server memory. please explain how can we do that .
OK, so while I don't fully understand why you have the problem you have, it is clear what is going on.
A publisher puts a message task in the queue
Your worker process pulls the message and processes it
The message is never actually removed from the queue
This behavior happens when a consumer fails to acknowledge the processing of a message. To confirm, if you look at the RabbitMQ management plug-in, you'll see a whole bunch of unacknowledged messages. These will be unavailable for consumption, but will continue to be held on the server and taking up disk space and memory.
Further, if you do a Basic.Recover, all of these messages will then get dumped back into the queue to be processed again.
This problem is due to incorrect configuration of your consumer. There are two ways to address this:
You can configure the consumer to auto-ack (i.e. acknowledge the message automatically upon receipt). This is done when you declare the consumer (using Basic.Consume). Edit: It looks like this may be the default behavior of Celery.
You can configure your worker process to submit an acknowledgement (using Basic.Ack). Edit: this is done via the acks_late property in Celery.
I want to read the payload, or messageId of unacknowledged messages in a RabbitMQ queue. Is this possible?
The reason I want to do so is I trying to use RabbitMQ dead letter feature to build a cycle to for auto-generating message periodically.
Briefly, create two queues - work queue and delay queue.
Set TTL of the message in delay queue as the time frequency of need to periodically. Can have different messages with different TTL for different job purpose;
put a message into the delay queue. When the message expires, it gets republished into the work queue. The message can sit in the work queue as long as needed until a consumer is up to consume it.
One consumer picks up the message, and process it. If processing succeeds, the consumer needs acknowledge the work queue, and then write the message back to the delay queue; If processing fails (e.g., the thread crashes), no acknowledgement. Then the message would re-appear in the worker queue automatically. Then another consumer can take up the job. When the message sent back to the delay queue gets expired again, it gets republished, then re-consumed by a consumer ...... A cycle constructed, workload distributed.
I want to make sure there is no missing or duplicate messages in the cycle since I do not want missing job or double doing the job at the same time. However, there is tiny tiny chance duplicate messages can happen. Below show the consumer first write back the message to delay queue, and acknowledge the work queue. If the thread crashes right between below two lines, the message would be in the delay queue, and Rabbit republish the message again into work queue. The end up with duplicate messages in the cycle.
channel.basicPublish(DELAY_EXCHANGE, "", null, message.getBytes());
channel.basicAck(delivery.getEnvelope().getDeliveryTag(), false);
To prevent above, I want to add a dog watch logic after above two line:
Check the total number of messages in the cycle (total messages in both queues) to see whether it is equal my expected number (I expected the number less 10);
If the number does not matches, I want to figure out which one is missing or which one is duplicate, then deal with it. I do not care about the sequence of those messages, or the frequency has been disturbed since this is a really really edge case to consider. I can easily retrieve those messages which are ready and requeue them. But the problem is how to deal with those unacknowledged messages?
Thank you very much in advance!
Roy
It's not possible to read unacknowledged messages from other context the original messages was consumed and held as un-aked.