RabbitMQ has non-visible queues messages - rabbitmq

For the past couple days my RMQ server has been accumulating messages that never get cleared from the queue. I'm using RMQ as a celery task broker, and the messages are accumulating in the main celery worker queue:
The queue intermittently switches to an active state when tasks are piped in, processes them, and then goes idle again. Those 21 messages, however, are there permanently. When I click through to the celery queue UI and try to look up these messages for inspection, I'm told that my queue is empty:
I've tried inspecting the messages via the URL endpoint https://<user>:<password>#<host url>/api/queues/%2f/celery. I can see in the message_stats section that there are 21 messages, but I can't inspect the messages individually:
'message_stats': {'ack': 274597,
'ack_details': {'rate': 0.0},
'deliver': 274634,
'deliver_details': {'rate': 0.0},
'deliver_get': 274634,
'deliver_get_details': {'rate': 0.0},
'deliver_no_ack': 0,
'deliver_no_ack_details': {'rate': 0.0},
'get': 0,
'get_details': {'rate': 0.0},
'get_empty': 0,
'get_empty_details': {'rate': 0.0},
'get_no_ack': 0,
'get_no_ack_details': {'rate': 0.0},
'publish': 274618,
'publish_details': {'rate': 0.0},
'redeliver': 16,
'redeliver_details': {'rate': 0.0}},
'messages': 21,
'messages_details': {'rate': 0.0},
I have two questions:
Are there any other RMQ API endpoints that I should be using for individual-level message inspection? I'm trying to avoid firing up a flower server just for debugging purposes.
Are there any obvious reasons why these 21 tasks are not inspectable via the UI, or why they seem to just idly accumulate in this queue?

Those remaining messages were are unacked, which means that they were sent to some consumer that has yet to acknowledge them.
If you manually close the queue connections or the consumers that have failed to acknowledge them from the management UI, they should revert to the ready state and be ready to be consumed again.

Related

rabbitmq round-robin consumption by celery workers

I'm using a RabbitMQ broker, and there is a Celery worker which is subscribed to the broker. From my testing, it looks like RabbitMQ treats messages in FIFO order. Because one queue has been populated, then another, then another, and so on, my worker consumes all the messages from queue 1, and only moves on to queue 2 once it is done with queue 1.
Is it possible to change this behavior? I would like the Celery worker to consume in a round-robin style instead, ie consume a message from queue 1, then a message from queue 2, and so on, only coming back to queue 1 once a message has been consumed from each of the other queues.
Yes, you have to reduce your prefetch_count to 1 so only 1 message is fetched at a time. In Celery you can archive this by setting CELERYD_PREFETCH_MULTIPLIER to 1. You may also want to set task_acks_late = True, make sure you read the documentation on both.
from the Celery docs:
To disable prefetching, set worker_prefetch_multiplier to 1. Changing
that setting to 0 will allow the worker to keep consuming as many
messages as it wants.

RabbitMQ take one message from queue on launch

I publish many messages to my queue in RabbitMQ when worker is off.
Then i start my worker and he takes all messages from queue and he is working fine but when i stop my worker proces before he ends, all messages are deleted from queue.
If i start second the same worker he has no messages to take.
$channel->queue_declare($action, false, false, false, false);
$channel->basic_qos(null, 1, null);
$channel->basic_consume($action, '', false, true, false, false, $callback);
How can i do to my worker takes only one message on start from queue and when worker stops and starts he continue takes messages from queue where he stops ?
If there are no messages left to take the messages have already been acknowledged.
"Auto acknowledgements" should be disabled so that an acknowledgement must be explicitly sent back manually (ie. after all work is done) to complete consuming the message.
$no_ack = false; # If no_ack is true the server does NOT expect a
# basic.ack response and will dequeue the message immediately!
$channel->basic_consume($action, '', false, $no_ack, false, false, $callback);
The PHP tutorial on the RabbitMQ site shows how to send an ack from within the callback.
As long as the worker is killed before the ack has been sent the message will be made available again when the connection is closed - and can be picked up by other consumers. See Message Acknowledgements.
Specifically, wrt the no-ack bit:
If this field is set (true) the server does not expect acknowledgements for messages. That is, when a message is delivered to the client the server assumes the delivery will succeed and immediately dequeues it. This functionality may increase performance but at the cost of reliability. Messages can get lost if a client dies before they are delivered to the application.

RabbitMQ redelivery backoff of rejected requeued messages?

I have a simple service that subscribes to messages from RabbitMQ and writes them down to a datastore. Sometimes this datastore is unavailable for some short periods of time (sometimes seconds but sometimes minutes). If this happens we do a basic.reject on the failed message with requeue set to true. While this works the message seems to get redelivered immediately. I'd like RabbitMQ to gracefully backoff the redelivery. For example first try to redeliver "immediately" then after 2, 3, 5, 8, 13 seconds etc. Is this possible and if so how?
in addition to what Louis F. posted as a commented, check out the Delayed Message Exchange plugin https://github.com/rabbitmq/rabbitmq-delayed-message-exchange/
You could set up a dead-letter exchange using the delayed message exchange type, and have this very easily accomplished without having to do a bunch of configuration and use TTLs like that.

How does RabbitMQ send messages to consumers?

I am a newbie to RabbitMQ, hence need guidance on a basic question:
Does RabbitMQ send messages to consumer as they arrive?
OR
Does RabbitMQ send messages to consumer as they become available?
At message consumption endpoint, I am using com.rabbitmq.client.QueueingConsumer.
Looking at the sprint client source code, I could figure out that
QueueingConsumer keeps listening on socket for any messages the broker sends to it
Any message that is received is parsed and stored as Delivery in a LinkedBlockingQueue encapsulated inside the QueueingConsumer.
This implies that even if the message processing endpoint is busy, messages will be pushed to QueueingConsumer
Is this understanding right?
TLDR: you poll messages from RabbitMQ till the prefetch count is exceeded in which case you will block and only receive heart beat frames till the fetch messages are ACKed. So you can poll but you will only get new messages if the number of non-acked messages is less than the prefetch count. New messages are put on the QueueingConsumer and in theory you should never really have much more than the prefetch count in that QueueingConsumer internal queue.
Details:
Low level wise for (I'm probably going to get some of this wrong) RabbitMQ itself doesn't actually push messages. The client has to continuously read the connections for Frames based on the AMQP protocol. Its hard to classify this as push or pull but just know the client has to continuously read the connection and because the Java client is sadly BIO it is a blocking/polling operation. The blocking/polling is based on the AMQP heartbeat frames and regular frames and socket timeout configuration.
What happens in the Java RabbitMQ client is that there is thread for each channel (or maybe its connection) and that thread loops gathering frames from RabbitMQ which eventually become commands that are put in a blocking queue (I believe its like a SynchronousQueue aka handoff queue but Rabbit has its own special one).
The QueueingConsumer is a higher level API and will pull commands off of that handoff queue mentioned early because if commands are left on the handoff queue it will block the channel frame gathering loop. This is can be bad because timeout the connection. Also the QueueingConsumer allows work to be done on a separate thread instead of being in the same thread as the looping frame thread mentioned earlier.
Now if you look at most Consumer implementations you will probably notice that they are almost always unbounded blocking queues. I'm not entirely sure why the bounding of these queues can't be a multiplier of the prefetch but if they are less than the prefetch it will certainly cause problems with the connection timing out.
I think best answer is product's own answer. As RMQ has both push + pull mechanism defined as part of the protocol. Have a look : https://www.rabbitmq.com/tutorials/amqp-concepts.html
Rabbitmq mainly uses Push mechanism. Poll will consume bandwidth of the server. Poll also has time gaps between each poll. It will not able to achieve low latency. Rabbitmq will push the message to client once there are consumers available for the queue. So the connection is long running. ReadFrame in rabbitmq is basically waiting for incoming frames

How can I recover unacknowledged AMQP messages from other channels than my connection's own?

It seems the longer I keep my rabbitmq server running, the more trouble I have with unacknowledged messages. I would love to requeue them. In fact there seems to be an amqp command to do this, but it only applies to the channel that your connection is using. I built a little pika script to at least try it out, but I am either missing something or it cannot be done this way (how about with rabbitmqctl?)
import pika
credentials = pika.PlainCredentials('***', '***')
parameters = pika.ConnectionParameters(host='localhost',port=5672,\
credentials=credentials, virtual_host='***')
def handle_delivery(body):
"""Called when we receive a message from RabbitMQ"""
print body
def on_connected(connection):
"""Called when we are fully connected to RabbitMQ"""
connection.channel(on_channel_open)
def on_channel_open(new_channel):
"""Called when our channel has opened"""
global channel
channel = new_channel
channel.basic_recover(callback=handle_delivery,requeue=True)
try:
connection = pika.SelectConnection(parameters=parameters,\
on_open_callback=on_connected)
# Loop so we can communicate with RabbitMQ
connection.ioloop.start()
except KeyboardInterrupt:
# Gracefully close the connection
connection.close()
# Loop until we're fully closed, will stop on its own
connection.ioloop.start()
Unacknowledged messages are those which have been delivered across the network to a consumer but have not yet been ack'ed or rejected -- but that consumer hasn't yet closed the channel or connection over which it originally received them. Therefore the broker can't figure out if the consumer is just taking a long time to process those messages or if it has forgotten about them. So, it leaves them in an unacknowledged state until either the consumer dies or they get ack'ed or rejected.
Since those messages could still be validly processed in the future by the still-alive consumer that originally consumed them, you can't (to my knowledge) insert another consumer into the mix and try to make external decisions about them. You need to fix your consumers to make decisions about each message as they get processed rather than leaving old messages unacknowledged.
If messages are unacked there are only two ways to get them back into the queue:
basic.nack
This command will cause the message to be placed back into the queue and redelivered.
Disconnect from the broker
This action will force all unacked messages from this channel to be put back into the queue.
NOTE: basic.recover will try to republish unacked messages on the same channel (to the same consumer), which is sometimes the desired behaviour.
RabbitMQ spec for basic.recover and basic.nack
The real question is: Why are the messages unacknowledged?
Possible scenarios to cause unacked messages:
Consumer fetching too many messages, then not processing and acking them quickly enough.
Solution: Prefetch as few messages as appropriate.
Buggy client library (I have this issue currently with pika 0.9.13. If the queue has a lot of messages, a certain number of messages will get stuck unacked, even hours later.
Solution: I have to restart the consumer several times until all unacked messages are gone from the queue.
All the unacknowledged messages will go to ready state once all the workers/consumers are stopped.
Ensure all workers are stopped by confirming with a grep on ps aux output, and stopping/killing them if found.
If you are managing workers using supervisor, which shows as worker is stopped, you may want to check for zombies. Supervisor reports the worker to be stopped but still you will find zombie processes running when grepped on ps aux output. Killing the zombie processes will bring messages back to ready state.