How can RabbitMQ's (Pika) unack redelivery time be increased? - rabbitmq

How can RabbitMQ's (Pika) unack redelivery time be increased? We have a long running task that takes longer than the default redelivery time, and so when it tries to ack once the processing is finished, it's causing an unable to send ack error.

I'm assuming you're running into the channel ack timeout that is discussed in full here - https://github.com/rabbitmq/rabbitmq-server/pull/2990#issuecomment-1002089576
Long-running tasks like yours should publish the message being worked on to another queue representing in-progress work (perhaps a queue per worker), then ack the original message. When the work completes, consume and ack the message in the in-progress queue, and re-consume from the original queue.
Yes it is more work but the channel ack timeout was introduced for a very good reason. If you know you won't run into the issue described, you can increase the timeout in RabbitMQ's configuration.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

Related

How Masstransit deals with slow consumer

I am using Masstransit 7.0.0 with consumer to process the messages from RabbitMQ. I have observed when the consumer is slow processing the message or when I put the debug point in visual studio for consumer, after some time my consumer receives the message again. My assumption is Masstransit thought consumer crashed and sent the message again.
In my case, my consumer is some times slow processing the message as it talks to external endpoints but the processing is around 30-40 seconds and not minutes.
Could you please explain How masstransit knows the consumer is crashed( or whatever it is) to resend the message? If it waits for certain amount of time waiting for ack, then Is there setting anywhere just to increase this time little bit specific to queue/consumer?
Thanks
With RabbitMQ, there is no timeout after which the message would be redelivered. If you are stopping the consumer in the debugger, you are likely causing the RabbitMQ socket connection to close. When the connection is closed, MassTransit will reconnect at which point any messages not previously acknowledged would be redelivered.
In normal operation, consumers that take 30, 60, 90, even 300 seconds are rarely an issue as the broker connection is maintained.

rabbitmq quorum Queue ensure retry if data is lost

I read that Quorum Queue does not support ttl for both messages and Queues.
The producer in my system maintains state in database with message "READY_TO_SUBMIT" and then sends it to cluster of Quorum queue. In case the rabbitmq Queue crashes or for any reason the message is not delivered to consumer. How will my producer know that it should retry the message again.
In case of mirrored queue I assume I can put a ttl and then after the ttl gets over my producer can retry again if that status is not updated by consumer for "READY_TO_SUBMIT" to "SUBMITTED".
Your producers absolutely must use publisher confirms correctly: https://www.rabbitmq.com/confirms.html
Please see the detailed tutorial here: https://www.rabbitmq.com/tutorials/tutorial-seven-java.html
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

Rabbitmq: Unacked message not going away after broker restart

We have observed the following behavior of RabbitMQ and are trying to understand if it is correct and how to resolve it.
Scenario:
A (persistent) message is delivered into a durable queue
The (single) Consumer (Spring-AMQP) takes the message and starts processing => Message goes from READY to UNACK
Now the broker is shut down => Client correctly reports "Channel shutdown"
The consumer finishes the processing, but can not acknowledge the message as the broker is still down
Broker is started again => Client reconnects
As a result, one message remains unack'ed forever (or until the client is restarted).
Side note: In the Rabbit Admin UI, I can see that two channels are existing now. The "dead" one that was created before the broker restart, containing the unacked message and a new one that is healthy.
Is this behavior expected to be like that? It seems to me "correct" in the way, that RabbitMQ can not know after the broker restart, whether the message processing was completed or not. But what solution would exist than to get that unacked message back into the queue and to heal the system without restarting the consumer process?
The RabbitMQ team monitors this mailing list and only sometimes answers questions on StackOverflow.
Is this behavior expected to be like that? It seems to me "correct" in the way, that RabbitMQ can not know after the broker restart, whether the message processing was completed or not.
Yes, you are observing expected behavior. RabbitMQ will re-enqueue the message once it determines that the consumer is really dead. Since your consumer re-connects with what must be the same consumer tag as before, it is up to that process to ack or nack the message.

Behavior of channels in "confirm" mode with RabbitMQ

I've got some trouble understanding the confirm of RabbitMQ, I see the following explanation from RabbitMQ:
Notes
The broker loses persistent messages if it crashes before said
messages are written to disk. Under certain conditions, this causes
the broker to behave in surprising ways. For instance, consider this
scenario:
a client publishes a persistent message to a durable queue
a client consumes the message from the queue (noting that the message is persistent and the queue durable), but doesn't yet ack it,
the broker dies and is restarted, and
the client reconnects and starts consuming messages.
At this point, the client could reasonably assume that the message
will be delivered again. This is not the case: the restart has caused
the broker to lose the message. In order to guarantee persistence, a
client should use confirms. If the publisher's channel had been in
confirm mode, the publisher would not have received an ack for the
lost message (since the consumer hadn't ack'd it and it hadn't been
written to disk).
Then I am using this http://hg.rabbitmq.com/rabbitmq-java-client/file/default/test/src/com/rabbitmq/examples/ConfirmDontLoseMessages.java to do some basic test and verify the confirm, but get some weird results:
The waitForConfirmsOrDie method doesn't block the producer, which is different from my expectation, I suppose the waitForConfirmsOrDie will block the producer until all the messages have been ack'd or one of them is nack'd.
I remove the channel.confirmSelect() and channel.waitForConfirmsOrDie() from publisher, and change the consumer from auto ack to manual ack, I publish all messages to the queue and consume messages one by one, then I stop the rabbitmq server during the consuming process, what I expect now is the left messages will be lost after the rabbitmq server is restarted, because the channel is not in confirm mode, but I still see all other messages in the queue after the server restart.
Since I am new to RabbitMQ, can anyone tells me where is my problem of the confirm understanding?
My understanding is that "Channel Confirmation" is for Broker confirms it successfully got the message from producer, regardless of consumer ack this message or not. Depending on the queue type and message deliver mode, see http://www.rabbitmq.com/confirms.html for details,
the messages are confirmed when:
it decides a message will not be routed to queues
(if the mandatory flag is set then the basic.return is sent first) or
a transient message has reached all its queues (and mirrors) or
a persistent message has reached all its queues (and mirrors) and been persisted to disk (and fsynced) or
a persistent message has been consumed (and if necessary acknowledged) from all its queues
Old question but oh well..
I publish all messages to the queue and consume messages one by one, then I stop the rabbitmq server during the consuming process, what I expect now is the left messages will be lost after the rabbitmq server is restarted, because the channel is not in confirm mode, but I still see all other messages in the queue after the server restart.
This is actually how it should work, IF the persistence is enabled. If the server crashes or something else goes wrong, the messages cannot be confirmed, and thus, won't be removed from the queue.
Messages will only be removed from the queue if they are confirmed to be handled, or the broker didn't yet write it to memory or disk before the server crashed.
Confirming and acknowledging can be set off if wanted, and the producer won't be waiting for the acks. I cannot find the exact command for it right now, but it does exist.
More on the acks and confirms: https://www.rabbitmq.com/reliability.html

MSMQ error Insufficient Resources transactinoal dead-letter queue is filling up

Was running a system that uses multiple msmq's on the same machine, ran fine for about a day then I get the error about Insufficient resources when trying to post a message to one of the queues. Investigated via this blog post:
http://blogs.msdn.com/b/johnbreakwell/archive/2006/09/18/761035.aspx
I don't see anything in there about investigating the dead-letter queue.
Looked at the queues, realized the only queue that had any messages left in it was the transactional dead-letter queue, purged it, now the app(s) run again and can post messages to private queues.
I guess my main question is explain to me the trans dead-letter queue and how I can manage it.
thanks.
There will be nothing in the blog about the Dead Letter Queue as it is just a queue, like any other.
You have messages in the DLQ because you have enabled Negative Source Journaling in your application. An error condition has meant the original messages have died and ended up in the DLQ, as requested by your application. Ideally, if you are using the DLQ, you have a separate thread looking for messages in it.
You should have monitoring enabled on the total number of messages in the server so that you get an early alert when messages start piling up somewhere unexpectedly.
Cheers
John Breakwell
Ran into this issue today with our MSMQ/NServiceBus setup. From what I understand, manual queue purges will move messages to the Transaction Dead Messages queue. Clearing this queue out resolved the problem for us.