I'm trying to implement exponential backoff with a RabbitMQ headers exchange, and I had each queue be bound with x-match: "all" and x-retry-count: [RETRY COUNT FOR THIS LEVEL]. However, what I found was that if I try to retry a task and I have backoff queues for 100, 200, 400, and 800 millisecond wait time, each task I send to the retry exchange somehow matches every queue.
As you can see in the picture below, for the 200ms backoff queue, I'm binding the header x-retry-count: 2, but a task with the header x-retry-count: 1 is matching it (and the x-retry-count values for all other queues in the backoff exchange too). Why would that be?
Found what was going on. x-retry-count doesn't count as a header that can be matched on because it starts with x-; naming the header retry-count does work
Related
I'have this configuration of MassTransit (RabbitMQ) on my consumer. The retry policy is to resend messages when there is any Timeout for a max of 5 intervals.
ep.ConfigureConsumer<T>(ctx,
c => {
c.UseMessageRetry(r =>
{
r.Interval(5, TimeSpan.FromMinutes(30));
r.Handle<Exception>(m => m.GetFullMessage().Contains("Timeout"));
r.Ignore(typeof(ApplicationException));
});
c.UseConcurrentMessageLimit(1);
});
ep.Bind(massTransitSettings.ExchangeName, y =>
{
y.Durable = true;
y.AutoDelete = false;
y.ExchangeType = massTransitSettings.ExchangeType;
y.RoutingKey = massTransitSettings.RoutingKey;
});
});
Everything works fine except when, sometimes, two consecutive timeouts occur of the same message;
Here a part of log file for message id: fa300000-56b5-0050-e012-08d9f39bc934
19/02/2022 12:34:28.699 - PAYMENT INCOMING Message -> Sql timeout, retry in 30 min
19/02/2022 13:04:28.572 - PAYMENT INCOMING Message -> 2th Sql timeout, retry in 30 min
19/02/2022 13:34:59.722 - PAYMENT INCOMING Message -> Process ok
19/02/2022 13:35:13.983 - PAYMENT HANDLED Message -> Message handled (74 secs)
19/02/2022 13:35:31.606 - PAYMENT INCOMING Message -> This should not incoming, causing an Application Exception (message in _error queue)
Where am I wrong?
You can't use message retry for intervals that long, since RabbitMQ has a default consumer timeout of 30 minutes. Per the documentation:
MassTransit retry filters execute in memory and maintain a lock on the message. As such, they should only be used to handle short, transient error conditions. Setting a retry interval of an hour would fall into the category of bad things. To retry messages after longer waits, look at the next section on redelivering messages.
For longer retry intervals, use message redelivery (sometimes called second-level retry). If you're using RabbitMQ, the built-in delayed redelivery requires the delayed-exchange plug-in. Or you can use an external message scheduler.
I'm trying to understand how rabbitmq works with multiple consumer and prefetch_count.
I have three consumers consuming on the same queue and all of these consumers have configured with the QoS prefetch_count = 200.
Now assuming at a certain point I have unlimited backlog messages in the queue and consumers A,B,C are connecting to the queue, would A get message 1-200, B get 201-400, C get 401-600 from the queue simultaneously? That seems like message 1, 201, 401 got processed at the first place compared to the rest. Somehow I don't want that, I'd like to have these messages being processed sequentially.
If that's the case I guess this implies that the messages may be processed disordered based on how consumers are setup, even though the queue follows FIFO.
Or should I set prefetch_count = 1 to make sure of REAL FIFO?
Edited:
Just set up a local env of rabbitmq and experimented a bit. I used a producer to bombard a queue with numbers 0 to 100000 sequentially to accumulate backlog messages in a queue. Later on, I had two consumers A, B consuming messages from that queue with prefetch_count = 200.
From what I observed, A got 0-199 and B got numbers 200-399 at very beginning. However, A started getting numbers {401, 403, 405, 406 ...} and B gets {400, 402, 404, ...} after that.
I guess A and B got non-skipped messages at the beginning was because I wasn't strictly spinning up these two consumers simultaneously. But the following pattern explains well how prefetch_count works. It doesn't necessarily send consumers consecutive messages(I knew it's processed in a round robin fashion, but I guess this is more intuitive with an experiment). There's no guarantee in what order the messages will be processed if using prefetch_count.
I'm trying to achieve a reject/delay loop using Rabbit's operations, i.e. :
I Have:
Main Queue with Main Exchange binded to it and DLX to StandBy Exchange.
StandBy Queue with StandBy Exchange binded to it with 60s TTL and DLX to Main Exchange
Basically I want to:
Consume from Main Queue
Rejects message (under certain circunstances)
Will get redirect it to StandBy Queue because rejection
When TTL expire, re-queue message to Main Queue.
The steps 1, 2 and 3 are OK but the last one drop the message instead of re-queue it.
Some theory from RabbitMQ's docs what I used to design this was:
Messages from a queue can be 'dead-lettered'; that is, republished to another exchange when any of the following events occur:
The message is rejected (basic.reject or basic.nack) with requeue=false,
The TTL for the message expires; or
The queue length limit is exceeded.
...
It is possible to form a cycle of message dead-lettering. For instance, this can happen when a queue dead-letters messages to the default exchange without specifiying a dead-letter routing key. Messages in such cycles (i.e. messages that reach the same queue twice) will be dropped if there was no rejections in the entire cycle.
The theory says that it should be re-queue because it has a rejection in the cycle from step #2, so, can you help me figure it out why it drops the message instead of re-queue it?
UPDATE:
The version I was targeting was 2.8.4 and it seems that in that moment the if there was no rejections in the entire cycle wasn't in the uses cases, anyway you can check this yourselves RabbitMQ 2.8.x Docs
I'll accept #george answer as the original objective can be achieved by this code.
Rafael, I am not sure what client you are using but with the Pika client in Python you could implement something like this. For simplicity I only use one exchange. Are you sure you are setting the exchange and the routing-key properly?
sender.py
import sys
import pika
connection = pika.BlockingConnection(pika.ConnectionParameters(
'localhost'))
channel = connection.channel()
channel.exchange_declare(exchange='cycle', type='direct')
channel.queue_declare(queue='standby_queue',
arguments={
'x-message-ttl': 10000,
'x-dead-letter-exchange': 'cycle',
'x-dead-letter-routing-key': 'main_queue'})
channel.queue_declare(queue='main_queue',
arguments={
'x-dead-letter-exchange': 'cycle',
'x-dead-letter-routing-key': 'standby_queue'})
channel.queue_bind(queue='main_queue', exchange='cycle')
channel.queue_bind(queue='standby_queue', exchange='cycle')
channel.basic_publish(exchange='cycle',
routing_key='main_queue',
body="message body")
connection.close()
receiver.py
import sys
import pika
def callback(ch, method, properties, body):
print "Processing message: {}".format(body)
# replace with condition for rejection
if True:
print "Rejecting message"
ch.basic_nack(method.delivery_tag, False, False)
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.basic_consume(callback, queue='main_queue')
channel.start_consuming()
I have a http server which receives some messages and must reply 200 when a message is successfully stored in a queue and 500 is the message is not added to the queue.
I would like rabbitmq to refuse my messages when the queue reach a size limit.
How can I do it?
actually you can't configure RabbitMq is such a way. but you may programatically check queue size like:
`DeclareOk queueOkStatus = channel.queueDeclare(queueOutputName, true, false, false, null);
if(queueOkStatus.getMessageCount()==0){//your logic here}`
but be careful, because this method returns number of non-acked messages in queue.
If you want to be aware of this , you can check Q count before inserting. It sends request on the same channel. Asserting Q returns messageCount which is Number of 'Ready' Messages. Note : This does not include the messages in unAcknowledged state.
If you do not wish to be aware of the Q length, then as specified in 1st comment of the question:
x-max-length :
How many (ready) messages a queue can contain before it starts to drop them from its head.
(Sets the "x-max-length" argument.)
we have setup some workflow environment with Rabbit.
It solves our needs but I like to know if it is also good practise to do it like we do for scheduled tasks.
Scheduling means no mission critical 100% adjusted time. So if a job should be retried after 60 seconds, it does mean 60+ seconds, depends on when the queue is handled.
I have created one Q_WAIT and made some headers to transport settings.
Lets do it like:
Worker is running subscribed on Q_ACTION
If the action missed (e.g. smtp server not reachable)
-> (Re-)Publish the message to Q_WAIT and set properties.headers["scheduled"] = time + 60seconds
Another process loops every 15 seconds through all messages in Q_WAIT by method pop() and NOT by subscribed
q_WAIT.pop(:ack => true) do |delivery_info,properties,body|...
if (properties.headers["scheduled"] has reached its time)
-> (Re-)Publish the message back to Q_ACTION
ack(message)
after each loop, the connection is closed so that the NOT (Re-)Published are left in Q_WAIT because they were not acknowledged.
Can someone confirm this as a working (good) practise.
Sure you can use looping process like described in original question.
Also, you can utilize Time-To-Live Extension with Dead Letter Exchanges extension.
First, specify x-dead-letter-exchange Q_WAIT queue argument equal to current exchange and x-dead-letter-routing-key equal to routing key that Q_ACTION bound.
Then set x-message-ttl queue argument set or set message expires property during publishing if you need custom per-message ttl (which is not best practice though while there are some well-known caveats, but it works too).
In this case your messages will be dead-lettered from Q_WAIT to Q_ACTION right after their ttl expires without any additional consumers, which is more reliable and stable.
Note, if you need advanced re-publish logic (change message body, properties) you need additional queue (say Q_PRE_ACTION) to consume messages from, change them and then publish to target queue (say Q_ACTION).
As mentioned here in comments I tried that feature of x-dead-letter-exchange and it worked for most requirements. One question / missunderstandig is TTL-PER-MESSAGE option.
Please look on the example here. From my understanding:
the DLQ has a timeout of 10 seconds
so first message will be available on subscriber 10 seconds after publishing.
the second message is posted 1 second after the first with a message-ttl (expiration) of 3 seconds
I would expect the second message should be prounounced after 3 seconds from publishing and before first message.
But it did not work like that, both are available after 10 seconds.
Q: Shouldn't the message expiration overrule the DLQ ttl?
#!/usr/bin/env ruby
# encoding: utf-8
require 'bunny'
B = Bunny.new ENV['CLOUDAMQP_URL']
B.start
DELAYED_QUEUE='work.later'
DESTINATION_QUEUE='work.now'
def publish
ch = B.create_channel
# declare a queue with the DELAYED_QUEUE name
q = ch.queue(DELAYED_QUEUE, :durable => true, arguments: {
# set the dead-letter exchange to the default queue
'x-dead-letter-exchange' => '',
# when the message expires, set change the routing key into the destination queue name
'x-dead-letter-routing-key' => DESTINATION_QUEUE,
# the time in milliseconds to keep the message in the queue
'x-message-ttl' => 10000,
})
# publish to the default exchange with the the delayed queue name as routing key,
# so that the message ends up in the newly declared delayed queue
ch.basic_publish('message content 1 ' + Time.now.strftime("%H-%M-%S"), "", DELAYED_QUEUE, :persistent => true)
puts "#{Time.now}: Published the message 1"
# wait moment before next publish
sleep 1.0
# puts this with a shorter ttl
ch.basic_publish('message content 2 ' + Time.now.strftime("%H-%M-%S"), "", DELAYED_QUEUE, :persistent => true, :expiration => "3000")
puts "#{Time.now}: Published the message 2"
ch.close
end
def subscribe
ch = B.create_channel
# declare the destination queue
q = ch.queue DESTINATION_QUEUE, durable: true
q.subscribe do |delivery, headers, body|
puts "#{Time.now}: Got the message: #{body}"
end
end
subscribe()
publish()
sleep