I'm trying to understand how rabbitmq works with multiple consumer and prefetch_count.
I have three consumers consuming on the same queue and all of these consumers have configured with the QoS prefetch_count = 200.
Now assuming at a certain point I have unlimited backlog messages in the queue and consumers A,B,C are connecting to the queue, would A get message 1-200, B get 201-400, C get 401-600 from the queue simultaneously? That seems like message 1, 201, 401 got processed at the first place compared to the rest. Somehow I don't want that, I'd like to have these messages being processed sequentially.
If that's the case I guess this implies that the messages may be processed disordered based on how consumers are setup, even though the queue follows FIFO.
Or should I set prefetch_count = 1 to make sure of REAL FIFO?
Edited:
Just set up a local env of rabbitmq and experimented a bit. I used a producer to bombard a queue with numbers 0 to 100000 sequentially to accumulate backlog messages in a queue. Later on, I had two consumers A, B consuming messages from that queue with prefetch_count = 200.
From what I observed, A got 0-199 and B got numbers 200-399 at very beginning. However, A started getting numbers {401, 403, 405, 406 ...} and B gets {400, 402, 404, ...} after that.
I guess A and B got non-skipped messages at the beginning was because I wasn't strictly spinning up these two consumers simultaneously. But the following pattern explains well how prefetch_count works. It doesn't necessarily send consumers consecutive messages(I knew it's processed in a round robin fashion, but I guess this is more intuitive with an experiment). There's no guarantee in what order the messages will be processed if using prefetch_count.
Related
When using StreamMessageListenerContainer a subscription for a consumer group can be created by calling:
receive(consumer, readOffset, streamListener)
Is there a way to configure the container/subscription so that it will always attempt to re-process any PENDING messages before moving on to polling for new messages?
The goal would be to keep retrying any message that wasn't acknowledged until it succeeds, to ensure that the stream of events is always processed in exactly the order it was produced.
My understanding is if we specify the readOffset as '>' then on every poll it will use '>' and it will never see any messages from the PENDING list.
If we provide a specific message id, then it can see messages from the PENDING list, but the way the subscription updates the lastMessageId is like this:
pollState.updateReadOffset(raw.getId().getValue());
V record = convertRecord(raw);
listener.onMessage(record);
So even if the listener throws an exception, or just doesn't acknowledge the message id, the lastMessageId in pollState is still updated to this message id and won't be seen again on the next poll.
I am following this guide- https://spring.io/guides/gs/messaging-jms/
I have few messages with higher priority that needs to be sent before any other message.
I have already tried following -
jmsTemplate.execute(new ProducerCallBack(){
public Object doInJms(Session session,MessageProducer producer){
Message hello1 =session.createTextMessage("Hello1");
producer.send(hello1, DeliveryMode.PERSISTENT,0,0); // <- low priority
Message hello2 =session.createTextMessage("Hello2");
producer.send(hello1, DeliveryMode.PERSISTENT,9,0);// <- high priority
}
})
But the messages are sent in order as they are in the code.What I am missing here?
Thank you.
There are a number of factors that can influence the arrival order of messages when using priority. The first question would be did you enable priority support and the second would be is there a consumer online at the time you sent the message.
There are many good resources for information on using prioritized messages with ActiveMQ, here is one. Keep in mind that if there is an active consumer online when you sent those messages then the broker is just going to dispatch them as they arrive since and the consumer will of course process them in that order.
I have a http server which receives some messages and must reply 200 when a message is successfully stored in a queue and 500 is the message is not added to the queue.
I would like rabbitmq to refuse my messages when the queue reach a size limit.
How can I do it?
actually you can't configure RabbitMq is such a way. but you may programatically check queue size like:
`DeclareOk queueOkStatus = channel.queueDeclare(queueOutputName, true, false, false, null);
if(queueOkStatus.getMessageCount()==0){//your logic here}`
but be careful, because this method returns number of non-acked messages in queue.
If you want to be aware of this , you can check Q count before inserting. It sends request on the same channel. Asserting Q returns messageCount which is Number of 'Ready' Messages. Note : This does not include the messages in unAcknowledged state.
If you do not wish to be aware of the Q length, then as specified in 1st comment of the question:
x-max-length :
How many (ready) messages a queue can contain before it starts to drop them from its head.
(Sets the "x-max-length" argument.)
Up until now, my RabbitMQ consumer clients have used a prefetch value of 1. I'm looking to increase the value in order to gain performance. If I set the value to 2, will the RabbitMQ server send each consumer 2 messages at once such that I will need to parse the two messages and store the second one in a List until the first is processed and acknowledged? Or will the API handle this behind the scenes?
I'm using the Java AMQP client library:
ConnectionFactory factory = new ConnectionFactory();
...
Connection connection = factory.newConnection();
Channel channel = connection.createChannel();
channel.basicQos(2);
QueueingConsumer consumer = new QueueingConsumer(channel);
channel.basicConsume(CONSUME_QUEUE_NAME, false, consumer);
while (!Thread.currentThread().isInterrupted()) {
try {
QueueingConsumer.Delivery delivery = consumer.nextDelivery();
String m = new String(delivery.getBody(), "UTF-8");
// Will m contain two messages? Will I have to each message and keep track of them within a List?
...
}
The api handles this behind the scenes, so there are no worries there for you.
Regarding which message gets where, RMQ will just deliver by using round robin, that is if you have the queue: 1 2 3 4 5 6 and consumer1 and consumer2.
consumer1 will have 1 3 5
consumer2 will have 2 4 6
Should the connection die to any of your consumers the prefetched messages will be redelivered to the active consumers using the same delivery method.
This should be interesting reading and a good starting point to figure more exactly what happens:
Tutorial no.2 which I'm sure you've read
Reliability
The api internally queue messages in a blocking queue.
Setting the prefetch count more than 1 is actually a good idea since your worker need not wait for each and every message to arrive. It can read up to N messages (where N is the prefetch count). It can start working on a message as soon as it has finished the previous one.
Also, you have the option to acknowledge multiple messages at once instead of acknowledging individually.
channel.basicAck(lastDeliveryTag, true);
boolean true indicates to acknowledge all the messages upto and including the supplied lastDeliveryTag
I am using celery with a rabbitmq backend. It is producing thousands of queues with 0 or 1 items in them in rabbitmq like this:
$ sudo rabbitmqctl list_queues
Listing queues ...
c2e9b4beefc7468ea7c9005009a57e1d 1
1162a89dd72840b19fbe9151c63a4eaa 0
07638a97896744a190f8131c3ba063de 0
b34f8d6d7402408c92c77ff93cdd7cf8 1
f388839917ff4afa9338ef81c28aad75 0
8b898d0c7c7e4be4aa8007b38ccc00ea 1
3fb4be51aaaa4ac097af535301084b01 1
This seems to be inefficient, but further I have observed that these queues persist long after processing is finished.
I have found the task that appears to be doing this:
#celery.task(ignore_result=True)
def write_pages(page_generator):
g = group(render_page.s(page) for page in page_generator)
res = g.apply_async()
for rendered_page in res:
print rendered_page # TODO: print to file
It seems that because these tasks are being called in a group, they are being thrown into the queue but never being released. However, I am clearly consuming the results (as I can view them being printed when I iterate through res. So, I do not understand why those tasks are persisting in the queue.
Additionally, I am wondering if the large number queues that are being created is some indication that I am doing something wrong.
Thanks for any help with this!
Celery with the AMQP backend will store task tombstones (results) in an AMQP queue named with the task ID that produced the result. These queues will persist even after the results are drained.
A couple recommendations:
Apply ignore_result=True to every task you can. Don't depend on results from other tasks.
Switch to a different backend (perhaps Redis -- it's more efficient anyway): http://docs.celeryproject.org/en/latest/userguide/tasks.html
Use CELERY_TASK_RESULT_EXPIRES (or on 4.1 CELERY_RESULT_EXPIRES) to have a periodic cleanup task remove old data from rabbitmq.
http://docs.celeryproject.org/en/master/userguide/configuration.html#std:setting-result_expires