AMQP/RabbitMQ - Process messages sequentially - rabbitmq

I have one direct exchange. There is also one queue, bound to this exchange.
I have two consumers for that queue. The consumers are manually ack'ing the messages once they've done the corresponding processing.
The messages are logically ordered/sorted, and should be processed in that order. Is it possible to enforce that all messages are received and processed sequentially accross consumer A and consumer B? In other words, prevent A and B from processing messages at the same time.
Note: the consumers are not sharing the same connection and/or channel. This means I cannot use <channel>.basicQoS(1);.
Rationale of this question: both consumers are identicall. If one goes down, the other queue starts processing messages and everything keeps working without any required intervention.

One approach to handling failover in a case where you want redundant consumers but need to process messages in a specific order is to use the exclusive consumer option when setting up the bind to the queue, and to have two consumers who keep trying to bind even when they can't get the exclusive lock.
The process is something like this:
Consumer A starts first and binds to the queue as an exclusive consumer. Consumer A begins processing messages from the queue.
Consumer B starts next and attempts to bind to the queue as an exclusive consumer, but is rejected because the queue already has an exclusive consumer.
On a recurring basis, consumer B attempts to get an exclusive bind on the queue but is rejected.
Process hosting consumer A crashes.
Consumer B attempts to bind to the queue as an exclusive consumer, and succeeds this time. Consumer B starts processing messages from the queue.
Consumer A is brought back online, and attempts an exclusive bind, but is rejected now.
Consumer B continues to process messages in FIFO order.
While this approach doesn't provide load sharing, it does provide redundancy.

Even though this is already answered. May be this can help others.
RabbitMQ has a feature known as Single Active Consumer, which matches your case.
We can have N consumers attached to a Queue but only 1 (one) of them will be actively consuming messages from the Queue. Fail-over happens only when active consumer fails.
Kindly take a look at the link https://www.rabbitmq.com/consumers.html#single-active-consumer
Thank you

Usually the point of a MQ system is to distribute workload. Of course, there are some situations where processing of message N depends on result of processing the message N-1, or even the N-1 message itself.
If A and B can't process messages at the same time, then why not just have A or just B? As I see it, you are not saving anything with having 2 consumers in a way that one can work only when the other one is not...
In your case, it would be best to have one consumer but to actually do the parallelisation (not a word really) on the processing part.
Just to add that RMQ is distributing messages evenly to all consumers (in round-robin fashion) regardless on any criteria. Of course this is when prefetch is set to 1, which by default it is. More info on that here, look for "fair dispatch".

Related

RabbitMQ competing consumers processing 1 message at a time sequentially

Similar to this question, we have FIFO queues and the messages must be processed in order. We want competing consumers from different machines for redundancy and performance reasons, but only one consumer on one machine should handle a message for a given queue at a time.
I tried setting the prefetch count to 1, but I believe this will only work if used with a single machine. Is this possible by default with RabbitMQ or do we need to implement our own lock?
Given a single queue with multiple consumers there is no way to block one of the consumers, all of them receive the messages in round-robin fashion.
EDIT
See https://www.rabbitmq.com/consumers.html#single-active-consumer
/EDIT
You could see this plugin, https://github.com/rabbitmq/rabbitmq-consistent-hash-exchange to distribute the load using different queues.
I tried setting the prefetch count to 1
prefetch=1 means that the consumers take one message at a time.
do we need to implement our own lock
Yes, If you want one single consumer for queue avoiding other consumers.
EDIT
There are also the Exclusive Queues https://www.rabbitmq.com/queues.html#exclusive-queues but note:
Exclusive queues are deleted when their declaring connection is closed or gone (e.g. due to underlying TCP connection loss). They, therefore, are only suitable for client-specific transient state.

RabbitMQ distributing messages unevenly to consumers

We're seeing an issue where consumers of our message queues are picking up messages from queues at the top of the alphabetical range. We have two applications: a producer, and a subscriber. We're using RabbitMQ 3.6.1.
Let's say that the message queues are setup like so:
Our first application, the producer, puts say 100 messages/second onto each queue:
Our second application, the subscriber, has five unique consumer methods that can deal with messages on each respective queue. Each method binds to it's respective queue. A subscriber has a prefetch of 1 meaning it can only hold one message at a time, regardless of queue. We may run numerous instances of the subscriber like so:
So the situation is thus: each queue is receiving 100 msg/sec, and we have four instances of subscriber consuming these messages, so each queue has four consumers. Let's say that the consumer methods can deal with 25 msg/sec each.
What happens is that instead of all the queues being consumed equally, the alphabetically higher queues instead get priority. It's seems as though when the subscriber becomes ready, RabbitMQ looks down the list of queues that this particular ready channel is bound to, and picks the first queue with pending messages.
In our situation, A_QUEUE will have every message consumed. B_QUEUE may have some consumed in certain race conditions, but C_QUEUE/D_QUEUE and especially E_QUEUE will rarely get touched.
If we turn off the publisher, the queues will eventually drain, top to bottom.
Is it possible to configure either RabbitMQ itself or possibly even the channel to use some sort of round robin distribution policy or maybe even random policy so that when a channel has numerous bound queues, all with messages pending, the distribution is even?
to clarify: you have a single subscriber application with multiple consumers in it, right?
I'm guessing you're using a single RabbitMQ Connection within the subscriber app.
Are you also re-using a single RabbitMQ Channel for all of your consumers? If so, that would be a problem. Be sure to use a new Channel for each consumer you start.
Maybe the picture is wrong, but if it's not then your setup is wrong. You don't need 4 queues if you are going to have subscribers that listen to each and every queue. You'd just need one queue, that has multiple instances of the same subscriber consuming from it.
Now to answer, yes (but no need to configure, as long as prefetch is 1), actually rabbitmq does distribute messages evenly. You can find about about that here, and on the same place actually how your setup should look like. Here is a quote from the link.
RabbitMQ just dispatches a message when the message enters the queue.
It doesn't look at the number of unacknowledged messages for a
consumer. It just blindly dispatches every n-th message to the n-th
consumer.

AMQP basic.get concurrent consumers pulling from Queue

When using RabbitMQ as Message Broker, I have a scenario where multiple concurrent consumers pull messages from a Queue using the basic.get AMQP method and use explicit acknowledgement for deleting the message from the Queue. Assuming the following setup
Q has messages M1, M2, M3 and has consumers C1, C2 and C3 (each having its own connection and channel) connected to it.
How is concurrency handled in the basic.get method? Is the call to basic.get method synchronized to handle concurrent consumers each using its own connection and channel? C1, C2 and C3 issue a basic.get call to receive a message at the same time (assume the server receives all 3 requests simultaneously).
C1 requests a message using basic.get and gets M1. When C2 requests for a message, since its using a different connection, does it get M1 again?
How can consumers pull messages in batches of a predefined size?
Your questions really hit at the heart of queuing and process theory, so I will answer from that standpoint (RabbitMQ is really a generic message broker as far as my answers are concerned, as this applies to any message broker).
How is concurrency handled in the basic.get method? Is the call to
basic.get method synchronized to handle concurrent consumers each
using its own connection and channel? C1, C2 and C3 issue a basic.get
call to receive a message at the same time (assume the server receives
all 3 requests simultaneously).
Answer 1: RabbitMQ is designed to be a reliable message broker. It contains internal processes and controls to ensure that the same message does not get passed out multiple times to different consumers. Now, due to the impracticality of testing the scenario that you describe, does it work perfectly? Who knows. That is why properly-designed applications using message-based architecture will use idempotent transactions, such that if the same transaction is processed multiple times, the result will be the same as if the transaction was processed once.
Takeaway: Design your application so that the answer to this question is unimportant.
C1 requests a message using basic.get and gets M1. When C2 requests
for a message, since its using a different connection, does it get M1
again?
Answer 2: No. Subject to the assumptions of my previous answer, the RabbitMQ broker will not serve the same message back once it has been delivered. Depending on the settings of the channel and queue, the message may be automatically acknowledged upon delivery and will never be redelivered. Other settings will have the message requeue automatically upon the "death" of the processing thread/channel or a negative acknowledgment from your processing thread. This is important functionality, since a "poison" message could repeatedly wreak havoc in your application if it could be served up to multiple consumers. Takeaway: you may safely rely on this assumption in designing your application.
How can consumers pull messages in batches of a predefined size?
Answer: They can't, nor would it make sense for them to. In any queuing system, the fundamental assumption is that items are removed from the queue in single file. Attempts to violate this assumption result in unpredictable behavior; furthermore, single-piece flow is commonly the most efficient method of processing. However, in the real world, there are cases where batch sizes > 1 are necessary. In such cases, it makes sense to load the batch into its own single message, so this may require a separate processing thread that pulls messages from the queue and batches them together, or put them in batches initially. Keep in mind that once you have multiple consumers, there is no possible way to guarantee single messages will be processed in order. Takeaway: Batching should be avoided wherever possible, but where it is not practical to avoid, you may not assume that batches will contain individual messages in any particular order.
You might wanna read the RabbitMQ Api guide and the introduction to Amqp.
First of all, avoid consuming messages using basicGet in your consumers. Rather use the Consumer interface basicConsume. This allows RabbitMq to push you messages as they arrive on the queue. Everything else is a waist of resources here as it boils down to busy polling.
When using basicConsume RabbitMq will even push you more messages in the background up to a certain prefetch count. This allows you to process multiple messages concurrently as well as minimizing the time you need to wait for your next message to process (if some message is available).
Concurrency is not an issue at all, that's what you're using a queue for!
When having multiple consumers on one queue, a message will always only be delivered to one consumer (as long as the message is ACKed). Otherwise you need private queues for each consumer and route your messages accordingly.
Btw, if you're able to share the connection among your consumers, you should do so.
Just make sure to use one channel per thread.
There is no special configuration required for that scenario. Each client will atomically fetch and receive one message from the queue, just as you would like to happen.

RabbitMQ Round Robin With Acknowledge

Lets say I have a queue with a bunch of messages in it. I have 2 consumers connected to that queue, both set with a prefetch = 1. The work that these consumers do takes some time, and I don't want to acknowledge the message until the work is done (in case the consumer crashes or something - I want the message to automatically reenter the queue in exceptional cases).
But I also want these consumers to work in parallel, and that doesn't appear to be happening. In other words, as long as there are 2+ messages in the queue, I'd expect both consumers to be busy.
What appears to be happening instead is that consumer 1 receives a message, but consumer 2 will wait until consumer 1 has acknowledged the message. Then consumer 2 receives a message and consumer 1 waits, etc.
Is there an option I'm missing? Or should this be working, I just have a bug in my code somewhere? Or is this not possible?
You should be able to pull messages off the queue while previous messages are still being processed by other consumers. The RabbitMQ tutorial specifically points to parallelism as a strength of round-robin dispatching (http://www.rabbitmq.com/tutorials/tutorial-two-python.html). Are your two consumers running as threads in the same process? I wonder if you've just made a mistake in the implementation.

Send One Message to only one of Multiple Consumers in RabbitMQ

I have a somewhat unique use case with RabbitMQ and I'm not sure how to go about solving the problem. I want to have one queue with multiple consumers bound to it and then have RabbitMQ send out one message to only one consumer at at time and wait for an ACK before sending out another message to any other consumer.
I realize this kills throughput and can essentially starve the other consumers but for me that's OK. The reason for this odd use case is that the service that the consumers talk to can only handle one concurrent request at a time so I need a way to limit this but consumers can also die unexpectedly and I need another consumer to pick up processing the messages if this happens. I know there is the prefetch option but that still allows multiple users to get a and exclusive queues but I'm not sure those accomplish what I want. Is it possible configure RabbitMQ to do this?
No; there is no way to limit competing consumers on the same queue such that there is one and only one message in process across all consumers until the ack is received.
A similar question came up some time ago; I don't remember if it was here or in the Spring forums but I believe the solution was to have the consumers acquire a global lock of some kind, using something like hazelcast, or even a simple database table row lock (with prefetch=1 so each consumer had only one "in process" message which was processed as and when each one got the lock).