How data loss and data sequence is managed in RabbitMQ - rabbitmq

As in kafka offset is used to manage the sequence of data that needed to be transfer to consumer, how this type of management is done in RabbitMQ to manage the data sequence to prevent data loss.

RabbitMQ consumers do not maintain a client side offset like Kafka consumers must. Kafka stores all messages for a configured time period and clients manage their own offsets meaning that different clients can consume messages from different offsets within the queue. This means that different Kafka consumers are not competing consumers unless they coordinate and share their offset.
RabbitMQ is very different. Messages are stored until they get sent to a consumer and then acknowledged or they expire. If there are multiple consumers for one queue then they are competing consumers and one message can only be consumed by one of the many consumers.
RabbitMQ has the concept of a delivery tag which is a monotonically incrementing number that increments per message delivered over a channel. It has no global meaning, it has the scope of a single channel and so is not shared between consumers on different servers, processes. Consumers only need to track this number for message acknowledgement purposes, it is RabbitMQ that is responsible for choosing which message is delivered to which consumer.

Related

RabbitMQ delivery throttle

So I'm testing RabbitMQ in one node. Plain and simple,
One producer sends messages to the queue,
Multiple consumers take tasks from that queue.
Currently consumers execute thousands of messages per second, they are too fast so I need them to slow down. Managing consumer-side throttling is not possible due to network unreliable nature.
Collectively consumers must not take more than 10 messages per second altogether from that queue.
Is there a way to configure RabbitMQ so as the queue dispatches a maximum of 10 messages per second?
If I remember correctly, once Rabbit MQ has delivered a message to the queue, it's up to consumers to consume a message. There are various consumers in different languages, you haven't mentioned anything specific, so I'm giving a generic answer.
In my understanding, you shouldn't try to impose any restrictions on Rabbit MQ itself, instead, consider implementing connection pool of message consumers that will be able to handle not more than X messages simultaneously on the client side. Alternatively, you can provide some kind of semaphore at the handler itself, but not on the Rabbit MQ server itself.

Rabbitmq : Prioritize consuming messages from multiple queues

If I have two queues from which I want to consume messages, and I use a single SimpleMessageQueueListenerContainer for it, in which order would the listeners be invoked/messages consumed when both queues have messages?
I will try to be more specific of the problem I am working on:
I have a consumer application which needs to consume messages from 2 queues – say regular-jobs-queue and infrequent-jobs-queue. If there are any messages in ‘infrequent-jobs-queue’ I want to consume those before consuming messages from ‘regular-jobs-queue’. I might not be able to combine these and put all messages into a single rabbitmq level priority queue and assign higher priority to infrequent-job message because of some upcoming use-cases like purging regular-jobs without affecting infrequent-jobs and others.
I am aware that RabbitMQ has support for consumer priority but I am not very sure if it will be applicable here. I want all instances of my consumer application to first consume messages of infrequent-jobs-queue if any and not prioritize amongst these consumers.
Or should I like have 2 containers, with dedicated consumer thread(s) per queue and have an internal priority-queue data structure into which I can put messages as and when consumed from rabbitmq queue.
Any help would be really appreciated. Thanks.
~Rashida
You can't do what you want; messages will be delivered with equal priority.
Moving them to an internal in-memory queue will risk message loss.
You might want to consider using one of the RabbitTemplate.receive() or receiveAndConvert() methods instead of a message-driven container.
That way you have complete control.

RabbitMQ distributing messages unevenly to consumers

We're seeing an issue where consumers of our message queues are picking up messages from queues at the top of the alphabetical range. We have two applications: a producer, and a subscriber. We're using RabbitMQ 3.6.1.
Let's say that the message queues are setup like so:
Our first application, the producer, puts say 100 messages/second onto each queue:
Our second application, the subscriber, has five unique consumer methods that can deal with messages on each respective queue. Each method binds to it's respective queue. A subscriber has a prefetch of 1 meaning it can only hold one message at a time, regardless of queue. We may run numerous instances of the subscriber like so:
So the situation is thus: each queue is receiving 100 msg/sec, and we have four instances of subscriber consuming these messages, so each queue has four consumers. Let's say that the consumer methods can deal with 25 msg/sec each.
What happens is that instead of all the queues being consumed equally, the alphabetically higher queues instead get priority. It's seems as though when the subscriber becomes ready, RabbitMQ looks down the list of queues that this particular ready channel is bound to, and picks the first queue with pending messages.
In our situation, A_QUEUE will have every message consumed. B_QUEUE may have some consumed in certain race conditions, but C_QUEUE/D_QUEUE and especially E_QUEUE will rarely get touched.
If we turn off the publisher, the queues will eventually drain, top to bottom.
Is it possible to configure either RabbitMQ itself or possibly even the channel to use some sort of round robin distribution policy or maybe even random policy so that when a channel has numerous bound queues, all with messages pending, the distribution is even?
to clarify: you have a single subscriber application with multiple consumers in it, right?
I'm guessing you're using a single RabbitMQ Connection within the subscriber app.
Are you also re-using a single RabbitMQ Channel for all of your consumers? If so, that would be a problem. Be sure to use a new Channel for each consumer you start.
Maybe the picture is wrong, but if it's not then your setup is wrong. You don't need 4 queues if you are going to have subscribers that listen to each and every queue. You'd just need one queue, that has multiple instances of the same subscriber consuming from it.
Now to answer, yes (but no need to configure, as long as prefetch is 1), actually rabbitmq does distribute messages evenly. You can find about about that here, and on the same place actually how your setup should look like. Here is a quote from the link.
RabbitMQ just dispatches a message when the message enters the queue.
It doesn't look at the number of unacknowledged messages for a
consumer. It just blindly dispatches every n-th message to the n-th
consumer.

Persistent message queue with at-least-once delivery

I have been looking at message queues (currently between Kafka and RabbitMQ) for one of my projects where these are biggest must have features.
Must have features
Messages in queues should be persistent. (only until they are processed successfully by consumers.)
Messages in queues should be removed only when downstream consumers were able to process the message successfully. Basically, a consumer should ACK. that it processed a message successfully.
Good to have features
To increase throughput, consumers should be able to pull batch of messages from queue.
If you are going with Kafka it will only retains message for a configurable duration of time after which the messages will be discarded to free up spaces no matter consumed or not.
And it is simply the responsibilities of the Kafka consumers to keep a track of what has been consumed.
IMHO if you require to keep the messages persisted for ever than consider using a different storage medium (database may be).

Does rabbitmq support to push the same data to multi consumers?

I have a rabbitmq cluster used as a working queue. There are 5 kinds of consumers who want to consume exactly the same data.
What I know for now is using fanout exchange to "copy" the data to 5 DIFFERENT queues. And the 5 consumers can consume different queue. This is kind of wasting resources because the data is the same in file queues.
My question is, does rabbitmq support to push the same data to multi consumers? Just like a message need to be acked for a specified times to be deleted.
I got the following answer from rabbitmq email group. In short, the answer is no... and what I did above is the correct way.
http://rabbitmq.1065348.n5.nabble.com/Does-rabbitmq-support-to-push-the-same-data-to-multi-consumers-td36169.html#a36170
... fanout exchange to "copy" the data to 5 DIFFERENT queues. And the 5 consumers can consume different queue. This is kind of wasting resources because the data is the same in file queues.
You can consume with 5 consumers from one queue if you do not want to duplicate messages.
does rabbitmq support to push the same data to multiple consumers
In AMQP protocol terms you publish message to exchange and then broker (RabbitMQ) decide what to do with messages - assume it figured out the queue message intended for (one or more) and then put that message on top of that queue (queues in RabbitMQ are classic FIFO queues which is somehow break AMQP implementation in RabbitMQ). Only after that message may be delivered to consumer (or die due to queue length limit or per-queue or per-message ttl, if any).
message need to be acked for a specified times to be deleted
There are no way to change message body or attributes after message being published (actually, Dead Letter Exchanges extension and some other may change routing key, for example and add,remove and change some headers, but this is very specific case). So if you want to track ack's number you have to re-publish consumed message with changed body or header (depends on where do you plan to store ack's counter, but headers fits pretty nice for this.
Also note, that there are redeliverd message attribute which denotes whether message was already was consumed, but then redelivered. This flag doesn't count redelivers number so it usage is quite limited.