How does RabbitMQ send messages to consumers? - rabbitmq

I am a newbie to RabbitMQ, hence need guidance on a basic question:
Does RabbitMQ send messages to consumer as they arrive?
OR
Does RabbitMQ send messages to consumer as they become available?
At message consumption endpoint, I am using com.rabbitmq.client.QueueingConsumer.
Looking at the sprint client source code, I could figure out that
QueueingConsumer keeps listening on socket for any messages the broker sends to it
Any message that is received is parsed and stored as Delivery in a LinkedBlockingQueue encapsulated inside the QueueingConsumer.
This implies that even if the message processing endpoint is busy, messages will be pushed to QueueingConsumer
Is this understanding right?

TLDR: you poll messages from RabbitMQ till the prefetch count is exceeded in which case you will block and only receive heart beat frames till the fetch messages are ACKed. So you can poll but you will only get new messages if the number of non-acked messages is less than the prefetch count. New messages are put on the QueueingConsumer and in theory you should never really have much more than the prefetch count in that QueueingConsumer internal queue.
Details:
Low level wise for (I'm probably going to get some of this wrong) RabbitMQ itself doesn't actually push messages. The client has to continuously read the connections for Frames based on the AMQP protocol. Its hard to classify this as push or pull but just know the client has to continuously read the connection and because the Java client is sadly BIO it is a blocking/polling operation. The blocking/polling is based on the AMQP heartbeat frames and regular frames and socket timeout configuration.
What happens in the Java RabbitMQ client is that there is thread for each channel (or maybe its connection) and that thread loops gathering frames from RabbitMQ which eventually become commands that are put in a blocking queue (I believe its like a SynchronousQueue aka handoff queue but Rabbit has its own special one).
The QueueingConsumer is a higher level API and will pull commands off of that handoff queue mentioned early because if commands are left on the handoff queue it will block the channel frame gathering loop. This is can be bad because timeout the connection. Also the QueueingConsumer allows work to be done on a separate thread instead of being in the same thread as the looping frame thread mentioned earlier.
Now if you look at most Consumer implementations you will probably notice that they are almost always unbounded blocking queues. I'm not entirely sure why the bounding of these queues can't be a multiplier of the prefetch but if they are less than the prefetch it will certainly cause problems with the connection timing out.

I think best answer is product's own answer. As RMQ has both push + pull mechanism defined as part of the protocol. Have a look : https://www.rabbitmq.com/tutorials/amqp-concepts.html

Rabbitmq mainly uses Push mechanism. Poll will consume bandwidth of the server. Poll also has time gaps between each poll. It will not able to achieve low latency. Rabbitmq will push the message to client once there are consumers available for the queue. So the connection is long running. ReadFrame in rabbitmq is basically waiting for incoming frames

Related

Resiliently processing messages from RabbitMQ

I'm not sure how to resiliently handle RabbitMQ messages in the event of an intermittent outage.
I subscribe in a windows service, read the message, then store it my database. If I can't process the record because of the data I publish it to a dead letter queue for a human to address and reprocess.
I am not sure what to do if I have some intermittent technical issue that will fix itself (database reboot, network outage, drive space, etc). I don't want hundreds of messages showing up on dead letter that just needed to wait for a for a glitch but now would be waiting on a human.
Currently, I re-queue the event and retry it once, but it retries so fast the issue is not usually resolved. I thought of retrying forever but I don't want a real issue to get stuck in an infinite loop.
Is a broad topic but from the server side you could persist your messages and make your queues durable, this means that in the eventuality the server gets restarted they won't be lost, check more here How to persist messages during RabbitMQ broker restart?
For the consumer (client) it will depend on how you configure your client, from the docs:
In the event of network failure (or a node crashing), messages can be duplicated, and consumers must be prepared to handle them. If possible, the simplest way to handle this is to ensure that your consumers handle messages in an idempotent way rather than explicitly deal with deduplication.
If a message is delivered to a consumer and then requeued (because it was not acknowledged before the consumer connection dropped, for example) then RabbitMQ will set the redelivered flag on it when it is delivered again (whether to the same consumer or a different one). This is a hint that a consumer may have seen this message before (although that's not guaranteed, the message may have made it out of the broker but not into a consumer before the connection dropped). Conversely if the redelivered flag is not set then it is guaranteed that the message has not been seen before. Therefore if a consumer finds it more expensive to deduplicate messages or process them in an idempotent manner, it can do this only for messages with the redelivered flag set.
Check more here: https://www.rabbitmq.com/reliability.html#consumer

RabbitMQ delivery throttle

So I'm testing RabbitMQ in one node. Plain and simple,
One producer sends messages to the queue,
Multiple consumers take tasks from that queue.
Currently consumers execute thousands of messages per second, they are too fast so I need them to slow down. Managing consumer-side throttling is not possible due to network unreliable nature.
Collectively consumers must not take more than 10 messages per second altogether from that queue.
Is there a way to configure RabbitMQ so as the queue dispatches a maximum of 10 messages per second?
If I remember correctly, once Rabbit MQ has delivered a message to the queue, it's up to consumers to consume a message. There are various consumers in different languages, you haven't mentioned anything specific, so I'm giving a generic answer.
In my understanding, you shouldn't try to impose any restrictions on Rabbit MQ itself, instead, consider implementing connection pool of message consumers that will be able to handle not more than X messages simultaneously on the client side. Alternatively, you can provide some kind of semaphore at the handler itself, but not on the Rabbit MQ server itself.

API design around RabbitMQ for publisher/subscriber

TL;DR - Whats the best way to expose RabbitMQ to a consumer via REST API?
I'm creating an API to publish and consume message from RabbitMQ. In my current design, the publisher is going to make a POST request. My API will route the POST request to the exchange. In this way, the publisher doesn't have to know the server address, exchange name etc. while publishing.
Now the consumer part is where I'm not sure how to proceed.
At the beginning there will be no queues. When a new consumer wants to subscribe to a TOPIC, then I will create a queue and bind it to the exchange. I need help with answers to few questions -
Once I create a queue for the consumer, what's the next step to let the consumer get messages from that queue?
I make the consumer ask for a batch of messages(say 50 messages) from the queue. Then once I receive an ack from the consumer I will send the next 50 messages from queue. If I don't receive an ack I will requeue the 50 messages back into the queue. Isn't this expensive in terms of opening and closing connection between the consumer and my API?
If there is a better approach then please suggest
In general, your idea of putting RMQ behind a REST API is a good one. You don't want to expose RMQ to the world, directly.
For the specific questions:
Once I create a queue for the consumer, what's the next step to let the consumer get messages from that queue?
Have you read the tutorials? I would start there, for the language you are working with: http://www.rabbitmq.com/getstarted.html
Isn't this expensive in terms of opening and closing connection between the consumer and my API?
Don't open and close connections for each batch of messages.
Your application instance (the "consumer" app) should have a single connection. That connection stays open as long as you need it - across as many calls to RabbitMQ as you want.
I typically open my RMQ connection as soon as the app starts, and I leave it open until the app shuts down.
Within the consumer app, using that one single connection, you will create multiple channels through the connection. A channel is where the actual work is done.
Depending on your language, you will have a single channel per thread; a single channel per queue being consumed; etc
You can create and destroy channels very quickly, unlike connections.
More specifically with your idea of batch processing, this will be handled by putting a consumer prefetch limit on your consumer and then requiring messages to be acknowledged after processing it.

Can any of my consumer take the messages from queue?

I am developing an app. and I am using activemq. Is there any way to do that one producer always send messages to one broker but on the opposite side there 3 consumers.Each consumer listens broker and can take any of message from queue.Is this possible?
I am using activemq for writing my app. logs to db.As u know writing logs to db is time taking process.That's why consumer is more and more slow than producer.For ex. I send 100.000 message(huge objects).Producer finishes sending messages in 20 mins.But When the producer finished, consumer has finished 4.000 message processing yet.
Yes, what you are describing is possible. In fact, you can have any number of consumers listening on a single queue. The messages are dispatched in a round-robin fashion between consumers.
What you should be aware of is that ActiveMQ performs much better sending small messages than large ones. If you need to send very large payloads (e.g. 100mb), you are far better off saving the message to a location that is accessible by both the producer and consumers (e.g. a network file system), and sending the location of the message instead. The consumer can then use that to read the message manually. This way you get a relatively small amount of traffic through the message broker.

RabbitMQ - Does one consumer block the other consumers of the same queue?

I'm in a phase of learning RabbitMQ/AMQP from the RabbitMQ documentation. Something that is not clear to me that I wanted to ask those who have hands-on experience.
I want to have multiple consumers listening to the same queue in order to balance the work load. What I need is pretty much close to the "Work Queues" example in the RabbitMQ tutorial.
I want the consumer to acknowledge message explicitly after it finishes handling it to preserve the message and delegate it to another consumer in case of crash. Handling a message may take a while.
My question is whether AMQP postpones next message processing until the previous message is ack'ed? If so how do I achieve load balancing between multiple workers and guarantee no messages get lost?
No, the other consumers don't get blocked. Other messages will get delivered even if they have unacknowledged but delivered predecessors. If a channel closes while holding unacknowledged messages, those messages get returned to the queue.
See RabbitMQ Broker Semantics
Messages can be returned to the queue using AMQP methods that feature a requeue parameter (basic.recover, basic.reject and basic.nack), or due to a channel closing while holding unacknowledged messages.
EDIT In response to your comment:
Time to dive a little deeper into the AMQP specification then perhaps:
3.1.4 Message Queues
A message queue is a named FIFO buffer that holds message on behalf of a set of consumer applications.
Applications can freely create, share, use, and destroy message queues, within the limits of their authority.
Note that in the presence of multiple readers from a queue, or client transactions, or use of priority fields,
or use of message selectors, or implementation-specific delivery optimisations the queue MAY NOT
exhibit true FIFO characteristics. The only way to guarantee FIFO is to have just one consumer connected
to a queue. The queue may be described as “weak-FIFO” in these cases. [...]
3.1.8 Acknowledgements
An acknowledgement is a formal signal from the client application to a message queue that it has
successfully processed a message.[...]
So acknowledgement confirms processing, not receipt. The broker will hold on to the message until it's gotten acknowleged, so that it can redeliver them. But it is free to deliver more messages to consumers even if the prededing messages have not yet been acknowledged. The consumers will not be blocked.