RabbitMQ Clustering node failure

RabbitMQ Clustering node failure - rabbitmq

I have a RabbitMQ cluster (without HA) setup with the nodes in multiple instances. From the documentation, what I understood is, in cluster mode, the queues are not mirrored and it is owned by the node where it is declared.
So, now the question is, what will happen when the node which owns the queue goes down? Correct me if I'm wrong, since the queues are not mirrored, the client applications will throw up for missing queues.
Should we write our logic to figure out if the node goes down, the queues have to be re declared and in this case, what will happen to the messages?

So, now the question is, what will happen when the node which owns the queue goes down?
From the docs:
When RabbitMQ quits or crashes it will forget the queues and messages unless you tell it not to. Two things are required to make sure that messages aren't lost: we need to mark both the queue and messages as durable.
next question:
Should we write our logic to figure out if the node goes down, the queues have to be re declared and in this case, what will happen to the messages?
Yes, its a good idea to re-declare your queues.
In case when your node is going down, all consumers connected to it will be disconnected. Every time consumer connects, it should assume it's queue does not exist and so, it needs to fire declare queue request as first request when its connected.
If consumer sends declare queue request and queue does exist then:
declare queue won't affect queue's
messages in any way. If messages were persisted, they continue to be
in the queue.
Under normal circumstances (if you don't change the queue's
properties) no errors will be thrown

Related

RabbitMQ more messages than expected on fixed size queue

I have a publisher that sends messages to a consumer that moves a motor.
The motor has a work queue which I cannot access, and it works slower than the rate of the incoming messages, so I'm trying to control the traffic on the consumer.
To keep updated and relevant data coming to the motor without the queue filling up and creating a traffic jam, I set the RabbitMQ queue size limit to 5 and basicQos to 1.
The idea is that the RabbitMQ queue will drop the old messages when it is filled up, so the newest commands are at the front of the queue.
Also by setting basicQos to 1 I ensure that the consumer doesn't grab all messages from the queue and bombards the motor at once, which is exactly what i'm trying to avoid since I can't do anything once the command was sent to the motor.
This way the consumer takes messages from the queue one by one, while new messages replace the old ones on the queue.
Practically this moves the bottleneck to the RabbitMQ queue instead of the motor's queue.
I also cannot check the motor's work queue, so all traffic control must be done on the consumer.
I added messageId and tested, and found out many messages are still coming and going long after the publisher is being shut down.
I'm expecting around 5 messages after shutdown since that's the size of the queue, but i'm getting hundreds.
I also added a few seconds of sleep inside the callback to make sure this isn't the robot queue that's acting up, but i'm still getting many messages after shutdown, and I can see in the logs the callback is being called every time so it's definitely still getting messages from somewhere.
Please help.
Thanks.

Moving the acknowledgment to the end of the callback solved the problem.
I'm guessing that by setting basicQos to 1 it did execute the callback for each message one after another, but in the background it kept grabbing messages from the queue.
So even when the publisher was shutdown, the consumer still had messages that were taken from the queue in it, and those messages were the ones that I saw being executed.

Resiliently processing messages from RabbitMQ

I'm not sure how to resiliently handle RabbitMQ messages in the event of an intermittent outage.
I subscribe in a windows service, read the message, then store it my database. If I can't process the record because of the data I publish it to a dead letter queue for a human to address and reprocess.
I am not sure what to do if I have some intermittent technical issue that will fix itself (database reboot, network outage, drive space, etc). I don't want hundreds of messages showing up on dead letter that just needed to wait for a for a glitch but now would be waiting on a human.
Currently, I re-queue the event and retry it once, but it retries so fast the issue is not usually resolved. I thought of retrying forever but I don't want a real issue to get stuck in an infinite loop.

Is a broad topic but from the server side you could persist your messages and make your queues durable, this means that in the eventuality the server gets restarted they won't be lost, check more here How to persist messages during RabbitMQ broker restart?
For the consumer (client) it will depend on how you configure your client, from the docs:
In the event of network failure (or a node crashing), messages can be duplicated, and consumers must be prepared to handle them. If possible, the simplest way to handle this is to ensure that your consumers handle messages in an idempotent way rather than explicitly deal with deduplication.
If a message is delivered to a consumer and then requeued (because it was not acknowledged before the consumer connection dropped, for example) then RabbitMQ will set the redelivered flag on it when it is delivered again (whether to the same consumer or a different one). This is a hint that a consumer may have seen this message before (although that's not guaranteed, the message may have made it out of the broker but not into a consumer before the connection dropped). Conversely if the redelivered flag is not set then it is guaranteed that the message has not been seen before. Therefore if a consumer finds it more expensive to deduplicate messages or process them in an idempotent manner, it can do this only for messages with the redelivered flag set.
Check more here: https://www.rabbitmq.com/reliability.html#consumer

RabbitMQ: dropping messages when no consumers are connected

I'm trying to setup RabbitMQ in a model where there is only one producer and one consumer, and where messages sent by the producer are delivered to the consumer only if the consumer is connected, but dropped if the consumer is not present.
Basically I want the queue to drop all the messages it receives when no consumer is connected to it.
An additional constraint is that the queue must be declared on the RabbitMQ server side, and must not be explicitly created by the consumer or the producer.
Is that possible?
I've looked at a few things, but I can't seem to make it work:
durable vs non-durable does not work, because it is only useful when the broker restarts. I need the same effect but on a connection.
setting auto_delete to true on the queue means that my client can never connect to this queue again.
x-message-ttl and max-length make it possible to lose message even when there is a consumer connected.
I've looked at topic exchanges, but as far as I can tell, these only affect the routing of messages between the exchange and the queue based on the message content, and can't take into account whether or not a queue has connected consumers.
The effect that I'm looking for would be something like auto_delete on disconnect, and auto_create on connect. Is there a mechanism in rabbitmq that lets me do that?

After a bit more research, I discovered that one of the assumptions in my question regarding x-message-ttl was wrong. I overlooked a single sentence from the RabbitMQ documentation:
Setting the TTL to 0 causes messages to be expired upon reaching a queue unless they can be delivered to a consumer immediately
https://www.rabbitmq.com/ttl.html
It turns out that the simplest solution is to set x-message-ttl to 0 on my queue.

You can not doing it directly, but there is a mechanism not dificult to implement.
You have to enable the Event Exchange Plugin. This is a exchange at which your server app can connect and will receive internal events of RabbitMQ. You would be interested in the consumer.created and consumer.deleted events.
When these events are received you can trigger an action (create or delete the queue you need). More information here: https://www.rabbitmq.com/event-exchange.html
Hope this helps.

If your consumer is allowed to dynamically bind / unbind a queue during start/stop on the broker it should be possible by that way (e.g. queue is pre setup and the consumer binds the queue during startup to an exchange it wants to receive messages from)

AMQP/RabbitMQ - Process messages sequentially

I have one direct exchange. There is also one queue, bound to this exchange.
I have two consumers for that queue. The consumers are manually ack'ing the messages once they've done the corresponding processing.
The messages are logically ordered/sorted, and should be processed in that order. Is it possible to enforce that all messages are received and processed sequentially accross consumer A and consumer B? In other words, prevent A and B from processing messages at the same time.
Note: the consumers are not sharing the same connection and/or channel. This means I cannot use <channel>.basicQoS(1);.
Rationale of this question: both consumers are identicall. If one goes down, the other queue starts processing messages and everything keeps working without any required intervention.

One approach to handling failover in a case where you want redundant consumers but need to process messages in a specific order is to use the exclusive consumer option when setting up the bind to the queue, and to have two consumers who keep trying to bind even when they can't get the exclusive lock.
The process is something like this:
Consumer A starts first and binds to the queue as an exclusive consumer. Consumer A begins processing messages from the queue.
Consumer B starts next and attempts to bind to the queue as an exclusive consumer, but is rejected because the queue already has an exclusive consumer.
On a recurring basis, consumer B attempts to get an exclusive bind on the queue but is rejected.
Process hosting consumer A crashes.
Consumer B attempts to bind to the queue as an exclusive consumer, and succeeds this time. Consumer B starts processing messages from the queue.
Consumer A is brought back online, and attempts an exclusive bind, but is rejected now.
Consumer B continues to process messages in FIFO order.
While this approach doesn't provide load sharing, it does provide redundancy.

Even though this is already answered. May be this can help others.
RabbitMQ has a feature known as Single Active Consumer, which matches your case.
We can have N consumers attached to a Queue but only 1 (one) of them will be actively consuming messages from the Queue. Fail-over happens only when active consumer fails.
Kindly take a look at the link https://www.rabbitmq.com/consumers.html#single-active-consumer
Thank you

Usually the point of a MQ system is to distribute workload. Of course, there are some situations where processing of message N depends on result of processing the message N-1, or even the N-1 message itself.
If A and B can't process messages at the same time, then why not just have A or just B? As I see it, you are not saving anything with having 2 consumers in a way that one can work only when the other one is not...
In your case, it would be best to have one consumer but to actually do the parallelisation (not a word really) on the processing part.
Just to add that RMQ is distributing messages evenly to all consumers (in round-robin fashion) regardless on any criteria. Of course this is when prefetch is set to 1, which by default it is. More info on that here, look for "fair dispatch".

RabbitMQ clustering and mirror queues behavior behind the scenes

Can someone please explain what is going on behind the scenes in a RabbitMQ cluster with multiple nodes and queues in mirrored fashion when publishing to a slave node?
From what I read, it seems that all actions other than publishes go only to the master and the master then broadcasts the effect of the actions to the slaves(this is from the documentation). Form my understanding it means a consumer will always consume message from the master queue. Also, if I send a request to a slave for consuming a message, that slave will do an extra hop by getting to the master for fetching that message.
But what happens when I publish to a slave node? Will this node do the same thing of sending first the message to the master?
It seems there are so many extra hops when dealing with slaves, so it seems you could have a better performance if you know only the master. But how do you handle master failure? Then one of the slaves will be elected master, so you have to know where to connect to?
Asking all of this because we are using RabbitMQ cluster with HAProxy in front, so we can decouple the cluster structure from our apps. This way, whenever a node goes done, the HAProxy will redirect to living nodes. But we have problems when we kill one of the rabbit nodes. The connection to rabbit is permanent, so if it fails, you have to recreate it. Also, you have to resend the messages in this cases, otherwise you will lose them.
Even with all of this, messages can still be lost, because they may be in transit when I kill a node (in some buffers, somewhere on the network etc). So you have to use transactions or publisher confirms, which guarantee the delivery after all the mirrors have been filled up with the message. But here another issue. You may have duplicate messages, because the broker might have sent a confirmation that never reached the producer (due to network failures, etc). Therefore consumer applications will need to perform deduplication or handle incoming messages in an idempotent manner.
Is there a way of avoiding this? Or I have to decide whether I can lose couple of messages versus duplication of some messages?

Can someone please explain what is going on behind the scenes in a RabbitMQ cluster with multiple nodes and queues in mirrored fashion when publishing to a slave node?
This blog outlines exactly what happens.
But what happens when I publish to a slave node? Will this node do the same thing of sending first the message to the master?
The message will be redirected to the master Queue - that is, the node on which the Queue was created.
But how do you handle master failure? Then one of the slaves will be elected master, so you have to know where to connect to?
Again, this is covered here. Essentially, you need a separate service that polls RabbitMQ and determines whether nodes are alive or not. RabbitMQ provides a management API for this. Your publishing and consuming applications need to refer to this service either directly, or through a mutual data-store in order to determine that correct node to publish to or consume from.
The connection to rabbit is permanent, so if it fails, you have to recreate it. Also, you have to resend the messages in this cases, otherwise you will lose them.
You need to subscribe to connection-interrupted events to react to severed connections. You will need to build in some level of redundancy on the client in order to ensure that messages are not lost. I suggest, as above, that you introduce a service specifically designed to interrogate RabbitMQ. You client can attempt to publish a message to the last known active connection, and should this fail, the client might ask the monitor service for an up-to-date listing of the RabbitMQ cluster. Assuming that there is at least one active node, the client may then establish a connection to it and publish the message successfully.
Even with all of this, messages can still be lost, because they may be in transit when I kill a node
There are certain edge-cases that you can't cover with redundancy, and neither can RabbitMQ. For example, when a message lands in a Queue, and the HA policy invokes a background process to copy the message to a backup node. During this process there is potential for the message to be lost before it is persisted to the backup node. Should the active node immediately fail, the message will be lost for good. There is nothing that can be done about this. Unfortunately, when we get down to the level of actual bytes travelling across the wire, there's a limit to the amount of safeguards that we can build.
herefore consumer applications will need to perform deduplication or handle incoming messages in an idempotent manner.
You can handle this a number of ways. For example, setting the message-ttl to a relatively low value will ensure that duplicated messages don't remain on the Queue for extended periods of time. You can also tag each message with a unique reference, and check that reference at the consumer level. Of course, this would require storing a cache of processed messages to compare incoming messages against; the idea being that if a previously processed message arrives, its tag will have been cached by the consumer, and the message can be ignored.
One thing that I'd stress with AMQP and Queue-based solutions in general is that your infrastructure provides the tools, but not the entire solution. You have to bridge those gaps based on your business needs. Often, the best solution is derived through trial and error. I hope my suggestions are of use. I blog about a number of RabbitMQ design solutions here, including the issues you mentioned, here if you're interested.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas