We have setup a RabbitMQ cluster with 3 nodes. If an effort to have some form of load balancing, we setup the policy to only sync across 2 of the nodes:
rabbitmqctl set_policy ha-2 . '{"ha-mode":"exactly","ha-params":2,"ha-sync-mode":"automatic"}'
This works as expected when all 3 nodes are online.
When we shutdown one of the nodes (to simulate a failure) queues mastered on the failed node are still available (on the slave) but not synchronized to another node. If we manually re-apply the policy, the queues then synchronize as expected.
Should we expect that all queues be mirrored in the scenario that one node fails with this policy?
Works as expected in RabbitMQ 3.5.4
Related
We have the following setup:
Now, on the Upstream side, I see two connections to the Cluster. One to rabbitmq-1 and one to rabbitmq-2.
The one to rabbitmq-1 is piling up messages. Note the message count of 413'584.
In the downstream, on the Cluster, I see only the connection to rabbitmq-2.
If I delete the queue to rabbitmq-1 it reappears after some time.
Why are there two queues, and why is the one to rabbitmq-1 not processing any messages?
This happens in the following case:
Your cluster has no name defined. In such case the name of the node is used as a cluster name.
Your cluster is behind a load balancer which selects node randomly.
You use the load balancer url to setup the federation upstream. In such case when the node restarts. The connection from another node is made which has different name.
Solution
The easiest solution is to set the cluster name on any node in the cluster with the following command.
rabbitmqctl set_cluster_name "rabbitmq-cluster"
After that all nodes in the cluster will return the same name and no redundant exchanges or queues will be created
Overview
A RabbitMQ broker is a logical grouping of one or several Erlang
nodes, each running the RabbitMQ application and sharing users,
virtual hosts, queues, exchanges, bindings, and runtime parameters.
Sometimes we refer to the collection of nodes as a cluster.
Why would you do this? I understand to increase durability of messages (if a node goes down, other queues still get the messages). But what about performance? How does cluster improve performance. Won't all consumers/producers connect to the master node's queue anyway? If so, aren't we still getting traffic on a single node regardless? Do we put a load balancer so traffic is directed at different nodes each time?
How does RabbitMQ cluster increase performance?
Well, right after that paragraph, the documentation states the following:
What is Replicated?
All data/state required for the operation of a RabbitMQ broker is
replicated across all nodes. An exception to this are message queues,
which by default reside on one node, though they are visible and
reachable from all nodes. To replicate queues across nodes in a
cluster, see the documentation on high availability (note that you
will need a working cluster first).
So, you would cluster to provide higher capacity in your RabbitMQ broker than a single node can provide alone. Note that clustering by itself is not a high-availability strategy.
Your assertion that message durability is increased is false, as message queues continue to reside on one broker (unless mirroring is used).
By default, contents of a queue within a RabbitMQ cluster are located on a single node (the node on which the queue was declared) [1]
Without mirroring, when that node goes down, messages on it will be lost. The cluster will put the queue onto a different node. RabbitMQ does not handle network partitions well, so this can be a bit of a problem.
"Aren't we still getting traffic on a single node regardless?" - if you only have one queue, then yes. However, a bigger question is "why would you run a message broker with only one queue?" Similarly, if you only create queues on one node, then you will still have one point of failure in the system.
I set up lab about ha for rabbitmq using cluster and mirror queue.
I am using centos 7 and rabbitmq-server 3.3.5. with three server (ha1, ha2, ha3).
I have just joined ha1 and ha2 to ha3, but do not set policy for mirror queue. When I test create queue with name "hello" on ha1 server, after i check on ha2, and ha3 using rabbitmqctl list queue, hello queue is exist on all node on cluster.
I have a question, why i do not set policy to mirror queue on cluster, but it automatic mirror queue have been created on any node on cluster?
Please give me advice about I have mistake or only join node on cluster, queue will be mirror on all node of cluster. Thanks
In rabbitmq , by default, one queue is stored only to one node. When you create a cluster, the queue is available across nodes.
But It does't mean that the queue is mirrored, if the node gets down the queue is marked as down and you can't access.
Suppose to create one queue to the node, the queue will work until the node is up, as:
if the node is down you will have:
you should always apply the mirror policy, otherwise you could lose the messages
I would like to run RabbitMQ Highly Available Queues in a cluster of two RabbitMQ instances on two separate servers. It's not clear to me from the documentation how can I detect which node is considered as master by RabbitMQ in order to determine which node should I publish messages to and consume from.
Is that something that RabbitMQ resolves internally (and so I can publish and consume from master even when connected to a slave node) or should the application know about master node for each queue and connect only to it?
RabbitMQ will take care of that. The idea of HA queues is that you publish and consume from either node, and RabbitMQ will try to keep a consistent state.
I have a clustered HA rabbitmq setup. I am using the "exactly" policy similar to:
rabbitmqctl set_policy ha-two "^two\." \'{"ha-mode":"exactly","ha-params":10,"ha-sync-mode":"automatic"}'
I have 30 machines running, of which 10 are HA nodes with queues replicated. When my broker goes down (randomly assigned to be the first HA node), I need my celery workers to point to a new HA node (one of the 9 left). I have a script that automates this. The problem is: I do not know how to distinguish between a regular cluster node and a HA node. When I issue the command:
rabbitmqctl cluster_status
The categories I get are "running nodes", "disc", and "ram". but there is no way here to tell if a node is HA.
Any ideas?
In cluster every node share everything with another, so you don't have do distinguish nodes in your application in order to access all entities.
In your case when one of HA node goes down (their number get to 9), HA queues will be replicated to first available node (doesn't matter disc or ram).