Balance RabbitMQ master queues across nodes - rabbitmq

I have a cluster of 3 RabbitMQ nodes and I want to keep master queues balanced across all nodes, even after node reboots. Still, master queues don't rebalance when a new node joins the cluster or when one of the nodes disconnects and reconnects.
Example: I create 100 queues on nodes A, B and C.
If node C shutdowns, master queues from C are almost equally rebalanced between node A and B. So at this point, nodes A and B have both approximately 50 master queues.
Now, if I reconnect node C, it'll remain with 0 master queues until new queues are created. This is problematic because I want all my nodes to produce the same amount of work.
My exchanges are durables, my queues are durables and mirrored and my messages are persistent. I want to avoid loosing messages.
I know there is a way to change the master node manually using a policy trick. But this is not satisfying since it breaks HA (by inducing a resynchronisations of all mirrors).

you can use this command:
rabbitmq-queues rebalance type --vhost-pattern pattern --queue-pattern pattern
for example
rabbitmq-queues rebalance "all" --vhost-pattern "a-vhost" --queue-pattern ".*"

One solution is to use Federated Queues.
A federated queue links to other queues (called upstream queues). It will retrieve messages from upstream queues in order to satisfy demand for messages from local consumers.
You can create a completely new cluster which is both upstream and downstream from the original cluster. You also need to ensure that both your publishers and consumers reconnects periodically (to avoid one cluster to monopolize all connections, defeating load-balancing).
As you pointed out, there's also Simon MacMullen's trick from rabbitmq-users group.
# rabbitmqctl set_policy --apply-to queues --priority 100 my-queue '^my-queue$' '{"ha-mode":"nodes", "ha-params":["rabbit#master-node"]}'
# rabbitmqctl clear_policy my-queue
But it has the underdesirable side-effect to make mirrors loose synchronisation for a while. This might be acceptable or not, depending on your requirements, so I think it's worth saying it's possible.
More advanced technique might come up in 4.x, but it is not sure at all.

Related

Ensure a new RabbitMQ quorum queue replicas are spread among cluster's availability zones

I'm going to run a multi-node (3 zones, 2 nodes in each, expected to grow) RabbitMQ cluster with many dynamically created quorum queues. It will be unmanageable to tinker with the queue replicas manually.
I need to ensure that a new quorum queue (1lead, 2repl) always spans all 3 AZs to be able to survive an AZ outage.
Is there a way to configure node affinity to achieve that goal?
There is one poor man's solution (actually, pretty expensive to do right) that comes on my mind:
create queues with x-quorum-initial-group-size=1
have a reconciliation script that runs periodically and adds replica members to nodes in the right zones
Of course, build in feature for configurable node affinity, which I might miss somehow, would be the best one.

RabbitMQ Mirrored Queues on Multiple Clusters

Is it possible to use RabbitMQ HA using multiple(2) RabbitMQ clusters?
Here is my requirement:
We have 2 RabbitMQ clusters (each with 4 nodes). All the nodes in both the clusters will be using same Erlang cookie. So that, even though these 2 clusters are physically in separate locations, but will act as a single cluster with 8 nodes.
We are planning to use HAProxy to load balance both the clusters (8 nodes). Both publisher and consumer will be using this proxy to connect to the broker.
We would like to use mirrored queues for HA with ha-mode:exactly, ha-params:4, ha-sync-mode:automatic along with auto-heal for cluster_partition_handling.
Question:
In case of HA, is there a way we can specify to use 2 nodes from the first cluster and 2 nodes from the second cluster. As I understand, this can be done via policy ha-mode:nodes and use node names but that way it will always use the same node, can this setup be dynamic?
As both the clusters are very reliable, will it be the right approach to use auto-heal for cluster_partition_handling (in case of split brain)?
As per this "By default, queues within a RabbitMQ cluster are located on a single node (the node on which they were first declared). This is in contrast to exchanges and bindings, which can always be considered to be on all nodes.". Does this mean exchanges are mirrored by default? So when a message arrives at an exchange and that node goes down, will the message be available on the other exchange on the other node?
The RabbitMQ team monitors this mailing list and only sometimes answers questions on StackOverflow.
So that, even though these 2 clusters are physically in separate locations, but will act as a single cluster with 8 nodes.
Please do not do this. RabbitMQ clusters require reliable network connections with low latency. If your cluster crosses a WAN or availability zone your chance of having network partitions greatly increases. See this section of the docs for more information. You should use either the shovel or federation feature.
Does this mean exchanges are mirrored by default? So when a message arrives at an exchange and that node goes down, will the message be available on the other exchange on the other node?
Yes and yes.

scaling of rabbit mq

what scaling options can we use if rabbitMQ metrices reaches a threshold?I have a VM on which RabbitMQ is running. If the queue length>90% of total queue length, can we increase the instance count by 1 and a with a separate queue such that they are to be processed on a priority basis?
In short what scaling options do we have based on different parameters for RabbitMQ
Take a look into RabbitMQ Sharding Plugin
From their README:
RabbitMQ Sharding Plugin
This plugin introduces the concept of sharded queues for RabbitMQ.
Sharding is performed by exchanges, that is, messages will be
partitioned across "shard" queues by one exchange that we should
define as sharded. The machinery used behind the scenes implies
defining an exchange that will partition, or shard messages across
queues. The partitioning will be done automatically for you, i.e: once
you define an exchange as sharded, then the supporting queues will be
automatically created on every cluster node and messages will be
sharded across them.
Auto-scaling
One interesting property of this plugin, is that if you add more nodes
to your RabbitMQ cluster, then the plugin will automatically create
more shards in the new node. Say you had a shard with 4 queues in node
a and node b just joined the cluster. The plugin will automatically
create 4 queues in node b and join them to the shard partition.
Already delivered messages will not be rebalanced, but newly arriving
messages will be partitioned to the new queues.

Is RabbitMQ Clustering including scalability too?

I want to build a RabbitMQ system which is able to scale out for the sake of performance.
I've gone through the official document of RabbitMQ Clustering. However, its clustering doesn't seem to support scalability. That's because only through master queue we can publish/consume, even though the master queue is reachable from any node of a cluster. Other than the node on which a master queue resides, we can't process any publish/consume.
Why do we cluster then?
Why do we cluster then?
To ensure availability.
To enforce data replication.
To spread the load/data accross queues on different nodes. Master queues can be stored on different node and replicated with a factor < number of cluster nodes.
Other than the node on which a master queue resides, we can't process
any publish/consume.
Client can be connected on any node of the cluster. This node will transfer 'the request' to the master queue node and vice versa. As a downside it will generate extra hop.
Answer to the question in the title Is RabbitMQ Clustering including scalability too? - yes it does, this is achieved by simply adding more nodes/removing some nodes to/from the cluster. Of course you have to consider high availability - that is queue and exchange mirroring etc.
And just to make something clear regarding:
However, its clustering doesn't seem to support scalability. That's
because only through master queue we can publish/consume, even though
the master queue is reachable from any node of a cluster.
Publishing is done to exchange, queues have nothing to with publishing. A publishing client publishes only to an exchange and a routing key. It doesn't need any knowledge about the queue.

How load balancer works in RabbitMQ

I am new to RabbitMQ, so please excuse me for trivial questions:
1) In case of clustering in RabbitMQ, if a node fails, load shift to another node (without stopping the other nodes). Similarly, we can also add new fresh nodes to the existing cluster without stopping existing nodes in cluster. Is that correct?
2) Assume that we start with a single rabbitMQ node, and create 100 queues on it. Now producers started sending message at faster rate. To handle this load, we add more nodes and make a cluster. But queues exist on first node only. How does load balanced among nodes now? And if we need to add more queues, on which node we should add them? Or can we add them using load balancer.
Thanks In Advance
1) In case of clustering in RabbitMQ, if a node fails, load shift to another node (without stopping the other nodes). Similarly, we can also add new fresh nodes to the existing cluster without stopping existing nodes in cluster. Is that correct?
If a node on which the queue was created fails, rabbitmq will elect a new master for that queue in the cluster as long as mirroring for the queue is enabled. Clustering provides HA based on a policy that you can define.
2) Assume that we start with a single rabbitMQ node, and create 100 queues on it. Now producers started sending message at faster rate. To handle this load, we add more nodes and make a cluster. But queues exist on first node only. How does load balanced among nodes now?
The load is not balanced. The distributed cluster provides HA and not load balancing. Your requests will be redirected to the node in the cluster on which the queue resides.
And if we need to add more queues, on which node we should add them? Or can we add them using load balancer.
That depends on your use case. Some folks use a round robin and create queues on separate nodes.
In summary
For HA use mirroring in the cluster.
To balance load across nodes, use a LB to distribute across Queues.
If you'd like to load balance the queue itself take a look at Federated Queues. They allow you to fetch messages on a downstream queue from an upstream queue.
Let me try to answer your question and this is generally most of dev may encounter.
Question 1) In case of clustering in RabbitMQ, if a node fails, load shift to another node (without stopping the other nodes). Similarly, we can also add new fresh nodes to the existing cluster without stopping existing nodes in cluster. Is that correct?
Answer: absolutely correct(if rabbitMQ running on a single host) but rabbitMQ's Queue behaves differently on the cluster. Queues only live on one node in the cluster by default. But Rabbit(2.6.0) gave us a built-in active-active redundancy option for queues: mirrored queues. Declaring a mirrored queue is just like declaring a normal queue; you pass an extra argument called x-ha-policy; tells Rabbit that you want the queue to be mirrored across all nodes in the cluster. This means that if a new node is added to the cluster after the queue is declared, it’ll automatically begin hosting a slave copy of the queue.
Question 2) Assume that we start with a single rabbitMQ node, and create 100 queues on it. Now producers started sending message at faster rate. To handle this load, we add more nodes and make a cluster. But queues exist on first node only. How does load balanced among nodes now? And if we need to add more queues, on which node we should add them? Or can we add them using load balancer.
This question has multiple sub-questions.
How does load-balanced among nodes now?
Set to all, x-ha-policy tells Rabbit that you want the queue to be mirrored across all nodes in the cluster. This means that if a new node is added to the cluster after the queue is declared, it’ll automatically begin hosting a slave copy of the queue.
on which node we should add them?
answer the above.
can we add them using load balancer?
No but yes(you have to call the rabbitMQ API within LB which is not a best practice approach), Load balancer is used for resilient messaging infrastructure. Your cluster nodes are the servers behind the load balancer and your producers and consumers are the customers.