Is RabbitMQ Clustering including scalability too? - rabbitmq

I want to build a RabbitMQ system which is able to scale out for the sake of performance.
I've gone through the official document of RabbitMQ Clustering. However, its clustering doesn't seem to support scalability. That's because only through master queue we can publish/consume, even though the master queue is reachable from any node of a cluster. Other than the node on which a master queue resides, we can't process any publish/consume.
Why do we cluster then?

Why do we cluster then?
To ensure availability.
To enforce data replication.
To spread the load/data accross queues on different nodes. Master queues can be stored on different node and replicated with a factor < number of cluster nodes.
Other than the node on which a master queue resides, we can't process
any publish/consume.
Client can be connected on any node of the cluster. This node will transfer 'the request' to the master queue node and vice versa. As a downside it will generate extra hop.

Answer to the question in the title Is RabbitMQ Clustering including scalability too? - yes it does, this is achieved by simply adding more nodes/removing some nodes to/from the cluster. Of course you have to consider high availability - that is queue and exchange mirroring etc.
And just to make something clear regarding:
However, its clustering doesn't seem to support scalability. That's
because only through master queue we can publish/consume, even though
the master queue is reachable from any node of a cluster.
Publishing is done to exchange, queues have nothing to with publishing. A publishing client publishes only to an exchange and a routing key. It doesn't need any knowledge about the queue.

Related

Handling RabbitMQ node failures in a cluster in order to continue publishing and consuming

I would like to create a cluster for high availability and put a load balancer front of this cluster. In our configuration, we would like to create exchanges and queues manually, so one exchanges and queues are created, no client should make a call to redeclare them. I am using direct exchange with a routing key so its possible to route the messages into different queues on different nodes. However, I have some issues with clustering and queues.
As far as I read in the RabbitMQ documentation a queue is specific to the node it was created on. Moreover, we can only one queue with the same name in a cluster which should be alive in the time of publish/consume operations. If the node dies then the queue on that node will be gone and messages may not be recovered (depends on the configuration of course). So, even if I route the same message to different queues in different nodes, still I have to figure out how to use them in order to continue consuming messages.
I wonder if it is possible to handle this failover scenario without using mirrored queues. Say I would like switch to a new node in case of a failure and continue to consume from the same queue. Because publisher is just using routing key and these messages can go into more than one queue, same situation is not possible for the consumers.
In short, what can I to cope with the failures in an environment explained in the first paragraph. Queue mirroring is the best approach with a performance penalty in the cluster or a more practical solution exists?
Data replication (mirrored queues in RabbitMQ) is a standard approach to achieve high availability. I suggest to use those. If you don't replicate your data, you will lose it.
If you are worried about performance - RabbitMQ does not scale well.
The only way I know to improve performance is just to make your nodes bigger or create second cluster. Adding nodes to cluster does not really improve things. Also if you are planning to use TLS it will decrease throughput significantly as well. If you have high throughput requirement +HA I'd consider Apache Kafka.
If your use case allows not to care about HA, then just re-declare queues/exchanges whenever your consumers/publishers connect to the broker, which is absolutely fine. When you declare queue that's already exists nothing wrong will happen, queue won't be purged etc, same with exchange.
Also, check out RabbitMQ sharding plugin, maybe that will do for your usecase.

RabbitMQ clustering queue mirroring for high availability: get master node's ip for a queue at time t

From my understanding, RabbitMQ clustering is for scalability not availability, but using mirrored queues allows for availability as well in that if that master fails, an up-to-date slave can be promoted to master.
From the documentation:
Messages published to the queue are replicated to all slaves. Consumers are connected to the master regardless of which node they connect to, with slaves dropping messages that have been acknowledged at the master. Queue mirroring therefore enhances availability, but does not distribute load across nodes (all participating nodes each do all the work).
Therefore, load-balancing across the nodes for a given queue doesn't make sense as this will always add an extra trip from the node contacted to the master node for the queue (unless I'm misunderstanding something). Hence, we'd want to always be able to know which node is the master for a given queue.
I haven't really worked with RabbitMQ very much, so perhaps I'm just missing it in the documentation, but it seems that there's no way to determine a mirrored-queue's master's ip if there was a master failure and a slave was promoted to master. Every source that I see merely remarks on one's ability to set the initial master node, which isn't very helpful for me. For any time t, how do I find the master node ip for a given queue?
PS: It also seems bad to simply have the nodes behind a load-balancer since if there's some network partition (which can occur even with nodes in the same LAN), then we'd potentially be hitting nodes that can't communicate with the master for the queue OR worse there could be a split brain that we'd be evolving, if you will.
You can create a smart client which maintain queues mirroring topology. It is possible using the Management Plugin and its REST API.
eg. for a queue, curl -i -u guest:guest http://[HOST]:[PORT]/api/queues/[VHOST]/[QUEUE] will return the following payload:
{
"messages": 0,
"slave_nodes": [
"rabbit#node1",
"rabbit#node0"
],
"synchronised_slave_nodes": [
"rabbit#node0",
"rabbit#node1"
],
"recoverable_slaves": [
"rabbit#node0"
],
"state": "running",
"name": "myQueue",
=>"node": "rabbit#node2"
}
For myQueue your client will favor connection to node2 (the myQueue master node) to minimize HOP.
I'm not sure if it worth the cost. It will increase the number of connections and the client complexity. I would be happy to receive feeback if you implement somethink.
You don't need the master node's IP, you just need queues to be mirrored, that way all the messages in the queues are on all the nodes. In the paragraph above the one you quoted is this sentence
Each mirrored queue consists of one master and one or more slaves,
with the oldest slave being promoted to the new master if the old
master disappears for any reason.
so the words master and slave relate to queues, not rabbitmq nodes, I'm guessing here is the confusion. Once I read what the question and then again the docs, it got me thinking for a while but we can't say that a mirrored queue consists of master and slaves of rabbitmq nodes ;)
As for load balancing for the (of the?) cluster, you can do it so that the clients are always connecting to the rabbitmq node which is alive by using the actual load balancer, or by making clients "smarter" - i.e. they should reconnect to IP of a another node if the (original) master node goes down. The first approach is reccommended, just look for Connecting to Clusters from Clients here.

Is it necessary to use three nodes to build RabbitMQ cluster?

I have to say the official website provides very little information to understand RabbitMQ clearly.
The official website suggests using three nodes to build a cluster. What is the reason for that? I suppose it's like ZooKeeper, which needs an odd number of nodes to do a quorum and elect the master.
Also, what is the advantage of using a non-HA cluster? Improve the performance or what? If the node which a queue resides is down, then the queue is not working. So for all situation, is it necessary to set the cluster to be mirror queue and auto-sync?
Three nodes is the minimum to have a reasonable HA.
Suppose you have a queue mirrored in two nodes, if one gets down, another one will be promoted as the new slave or master.
Please read here section Automatically handling partitions and the section More about pause-minority mode
is therefore not a good idea to enable pause-minority mode on a
cluster of two nodes since in the event of any network partition or
node failure, both nodes will pause
RabbitMQ can handle the cluster in different ways, depending on where you deploy it - LAN or WAN or unstable LAN etc. And you can also use federation, shovel
what is the advantage of using a non-HA cluster? Improve the performance or what?
I'd say yes, or simply you have an environment where you don't need to have HA queues since you can have only temporary queues.
is it necessary to set the cluster to be mirror queue and auto-sync?
You can also decide for manual-sync, since when you sync the queue is blocked, and if you have lots of messages to sync, it can be a problem. For example, you can decide to sync the queues when you don't have traffic.
Here (section Unsynchronised Slaves) it is explained clearly.
Your question is a bit general, and it depends on what are you looking for.

How load balancer works in RabbitMQ

I am new to RabbitMQ, so please excuse me for trivial questions:
1) In case of clustering in RabbitMQ, if a node fails, load shift to another node (without stopping the other nodes). Similarly, we can also add new fresh nodes to the existing cluster without stopping existing nodes in cluster. Is that correct?
2) Assume that we start with a single rabbitMQ node, and create 100 queues on it. Now producers started sending message at faster rate. To handle this load, we add more nodes and make a cluster. But queues exist on first node only. How does load balanced among nodes now? And if we need to add more queues, on which node we should add them? Or can we add them using load balancer.
Thanks In Advance
1) In case of clustering in RabbitMQ, if a node fails, load shift to another node (without stopping the other nodes). Similarly, we can also add new fresh nodes to the existing cluster without stopping existing nodes in cluster. Is that correct?
If a node on which the queue was created fails, rabbitmq will elect a new master for that queue in the cluster as long as mirroring for the queue is enabled. Clustering provides HA based on a policy that you can define.
2) Assume that we start with a single rabbitMQ node, and create 100 queues on it. Now producers started sending message at faster rate. To handle this load, we add more nodes and make a cluster. But queues exist on first node only. How does load balanced among nodes now?
The load is not balanced. The distributed cluster provides HA and not load balancing. Your requests will be redirected to the node in the cluster on which the queue resides.
And if we need to add more queues, on which node we should add them? Or can we add them using load balancer.
That depends on your use case. Some folks use a round robin and create queues on separate nodes.
In summary
For HA use mirroring in the cluster.
To balance load across nodes, use a LB to distribute across Queues.
If you'd like to load balance the queue itself take a look at Federated Queues. They allow you to fetch messages on a downstream queue from an upstream queue.
Let me try to answer your question and this is generally most of dev may encounter.
Question 1) In case of clustering in RabbitMQ, if a node fails, load shift to another node (without stopping the other nodes). Similarly, we can also add new fresh nodes to the existing cluster without stopping existing nodes in cluster. Is that correct?
Answer: absolutely correct(if rabbitMQ running on a single host) but rabbitMQ's Queue behaves differently on the cluster. Queues only live on one node in the cluster by default. But Rabbit(2.6.0) gave us a built-in active-active redundancy option for queues: mirrored queues. Declaring a mirrored queue is just like declaring a normal queue; you pass an extra argument called x-ha-policy; tells Rabbit that you want the queue to be mirrored across all nodes in the cluster. This means that if a new node is added to the cluster after the queue is declared, it’ll automatically begin hosting a slave copy of the queue.
Question 2) Assume that we start with a single rabbitMQ node, and create 100 queues on it. Now producers started sending message at faster rate. To handle this load, we add more nodes and make a cluster. But queues exist on first node only. How does load balanced among nodes now? And if we need to add more queues, on which node we should add them? Or can we add them using load balancer.
This question has multiple sub-questions.
How does load-balanced among nodes now?
Set to all, x-ha-policy tells Rabbit that you want the queue to be mirrored across all nodes in the cluster. This means that if a new node is added to the cluster after the queue is declared, it’ll automatically begin hosting a slave copy of the queue.
on which node we should add them?
answer the above.
can we add them using load balancer?
No but yes(you have to call the rabbitMQ API within LB which is not a best practice approach), Load balancer is used for resilient messaging infrastructure. Your cluster nodes are the servers behind the load balancer and your producers and consumers are the customers.

How distributed should queues be in a RabbitMQ cluster?

Assume you have a small rabbitmq system of 3 nodes that is supposed to handle 100+ decently high volume queues in the same exchange. Given that queues only exist on the node they are created on (we're not using replicated, High Availability queues), what's the best way to create the queues? Is there any benefit to having the queues distributed among the cluster nodes, or is it better to keep them all on one node and have rmq do the routing?
It depends on your application, really.
RabbitMQ is smart about sending messages, so it'll only send a message to a node in the cluster if
a queue that holds that message resides on that node or
if a consumer has connected to that node and has requested the message.
In general, you should aim to declare queues on the nodes on which both the publishers and the consumers for that queue will connect to. In other words, you should aim to connect publishers and consumers to the node that holds the queues they use. This assumes you're trying to conserve bandwidth used overall.
If you're using clustering to improve throughput (and you probably are), and you don't care about internal bandwidth used, you should aim to connect your publishers/consumers to the nodes in a balanced way and not worry about the internal routing mechanisms.
One last thing to think about is memory and disk-space. Queues store messages in main memory, and fallback to disk if that's insufficient. So, if you declare all your queues in one place, that'll result in one node that's "over-worked" and two nodes with memory to spare.
As part of a move towards redundancy and failover in an application I'm working on, I've just finished setting up a RabbitMQ cluster behind a proxy, and have all of my publishers and consumers connect via the proxy, which round robins connections to the individual nodes as they come in from the clients. Prior to upgrading RabbitMQ to 2.7.1, this seemed to pretty evenly distribute queues to the separate nodes, though this would of course depend pretty heavily on how your proxy balances the requests and when your clients try to connect (and declare a queue)...
Having said all that, I just upgraded to RabbitMQ 2.7.1, which was pretty painless, and gave us HA queues, which is a pretty big win for our apps. At any rate, if you're interested in the set up, and think it would be of benefit to your queue problem, I'd be happy to share the setup.