How distributed should queues be in a RabbitMQ cluster? - rabbitmq

Assume you have a small rabbitmq system of 3 nodes that is supposed to handle 100+ decently high volume queues in the same exchange. Given that queues only exist on the node they are created on (we're not using replicated, High Availability queues), what's the best way to create the queues? Is there any benefit to having the queues distributed among the cluster nodes, or is it better to keep them all on one node and have rmq do the routing?

It depends on your application, really.
RabbitMQ is smart about sending messages, so it'll only send a message to a node in the cluster if
a queue that holds that message resides on that node or
if a consumer has connected to that node and has requested the message.
In general, you should aim to declare queues on the nodes on which both the publishers and the consumers for that queue will connect to. In other words, you should aim to connect publishers and consumers to the node that holds the queues they use. This assumes you're trying to conserve bandwidth used overall.
If you're using clustering to improve throughput (and you probably are), and you don't care about internal bandwidth used, you should aim to connect your publishers/consumers to the nodes in a balanced way and not worry about the internal routing mechanisms.
One last thing to think about is memory and disk-space. Queues store messages in main memory, and fallback to disk if that's insufficient. So, if you declare all your queues in one place, that'll result in one node that's "over-worked" and two nodes with memory to spare.

As part of a move towards redundancy and failover in an application I'm working on, I've just finished setting up a RabbitMQ cluster behind a proxy, and have all of my publishers and consumers connect via the proxy, which round robins connections to the individual nodes as they come in from the clients. Prior to upgrading RabbitMQ to 2.7.1, this seemed to pretty evenly distribute queues to the separate nodes, though this would of course depend pretty heavily on how your proxy balances the requests and when your clients try to connect (and declare a queue)...
Having said all that, I just upgraded to RabbitMQ 2.7.1, which was pretty painless, and gave us HA queues, which is a pretty big win for our apps. At any rate, if you're interested in the set up, and think it would be of benefit to your queue problem, I'd be happy to share the setup.

Related

ActiveMQ datastore for cluster setup

We have been using ActiveMQ version 5.16.0 broker with single instances in production. Now we are planning to use cluster of AMQ brokers for HA and load distribution with consistency in message data. Currently we are using only one queue
HA can be achieved using failover but do we need to use the same datastore or it can be separated? If I use different instances for AMQ brokers then how to setup a common datastore.
Please guide me how to setup datastore for HA and load distribution
Multiple ActiveMQ servers clustered together can provide HA in a couple ways:
Scale message flow by using compute resources across multiple broker nodes
Maintain message flow during single node planned or unplanned outage of a broker node
Share data store in the event of ActiveMQ process failure.
Network of brokers solve #1 and #2. A standard 3-node cluster will give you excellent performance and ability to scale the number of producers and consumers, along with splitting the full flow across 3-nodes to provide increased capacity.
Solving for #3 is complicated-- in all messaging products. Brokers are always working to be completely empty-- so clustering the data store of a single-broker becomes an anti-pattern of sorts. Many times, relying on RAID disk with a single broker node will provide higher reliability than adding NFSv4, GFSv2, or JDBC and using shared-store.
That being said, if you must use a shared store-- follow best practices and use GFSv2 or NFSv4. JDBC is much slower and requires significant DB maintenance to keep running efficiently.
Note: [#Kevin Boone]'s note about CIFS/SMB is incorrect and CIFS/SMB should not be used. Otherwise, his responses are solid.
You can configure ActiveMQ so that instances share a message store, or so they have separate message stores. If they share a message store, then (essentially) the brokers will automatically form a master-slave configuration, such that only one broker (at a time) will accept connections from clients, and only one broker will update the store. Clients need to identify both brokers in their connection URIs, and will connect to whichever broker happens to be master.
With a shared message store like this, locks in the message store coordinate the master-slave assignment, which makes the choice of message store critical. Stores can be shared filesystems, or shared databases. Only a few shared filesystem implementations work properly -- anything based on NFS 4.x should work. CIFS/SMB stores can work, but there's so much variation between providers that it's hard to be sure. NFS v3 doesn't work, however well-implemented, because the locking semantics are inappropriate. In any case, the store needs to be robust, or replicated, or both, because the whole broker cluster depends on it. No store, no brokers.
In my experience, it's easier to get good throughput from a shared file store than a shared database although, of course, there are many factors to consider. Poor network connectivity will make it hard to get good throughput with any kind of shared store (or any kind of cluster, for that matter).
When using individual message stores, it's typical to put the brokers into some kind of mesh, with 'network connectors' to pass messages from one broker to another. Both brokers will accept connections from clients (there is no master), and the network connections will deal with the situation where messages are sent to one broker, but need to be consumed from another.
Clients' don't necessarily need to specify all brokers in their connection URIs, but generally will, in case one of the brokers is down.
A mesh is generally easier to set up, and (broadly speaking) can handle more client load, than a master-slave with shared filestore. However, (a) losing a broker amounts to losing any messages that were associated with it (until the broker can be restored) and (b) the mesh interferes with messaging patterns like message grouping and exclusive consumers.
There's really no hard-and-fast rule to determine which configuration to use. Many installers who already have some sort of shared store infrastructure (a decent relational database, or a clustered NFS, for example) will tend to want to use it. The rise in cloud deployments has had the effect that mesh operation with no shared store has become (I think) a lot more popular, because it's so symmetric.
There's more -- a lot more -- that could be said here. As a broad question, I suspect the OP is a bit out-of-scope for SO. You'll probably get more traction if you break your question up into smaller, more focused, parts.

RabbitMQ how to optimally publish/consume messages in a cluster?

I am just curious what is the optimal way to publish and consume messages, ignoring durability, persistence and similar things, but rather from the network perspective in a cluster?
If we publish a message over a connection opened to server 1 (s1), but the queues master-locator-node is on server 2 (s2), the server has to move that message from s1 to s2, right?
It would be optimal to always consume from queues that are "local" to the server we are connected on, meaning that all the queues we consume from over our connection are located on that server, wouldn't it?
Is this overcomplicating? Or would it be best to always publish to and consume from servers where the queue is located? I am dealing with somewhere around 3B messages daily, so I am trying to reduce latency and load as much as possible.
Yes, always publishing to and consuming from the queue master node is optimal. Your understanding of what happens when you connect to a non-master node is correct. Of course, this means you will have to make your applications aware of this information (from the HTTP API).
If you're not worried about message loss, there's little need for a cluster in this scenario.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
You are ignoring important factors of the correct guidance, such as persistence and message size.Depending on message size , persistence and workload you have three potential resource bottlenecks 1) CPU 2) Network 3) Storage. In addition there is also the possibility of a contention bottleneck depending on the number of clients on each queue.

In RabbitMQ, How to make Queues in different clusters to Be Highly Available without Clustering?

In RabbitMQ, if two clusters are hosted on geographical different locations, then we can’t use Clustering. Then how to make them highly available I.e. if one site’s whole cluster goes down then the messages should be mirrored to other site and other site should be able to cater those messages. Note : sites are connected by WAN
See I can’t lose any message on the both sites. Publishing message to the right site can be taken care of, but if the messages are in queue(work queue) or messages are being processed by consumer and suddenly if the site goes down which includes the broker and consumer, how can those messages be catered by the other site. Like in a cluster if one node dies, the other one has all the messages mirrored and knows which were acknowledged, but how to achieve this on WAN, cause clustering cross WAN is not practical.
I think the question illustrates a conceptual problem with the design. To summarize,
There are two sites, connected via WAN
One site is the primary, while one is the active standby
There is a desire for complete replication of system state (total consistency) between site A and B, to include the status of messages in the queue and messages being processed.
Essentially, you want 100% consistency, availability, and partition tolerance. Such a design is not possible according to CAP Theorem. What RabbitMQ provides is either consistency and availability, with low partition tolerance via clustering, or availability and partition tolerance via federation or shovel. RabbitMQ does not deal very well with the case of needing consistency and partition tolerance, since message brokers really handle highly transient traffic.
Instead, what is needed is to fully scope the problem to something that can be solved using the available technologies. It sounds to me like the correct approach (since it's over a WAN) is to sacrifice availability for consistency and partition tolerance, and have your application handle the failover case. You may be able to configure RabbitMQ sufficiently in this regard - see https://www.rabbitmq.com/partitions.html.

Handling RabbitMQ node failures in a cluster in order to continue publishing and consuming

I would like to create a cluster for high availability and put a load balancer front of this cluster. In our configuration, we would like to create exchanges and queues manually, so one exchanges and queues are created, no client should make a call to redeclare them. I am using direct exchange with a routing key so its possible to route the messages into different queues on different nodes. However, I have some issues with clustering and queues.
As far as I read in the RabbitMQ documentation a queue is specific to the node it was created on. Moreover, we can only one queue with the same name in a cluster which should be alive in the time of publish/consume operations. If the node dies then the queue on that node will be gone and messages may not be recovered (depends on the configuration of course). So, even if I route the same message to different queues in different nodes, still I have to figure out how to use them in order to continue consuming messages.
I wonder if it is possible to handle this failover scenario without using mirrored queues. Say I would like switch to a new node in case of a failure and continue to consume from the same queue. Because publisher is just using routing key and these messages can go into more than one queue, same situation is not possible for the consumers.
In short, what can I to cope with the failures in an environment explained in the first paragraph. Queue mirroring is the best approach with a performance penalty in the cluster or a more practical solution exists?
Data replication (mirrored queues in RabbitMQ) is a standard approach to achieve high availability. I suggest to use those. If you don't replicate your data, you will lose it.
If you are worried about performance - RabbitMQ does not scale well.
The only way I know to improve performance is just to make your nodes bigger or create second cluster. Adding nodes to cluster does not really improve things. Also if you are planning to use TLS it will decrease throughput significantly as well. If you have high throughput requirement +HA I'd consider Apache Kafka.
If your use case allows not to care about HA, then just re-declare queues/exchanges whenever your consumers/publishers connect to the broker, which is absolutely fine. When you declare queue that's already exists nothing wrong will happen, queue won't be purged etc, same with exchange.
Also, check out RabbitMQ sharding plugin, maybe that will do for your usecase.

Is it necessary to use three nodes to build RabbitMQ cluster?

I have to say the official website provides very little information to understand RabbitMQ clearly.
The official website suggests using three nodes to build a cluster. What is the reason for that? I suppose it's like ZooKeeper, which needs an odd number of nodes to do a quorum and elect the master.
Also, what is the advantage of using a non-HA cluster? Improve the performance or what? If the node which a queue resides is down, then the queue is not working. So for all situation, is it necessary to set the cluster to be mirror queue and auto-sync?
Three nodes is the minimum to have a reasonable HA.
Suppose you have a queue mirrored in two nodes, if one gets down, another one will be promoted as the new slave or master.
Please read here section Automatically handling partitions and the section More about pause-minority mode
is therefore not a good idea to enable pause-minority mode on a
cluster of two nodes since in the event of any network partition or
node failure, both nodes will pause
RabbitMQ can handle the cluster in different ways, depending on where you deploy it - LAN or WAN or unstable LAN etc. And you can also use federation, shovel
what is the advantage of using a non-HA cluster? Improve the performance or what?
I'd say yes, or simply you have an environment where you don't need to have HA queues since you can have only temporary queues.
is it necessary to set the cluster to be mirror queue and auto-sync?
You can also decide for manual-sync, since when you sync the queue is blocked, and if you have lots of messages to sync, it can be a problem. For example, you can decide to sync the queues when you don't have traffic.
Here (section Unsynchronised Slaves) it is explained clearly.
Your question is a bit general, and it depends on what are you looking for.