About RabbitMQ two concept is unknow to me, cluster and node ? what is different between them?
A RabbitMQ node is the basic "message broker" service (process running on a server) which provides core RabbitMQ features such as exchanges, virtual hosts, queues, etc. You need at least one RabbitMQ node to be up-and-running, to use RabbitMQ.
A RabbitMQ cluster is simply a grouping of one or more RabbitMQ nodes. From the documentation, a cluster is:
"a logical grouping of one or several nodes, each sharing users, virtual hosts, queues, exchanges, bindings, runtime parameters and other distributed state.".
Why can it be useful to place nodes in a cluster? Again from the documentation:
Clustering nodes can help improve availability, data safety of queue contents and sustain more concurrent client connections.
So, a cluster of nodes gives you more flexibility (than a single node), about how you design and provide your overall RabbitMQ services.
The terms "node" and "cluster" are not specific to RabbitMQ - they are fairly generic terms, used much more widely than just for RabbitMQ.
Related
Overview
A RabbitMQ broker is a logical grouping of one or several Erlang
nodes, each running the RabbitMQ application and sharing users,
virtual hosts, queues, exchanges, bindings, and runtime parameters.
Sometimes we refer to the collection of nodes as a cluster.
Why would you do this? I understand to increase durability of messages (if a node goes down, other queues still get the messages). But what about performance? How does cluster improve performance. Won't all consumers/producers connect to the master node's queue anyway? If so, aren't we still getting traffic on a single node regardless? Do we put a load balancer so traffic is directed at different nodes each time?
How does RabbitMQ cluster increase performance?
Well, right after that paragraph, the documentation states the following:
What is Replicated?
All data/state required for the operation of a RabbitMQ broker is
replicated across all nodes. An exception to this are message queues,
which by default reside on one node, though they are visible and
reachable from all nodes. To replicate queues across nodes in a
cluster, see the documentation on high availability (note that you
will need a working cluster first).
So, you would cluster to provide higher capacity in your RabbitMQ broker than a single node can provide alone. Note that clustering by itself is not a high-availability strategy.
Your assertion that message durability is increased is false, as message queues continue to reside on one broker (unless mirroring is used).
By default, contents of a queue within a RabbitMQ cluster are located on a single node (the node on which the queue was declared) [1]
Without mirroring, when that node goes down, messages on it will be lost. The cluster will put the queue onto a different node. RabbitMQ does not handle network partitions well, so this can be a bit of a problem.
"Aren't we still getting traffic on a single node regardless?" - if you only have one queue, then yes. However, a bigger question is "why would you run a message broker with only one queue?" Similarly, if you only create queues on one node, then you will still have one point of failure in the system.
When creating a RabbitMQ cluster, non-mirrored queues from other nodes are "remotely accessible" from other nodes.
To a naive developer they will seemingly be able to publish to and consume from any node in an cluster and it will give them a false sense of high-availability.
If the node hosting the queue dies, the consumer will no longer be able to reach the queue from the other node.
Is there a way to disable this behaviour so that it's obvious that one has to either have a mirrored queue or needs to create a distinct queues on each server, consume from both and then handle duplicates.
Thanks
It is not possible disable this behaviour, this is one of the main reasons why you create a cluster.
BTW, you can create a federated cluster by using federation plug-in.
So you can:
have isolated nodes
share only the exchanges or/and queues you prefer.
this is a use case question on RabbitMQ clustering. In the past, I have clustered RabbitMQ to make queues highly available (HA). I understand you can cluster RabbitMQ nodes without making HA queues but why would you do that? From a message consumer's POV, clustering in itself buys you nothing unless the queues are made HA (or so I feel). What kind of use-cases can you cite for make a non-HA RabbitMQ cluster?
By having more servers you can get more throughput, be able to accept more connected clients and so on. The non HA cluster is able to see resources in all nodes in the clusters, despite of where the resources were declared.
I am trying to set up cluster of brokers, which should have same feature like rabbitMQ cluster, but over WAN (my machines are in different locations), so rabbitMQ cluster does not work.
I am looking to alternatives, rabbitMQ federation is just backup the messages in the downstream, can not make sure they have exactly the same messages available at any time (downstream still keeps the old messages already consumed in the upstream)
how about ActiveMQ Master/Slave, I have found :
http://activemq.apache.org/how-do-distributed-queues-work.html
"queues and topics are all replicated between each broker in the cluster (so often to a master and maybe a single slave). So each broker in the cluster has exactly the same messages available at any time so if a master fails, clients failover to a slave and you don't loose a message."
My concern is that if it can automatically update to make sure Master/Slave always have the same messages, which means the consumed messages in Master will also disappear in Slaves.
Thanks :)
ActiveMQ has various clustering features.
First there is High Availability - "Master/Slave". The idea is that several physical servers act as a single logical ActiveMQ broker. If one goes down, another takes it place without losing data. You can do that by sharing the message store (shared file system or shared JDBC), or you could setup a replicated cluster, which replicates read/writes to the master down to all slaves (you need three+ servers). ActiveMQ is using LevelDB and Apache Zookeeper to achieve this.
The other format of cluster available in ActiveMQ is to be able to distribute load and separate security over several logical brokers. Brokers are then connected in a network of brokers. Messages are by default passed around to the broker with available consumers for that message. However, there is a rich toolbox of features in ActiveMQ to tweak a network of brokers to do things as always send a copy of a message to specific broker etc. It takes some messing with the more advanced features though (static network connectors and queue mirroring, maybe more).
Maybe there is a better way to solve your requirements, which is not really specified in the question?
I am wanting to setup RabbitMQ as a two (or more) node cluster with HA.
Use case: a client producer app (C#.NET) knows that the cluster has two nodes and publishes to the cluster. Various consumer apps (also C#.NET) connect to the cluster and get all messages generated by the producer. So long as at least one node is up and running the producer and consumers will all continue to work without error. Supposing nodes A and B are running and B dies for a while, then gets restarted, then a while later A dies, the clients all continue to function without receiving an error since at all times at least one node is up.
Can it be made to work like this out of the box?
Are there any other MQs that would be more appropriate (commercial ok) for a Windows/.NET application environment?
RabbitMQ v2.6.0 now supports high-availability queues using active/active clustering. Microsoft and a number of other companies have collaborated on Apache QPid which has C# bindings and which also supports active/active HA clustering.
Can it be made to work like this out of the box?
No. When a node goes down, all of its connections are closed. Since AMQP connections are stateful, there's no way around this. What you could achieve is 1) broker goes down, 2) all clients disconnect, 3) clients connect to other node (masquerading as original) and are none the wiser.
On a side note, rabbit does not support active-active HA clustering at the moment. It does support active-passive clustering and a form of logical clustering (which might be what you're looking for).