Is it possible to connect to other RabbitMQ nodes when one node is down? - rabbitmq

The environment I have consists of two separate servers, one each with RabbitMQ service application running. They are correctly clustered and the queues are using mirroring correctly.
Node A is master
Node B is slave
My question is more specifically when Node A goes down but Service A is still up. Node B and Service B are still up. At this point, Node B is now promoted to master. When an application connects to Node B it connects okay, of course.
rabbitmqctl cluster_status on Node B shows cluster is up with two nodes and Node B is running. rabbitmqctl cluster_status on Node A shows node is down. This is expected behavior.
It is possible for an application to connect to Node A and be able to publish/pop queue items as normal?

Related

Setup docker-swarm to use specific nodes as backup?

Is there a way to setup docker-swarm to only use specific nodes (workers or managers) as fail-over nodes? For instance if one specific worker dies (or if a service on it dies), only then it will use another node, before that happens it's as if the node wasn't in the swarm.
No, that is not possible. However, docker-swarm does have the features to build that up. Let's say that you have 3 worker nodes in which you want to run service A. 2/3 nodes will always be available and node 3 will be the backup.
Add a label to the 3 nodes. E.g: runs=serviceA . This will make sure that your service only runs in those 3 nodes.
Make the 3rd node unable to schedule tasks by running docker node update --availability drain <NODE-ID>
Whenever you need your node back, run docker node update --availability active <NODE-ID>

Rabbitmq primary node rejoining a cluster

I have a cluster of 3 rabbitmq nodes spread out on 3 different servers. The second and third node joins the first node and forms the cluster. In the process of testing for failover I am finding that once the primary node is killed, I am not able to make it rejoin the cluster. The documentation does not state that I have to use join_cluster or any other command, after startup. I tried join_cluster but it is rejected since the cluster with name is the same as the node host. Is there a way to make this work?
cluster_status displays the following (not from the primary node):
Cluster status of node 'rabbit#<secondary>' ...
[{nodes,[{disc,['rabbit#<primary>','rabbit#<secondary>',
'rabbit#<tertiary>']}]},
{running_nodes,['rabbit#<secondary>','rabbit#<tertiary>']},
{cluster_name,<<"rabbit#<primary>">>},
{partitions,[]}]
On one of the nodes which are in the cluster, use the command
rabbitmqctl forget_cluster_node rabbit#rabbitmq1
To make the current cluster forget the old primary.
Now you should be able to rejoin the cluster on the old primary (rabbitmq1)
rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit#rabbitmq2
rabbitmqctl start_app
See the reference cluster guide
A quote from here
Nodes that have been joined to a cluster can be stopped at any time.
It is also ok for them to crash. In both cases the rest of the cluster
continues operating unaffected, and the nodes automatically "catch up"
with the other cluster nodes when they start up again.
So you just need to start the node that you killed/stopped. Doesn't make a difference if it's "primary" or not - if it was primary and then killed, some other node becomes the primary one.
I've just tested this (with docker of course) and works as expected.

HA RabbitMQ without set mirror policy

I set up lab about ha for rabbitmq using cluster and mirror queue.
I am using centos 7 and rabbitmq-server 3.3.5. with three server (ha1, ha2, ha3).
I have just joined ha1 and ha2 to ha3, but do not set policy for mirror queue. When I test create queue with name "hello" on ha1 server, after i check on ha2, and ha3 using rabbitmqctl list queue, hello queue is exist on all node on cluster.
I have a question, why i do not set policy to mirror queue on cluster, but it automatic mirror queue have been created on any node on cluster?
Please give me advice about I have mistake or only join node on cluster, queue will be mirror on all node of cluster. Thanks
In rabbitmq , by default, one queue is stored only to one node. When you create a cluster, the queue is available across nodes.
But It does't mean that the queue is mirrored, if the node gets down the queue is marked as down and you can't access.
Suppose to create one queue to the node, the queue will work until the node is up, as:
if the node is down you will have:
you should always apply the mirror policy, otherwise you could lose the messages

How can I determine which nodes in my rabbitmq cluster are HA?

I have a clustered HA rabbitmq setup. I am using the "exactly" policy similar to:
rabbitmqctl set_policy ha-two "^two\." \'{"ha-mode":"exactly","ha-params":10,"ha-sync-mode":"automatic"}'
I have 30 machines running, of which 10 are HA nodes with queues replicated. When my broker goes down (randomly assigned to be the first HA node), I need my celery workers to point to a new HA node (one of the 9 left). I have a script that automates this. The problem is: I do not know how to distinguish between a regular cluster node and a HA node. When I issue the command:
rabbitmqctl cluster_status
The categories I get are "running nodes", "disc", and "ram". but there is no way here to tell if a node is HA.
Any ideas?
In cluster every node share everything with another, so you don't have do distinguish nodes in your application in order to access all entities.
In your case when one of HA node goes down (their number get to 9), HA queues will be replicated to first available node (doesn't matter disc or ram).

RabbitMq queue being removed when node stopped

I have created two RabbitMQ nodes (say A and B) and I have cluster them. I have then done the following in the management UI :
(note that node A is intially the master)
On node A I created a queue (durable=true, auto-delete=false) and can see it shared on node B
Stopped node A, I can still see it on B (great)
Started node A again
Stopped node B, the queue has been removed from node A
This seems strange as node B was not even involved in the created of the queue
I then tried the same from node B :
On node B I created a queue (durable=true, auto-delete=false) and can see it shared on node A
Stopped node A, I can still see it on B (great)
Started node A again
Stopped node B, the queue has been removed from node A
The situation I am looking for is that no matter which node is stopped that the queue is still available on the other node.
I just noticed that the policies I setup have been removed from each node... no idea why. Just in case somebody else is having the same issue you can create policies using (e.g.)
rabbitmqctl set_policy ha-all "^com\.mydomain\." '{"ha-mode":"all","ha-sync-mode":"automatic"}'
It's immediately noticeable in the RabbitMQ Web UI as you can see the policy on the queue definition (in this case "ha-all").
See https://www.rabbitmq.com/ha.html for creating and,
See Policy Management section http://www.rabbitmq.com/man/rabbitmqctl.1.man.html for administration