I am trying to do clustering on RABBITMQ. I have added 2 nodes but unable to add 3rd one.i have clustered rabbit#node1 and rabbit#node2. Now I am trying to cluster rabbit#node3 with rabbit#node1.
Here is what I am trying to do
rabbitmqctl join_cluster rabbit#node1
Clustering node rabbit#node3 with rabbit#node1 ...
Error: mnesia_not_running
Is there any solution that how to add a third node in cluster? Or any solution for the Error: mnesia_not_running
When joining cluster, target node application should be started, while source (current) node application should be stopped. Application stopped and started with rabbitmqctl stop_app/rabbitmqctl start_app.
Maybe you have stopped application on rabbit#node1, while joining it to cluster, in that case you should to run rabbitmqctl start_app on rabbit#node1, or rabbitmqctl -n rabbit#node1 start_app to be able to join it's cluster. Or you can join rabbit#node2 cluster and start app later.
To have working cluster you should start application on all nodes after joining.
It happens when the target node's app is stopped. When joining a node to a rabbitmq cluster, only the source node(the node which you are trying to link) should be stopped.
master node:
rabbitmqctl start_app
on the current node:
rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit#node1
Related
We're using redis-cluster extensively in our production env. We currently have a 30 node cluster (15 masters, 15 slaves)
We're trying to increase the cluster, for that we've created new servers & joined them to the cluster. so far all is well.
Next - we're trying to reshard the slots to the new masters. we wrote a script that does this, using the redis-trib reshard command.
However - the migration fails midway (but not very far from the start) with this error:
[ERR] Calling MIGRATE: ERR Target instance replied with error: BUSYKEY Target key name already exists.
This happens sporadically, at times it manages to move some slots before failing, at times it fails on the first slot.
Each such failure requires a manual fixing operation which makes the reshard operation very hard to manage.
We have not found any concrete example of this, nor any idea on how to prevent this other than a downtime migration. which we are trying to avoid.
Versions:
redis server 4.0.2
redis trib 3.3.3 (downgraded from 4.0.2 following this issue : redis cluster reshard [ERR] Calling MIGRATE: ERR Syntax error)
Our next step is to upgrade to latest redis (4.0.11), even though we didn't find any indication in the release notes of this issue.
Hoping to hear we're doing something wrong and how to fix it, or is redis-cluster not built for live resharding ?
Thanks
I have faced like this problem while working with redis-clustering support for our own project. I found a problem with the redis-trib reshard command. It works fine if no key is stored in slots those are migrating from one master to another.
But redis-5 (still developing, not stable yet) has it's own
`redis-cli' that has no problem with resharding command I think. Only for lower versions of 5 it happens.
If you look at the official docs for redis say redis reconfiguration and redis cluster resharding, you'll find what they do internally to reshard.
So I solved the problem by doing those tasks by running a bash script instead of running redis-trib reshard command.
Suppose you want to reshard some slots from a master node to other master node. We'll call the node that has the current ownership of the hash slot the source node, and the node where we want to migrate the destination node.
For each slot do the following steps:
Remember that the order of these steps is important here according to redis official docs.
Send CLUSTER SETSLOT <slot> IMPORTING <source-node-id> to destination node to set the slot to importing state.
Send CLUSTER SETSLOT <slot> MIGRATING <destination-node-id> to source node to set the slot to migrating state.
Get keys from the source node with CLUSTER GETKEYSINSLOT command and move them into the destination node using the following MIGRATE command.
MIGRATE target_host target_port key target_database_id timeout
In Redis Cluster there is no need to specify a database other than 0, but MIGRATE is a general command that can be used for other tasks not involving Redis Cluster.
When the migration process is finally finished, use CLUSTER SETSLOT <slot> NODE <destination-node-id> in both source node and destination node in order to set the slot to their normal state again. The same command is usually sent to all other nodes to avoid waiting for the natural propagation of the new configuration across the cluster.
A simple example bash script to do this is also given here:
source-ip: 172.17.0.5. source-id: 1f70a5107e0042a7d33a9efaf88dbdfecd78076a
destination-ip: 172.17.0.4. destination-id: 7e428bae84697a3882ecad19bd0d13ac7ee97d02
another master ip: 172.17.0.7
for i in `seq 0 5460`; do
redis-cli -c -h 172.17.0.4 cluster setslot ${i} importing 1f70a5107e0042a7d33a9efaf88dbdfecd78076a
redis-cli -c -h 172.17.0.5 cluster setslot ${i} migrating 7e428bae84697a3882ecad19bd0d13ac7ee97d02
while true; do
key=`redis-cli -c -h 172.17.0.5 cluster getkeysinslot ${i} 1`
if [ "" = "$key" ]; then
echo "there are no key in this slot ${i}"
break
fi
redis-cli -h 172.17.0.5 migrate 172.17.0.4 6379 ${key} 0 5000
done
redis-cli -c -h 172.17.0.5 cluster setslot ${i} node 7e428bae84697a3882ecad19bd0d13ac7ee97d02
redis-cli -c -h 172.17.0.4 cluster setslot ${i} node 7e428bae84697a3882ecad19bd0d13ac7ee97d02
redis-cli -c -h 172.17.0.7 cluster setslot ${i} node 7e428bae84697a3882ecad19bd0d13ac7ee97d02
done
Similar to RabbitMQ has Nodedown Error. But, this is for Ubuntu 16.04 the working solution, posted below, differs from the windows one as well.
Something has gone wrong with my rabbitmq server. Trying to start the application gives an error:
$sudo rabbitmqctl start_app
Starting node rabbit#daniel ...
Error: unable to connect to node rabbit#daniel: nodedown
DIAGNOSTICS
===========
attempted to contact: [rabbit#daniel]
rabbit#daniel:
* connected to epmd (port 4369) on daniel
* epmd reports: node 'rabbit' not running at all
no other nodes on daniel
* suggestion: start the node
current node details:
- node name: 'rabbitmq-cli-6647#daniel'
- home dir: /var/lib/rabbitmq
- cookie hash: T1R4ztWXXH1w2IQe+fui9g==
Currently the only way I know of solving this is uninstalling/reinstalling rabbitmq. But, I'm hoping a more sensible solution is possible...
This is the important part of that message:
node 'rabbit' not running at all
You need to start RabbitMQ with systemctl start rabbitmq-server. You should also check the logs to see why it wasn't running in the first place.
Try to run it with sudo rabbitmq-server
I keep getting this error every time I try to do something with RabbitMQ:
attempted to contact: [fdbvhost#FORTE]
fdbvhost#FORTE:
* connected to epmd (port 4369) on FORTE
* epmd reports: node 'fdbvhost' not running at all
no other nodes on FORTE
* suggestion: start the node
current node details:
- node name: 'rabbitmq-cli-54#FORTE'
- home dir: C:\Users\Jesus
- cookie hash: iuRlQy0F81aBpoY9aQqAzw==
This is the output I get when I run rabbitmqctl -n fdbvhost status or /rabbitmqctl -n fdbvhost list_vhosts.
I've tried rabbitmqctl -n fdbvhost start which gives me the following output:
Error: could not recognise command
Usage:
rabbitmqctl [-n <node>] [-t <timeout>] [-q] <command> [<command options>]
...
So this doesn't start it. I cannot find anything about starting a node in the documentation. How do I actually start my node/vhost?
Try running the following command from the RabbitMQ's installation sbin directory
rabbitmq-server start -detached
This should start the broker node if it was stopped for some reason.
Check if you have RabbitMQ installed as a service in the /etc/init.d/ folder
sudo su # might be needed
cd /etc/init.d/
ls . | grep rabbit
The output should be rabbitmq-server
If that's the case, then, try restarting your service with:
sudo service rabbitmq-server restart
For mac users
To Start
brew services start rabbitmq
To Restart
brew services restart rabbitmq
To Stop
brew services stop rabbitmq
To Know the status of the server
brew services info rabbitmq
I installed RabbitMQ server on OS X, and started it on command line. Now, it is not obvious that how I should stop it from running? After I did:
sudo rabbitmq-server -detached
I get:
Activating RabbitMQ plugins ...
0 plugins activated:
That was it. How should I properly shut it down? In the document, it mentions using rabbitmqctl(1), but it's not clear to me what that means. Thanks.
Edit: As per comment below, this is what I get for running sudo rabbitmqctl stop:
(project_env)mlstr-1:Package mlstr$ sudo rabbitmqctl stop
Password:
Stopping and halting node rabbit#h002 ...
Error: unable to connect to node rabbit#h002: nodedown
DIAGNOSTICS
===========
nodes in question: [rabbit#h002]
hosts, their running nodes and ports:
- h002: [{rabbit,62428},{rabbitmqctl7069,64735}]
current node details:
- node name: rabbitmqctl7069#h002
- home dir: /opt/local/var/lib/rabbitmq
- cookie hash: q7VU0JjCd0VG7jOEF9Hf/g==
Why is there still a 'current node'? I have not run any client program but only the RabbitMQ server, does that mean a server is still running?
It turns out that it is related to permissions. Somehow my rabbitmq server was started with user 'rabbitmq' (which is strange), so that I had to do
sudo -u rabbitmq rabbitmqctl stop
In my dev environment where I keep it running all the time, I use:
launchctl unload ~/Library/LaunchAgents/homebrew.mxcl.rabbitmq.plist
and to start it
launchctl load ~/Library/LaunchAgents/homebrew.mxcl.rabbitmq.plist
Even easier....
brew services stop rabbitmq
brew services start rabbitmq
Use rabbitmqctl stop to stop any node. If you need to specify the node giving you trouble, add the -n rabbit#[hostname] option.
You can also use the shortcut RabbitMQ Service - stop if you don't like the commands
stop
sudo systemctl stop rabbitmq-server
start
sudo systemctl start rabbitmq-server
For Windows, use PowerShell as Admin, then run
.\rabbitmq-service.bat stop
stop Stop the service. The service must be running for this command to have any effect.
https://www.rabbitmq.com/man/rabbitmq-service.8.html
For OP's answer above,
It turns out that it is related to permissions.
I have no knowledge on this.
For mac users
To Stop
brew services stop rabbitmq
To Start
brew services start rabbitmq
To Restart
brew services restart rabbitmq
To Know the status of the server
brew services info rabbitmq
Rabbitmq server does not start, saying it's already running:
$: rabbitmq-server
Activating RabbitMQ plugins ...
0 plugins activated:
node with name "rabbit" already running on "android-d1af002161676bee"
diagnostics:
- nodes and their ports on android-d1af002161676bee: [{rabbit,52176},
{rabbitmqprelaunch2254,
59205}]
- current node: 'rabbitmqprelaunch2254#android-d1af002161676bee'
- current node home dir: /Users/Jordan
- current node cookie hash: ZSx3slRJURGK/nHXDTBRqQ==
But, rabbitmqctl seems to think otherwise:
rabbitmqctl -n rabbit status
Status of node 'rabbit#android-d1af002161676bee' ...
Error: unable to connect to node 'rabbit#android-d1af002161676bee': nodedown
diagnostics:
- nodes and their ports on android-d1af002161676bee: [{rabbit,52176},
{rabbitmqctl2462,59256}]
- current node: 'rabbitmqctl2462#android-d1af002161676bee'
- current node home dir: /Users/Jordan
- current node cookie hash: ZSx3slRJURGK/nHXDTBRqQ==
Any takers?
The rabbitmq server was running somewhere but it just couldn't be connected to.
One of the following will mention something about rabbits:
$: ps aux | grep epmd
$: ps aux | grep erl
Kill the process with kill -9 {pid of rabbitmq process}
i was having the same problem then I realized I was not issuing the right command.
./rabbitmqctl stop
this works everytime, although it does take down erlang runtime too. also mind where your config file.
I used rabbitmqctl stop and then restarted using rabbitmq-server as root.
This issue can be caused by two issues:
Rabbit is already running on the server. If that is the case, use the answer you found of killing the currently running process (ps aux | grep rabbit | grep -v grep)
You have changed the IP address of your machine but not changed the /etc/hosts file to reflect the new IP address of the machine.
The more common of the issues is the first, but the harder to find is the second (especially if you have rabbit running on the other machine. If rabbit is installed on the other machine it will look at the old IP address and would see another machine already running rabbitmq and give you the same error. This has caused me grief in the past.
I was having this same error # Win 7, but the solutions above did not worked for me, what did solved was to remove and reinstall the service. Using a console with admin rights:
rabbitmq-service remove
rabbitmq-service install
I hope this might help someone else too
$CD RabbitMQ Server\rabbitmq_server-3.7.8\sbin
rabbitmq-service remove
rabbitmq-service install
Go : windows Services
Find : RabbitMQ and Start it
after this Enable plugin :
rabbitmq-plugins enable rabbitmq_management
In my case under Ubuntu 11.10 it helped to
#rabbitmqctl cluster MASTER SLAVE
#rabbitmqctl start_app
before I always got this error message...
Using admin console, in Win 2012R2 ver 3.5.5 rabbit, got it to work using the remove and install then rabbitmq-server restart
then ctr-c to terminate the job, then I was able to use the windows service console and start the rabbitMq service.
In my case(windows),
1. I just ran the stop service.
2. The started the service.