Mnesia can't connect to another node - rabbitmq

I am setting up a rabbitmq cluster and ran into an issue during the one step in the process. Its straight out of the rabbitmq clustering guide.
root#celery:~# rabbitmqctl status
Status of node celery#celery ...
[{pid,20410},
{running_applications,[{rabbit,"RabbitMQ","2.5.1"},
{os_mon,"CPO CXC 138 46","2.2.4"},
{sasl,"SASL CXC 138 11","2.1.8"},
{mnesia,"MNESIA CXC 138 12","4.4.12"},
{stdlib,"ERTS CXC 138 10","1.16.4"},
{kernel,"ERTS CXC 138 10","2.13.4"}]},
{os,{unix,linux}},
{erlang_version,"Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:30] [hipe] [kernel-poll:true]\n"},
{memory,[{total,25296704},
{processes,9680280},
{processes_used,9662720},
{system,15616424},
{atom,1099393},
{atom_used,1082732},
{binary,89768},
{code,11606637},
{ets,726848}]}]
...done.
root#celery:~# rabbitmqctl cluster_status
Cluster status of node celery#celery ...
[{nodes,[{disc,[celery#celery]}]},{running_nodes,[celery#celery]}]
...done.
root#celery:~# rabbitmqctl stop_app
Stopping node celery#celery ...
...done.
root#celery:~# rabbitmqctl reset
Resetting node celery#celery ...
...done.
root#celery:~# rabbitmqctl cluster worker1#worker1
Clustering node celery#celery with [worker1#worker1] ...
Error: {failed_to_cluster_with,[worker1#worker1],
"Mnesia could not connect to some nodes."}
What are the possible reasons one node wouldn't be able to connect to another?
Here's the guide I'm following: http://www.rabbitmq.com/clustering.html

I jumped into the #rabbitmq channel on freenode. Here's the discussion that followed:
14:29 shakakai: hey all, i'm having a little issue with clustering rabbitmq http://stackoverflow.com/questions/6948624/mnesia-cant-connect-to-another-node
14:30 shakakai: has anyone run into that problem before?
14:30 daysmen has left IRC (Read error: Connection reset by peer)
14:30 antares_: shakakai: make sure that epmd is running on every node
14:30 antares_: shakakai: and that port it uses (4369) is open in your firewall
14:31 |Blaze|: shakakai: is your dns correct? Can you ping worker1 from celery and celery from worker1
14:31 shakakai: |Blaze|: hmm...i'll check
14:31 daysmen has joined (~quassel#host-84-13-157-50.opaltelecom.net)
14:32 shakakai: |Blaze|: this is where I'm a little confused, the rabbitmq nodename is worker1#worker1 but the fqdn to ping the box is "ping worker1.mydomain.com"
14:33 |Blaze|: can you "ping worker1"
14:34 shakakai: |Blaze|: no
14:34 |Blaze|: k, you'll need to fix that
14:34 hyperboreean has left IRC (Ping timeout: 250 seconds)
14:37 shakakai: |Blaze|: gotcha, so I setup a hosts file and i should be good
14:37 |Blaze|: yup
14:37 |Blaze|: in both directions
TL;DR
Make sure you can ping the rabbit nodename from each of the boxes you are clustering. If you can't, setup a hosts file for each rabbit nodename.

I installed the Docker RabbitMQ also encountered similar problems in the process.
The main reason is /var/lib/RabbitMQ/mnesia/rabbit/cluster_nodes.config configuration file on errors cannot be connected to.
Mnesia is a distributed, soft real-time database management system written in the Erlang programming language
There are several ways to repair this problem:
Fix the configure file,using the correct cluster node name, from the log we see that our Node name is rabbit#cb43449d5d72
// log info
...
rabbitmq | Starting broker...2019-11-27 16:18:22.621 [info] <0.304.0>
rabbitmq | node : rabbit#cb43449d5d72
...
// This is the wrong configuration file:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit#cb43449d5d72,rabbit#dc3288264c34],[rabbit#dc3288264c34]}.
// Update it with correctly config node name, and restart RabbitMQ server:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit#cb43449d5d72],[rabbit#cb43449d5d72]}.
The simplest way is to remove the mnesia directory and configure the correct node name, which like rabbit#my-rabbit, in /etc/hosts is 127.0.0.1 my-rabbit, after the operation, you should see the following configuration details
$ find . -name cluster_nodes.config
./mnesia/rabbit/cluster_nodes.config
./mnesia/rabbit#my-rabbit/cluster_nodes.config
$ cat ./mnesia/rabbit#my-rabbit/cluster_nodes.config
{['rabbit#my-rabbit'],['rabbit#my-rabbit']}.

There are several things to check before you can get the cluster to work well:
0) Ensure you are running the exact same rabbitmq version on each node
1) set up network until you are able to ping each server from each other
2) cookies - You have to get the exact same erlang cookie in the .erlang.cookie file on each server
One trick is useful is to try this command from one node to see if you can reach another one from rabbitmq
rabbitmqctl eval 'net_adm:ping(rabbit#othernode).'
this should say Pang if it's nok or pong if it's ok
be careful to not forget the dot close to the end of the eval expression.
I got it working fine after several hours of unsuccessful trials.
3) Bear in mind that there may be an issue when restarting a node of a cluster if this node was not the last that was stop - it wont start before the last that stop was restarted.
When all the above (0 to 2) are correct, 3 may well be the root cause of your problem...
Hope this help,
cheers,
jb

One thing I've read is that the erlang cookie needs to be on all cluster nodes so that they can communicate. i believe it lives in /var/lib/rabbitmq/.erlang.cookie

Related

RabbitMQ cluster on a single machine

I want to create a three node RabbitMQ cluster on a single RHEL8 machine for testing purposes. I tried instructions given in RabbitMQ official guide and also tried to follow this guide.
The first node works fine and it's running. However, the second node cannot be started and throws up an error.
I used below commands as mentioned in the guide.
RABBITMQ_NODE_PORT=5672 RABBITMQ_NODENAME=rabbit rabbitmq-server -detached
RABBITMQ_NODE_PORT=5673 RABBITMQ_NODENAME=hare rabbitmq-server -detached
rabbitmqctl -n hare stop_app
This command throws up below error.
DIAGNOSTICS
attempted to contact: [hare#localhost]
hare#localhost:
connected to epmd (port 4369) on localhost
epmd reports: node 'hare' not running at all
other nodes on localhost: [rabbit]
On further inspection of logs, it seems like that this node tries to use the same ports used by the first node (e.g. MQTT port 1883).
I think I might have to use the other option of declaring /etc/rabbitmq/rabbitmq.conf. Mainly because it seems to give more options to change ports etc.
A sample config file resembling the one needed in my case or a link to a proper guide is highly appreciated.
You didn't specify, but you must have the MQTT plugin enabled for there to be a conflict on that port, correct?
The easiest work-around would be to have two configuration files specifying different ports for MQTT, AMQP and anything else. Then, use the RABBITMQ_CONFIG_FILE environment variable to point to the appropriate file:
RABBITMQ_NODE_PORT=5672 RABBITMQ_NODENAME=rabbit0 \
RABBITMQ_CONFIG_FILE=/path/to/rabbitmq-0.conf rabbitmq-server -detached
RABBITMQ_NODE_PORT=5673 RABBITMQ_NODENAME=rabbit1 \
RABBITMQ_CONFIG_FILE=/path/to/rabbitmq-1.conf rabbitmq-server -detached
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

How can I extend redis database by redisgraph.so module?

Unable to import redisgraph module redisgraph.so indo redis database.
I successfully compiled redisgraph.so from sources.
redisgraph.so execution rights are set for everyone.
I tried:
$ redis-cli
> shutdown ((stop redis-server))
$ redis-server --loadmodule pathto/redisgraph.so
((System replies:))
# oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
# Redis version=4.0.9, bits=64, commit=00000000, modified=0, pid=2407, just started
# Configuration loaded
* Increased maximum number of open files to 10032 (it was originally set to 1024).
# Creating Server TCP listening socket *:6379: bind: Address already in use
$ redis-cli
> module list
(empty list or set)
> module load pathto/redisgraph.so
(error) ERR Error loading the extension. Please check the server logs.
((log file says: *no permission*))
redis database works fine as key-value database.
But I fail to extend it by graph functionality.
So far I am unable to drop commands like "GRAPH.QUERY" (redis replies: "unknown command").
I have no idea why redis-server seems to ignore the import command or redis-cli complains about permission rights.
The error indicates that you already have a running process bound to the same port (probably another redis-server).
Also, you'd be better off using redisgraph with the latest Redis version (i.e. v5).
It's better to have redis managed by systemd and you could configure it as follow:
Inside
update the supervised directive in /etc/redis/redis.conf to use systemd by setting supervised systemd
Creating a redis systemd file /etc/systemd/system/redis.service and set unit, service and install directive:
[Unit]
Description=Redis In-Memory Data Store
After=network.target
[Service]
User=redis
Group=redis
ExecStart=/usr/local/bin/redis-server /etc/redis/redis.conf
ExecStop=/usr/local/bin/redis-cli shutdown
Restart=always
[Install]
WantedBy=multi-user.target
Then start redis
sudo systemctl start redis
sudo systemctl status redis
If you want redis to automatically restart when your server starts then:
Assuming all of these tests worked and that you would like to start Redis automatically when your server boots, enable the systemd service:
sudo systemctl enable redis

Could not connect to Redis at 127.0.0.1:6379: Connection refused with homebrew

Using homebrew to install Redis but when I try to ping Redis it shows this error:
Could not connect to Redis at 127.0.0.1:6379: Connection refused
Note :
I tried to turn off firewall and edit conf file but still cannot ping.
I am using macOS Sierra and homebrew version 1.1.11
After installing redis, type from terminal:
redis-server
And Redis-Server will be started
I found this question while trying to figure out why I could not connect to redis after starting it via brew services start redis.
tl;dr
Depending on how fresh your machine or install is you're likely missing a config file or a directory for the redis defaults.
You need a config file at /usr/local/etc/redis.conf. Without this file redis-server will not start. You can copy over the default config file and modify it from there with
cp /usr/local/etc/redis.conf.default /usr/local/etc/redis.conf
You need /usr/local/var/db/redis/ to exist. You can do this easily with
mkdir -p /usr/local/var/db/redis
Finally just restart redis with brew services restart redis.
How do you find this out!?
I wasted a lot of time trying to figure out if redis wasn't using the defaults through homebrew and what port it was on. Services was misleading because even though redis-server had not actually started, brew services list would still show redis as "started." The best approach is to use brew services --verbose start redis which will show you that the log file is at /usr/local/var/log/redis.log. Looking in there I found the smoking gun(s)
Fatal error, can't open config file '/usr/local/etc/redis.conf'
or
Can't chdir to '/usr/local/var/db/redis/': No such file or directory
Thankfully the log made the solution above obvious.
Can't I just run redis-server?
You sure can. It'll just take up a terminal or interrupt your terminal occasionally if you run redis-server &. Also it will put dump.rdb in whatever directory you run it in (pwd). I got annoyed having to remove the file or ignore it in git so I figured I'd let brew do the work with services.
If after install you need to run redis on all time, just type in terminal:
redis-server &
Running redis using upstart on Ubuntu
I've been trying to understand how to setup systems from the ground up on Ubuntu. I just installed redis onto the box and here's how I did it and some things to look out for.
To install:
sudo apt-get install redis-server
That will create a redis user and install the init.d script for it. Since upstart is now the replacement for using init.d, I figure I should convert it to run using upstart.
To disable the default init.d script for redis:
sudo update-rc.d redis-server disable
Then create /etc/init/redis-server.conf with the following script:
description "redis server"
start on runlevel [23]
stop on shutdown
exec sudo -u redis /usr/bin/redis-server /etc/redis/redis.conf
respawn
What this is the script for upstart to know what command to run to start the process. The last line also tells upstart to keep trying to respawn if it dies.
One thing I had to change in /etc/redis/redis.conf is daemonize yes to daemonize no. What happens if you don't change it then redis-server will fork and daemonize itself, and the parent process goes away. When this happens, upstart thinks that the process has died/stopped and you won't have control over the process from within upstart.
Now you can use the following commands to control your redis-server:
sudo start redis-server
sudo restart redis-server
sudo stop redis-server
Hope this was helpful!
redis-server --daemonize yes
I have solved this issue by running this command.
This work for me :
sudo service redis-server start
Date: Dec 2021
There is a couple of reason for this error. I read one article to fix the issue for me. So I just summarize what to check one by one.
1 Check: Redis-Server not Started
redis-server
Also to run Redis in the background, the following command could be used.
redis-server --daemonize yes
2. Check: Firewall Restriction
sudo ufw status (inactive)
sudo ufw active (for making active it might disable ssh when first time active. So enable port 22 to access ssh.)
sudo ufw allow 22
sudo ufw allow 6379
3. Check: Resource usage
ps -aux | grep redis
4. Config setup restriction
sudo vi /etc/redis/redis.conf.
Comment the following line.
# bind 127.0.0.1 ::1
Note: It will be more difficult for malicious actors to make requests or gain access to your server. Make sure you're bound to correct IP address network.
Hope it helps someone. For more information read the following article.
https://bobcares.com/blog/could-not-connect-to-redis-connection-refused/
It's the better way to connect to your redis.
At first, check the ip address of redis server like this.
ps -ef | grep redis
The result is kind of " redis 1184 1 0 .... /usr/bin/redis-server 172.x.x.x:6379
And then you can connect to redis with -h(hostname) option like this.
redis-cli -h 172.x.x.x
Try this :
sudo service redis-server restart
Error connecting Redis on Apple Silicon( Macbook Pro M1 - Dec 2020), you have to just know 2 things:
Run the redis-server using a sudo will remove the server starting error
shell% sudo redis-server
For running it as a service "daemonize" it will allow you to run in the background
shell% sudo redis-server --daemonize yes
Verify using below steps:
shell% redis-cli ping
Hope this helps all Macbook Pro M1 users who are really worried about lack of documentation on this.
I was stuck on this for a long time. After a lot of tries I was able to configure it properly.
There can be different reasons of raising the error. I am trying to provide the reason and the solution to overcome from that situation. Make sure you have installed redis-server properly.
6379 Port is not allowed by ufw firewall.
Solution: type following command sudo ufw allow 6379
The issue can be related to permission of redis user. May be redis user doesn't have permission of modifying necessary redis directories. The redis user should have permissions in the following directories:
/var/lib/redis
/var/log/redis
/run/redis
/etc/redis
To give the owner permission to redis user, type the following commands:
sudo chown -R redis:redis /var/lib/redis
sudo chown -R redis:redis /var/log/redis
sudo chown -R redis:redis /run/redis
sudo chown -R redis:redis /etc/redis.
Now restart redis-server by following command:
sudo systemctl restart redis-server
Hope this will be helpful for somebody.
First you need to up/start the all the redis nodes using below command, one by one for all conf files.
#Note : if you are setting up cluster then you should have 6 nodes, 3 will be master and 3 will be slave.redis-cli will automatically select master and slave out of 6 nodes using --cluster command as shown in my below commands.
[xxxxx#localhost redis-stable]$ redis-server xxxx.conf
then run
[xxxxx#localhost redis-stable]$ redis-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 --cluster-replicas 1
output of above should be like:
>>> Performing hash slots allocation on 6 nodes...
2nd way to set up all things automatically:
you can use utils/create-cluster scripts to set up every thing for you like
starting all nodes, creating cluster
you an follow https://redis.io/topics/cluster-tutorial
Thanks
Actually you need to run "redis-server &" after instalation to start the service, when you only run "redis-server" the service runs in undetached mode. emphasis on "&"
I just had this same problem because I had used improper syntax in my config file. I meant to add:
maxmemory-policy allkeys-lru
to my config file, but instead only added:
allkeys-lru
which evidently prevented Redis from parsing the config file, which in turn prevented me from connecting through the cli. Fixing this syntax allowed me to connect to Redis.
Had that issue with homebrew MacOS the problem was some sort of permission missing on /usr/local/var/log directory see issue here
In order to solve it I deleted the /usr/local/var/log and reinstall redis brew reinstall redis
In my case, it was the password that contained some characters like ', after changing it the server started without problems.
Just like Aaron, in my case brew services list claimed redis was running, but it wasn't. I found the following information in my log file at /usr/local/var/log/redis.log:
4469:C 28 Feb 09:03:56.197 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
4469:C 28 Feb 09:03:56.197 # Redis version=4.0.9, bits=64, commit=00000000, modified=0, pid=4469, just started
4469:C 28 Feb 09:03:56.197 # Configuration loaded
4469:M 28 Feb 09:03:56.198 * Increased maximum number of open files to 10032 (it was originally set to 256).
4469:M 28 Feb 09:03:56.199 # Creating Server TCP listening socket 192.168.161.1:6379: bind: Can't assign requested address
That turns out to be caused by the following configuration:
bind 127.0.0.1 ::1 192.168.161.1
which was necessary to give my VMWare Fusion virtual machine access to the redis server on macOS, the host. However, if the virtual machine wasn't started, this binding failure caused redis not to start up at all. So starting the virtual machine solved the problem.
I was trying to connect my Redis running in wsl2 from vs code running in Windows.
I have listed down what worked for me and the order in which I have performed these actions:
1) sudo ufw allow 6379
2) Update redis.conf to bind 127.0.0.1 ::1 192.168.1.7
3) sudo service redis-server restart
NOTE: This is the first time I have installed Redis on wsl2 and have not run a single command yet.
Let me know if it works for you.
Thanks.
Redis for Mac:
1- brew install redis
2- brew services start redis
3- redis-cli ping
$ brew services start redis
$ brew services stop redis
$ brew services restart redis
Lunch autostart options:
$ ln -sfv /usr/local/opt/redis/*.plist ~/Library/LaunchAgents
# autostart activate
$ launchctl load ~/Library/LaunchAgents/homebrew.mxcl.redis.plist
# autostart deactivate
$ launchctl unload ~/Library/LaunchAgents/homebrew.mxcl.redis.plist
Redis conf default path : /usr/local/etc/redis.conf
In my case, someone had come along and incorrectly edited the redis.conf file to this:
bind 127.0.0.1 ::1
bind 192.168.1.7
when, it really needed to be this (one line):
bind 127.0.0.1 ::1 192.168.1.7
I am using Ubuntu 18.04
I have just enter this command in CMD
sudo systemctl start redis-server
And it is now working. so I thing my redis server was not started that why it showing me the error
Could not connect to Redis at 127.0.0.1:6379: Connection refused.

Auto reconnect to RabbitMQ cluster after server restart

I have master-slave configuration of RabbitMQ. As two Docker containers, with dynamic internal IP (changed on every restart).
Clustering works fine on clean run, but if one of servers got restarted it cannot reconnect to the cluster:
rabbitmqctl join_cluster --ram rabbit#master
Clustering node 'rabbit#slave' with 'rabbit#master' ...
Error: {ok,already_member}
And following:
rabbitmqctl cluster_status
Cluster status of node 'rabbit#slave' ...
[{nodes,[{disc,['rabbit#slave']}]}]
says that node not in a cluster.
Only way I found it remove this node, and only then try to rejoin cluster, like:
rabbitmqctl -n rabbit#master forget_cluster_node rabbit#slave
rabbitmqctl join_cluster --ram rabbit#master
That works, but doesn't look good for me. I believe there should be better way to rejoin cluster, than forgetting and join again. I see there is a command update_cluster_nodes also, but seems that this something different, not sure if it could help.
What is correct way to rejoin cluster on container restart?
I realize that this has been opened for a year but I though I would answer just in case it might help someone.
I believe that this issue has been resolved in a recent RabbitMQ release.
I implemented a Dockerized RabbitMQ Cluster using the Rabbit management 3.6.5 image and my nodes are able to auto rejoin the cluster on container or Docker host restart.

Rabbit will not cluster on ec2

I am having server issues with getting rabbit to cluster.
I boot up two nodes on ec2.
On the the first node booted I do this.
rabbitmqctl stop_app
rabbitmqctl reset
rabbitmqctl start_app
I boot another node.
sudo service rabbitmq-server stop
#Copy cookie from the first server booted
sudo su - -c 'echo -n "cookie" > /var/lib/rabbitmq/.erlang.cookie'
rabbitmqctl stop_app
rabbitmqctl reset
rabbitmqctl cluster rabbit#server1
1) sever1 is running
2) What ports to need open? I have 22, 4369, 5672
sudo rabbitmqctl cluster rabbit#aws-rabbit-server-east-development-20121102162143
Clustering node 'rabbit#aws-rabbit-server-east-development-20121103033005' with ['rabbit#aws-rabbit-server-east-development-20121102162143'] ...
Error: {no_running_cluster_nodes,['rabbit#aws-rabbit-server-east-development-20121102162143'],
['rabbit#aws-rabbit-server-east-development-20121102162143']}
What could possibility be missing from there docs or what what am I missing?
I had a similar problem on EC2 with two windows machines. I eventually got it working but I'm not sure I did it in the correct way so there may be a better solution.
The issue I found was that the two nodes could not see each other when trying to cluster. Each time you start a Rabbit node it seemed to be assigned a port number dynamically.
This obviously makes it very difficult to know which port to open up in the security group so to solve this, I restricted the range of ports Rabbit chose from when assigning the port. I restricted this to a range of 1 port on each node so I always know which port was being assigned.
The easiest way I found to do this was by editing the sbin\rabbitmq-service.bat file.
find the line -kernel inet_default_connect_options "[{nodelay,true}]" ^
add the following two lines to the file underneath:
-kernel inet_dist_listen_min ##### ^
-kernel inet_dist_listen_max ##### ^
replacing ##### with your chosen port number.
So you should now open up the following ports:
5672 - RabbitMQ’s listening port
4369 - Erlang Port Mapper Daemon
##### - the chosen port number for the Erlang nodes to communicate via
Because Erlang does not recognise FQDNs you may need to modify the hosts file on all the servers to make sure they are all able to resolve all the Erlang node name to an IP address, e.g.
123.123.123.111 NODE1
123.123.123.222 NODE2
once this is done you should then be able to see each node from the other. you can do this by using calling the following from the command line (replacing rabbit#NODE2 with whichever node you want to see)
rabbitmqctl status -n rabbit#NODE2
Hope this give you some help, I'm no expert but found this got things working for me!