Rabbitmq cluster does not work on ec2 - rabbitmq

Clustering with rabbit does not work. I mean wow..I have all ports open. I am on ubuntu 12.04 and yet I get the below? Any rabbit dev ever read these posts...why does this happen? Why do the docs reflect how to cluster properly? Both have the same cookies.
thanks
sudo rabbitmqctl stop_app
rabbitmqctl join_cluster --ram rabbit#ip-172-31-12-135.us-west-1.compute.internal
Clustering node 'rabbit#ip-172-31-2-103' with 'rabbit#ip-172-31-12-135.us-west-1.compute.internal' ...
Error: unable to connect to nodes ['rabbit#ip-172-31-12-135.us-west-1.compute.internal']: nodedown
=ERROR REPORT==== 26-Aug-2014::07:25:21 ===
** System NOT running to use fully qualified hostnames **
** Hostname ip-172-31-12-135.us-west-1.compute.internal is illegal **
DIAGNOSTICS
===========
attempted to contact: ['rabbit#ip-172-31-12-135.us-west-1.compute.internal']
rabbit#ip-172-31-12-135.us-west-1.compute.internal:
* connected to epmd (port 4369) on ip-172-31-12-135.us-west-1.compute.internal
* epmd reports node 'rabbit' running on port 25672
* TCP connection succeeded but Erlang distribution failed
* suggestion: hostname mismatch?
* suggestion: is the cookie set correctly?
current node details:
- node name: 'rabbitmqctl20516#ip-172-31-2-103'
- home dir: /var/lib/rabbitmq
- cookie hash: deaU3MfVotDW9r05xrIWwA==

Sorry to revive this but I ran into something similar on Windows with Rabbit 3.4.2 and Erlang OTP 17.3. I'm certain this is also an issue with Rabbit back to at least 3.3.5.
My goal was to setup a RabbitMq cluster on the same vnet. The machines can see each other and can get to each others shares, etc. Machines had the same erlang cookies and reported no errors. I could connect to each broker but not get them to cluster together.
There's not a lot of help on this around the web so after struggling with it for (many) hours here's how I fixed it. For me it was a casing issue. My vms were named rabbitMq00 and rabbitMq99 so my cluster command from rabbitMq00 was:
rabbitmqctl join_cluster rabbit#rabbitMq99
Wrong! Error message that produced was just as in the original question:
rabbit#rabbitMq99:
* connected to epmd (port 4369) on rabbitMq99
* epmd reports node 'rabbit' running on port 25672
* TCP connection succeeded but Erlang distribution failed
* suggestion: hostname mismatch?
* suggestion: is the cookie set correctly?
Windows/Erlang/WhoKnows wants caps (probably because of the NetBIOS of olden days). The proper command is:
rabbitmqctl join_cluster rabbit#RABBITMQ99
Server name must be uppercase. Sad but true. Hopefully this can help someone.

Answered on the mailing list: https://groups.google.com/d/msg/rabbitmq-users/9P-BAwGVHJU/fwOpZPJywwYJ, including my response here.
** System NOT running to use fully qualified hostnames **
** Hostname ip-172-31-12-135.us-west-1.compute.internal
is illegal **
There are 3 most common issues:
Host names: see "Issues with hostname" on http://www.rabbitmq.com/ec2.html
Firewalls, port access: see "Firewalled nodes" on http://www.rabbitmq.com/clustering.html
Different Erlang versions across the cluster: "If using clustered nodes, all nodes should use the same version of Erlang" on http://www.rabbitmq.com/which-erlang.html
so I'm not sure it's fair to claim that the docs are unhelpful.
Your issue seems to be 1 or 2, although all 3 need to be checked to be sure.
We'll try to cross link the pages above better.
Also, a quick search for the error message above yields multiple results, e.g.:
http://markmail.org/thread/2tgytqbittfvb2jq
http://markmail.org/thread/qfpphcemg73luf4j
http://markmail.org/thread/2f5alpmgwn2xybvj
which may clarify some of the issues in a bit more detail.

starting from version 3.7.0 - there is an environment variable RABBITMQ_USE_LONG NAME=true
or you need to use the --longnames option

Related

RabbitMQ: node is unreachable on windows after update

My windows machine recently auto-updated, and since then rabbitmq is not working. I know the errors I'm getting appear in many stackoverflow questions, but none of them have helped me resolve my issue. Any rabbitmqctl command I run returns the same result, that the node is unreachable (see below).
What I want to know:
How can I diagnose my issue?
Where can I find rabbitmq or erlang logs for what is happening? I can't find any and the only environment variable defined on my machine is the RABBIT_MQ_HOME var.
Any suggestions on fixing the issue (I have listed what I have tried below).
The error I am getting on any rabbitmqctl command:
rabbitmqctl.bat start_app
Starting node rabbit#DESKTOP-BG3LMOM ...
Error: unable to perform an operation on node 'rabbit#DESKTOP-BG3LMOM'. Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit#DESKTOP-BG3LMOM
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
===========
attempted to contact: ['rabbit#DESKTOP-BG3LMOM']
rabbit#DESKTOP-BG3LMOM:
* connected to epmd (port 4369) on DESKTOP-BG3LMOM
* epmd reports: node 'rabbit' not running at all
no other nodes on DESKTOP-BG3LMOM
* suggestion: start the node
Current node details:
* node name: 'rabbitmqcli-352-rabbit#DESKTOP-BG3LMOM'
System information:
RabbitMQ version: 3.9.13
Erlang version: 12.0
Windows 10 build: 19044.1526
What I have tried:
I've checked the erlang cookie is synced in all locations
I've uninstalled RabbitMQ and Erlang and reinstalled them both to the latest version (via choco), with machine restarts between uninstalling and reinstalling.
Followed every suggestion listed in this thread

Can't establish TCP connection, RabbitMQ

I'm new to RabbitMQ and I want to run a RabbitMQ server instance on centOS7 using the following command:
sudo systemctl start rabbitmq-server
The command seemed to take forever and when I stopped the process and checked the log files, everything was ok and it said that rabbit is up and running. But when I try to execute any command using rabbitmqctl I'm getting the following error:
Error: unable to perform an operation on node 'rabbit#hostname'. Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit#hostname
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
**DIAGNOSTICS**
attempted to contact: [rabbit#hostname]
rabbit#hostname:
* connected to epmd (port 4369) on hostname
* epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
* can't establish TCP connection to the target node, reason: timeout (timed out)
* suggestion: check if host 'hostname' resolves, is reachable and ports 25672, 4369 are not blocked by firewall
Current node details:
* node name: 'rabbitmqcli-806330-rabbit#hostname'
* effective user's home directory: /var/lib/rabbitmq
* Erlang cookie hash: KgAE7WR3dl5/FGAyWKE5LA==
I tried killing the processes manually but it didn't work.
every needed port is listening and I can telnet them. Can you please help me on where the problem might be?
The client machine cannot resolve the hostname pointing to the rabbitmq server.
If the IP address isn't publicly propagated, you have to put the IP/host combination in /etc/hosts file.
You could also try to connect to the IP address instead of the hostname to clear any other network related issues.

TCP connection succeeded, Erlang distribution failed

We installed Erlang Vm (erlang-23.2.1-1.el7.x86_64.rpm) and Rabbitmq server(rabbitmq-server-3.8.19-1.el7.noarch.rpm) on 3 different machines and were successful in starting the RabbitMQ server with three different clusters on 3 machines, but when we tried to cluster these rabbitmq nodes we are facing Erlang distribution failed error, googled it and found it might be due to Erlang cookie mismatch can anyone help us how to solve this mismatch issue if it is the root cause
Error message :
Error: unable to perform an operation on node 'rabbit#keng03-dev01-ins01-dmq67-app-1627533565-1'. Please see diagnostics information and suggestions below.
The most common reasons for this are:
Target node is unreachable (e.g. due to hostname resolution, TCP connection, or firewall issues)
CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
Target node is not running
In addition to the diagnostics info below:
See the CLI, clustering, and networking guides on https://rabbitmq.com/documentation.html to learn more
Consult server logs on node rabbit#keng03-dev01-ins01-dmq67-app-1627533565-1
If a target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
attempted to contact: ['rabbit#keng03-dev01-ins01-dmq67-app-1627533565-1']
rabbit#keng03-dev01-ins01-dmq67-app-1627533565-1:
connected to epmd (port 4369) on keng03-dev01-ins01-dmq67-app-1627533565-1
epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
TCP connection succeeded but Erlang distribution failed
suggestion: check if the Erlang cookie is identical for all server nodes and CLI tools
suggestion: check if all server nodes and CLI tools use consistent hostnames when addressing each other
suggestion: check if inter-node connections may be configured to use TLS. If so, all nodes and CLI tools must do that
suggestion: see the CLI, clustering, and networking guides on https://rabbitmq.com/documentation.html to learn more
Current node details:
node name: 'rabbitmqcli-616-rabbit#keng03-dev01-ins01-dmq67-app-1627533565-2'
effective user's home directory: /var/lib/rabbitmq
Erlang cookie hash: AFJEXwyuc44Sp8oYi00SOw==
'''
I had samer error description, in my case the erlang cookies matched among cluster nodes, but I seemed to face some case-sensitivity with the rabbitmqctl join_cluster-command.
With an elevated command prompt on host 2179NBXXXDP
this failed: rabbitmqctl join_cluster rabbit#2179ASXXX02
and this worked: rabbitmqctl join_cluster rabbit#2179asxxx02
(Hostname of the latter turned out to be indeed lowercased in my case.)

rabbitmq TCP connection succeeded but Erlang distribution failed

I'm getting below error while joining the cluster
root#hostname02:~# rabbitmqctl join_cluster --longnames rabbit#hostname01
Error: unable to perform an operation on node 'rabbit#hostname02'. Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit#hostname02
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
===========
attempted to contact: ['rabbit#hostname02']
rabbit#hostname02:
* connected to epmd (port 4369) on hostname02
* epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
* TCP connection succeeded but Erlang distribution failed
* Node name (or hostname) mismatch: node "rabbit#hostname02" believes its node name is not "rabbit#hostname02" but something else.
All nodes and CLI tools must refer to node "rabbit#hostname02" using the same name the node itself uses (see its logs to find out what it is)
Current node details:
* node name: 'rabbitmqcli-7548-rabbit#hostname02'
* effective user's home directory: /var/lib/rabbitmq
* Erlang cookie hash: Uaa+OhOna5fm+J0oPGPAiw==
root#hostname02:~#
.erlang.cookie on both the servers are same.
I cannot see what I am missing here. Can someone help in resolving it.
Check the --longnames option. Erlang nodes (and thus, its distribution protocol) only need longnames if there is a dot in the host part.
Also, check the node name reported by RabbitMQ on startup, and verify that it matches rabbit#hostname02

RabbitMQ Cluster : unable to connect to nodes : nodedown

I have installed rabbitmq in two machines in linux OS.And they all worked well. Then I run the command:rabbitmqctl join_cluster rabbit#gz2, it's not work.And the error info :
Error: unable to connect to nodes [rabbit#gz2]: nodedown
attempted to contact: [rabbit#gz2]
rabbit#gz2:
connected to epmd (port 4369) on gz2
epmd reports node 'rabbit' running on port 25672
TCP connection succeeded but Erlang distribution failed
suggestion: hostname mismatch?
suggestion: is the cookie set correctly?
suggestion: is the Erlang distribution using TLS?
suggestion: is the cookie set correctly?
You need to ensure both RabbitMQ nodes are using the same cookie file. Copy the file /var/lib/rabbitmq/.erlang.cookie from one node to the other, then restart RabbitMQ on the node to which you copied the file. You will be able to create a cluster after that.
Clustering and the Erlang cookie is documented here.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.