Mesos Failed to connect error to IP:5050 - apache

I am new to Mesos and just finished setting up mesos and along with zookeeper on my test server.
Unfortunately I keep getting this error message on my mesos console indicating i am unable to connect to mesos on port 5050 and can't seem to figure out why.
I have included the error in the screen shot below
The mesos log files doesn't point to why the error is showing either.

I resolved the problem by this:
./bin/mesos-master.sh --ip=x.x.x.x --work_dir=/var/lib/mesos --hostname=x.x.x.x

We can avoid this problem by starting mesos-master with following option:
--ip=xx.xx.xx.xx --hostname_lookup=false

I have resolved this problem. Open the web page in Chrome, and open the developer tool, you will see the chrome is accessing the web site with domain, in my case the domain name is "mesosphere", as there is no mesosphere in dns, so the accessing was failed.
I solved the problem by adding the mesosphere in the hosts file, C:/windows/system32/etc/hosts/
If you use the domain name for the Mesos cluster you must set the domain name in windows hosts.

There can be multiple issues here.
Is your mesos-master running and healthy ?
Has leader election process completed, if all is good.
Check if you are able to do
ping leader.mesos
If above ping doesn't work, that means leader has not been elected. First fix that.

I had this problem also. Luckily, I have a running mesos server also. So, I can compare the different between my demo and the running mesos server. I captured the packets between client and server in my demo. I found the explorer didn`t resend fresh request, only some keepalive packets.
but, when I catch the packets in the running mesos server, I found the explorer send get request frequently. like the image
I think, if you run some task or add some agent, maybe it will activate the explore to send request frequently. Then the "Failed to connect" will disappeared.

I was having the same issues and what fixed it for me was the zookeeper configuration. In my case I was using the EC2 public IP Address rather than the private one. Once I changed the /etc/mesos/zk file to zk://<private IP>:2181/mesos I was able to connect without the constant error messages. In other words, zookeeper was reporting to be running in one IP and mesos-master was trying to connect using a different IP.

My configuration was correct as suggested. But failed to start mesos-master service. But There is alternative way to start mesos-master node with exact same configuration. Commands to start mesos-master
$ cd /usr/sbin [or mesos_installation directory/bin]
$sudo ./mesos-master --work_dir=/var/lib/mesos --log_dir=/home/rajeev/logs/mesos/
Its start mesos-master service successfully for me.

Related

Is it possible to host a Minecraft server on GitHub Codespaces?

I downloaded the Fabric server jar file to a GitHub Codespace and am able to run the server without trouble. However, I am unable to determine the IP needed to connect to the server. Starting the server automatically forwards port 25565 and I make the port public. However, I can't figure out which IP to paste into Minecraft to connect to it. How do I figure out the IP of the server?
I found an answer thanks to inspiration from this question.
Steps:
Set up the fabric server jar as you normally would, but on the codespace. Start the server.
Split the terminal so one is running Java (server console) and the other is running bash.
Install ngrok via npm i ngrok --save-dev.
Once the server is finished setting up, run the command ./node_modules/.bin/ngrok tcp 25565.
Copy the ip shown under Forwarding (minus the tcp:// part and including the port). This should look something like 4.tcp.ngrok.io:17063.
You now have the ip of the serve!
Note: The free version of ngrok has URLs which change every time, as well as a limit, but for small-scale servers this shouldn't be an issue. You are also limited by the free codespace usage limit GitHub puts in place. However, you can easily get around this by creating a secondary account that you use codespaces on only for the server.

Payara server starts for a brief time but doesn't connect

I recently had to reinstall IntelliJ IDEA, and ever since then, I've been unable to run this one app that runs on Payara. I have Payara 5.2022.3 (full) installed and the project is using java 11.
This is the server log:
Artifact my_project-ear:ear exploded: Waiting for server connection to start artifact deployment…
Detected server admin port: 4848
Detected server http port: 8080
And then nothing happens.
And if I terminate the process I get a message:
Application Server was not connected before run configuration stop, reason: Unable to ping server at localhost:4848
Based on my observation it seems like a process starts running on port 4848 for a few seconds but then stops abruptly.
I checked the CrashDumps and here is the .dpm file in question.
https://drive.google.com/file/d/1AyLU2HOyXKxREjaDNyIU9eRYMnzaBspw/view?usp=sharing
I'd already tried:
Running it on a different port./ Checking if there was not a process blocking used ports.
"Renewing" the domain.xml in case it was corrupted somehow.
Using different JDK.
Reinstalling Windows
I'm positive there is no problem with the app's code (seen a friend run it on his computer today) and I also think no changes happened to the run/debug configuration or the payara and domain configuration ever since it was working before the IDEA reinstall.
(I'm also very new to payara, and software development in general, so I'm not quite as skilled in solving this kind of problem.)
Thank you for all your answers.
It looks like a bug in the Java version/vendor you are using which is causing the crash of the process. Updating to a more recent Java build or to a different JVM vendor should help.

Failed to start Rabbitmq on Ubuntu server - Node is not runing

I have installed Rabbitmq on my server. Suddenly I needed to remove a queue and I could not. I decided to remove and reinstall Rabbitmq but now I can not run it. I user this article https://linoxide.com/ubuntu-how-to/install-setup-rabbitmq-ubuntu-16-04/ and got this error:
https://imgur.com/a/lL0xln6
Problem was related to some DNS and Hostname settings on my server. I just try to use a new server for AMQP as Saas.

Apache Drill-embedded can't connect due to VPN

I'm trying to use Apache Drill in embedded mode (drill-embedded) however when it starts it shows an error:
Error: Failure in connecting to Drill: org.apache.drill.exec.pc.RpcException: CONNECTION : io.netty.channel.ConnectTimeoutException: connection timed out: /192.168.1.11:31010 (state=,code=0)
If I disconnect the corporate VPN it will startup just fine. Connections to the IP of the network adapter are being blocked by the VPN software, which is expected, so I need it to connect to the loopback (127.0.0.1) instead. How can I configure this? I have several other server/services running fine that use the loopback, but for whatever reason Drill insists on using the IP of the adapter.
I've tried various settings in drill-override.conf but can't seem to find the right one that would cause it to connect to the loopback.
Any ideas?
A similar question was asked on the mailing list, and the following answer by Aditya seems to have fixed the problem for the user:
Could you please check your "/etc/hosts" file for a possible
mis-configuration of "localhost" and ensure that all localhost names
are set to default.
I had the same problem and was able to work around it by setting the environment variable DRILL_HOST_NAME=localhost. I think you can also set this in conf/drill-env.sh, there's a line there that exports that environment variable and you can uncomment it and set a value.

RabbitMQ Shovel plugin stuck on "starting" status

RabbitMQ starts up just fine, but the shovel plugin status is listed as "starting".
I'm using the following rabbitmq.config:
Each broker is running on a separate AWS instance. The remote server is windows 2008 server, the local server is Amazon Linux.
[{rabbitmq_shovel,
[{shovels,
[{scrape_request_shovel,
[{sources, [{broker,"amqp://test_user:test_password#localhost"}]},
{destinations, [{broker, "amqp://test_user:test_password#ec2-###-##-###-###.compute-1.amazonaws.com"}]},
{queue, <<"scp_request">>},
{ack_mode, on_confirm},
{publish_properties, [{delivery_mode, 2}]},
{publish_fields, [{exchange, <<"">>},
{routing_key, <<"scp_request">>}]},
{reconnect_delay, 5}
]}
]
}]
}].
Running the following command:
sudo rabbitmqctl eval 'rabbit_shovel_status:status().'
returns:
[{scrape_request_shovel,starting,{{2012,7,11},{23,38,47}}}]
According to This question, this can result if the users haven't been set up correctly on the two brokers. However, I've double-checked that I've set up the users correctly via rabbitmqctl user_add on both machines -- have even tried it with a different set of users, to be sure.
I also ran an nmap scan of port 5672 on the remote host to verify is was up and running on that port.
UPDATE Problem isn't solved but this does appear to be a result of connection problems with the remote server. I changed "reconnect_delay" to 0 in my config file, to avoid having shovel infinitely re-try the connection. Highly recommend others with this problem do this as well, as it allows you to get error messages out of rabbit_shovel_status. In my case I got the following error:
[{scrape_request_shovel,
{terminated,
{{badmatch,{error,access_refused}},
[{rabbit_shovel_worker,make_conn_and_chan,1},
{rabbit_shovel_worker,handle_cast,2},
{gen_server2,handle_msg,2},
{proc_lib,init_p_do_apply,3}]}},
{{2012,7,12},{0,4,37}}}]
Answering my own question here, in case others encounter this issue. This error (and also a timeout error if you get it, {{badmatch,{error,etimedout}}, ), is almost certainly a communications problem between the two machines, most likely due to port access / firewall settings.
There were a couple of dumb things I was doing here:
1) Was using the wrong DNS for my remote EC2 instance (D'oh! really dumb -- can't tell you how long I spent banging my head against the wall on this one...). Remember that stopping and starting your instance generates a new DNS, if you don't have an elastic IP associated with the instance.
2) My remote instance is a windows server, and I realized you have to open up port 5672 both in windows firewall and in EC2 security groups -- there are two overlapping levels of access controls here, and opening up the port in the EC2 management console isn't sufficient if your machine is windows server on EC2, as you also have to configure the windows server firewall.