Open MPI can't launch remote nodes via SSH - ssh

I am trying to set up Open MPI between a few machines on out network.
Open MPI works fine locally, but I just can't get it to work on a remote node.
I can ssh into the remote machine (without password) just fine, but if I try something like
mpiexec -n 4 --host remote.host hello_c
then the ssh connection just times out.
I checked several tutorials but the only configuration instructions they give is "make sure you can ssh into the remote machine without a password". I did and I still can't launch nodes on remote machines. What's the problem?

I've the same issue. Try to connect in ssh with rsa certificates
Edit 03/24 : This not work.. sorry

Related

Spyder -- Connect to remote kernel via proxy

I'm trying to connect to a remote kernel in Spyder, however the machine on which it is running is not directly accessible. Rather, to connect to it I must go through a bastion host / jumpbox as follows:
ssh -i ~/.ssh/id_rsa -J me#jumpbox me#remote which logs me directly into remote, automatically sending the connection through jumpbox.
I have python -m spyder-kernels.console running on remote, where I want to do my computing, but no way to connect to it directly since it's only accessible from jumpbox. I've tried setting up my ssh config with a ProxyJump entry which works for logging into the machine through ssh in the command line, but it appears that Spyder ignores the config file when setting up the remote kernel connection,
Is there a way to connect to this remote kernel? It appears there's a way to do this with IPython and I know I can do it with Jupyter Notebook, but I'm wondering if I can do this in Spyder.
(Related: Connect spyder to a remote kernel via ssh tunnel)
I don't know if you're still looking for an answer to this, but for future people arriving here, and for my own reference:
Yes, you can. You have to create an ssh-tunnel and connect Spyder to the kernel via localhost. For you that would look something like this:
ssh -L 3336:me#jumpbox:22 me#remote
22 is for the port your ssh server at remote is listening to. This is usually 22, unless the moderator changed this. 3336 is the port at localhost to connect to, you can choose any number you like above 1024 (which are privileged ports).
Then proceed as explained in the Spyder docs, i.e., launch the spider kernel (in the environment you want) on remote
python -m spyder_kernels.console
copy the connection file (kernel-pid.json) file to your local computer:
scp -oProxyJump=me#jumpbox remote:/path/to/json-file/kernel-pid.json ~/Desktop
/path/to/json-file you have to change to the path to the connection file (which you can find by running jupyter --runtime-dir on remote in the same environment as the spyder-kernel is running) and kernel-pid.json of course to the real file name. ~/Desktop copies it to your Desktop-folder, you can change that to wherever you want.
Connect Spyder to the kernel via "Connect to existing kernel", point it to the connection file you just copied, check the This is a remote kernel (via SSH) box and enter localhost as the Hostname, and 3336 as the port (or whichever port you changed it to).
That should do it.
Note, that, as is the case for me, your jumpbox server may break your ssh connection over which you launched the Spyder kernel, which will cause your kernel to break. So you might want to use
python -m spyder_kernels.console &
to have it run in the background, or launch it in a screen session. However, note that you cannot shutdown a remote kernel with exit, and it will keep running (see here), so you have to kill it in a different way.

SSH to Github not working

SSH has been working fine for the last few weeks since I got my new PC. I've had no problems but today I started getting:
ssh: connect to host github.com port 22: resource temporarily unavailable
I did some googling and found that there is a common issue with WSL which sometimes causes this, but I'm unable to SSH from my bash shell, or from cmd/powershell.
This is the part that confuses me, if I do: ssh -T git#192.30.253.113 I am prompted for the password to my key, it successfully authenticates and responds with "Hi alexmk92! You've successfully authenticated".
Great, that at least proves that my firewall isn't blocking SSH on port 22. But why does git#github.com throw the resource failed error? My initial thought is that this could be a DNS problem.
So I tried to configure my network adapter to use Google's DNS server (8.8.8.8 and 8.8.4.4) I even configured the IPV6 DNS servers just in case. Following this I did an ipconfig /flushdns, attempted to connect via git#github.com again and BAM the same result, however git#192.30.253.113 still works.
I'm guessing another potential cause is that github.com is behind a load balancer and one of the IP's on the cluster could be black-listed somewhere on my machine? I'm just pulling guesses out of thin air now, any help would be greatly appreciated, this is driving me insane.
After some further Googling it turned out that my machine did not have a hosts entry for github.com and it was unable to automatically resolve it.
In Windows Subsystem for Linux I created a ssh config file
touch ~/.ssh/config
(for some reason the base distro of Ubuntu 18.04 on the windows marketplace didn't have one) I then had to make sure the file permissions were correct:
chmod 755 ~/.ssh/config
Once the file was created, I edited it with
sudo nano ~/.ssh/config
and added github.com as a Host.
Host github.com
Hostname ssh.github.com
Port 22
Upon saving, I ran
sudo /etc/init.d/ssh restart
and attempted
ssh -T git#github.com
Everything now seems to be working.
In my case my ISP did not allow ssh, so it was not working from cmd and wsl both. Got around it using vpn
To have successful SSH connection to Github, SSH key has to be import into Github
Open Git bash or Terminal
Run the command ssh-keygen
Choose all default option
A private and a public key gets generated in the folder * < user_home>/.ssh/*
Login to Github.com
Navigate to account settings
Choose item "SSH and GPG Keys" from the side navigation bar
click added new SSh key
Copy and save public key content from * < user_home>/.ssh/id_rsa.pub *

Syncing folder with Windows guest in Vagrant and using with IIS

I'm running a Windows 2012 R2 eval box (mwrock/Windows2012R2) on Mac OS Sierra host.
I'm trying to setup IIS to run a web site from the synced folder, but I keep getting an HTTP Error 500.19 - Internal Server Error.
After researching, I found out that it seems to be related to permissions. I tried every possible combination of permissions, granting to Users, IIS_IUSRS, vagrant, etc., changing the app pool user, etc., nothing got it working.
Then I figured I could just use rsync. So I tried to change my Vagrantfile to use type: rsync, and then got an error saying rsync was not found in the PATH.
No problem, I installed rsync using chocolatey and tried again. This time, I got an SSH error: Error: ssh_exchange_identification: Connection closed by remote host. I figured there probably wasn't anything setup for SSH on Windows guest, so I found this post on setting up OpenSSH .
I followed the instructions and tried again. rsync still wouldn't work, but now there's no error, it just stalls. I tested just doing a regular SSH to the windows guest and that works fine. However, while doing plain old ssh works, doing vagrant ssh does not.
I get the following error after entering password: ssh_dispatch_run_fatal: Connection to 127.0.0.1 port 2222: incomplete message. Meanwhile, doing ssh vagrant#127.0.0.1 -p 2222 works fine. So running out of ideas at this point.
Anybody manage to get this working with a setup similar to mine?

Docker to run X applications while connected through SSH

I have used these instructions for Running Gui Apps with Docker to create images that allow me to launch GUI based applications.
It all works flawlessly when running Docker on the same machine, but it stops working when running it on a remote host.
Locally, I can run
docker --rm --ti -e DISPLAY -e <X tmp> <image_name> xclock
And I can get xclock running on my host machine.
When connecting remotely to a host with XForwarding, I am able to run X applications that show up on my local X Server, as anyone would expect.
However if in the remote host I try to run the above docker command, it fails to connect to the DISPLAY (usually localhost:10.0)
I think the problem is that the XForwarding is setup on the localhost interface of the remote host.
So the docker host has no way to connect to DISPLAY=localhost:10.0 because that localhost means the remote host, unreachable from docker itself.
Can anyone suggest an elegant way to solve this?
Regards
Alessandro
EDIT1:
One possible way I guess is to use socat to forward the remote /tmp/.X11-unix to the local machine. This way I would not need to use port forwarding.
It also looks like openssh 6.7 will natively support unix socket forwarding.
When running X applications through SSH (ssh -X), you are not using the /tmp/.X11-unix socket to communicate with the X server. You are rather using a tunnel through SSH reached via "localhost:10.0".
In order to get this to work, you need to make sure the SSH server supports X connections to the external address by setting
X11UseLocalhost no
in /etc/ssh/sshd_config.
Then $DISPLAY inside the container should be set to the IP address of the Docker host computer on the docker interface - typically 172.17.0.1. So $DISPLAY will then be 172.17.0.1:10
You need to add the X authentication token inside the docker container with "xauth add" (see here)
If there is any firewall on the Docker host computer, you will have to open up the TCP ports related to this tunnel. Typically you will have to run something like
ufw allow from 172.17.0.0/16 to any port $TCPPORT proto tcp
if you use ufw.
Then it should work. I hope it helps. See also my other answer here https://stackoverflow.com/a/48235281/5744809 for more details.

ssh server connect to host xxx port 22: Connection timed out on linux-ubuntu [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed last month.
Improve this question
I am trying to connect to remote server via ssh but getting connection timeout.
I ran the following command
ssh testkamer#test.dommainname.com
and getting following result
ssh: connect to host testkamer#test.dommainname.com port 22: Connection timed out
but if try to connect on another remote server then I can login successfully.
So I think there is no problem in ssh and other person try to login with same login and password he can successfully login to server.
Please help me
Thanks.
Here are a couple of things that could be preventing you from connecting to your Linode instance:
DNS problem: if the computer that you're using to connect to your
remote server isn't resolving test.kameronderdehamer.nl properly
then you won't be able to reach your host. Try to connect using the
public IP address assigned to your Linode and see if it works (e.g.
ssh user#123.123.123.123). If you can connect using the public IP
but not using the hostname that would confirm that you're having
some problem with domain name resolution.
Network issues: there
might be some network issues preventing you from establishing a
connection to your server. For example, there may be a misconfigured
router in the path between you and your host, or you may be
experiencing packet loss. While this is not frequent, it has
happenned to me several times with Linode and can be very annoying.
It could be a good idea to check this just in case. You can have a look
at Diagnosing network issues with MTR (from the Linode
library).
That error message means the server to which you are connecting does not reply to SSH connection attempts on port 22. There are three possible reasons for that:
You're not running an SSH server on the machine. You'll need to install it to be able to ssh to it.
You are running an SSH server on that machine, but on a different port. You need to figure out on which port it is running; say it's on port 1234, you then run ssh -p 1234 hostname.
You are running an SSH server on that machine, and it does use the port on which you are trying to connect, but the machine has a firewall that does not allow you to connect to it. You'll need to figure out how to change the firewall, or maybe you need to ssh from a different host to be allowed in.
EDIT: as (correctly) pointed out in the comments, the third is certainly the case; the other two would result in the server sending a TCP "reset" package back upon the client's connection attempt, resulting in a "connection refused" error message, rather than the timeout you're getting. The other two might also be the case, but you need to fix the third first before you can move on.
I got this error and found that I don't have my SSH port (non standard number) whitelisted in config server firewall.
Just adding this here because it worked for me. Without changing any settings (to my knowledge), I was no longer able to access my AWS EC2 instance with: ssh -i /path/to/key/key_name.pem admin#ecx-x-x-xxx-xx.eu-west-2.compute.amazonaws.com
It turned out I needed to add a rule for inbound SSH traffic, as explained here by AWS. For Port range 22, I added 0.0.0.0/0, which allows all IPv4 addresses to access the instance using SSH.
Note that making the instance accessible to all IPv4 addresses is a security risk; it is acceptable for a short time in a test environment, but you'll likely need a longer term solution.
If you are on Public Network, Firewall will block all incoming connections by default. check your firewall settings or use private network to SSL
The possibility could be, the SSH might not be enabled on your server/system.
Check sudo systemctl status ssh is Active or not.
If it's not active, try installing with the help of these commands
sudo apt update
sudo apt install openssh-server
Now try to access the server/system with following command
ssh username#ip_address
This happens because of firewall connection.
Reset your firewall connection from your hosting website.
It will start working.
After connecting to the server again add this to your (ufw) security
sudo ufw allow 22/tcp
There can be many possible reasons for this failure.
Some are listed above. I faced the same issue, it is very hard to find the root cause of the failure.
I will recommend you to check the session timeout for shh from ssh_config file.
Try to increase the session timeout and see if it fails again
My VPN connection was not enabled. I was trying all possible way to open up the Firwall and Ports until I realized, I am working from home and my VPN connection was down.
But yes, Firewall and ssh configurations can be a reason.
Try connecting to a vpn, if possible. That was the reason I was facing problem.
Tip: if you're using an ec2 machine, try rebooting it. This worked for me the other day :)
I had this issue while trying to ssh into a local nextcloud server from my Mac.
I had no issues ssh-ing in once, but if I tried to have more than one concurrent connection, it would hang until it timed out.
Note, I was sshing to my user#public-ip-address.
I realized the second connection only didn't work when I tried to ssh into it when on the same network, ie my home network
Furthermore, when I tried ssh user#server-domain it worked!
The end fix was to use ssh user#server-domain rather than ssh user#public-ip
I have experienced a couple of nasty issues that lead to these errors, and these are different from everyone else's answer here:
Wrong folder access rights. You need to have specific directory permissions on you ssh folders and files.
a. The .ssh directory permissions should be 700 (drwx------).
b. The public key (.pub file) should be 644 (-rw-r--r--).
c. The private key (id_rsa) on the client host, and the authorized_keys file on the server, should be 600 (-rw-------).
Nasty docker network configuration. This just happened to me on an AWS EC2 instance. It turned out that I had a docker network with an ip range that interfered with the ssh access granted by the security group and VPC. The docker network's range was e.g. 192.168.176.0/20 (i.e. a range from 192.168.176.1->192.168.191.254), whereas the security group had a range of 192.168.179.0/24; interfering with the SSH access.
I had this error when trying to SSH into my Raspberry pi from my MBP via bash terminal. My RPI was connected to the network via wifi/wlan0 and this IP had been changed upon restart by my routers DHCP.
Check IP being used to login via SSH is correct. Re-check IP of device being SSH'd into (in my case the RPI), which can be checked using hostname -I
Confirm/amend SSH login credentials on "guest" device (in my case the MBP) and it worked fine in my attempt.
I faced a similar issue. I checked for the below:
if ssh is not installed on your machine, you will have to install it firstly. (You will get a message saying ssh is not recognized as a command).
Port 22 is open or not on the server you are trying to ssh.
If the control of remote server is in your hands and you have permissions, try to disable firewall on it.
Try to ssh again.
If port is not an issue then you would have to check for firewall settings as it is the one that is blocking your connection.
For me too it was a firewall issue between my machine and remote server.I disabled the firewall on the remote server and I was able to make a connection using ssh.
my main machine is windows 10 and I have CEntOS 7 VBox
Search in your main machine for "known_hosts"
usually, known_host location in windows in "user/.ssh/known_host"
open it using notepad and delete the line where your centos vbox ip
then try connect in your terminal
in mac os user you can find known_hosts in "~/.ssh/known_hosts"
Make sure to ask the admin to authorize your device.
On Linux run:
sudo zerotier-cli listnetworks
if it returns status ACCESS DENIED ask the admin to authorize your node. This is mentioned here.
https://discuss.zerotier.com/t/solved-cant-join-network/1919
This issue is also caused if the Dynamic Host Configuration Protocol is not set-up properly.
To solve this first check if your IP Address is configured using
ping ipaddress,
If there is no packet loss and the IP Address is working fine try any other solution. If there is no response and you have 100% packet loss, it means that your IP Address is not working and not configured.
Now configure your IP Address using,
sudo dhclient -v devicename
To check your device you can use the 'ip a' command
For eg. My device was usb0 since I had connected the device through usb
This will configure an IP Address automatically and you can even see which one is configured. You can again check with the 'ip a' command to confirm.
This may be very case specific and work in some cases only but
check to see if you were previously connecting through some VPN software/application.
Try connecting again to the VPN. Worked in my case.
This happened to me after enabling port 22 with "sudo ufw allow ssh". Before that, I was getting a refusal from my machine when entering with ssh from another one. After enabling it, I thought it would work, but instead it showed the message "connection timed out". As I had just installed Ubuntu with the option of installing basic functions alongside, I checked whether I had the openssh-server with the command sudo apt list --installed | grep openssh-server. It turned out that Ubuntu had installed by defect the openssh-client instead. I uninstalled it and installed the openssh-server following the basic commands:
sudo apt-get purge openssh-client
sudo apt update
sudo apt install openssh-server
After that, a simple "sudo ufw allow ssh" worked perfectly and I was finally able to access the machine with an ssh command.
What worked for me was that i went to my security group and reset my IP and it worked
Here are some considerations which i took to resolve a similar issue that I had:
Port 22
IGW (Internet Gateway)
VPC
Scene 1> This is for port 22 not enabled with right configurations. If the port is set to custom or myip, the probable scene is this won't work.
Scene 2> When you delete the internet gateway, the network is created and the instance will be functional too, but the routing from the internet will not work. Hence make sure that if there is a VPC, it has an Internet Gateway attached.
Scene 3> Check the VPC for the subnet associations and routing table entries. This might probably tell you the cause. I found one in this kind of troubleshooting. The route used to land up in a "blackhole" (shows up in the route table section of the console). To fix this I had to check and find out my internet gateway and found the issue with the IGW.
Moral of the story: always trace backward in the network!
In my case I'm on windows, I reset my firewall settings, and it fixed
If you get any error check the basic a version control request with ssh -V and If it is not installed, install it with the sudo apt-get install openssh-server command.
Check your virtual machine ssh connection with sudo service ssh status at console.
Check "Active" rows and if write a inactive(dead) the console write sudo service ssh start
Result: Now you can check your connection with sudo service ssh status command and send ssh connection request.
Reset the firewall and reboot your VPS from your hosting service, it will start working perfectly fine
check whether accidentally you have deleted the default vpc or default subnets ,while creating your own vpc and subnets.
I have done this mistake while creating vpc, hence got this error while connecting via ssh.
alos check whether u have attched IGW to public subnets.
Its not complicated.
First, go disable your firewall(USE YOUR CONTROL PANEL)after you check if your openssh is active.
Disable firewall, then use putty or any alternative to basically disable using this command sudo ufw disable
try now
Update the security group of that instance. Your local IP must have updated. Every time it’s IP flips. You will have to go update the Security group.