OpenMPI: multi-host usage - ssh

I have two hosts with passwordless SSH-connection: 10.XXX.XXX.XX1 and 10.XXX.XXX.XX2;
I have a simple binary file "matrix_mult", which executes successfully (mpirun matrix_mult).
Now I want to use it in multi-host mode and try the following:
mpirun -H 10.XXX.XXX.XX1,10.XXX.XXX.XX2 matrix_mult
But it returns "10.0.0.0: Connection timeout"
I also tried to use hostfile with same hosts, but result was the same...
Thank you in advance!

Related

How to disable NFS client caching?

I have a trouble with NFS client file caching. The client read the file which was removed from the server many minutes before.
My two servers are both CentOS 6.5 (kernel: 2.6.32-431.el6.x86_64)
I'm using server A as the NFS server, /etc/exports is written as:
/path/folder 192.168.1.20(rw,no_root_squash,no_all_squash,sync)
And server B is used as the client, the mount options are:
nfsstat -m
/mnt/folder from 192.168.1.7:/path/folder
Flags: rw,sync,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,nosharecache,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.20,minorversion=0,lookupcache=none,local_lock=none,addr=192.168.1.7
As you can see, "lookupcache=none,noac" options are already used to disable the caching, but seems doesn't work...
I did the following steps:
Create a simple text file from server A
Print the file from the server B by cmd "cat", and it's there
Remove the file from the server A
Wait couple minutes and print the file from the server B, and it's still there!
But if I do "ls" from the server B at that time, the file is not in the output. The inconsistent state may last a few minutes.
I think I've checked all the NFS mount options...but can't find the solution.
Is there any other options I missed? Or maybe the issue is not about NFS?
Any ideas would be appreciated :)
I have tested the same steps you have given with below parameters. Its working perfectly. I have added one more parameter "fg" in the client side mounting.
sudo mount -t nfs -o fg,noac,lookupcache=none XXX.XX.XX.XX:/var/shared/ /mnt/nfs/fuse-shared/

How to setup SSH connection with Ansible?

I am brand new to learning Ansible. Here is a pretty easy example.
I have computer A, where I will be running playbooks from.
And 10 other host machines that need to be configured. My question is, do I just need to put the public SSH key of my host machine on the 10 hosts in ~/.ssh/authorized_keys ?
I guess my understanding of how to efficiently setup SSH connections between my main computer and all the clients is a little fuzzy. Any help would be appreciated here.
You create a file called hosts with this content
[test-vms]
10.0.0.100 ansible_ssh_pass='password' ansible_ssh_user='username'
In above hosts file leave off ansible_ssh_pass='password' if using ssh keys ... Then you can create a playbook with the commands and call the playbook like below. The first line of the playbook needs to have the hosts declaration
---
- hosts: test-vms
tasks:
-name: "This is a test task"
command: /bin/hostname
Finally, you call the playbook like this
ansible-playbook -i <hosts-file> <playbook.yaml>
Ansible simply uses SSH so you can either copy the public key as you describe or use password authentication using the --user and --ask-pass flags.
Yes. As far as connection to hosts go, Ansible sets up SSH connection between the master machine and the host machines. You have to add the SSH fingerprints to the end machines. You can always skip the Are you sure you want to continue connecting (yes/no/[fingerprint]) step i.e., adding the fingerprints to .ssh/known_hosts by setting host_key_checking=false
I found this great video for initial Ansible Setup - https://youtu.be/-Q4T9wLsvOQ - maybe this can help!

docker Mule-server curl: (56) Recv failure: Connection reset by peer

This might just be my rookie knowledge of Docker,
but I can't get the networking to work.
I'm trying to run a Mule-server via the pr3d4t0r/mule repository.
I can run it, hot-swap applications but I can reach it.
I can run a local server without Docker, and it works flawlessly.
But not when I try it with Docker.
When I try to do a simple curl command I get "curl: (56) Recv failure: Connection reset by peer"
curl http://localhost:8090/Sven
I have tried exposing the ports via -P and separately via -p 8090:8090 but no luck.
When the docker is running it blocks the ports (I tried running Docker and the normal server at the same time but the normal one said the ports where already in use).
When I try another Image like jboss/wildfly and I use -p 8080:8080 there's no problem, it works perfectly.
The application in the mule-server will log and respond a simple "hello World", the output says that the application is deployed, but no messages or logging while I try to reach it.
Any suggestions?
In my case it was actually the app that was configured incorrectly. It had localhost as host. It should have been 0.0.0.0 without this it was acting only on localhost aka the docker container but not from outside of it.
You should not need to use -net=host.
So check if there's a configuration
In application.properties need set 0.0.0.0 ip not 127.0.0.0.
error
"curl: (56) Recv failure: Connection reset by peer"
mean that no process in docker image listening to the port. Option -p is bind of port in host system and image.
-p <port in host os to be binded to>:<port in container>
So, check your image, maybe your app in container use different port and you need
-p 8080:8090
if you have this , comment or remove it, server.address=localhost in your application.properties

fabric appears to start apache2 but doesn't

I'm using fabric to remotely start a micro aws server, install git and a git repository, adjust apache config and then restart the server.
If at any point, from the fabfile I issue either
sudo('service apache2 restart') or run('sudo service apache2 restart') or a stop and then a start, the command apparently runs, I get the response indicating apache has started, for example
[ec2-184-73-1-113.compute-1.amazonaws.com] sudo: service apache2 start
[ec2-184-73-1-113.compute-1.amazonaws.com] out: * Starting web server apache2
[ec2-184-73-1-113.compute-1.amazonaws.com] out: ...done.
[ec2-184-73-1-113.compute-1.amazonaws.com] out:
However, if I try to connect, the connection is refused and if I ssh into the server and run
sudo service apache2 status it says that "Apache is NOT running"
Whilst sshed in, if run
sudo service apache start, the server is started and I can connect. Has anyone else experienced this? Or does anyone have any tips as to where I could look, in log files etc to work out what has happened. There is nothing in apache2/error.log, syslog or auth.log.
It's not that big a deal, I can work round it. I just don't like such silent failures.
Which version of fabric are you running?
Have you tried to change the pty argument (try to change shell too, but it should not influence things)?
http://docs.fabfile.org/en/1.0.1/api/core/operations.html#fabric.operations.run
You can set the pty argument like this:
sudo('service apache2 restart', pty=False)
Try this:
sudo('service apache2 restart',pty=False)
This worked for me after running into the same problem. I'm not sure why this happens.
This is an instance of this issue and there is an entry in the FAQ that has the pty answer. Unfortunately on CentOS 6 doesn't support pty-less sudo commands and I didn't like the nohup solution since it killed output.
The final entry in the issue mentions using sudo('set -m; service servicename start'). This turns on Job Control and therefore background processes are put in their own process group. As a result they are not terminated when the command ends.
When connecting to your remotes on behalf of a user granted enough privileges (such as root), you can manage system services as shown below:
from fabtools import service
service.restart('apache2')
https://fabtools.readthedocs.org/en/0.13.0/api/service.html
P.S. Its requires the installation of fabtools
pip install fabtools
Couple of more ways to fix the problem.
You could run the fab target with --no-pty option
fab --no-pty <task>
Inside fabfile, set the global environment variable always_use_pty to False, before your target code executes
env.always_use_pty = False
using pty=False still didn't solve it for me. The solution that ended up working for me is doing a double-nohup, like so:
run.sh
#! /usr/bin/env bash
nohup java -jar myapp.jar 2>&1 &
fabfile.py
...
sudo("nohup ./run.sh &> nohup.out", user=env.user, warn_only=True)
...

Can I forward env variables over ssh?

I work with several different servers, and it would be useful to be able to set some environment variables such that they are active on all of them when I SSH in. The problem is, the contents of some of the variables contain sensitive information (hashed passwords), and so I don't want to leave it lying around in a .bashrc file -- I'd like to keep it only in memory.
I know that you can use SSH to forward the DISPLAY variable (via ForwardX11) or an SSH Agent process (via ForwardAgent), so I'm wondering if there's a way to automatically forward the contents of arbitrary environment variables across SSH connections. Ideally, something I could set in a .ssh/config file so that it would run automatically when I need it to. Any ideas?
You can, but it requires changing the server configuration.
Read the entries for AcceptEnv in sshd_config(5) and SendEnv in ssh_config(5).
update:
You can also pass them on the command line:
ssh foo#host "FOO=foo BAR=bar doz"
Regarding security, note than anybody with access to the remote machine will be able to see the environment variables passed to any running process.
If you want to keep that information secret it is better to pass it through stdin:
cat secret_info | ssh foo#host remote_program
You can't do it automatically (except for $DISPLAY which you can forward with -X along with your Xauth info so remote programs can actually connect to your display) but you can use a script with a "here document":
ssh ... <<EOF
export FOO="$FOO" BAR="$BAR" PATH="\$HOME/bin:\$PATH"
runRemoteCommand
EOF
The unescaped variables will be expanded locally and the result transmitted to the remote side. So the PATH will be set with the remote value of $HOME.
THIS IS A SECURITY RISK Don't transmit sensitive information like passwords this way because anyone can see environment variables of every process on the same computer.
Something like:
ssh user#host bash -c "set -e; $(env); . thescript.sh"
...might work (untested)
Bit of a hack but if you cannot change the server config for some reason it might work.