Ansible SSH error during play - ssh

I get a strange error with Ansible. First of all, the first role works fine but when Ansible tries to execute the seconde one it failed because of ssh error.
Environment:
OS: CentOS 7
Ansible version: 2.2.1.0
Python version: 2.7.5
OpenSSH version: OpenSSH_6.6.1p1, OpenSSL 1.0.1e-fips 11 Feb 2013
Ansible command which is executed:
ansible-playbook -vvvv -i inventory/dev playbook_update_system.yml --limit "db[0]"
Playbook:
- name: "HUB Playbook | Updating system packages on {{ ansible_hostname }}"
hosts: release_first_half
roles:
- upgrade_system_package
- reboot_server
Role: upgrade_system_package:
- name: "upgrading CentOS system packages on {{ ansible_hostname }}"
shell: sudo puppet apply -e 'exec{"upgrade-package":command => "/usr/bin/yum clean all; /usr/bin/yum -y update;"}'
when: ansible_distribution == 'CentOS' and 'cassandra' not in group_names
Role: reboot_server:
- name: "reboot CentOS [{{ ansible_hostname }}] server"
shell: sudo puppet apply -e 'exec{"reboot-os":command => "/usr/sbin/reboot"}'
when: ansible_distribution == 'CentOS' and 'cassandra' not in group_names
Current behavior:
Connection to "db1" node and execute role "upgrade system packages" => OK
Try to connect to "db1" and execute role "reboot_server" => failed due to ssh.
Error message returned by Ansible:
fatal: [db1]: UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013\r\ndebug1: Reading configuration data /USR/newtprod/.ssh/config\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 56: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 64994\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Control master terminated unexpectedly\r\nShared connection to db1 closed.\r\n",
"unreachable": true
}
I don't understand because the previous role has been executed successfully on this node. Moreover, we have a lot of playbook which are using same inventory file and they works fine. I tried on another node too but same result.

It's a simple and pretty well-known issue: the shutdown process causes SSH daemon to quit and this breaks the current SSH session (you get the "broken pipe" error). The server reboots properly, but Ansible flow gets interrupted.
You need to add a delay to your shell command and run it with async option, so that Ansible's SSH session can finish before it gets killed.
shell: sleep 5; sudo puppet apply -e 'exec{"reboot-os":command => "/usr/sbin/reboot"}'
async: 0
poll: 0

Related

Ansible failed to connect to the host via ssh, but ssh command works

From inventory.yml:
myhost:
ansible_host: myhost # actually it was ansible_ssh_host (see my answer)
ansible_user: myuser # actually it was ansible_ssh_user (see my answer)
ansible_pass: mypass # actually it was ansible_ssh_pass (see my answer)
So far, Ansible worked fine. I could also ssh myuser#myhost.
Then, I changed the ssh port from default 22 to 23 and edited the inventory.yml:
myhost:
ansible_host: myhost
ansible_user: myuser
ansible_pass: mypass # THE PROBLEM! Must be ansible_ssh_pass. (see my answer)
ansible_port: 23
As expected, I can ssh myuser#myhost -p 23, but Ansible gives the error:
fatal: [staging]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: myuser#myhost: Permission denied (publickey,password).", "unreachable": true}
What could be causing the error?
The solution is quite unexpected and slightly embarrassing:
While changing the SSH port, I also read this:
Ansible 2.0 has deprecated the “ssh” from ansible_ssh_user,
ansible_ssh_host, and ansible_ssh_port to become ansible_user,
ansible_host, and ansible_port.
I edited inventory.yml a bit too eagerly, as I also changed ansible_ssh_pass to ansible_pass. Hence: missing password -> permission denied.
So, my question had been phrased in a wrong way. I have updated it accordingly.
The variable for password is ansible_password. See documentation here to create your inventory.yml properly.
Notice that you should never store your password in plain text, but use a vault in stead.
In my case, where key based ssh is only allowed, ansible ping was failing with -
"msg": "Failed to connect to the host via ssh: Permission denied (publickey).",
The inventory file had entry for ansible_user=$user. Removing entry from inventory file helped.
My environment :
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.7 LTS
Release: 16.04
Codename: xenial
$ ansible --version
ansible 2.9.27
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/home/USER_NAME/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/dist-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.12 (default, Mar 1 2021, 11:38:31) [GCC 5.4.0 20160609]

Ansible SSL error "unable to get local issuer certificate" when triggered by vagrant

I have set the following ansible variables:
ansible_port: 5986
ansible_connection: winrm
ansible_winrm_server_cert_validation: ignore
and running my playbook via ansible-playbook -i ansible/inventory.ini -vvvvv ansible/playbook.yml works fine.
Now I would like vagrant to trigger the ansible provision via vagrant. The Vagrantfilelooks like this:
Vagrant.configure(2) do |config|
config.vm.define "virtualbox_windows_server_2016_1" do |s|
...
s.vm.provision "ansible" do |ansible|
ansible.playbook = "ansible/playbook.yml"
ansible.inventory_path = "ansible/inventory.ini"
ansible.config_file = "ansible/ansible.cfg"
ansible.verbose = "-vvvvv"
end
end
end
Doing a vagrant provision or vagrant up --provision results in the following error:
fatal: [virtualbox_windows_server_2016_1]: UNREACHABLE! => {
"changed": false,
"msg": "ssl: HTTPSConnectionPool(host='192.168.57.3', port=5986): Max retries exceeded with url: /wsman (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)')))",
"unreachable": true
}
The vagrant log info says it runs the following ansible command:
PYTHONUNBUFFERED=1 ANSIBLE_FORCE_COLOR=true ANSIBLE_CONFIG='ansible/ansible.cfg' ANSIBLE_HOST_KEY_CHECKING=false ANSIBLE_SSH_ARGS='-o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o IdentityFile=/Users/user/.vagrant.d/insecure_private_key -o ControlMaster=auto -o ControlPersist=60s' ansible-playbook --connection=ssh --timeout=30 --extra-vars=ansible_user\=\'vagrant\' --limit="virtualbox_windows_server_2016_1" --inventory-file=ansible/inventory.ini -vvvvv ansible/playbook.yml
Interestingly, when I copy and paste the above command and run it separately (i.e. on the terminal not via vagrant), there is no error and everything works just like the short ansible-playbook command I mentioned above.
It also works with and without vagrant if I set
ansible_port: 5985 # not 5986
What is the problem here?
Vagrant 2.2.4
ansible 2.8.1
Python 3.7.3
macOS 10.13.6

Ansible provisioning ERROR! Using a SSH password instead of a key is not possible

I would like to provision with my three nodes from the last one by using Ansible.
My host machine is Windows 10.
My Vagrantfile looks like:
Vagrant.configure("2") do |config|
(1..3).each do |index|
config.vm.define "node#{index}" do |node|
node.vm.box = "ubuntu"
node.vm.box = "../boxes/ubuntu_base.box"
node.vm.network :private_network, ip: "192.168.10.#{10 + index}"
if index == 3
node.vm.provision :setup, type: :ansible_local do |ansible|
ansible.playbook = "playbook.yml"
ansible.provisioning_path = "/vagrant/ansible"
ansible.inventory_path = "/vagrant/ansible/hosts"
ansible.limit = :all
ansible.install_mode = :pip
ansible.version = "2.0"
end
end
end
end
end
My playbook looks like:
---
# my little playbook
- name: My little playbook
hosts: webservers
gather_facts: false
roles:
- create_user
My hosts file looks like:
[webservers]
192.168.10.11
192.168.10.12
[dbservers]
192.168.10.11
192.168.10.13
[all:vars]
ansible_connection=ssh
ansible_ssh_user=vagrant
ansible_ssh_pass=vagrant
After executing vagrant up --provision I got the following error:
Bringing machine 'node1' up with 'virtualbox' provider...
Bringing machine 'node2' up with 'virtualbox' provider...
Bringing machine 'node3' up with 'virtualbox' provider...
==> node3: Running provisioner: setup (ansible_local)...
node3: Running ansible-playbook...
PLAY [My little playbook] ******************************************************
TASK [create_user : Create group] **********************************************
fatal: [192.168.10.11]: FAILED! => {"failed": true, "msg": "ERROR! Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this. Please add this host's fingerprint to your known_hosts file to manage this host."}
fatal: [192.168.10.12]: FAILED! => {"failed": true, "msg": "ERROR! Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this. Please add this host's fingerprint to your known_hosts file to manage this host."}
PLAY RECAP *********************************************************************
192.168.10.11 : ok=0 changed=0 unreachable=0 failed=1
192.168.10.12 : ok=0 changed=0 unreachable=0 failed=1
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.
I extended my Vagrantfile with ansible.limit = :all and added [all:vars] to the hostfile, but still cannot get through the error.
Has anyone encountered the same issue?
Create a file ansible/ansible.cfg in your project directory (i.e. ansible.cfg in the provisioning_path on the target) with the following contents:
[defaults]
host_key_checking = false
provided that your Vagrant box has sshpass already installed - it's unclear, because the error message in your question suggests it was installed (otherwise it would be "ERROR! to use the 'ssh' connection type with passwords, you must install the sshpass program"), but in your answer you add it explicitly (sudo apt-get install sshpass), like it was not
I'm using Ansible version 2.6.2 and solution with host_key_checking = false doesn't work.
Adding environment variable export ANSIBLE_HOST_KEY_CHECKING=False skipping fingerprint check.
This error can also be solved by simply export ANSIBLE_HOST_KEY_CHECKING variable.
export ANSIBLE_HOST_KEY_CHECKING=False
source: https://github.com/ansible/ansible/issues/9442
This SO post gave the answer.
I just extended the known_hosts file on the machine that is responsible for the provisioning like this:
Snippet from my modified Vagrantfile:
...
if index == 3
node.vm.provision :pre, type: :shell, path: "install.sh"
node.vm.provision :setup, type: :ansible_local do |ansible|
...
My install.sh looks like:
# add web/database hosts to known_hosts (IP is defined in Vagrantfile)
ssh-keyscan -H 192.168.10.11 >> /home/vagrant/.ssh/known_hosts
ssh-keyscan -H 192.168.10.12 >> /home/vagrant/.ssh/known_hosts
ssh-keyscan -H 192.168.10.13 >> /home/vagrant/.ssh/known_hosts
chown vagrant:vagrant /home/vagrant/.ssh/known_hosts
# reload ssh in order to load the known hosts
/etc/init.d/ssh reload
I had a similar challenge when working with Ansible 2.9.6 on Ubuntu 20.04.
When I run the command:
ansible all -m ping -i inventory.txt
I get the error:
target | FAILED! => {
"msg": "Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this. Please add this host's fingerprint to your known_hosts file to manage this host."
}
Here's how I fixed it:
When you install ansible, it creates a file called ansible.cfg, this can be found in the /etc/ansible directory. Simply open the file:
sudo nano /etc/ansible/ansible.cfg
Uncomment this line to disable SSH key host checking
host_key_checking = False
Now save the file and you should be fine now.
Note: You could also try to add the host's fingerprint to your known_hosts file by SSHing into the server from your machine, this prompts you to save the host's fingerprint to your known_hosts file:
promisepreston#ubuntu:~$ ssh myusername#192.168.43.240
The authenticity of host '192.168.43.240 (192.168.43.240)' can't be established.
ECDSA key fingerprint is SHA256:9Zib8lwSOHjA9khFkeEPk9MjOE67YN7qPC4mm/nuZNU.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.43.240' (ECDSA) to the list of known hosts.
myusername#192.168.43.240's password:
Welcome to Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-53-generic x86_64)
That's all.
I hope this helps
run the below command, it resolved my issue
export ANSIBLE_HOST_KEY_CHECKING=False && ansible-playbook -i
all provided solutions require changes in global config file or adding environment variable what create problems to onboard new people.
Instead you can add following variable to your inventory or host vars
ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
Adding ansible_ssh_common_args='-o StrictHostKeyChecking=no'
to either your inventory
like:
[all:vars]
ansible_ssh_common_args='-o StrictHostKeyChecking=no'
[all:children]
servers
[servers]
host1
OR:
[servers]
host1 ansible_ssh_common_args='-o StrictHostKeyChecking=no'

Using Command in Ansible playbook throws "Failed to connect to the host via ssh"

I have three tasks in my playbook. For all of those, Ansible needs to connect to hosts specified in the inventory file. The first two tasks executed well. The third task says
<10.182.1.23> ESTABLISH SSH CONNECTION FOR USER: root
<10.182.1.23> SSH: EXEC sshpass -d12 ssh -C -q -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r 10.182.1.23 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1485219301.67-103341754305609 `" && echo ansible-tmp-1485219301.67-103341754305609="` echo $HOME/.ansible/tmp/ansible-tmp-1485219301.67-103341754305609 `" ) && sleep 0'"'"''
fatal: [10.182.1.23]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh.", "unreachable": true}
Here is my playbook
This is a screenshot of my playbook
Here is playbook.yml
---
- hosts: all
strategy: debug
gather_facts: no
vars:
contents: "{{ lookup('file','/etc/redhat-release')}}"
mydate: "{{lookup('pipe','date +%Y%m%d%H%M%S**.%5N**')}}"
tasks:
- name: cat file content
debug: msg='the content of file is {{contents}} at date {{mydate}}.'
- name: second task
debug: msg='this is second task at time {{mydate}}.'
- name: fourth task
command: sudo cat /etc/redhat-release
register: result
- debug: var=result
Here is my inventory file
[hosts]
10.182.1.23 ansible_connection=ssh ansible_ssh_user=username ansible_ssh_pass=passphrase
I am not able to understand how it is able to connect to the host for the top two tasks and why it threw an error for the third.
I am new to using Ansible. Please help me with this.
I have three tasks in my playbook. For all of those, Ansible needs to connect to hosts specified in the inventory file.
That's not true. All lookup plugins perform their actions locally on the control machine.
"Lookups: Like all templating, these plugins are evaluated on the Ansible control machine"
I am not able to understand how it is able to connect to the host for the top two tasks and why it threw an error for the third.
Because your first two tasks use the debug module with lookup plugins. They just print the value of msg argument to the output and do not require connection to the remote host.
So your first two tasks display the contents of the local file /etc/redhat-release and local date-time.
The third task tries to run the code on the target machine and fails to connect.
try below linux command to determine whether ssh is flawless.
ssh remoteuser#remoteserver
It shouldn't prompt for password.

Ansible script ssh error

I am creating a vm in openstack (linux vm) and launching ansible script from there.I am getting following ssh error.
---
- hosts: licproxy
user: my-user
sudo: yes
tasks:
- name: Install tinyproxy#
command: sudo apt-get install tinyproxy
- name: Update tinyproxy
command: sudo apt-get update
- name: Install bind9
shell: yes '' | sudo apt-get install bind9
Though I am directly able to ssh to machine 10.32.1.40 from the linux box in openstack admin-keydev29
PLAY [licproxy] ***********************************************************
GATHERING FACTS ***************************************************************
<10.32.1.40> ESTABLISH CONNECTION FOR USER: my-user
<10.32.1.40> REMOTE_MODULE setup
<10.32.1.40> EXEC ssh -C -tt -vvv -o StrictHostKeyChecking=no -o IdentityFile="/opt/apps/installer/tenant-dev29/ssh/admin-key-dev29" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=my-user -o ConnectTimeout=10 10.32.1.40 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1450797442.33-90087292637238 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1450797442.33-90087292637238 && echo $HOME/.ansible/tmp/ansible-tmp-1450797442.33-90087292637238'
EXEC previous known host file not found for 10.32.1.40
fatal: [10.32.1.40] => SSH Error: ssh: connect to host 10.32.1.40 port 22: Connection refused
while connecting to 10.32.1.40:22
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.
TASK: [Install tinyproxy] *****************************************************
FATAL: no hosts matched or all hosts have already failed -- aborting
I removed from known_host entry and ran the script again it is still showing me same message.
UPDATE
I observed manual ssh is working fine.but ansible script is giving ssh error.
I logged in to the newly created vm using ssh key and checked /var/log/auth.log file
Dec 30 13:00:33 licproxy-vm sshd[1184]: Server listening on :: port 22.
Dec 30 13:01:10 licproxy-vm sshd[1448]: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
Dec 30 13:01:10 licproxy-vm sshd[1448]: Connection closed by 192.168.0.106 [preauth]
Dec 30 13:01:32 licproxy-vm sshd[1450]: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
The vm has sshd version OpenSSH_6.6.1 version
I checked /etc/ssh folder i found ssh_host_ed25519_key and ssh_host_ed25519_key.pub missing
I created those file using command ssh-keygen -A.
Now I want to know why these files are missing from ssh folder.Is this a bug?
Problem was because of ssh port 22.The port was not up.
I added the following code.which basically wait for ssh port to come up.
while ! nc -z $PROXY_SERVER_IP 22; do
sleep 10s
done