Using knife ec2 plugin to create VM in VPC private subnet - ssh

Although I've written a fair amount of chef, I'm fairly new to both AWS/VPC and administrating network traffic (especially a bastion host).
Using the knife ec2 plugin, I would like the capability to dynamically create and bootstrap a VM from my developer workstation. The VM should be able to exist in either a public or private subnet of my VPC. I would like to do all of this without use of an elastic IP. I would also like for my bastion host to be hands off (i.e. I would like to avoid having to create explicit per-VM listening tunnels on my bastion host)
I have successfully used the knife ec2 plugin to create a VM in the legacy EC2 model (e.g. outside of my VPC). I am now trying to create an instance in my VPC. On the knife command line, I'm specifying a gateway, security groups, subnet, etc. The VM gets created, but knife fails to ssh to it afterward.
Here's my knife command line:
knife ec2 server create \
--flavor t1.micro \
--identity-file <ssh_private_key> \
--image ami-3fec7956 \
--security-group-ids sg-9721e1f8 \
--subnet subnet-e4764d88 \
--ssh-user ubuntu \
--server-connect-attribute private_ip_address \
--ssh-port 22 \
--ssh-gateway <gateway_public_dns_hostname (route 53)> \
--tags isVPC=true,os=ubuntu-12.04,subnet_type=public-build-1c \
--node-name <VM_NAME>
I suspect that my problem has to do with the configuration of my bastion host. After a day of googling, I wasn't able to find a configuration that works. I'm able to ssh to the bastion host, and from there I can ssh to the newly created VM. I cannot get knife to successfully duplicate this using the gateway argument.
I've played around with /etc/ssh/ssh_config. Here is how it exists today:
ForwardAgent yes
#ForwardX11 no
#ForwardX11Trusted yes
#RhostsRSAAuthentication no
#RSAAuthentication yes
#PasswordAuthentication no
#HostbasedAuthentication yes
#GSSAPIAuthentication no
#GSSAPIDelegateCredentials no
#GSSAPIKeyExchange no
#GSSAPITrustDNS no
#BatchMode no
CheckHostIP no
#AddressFamily any
#ConnectTimeout 0
StrictHostKeyChecking no
IdentityFile ~/.ssh/identity
#IdentityFile ~/.ssh/id_rsa
#IdentityFile ~/.ssh/id_dsa
#Port 22
#Protocol 2,1
#Cipher 3des
#Ciphers aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc
#MACs hmac-md5,hmac-sha1,umac-64#openssh.com,hmac-ripemd160
#EscapeChar ~
Tunnel yes
#TunnelDevice any:any
#PermitLocalCommand no
#VisualHostKey no
#ProxyCommand ssh -q -W %h:%p gateway.example.com
SendEnv LANG LC_*
HashKnownHosts yes
GSSAPIAuthentication yes
GSSAPIDelegateCredentials no
GatewayPorts yes
I have also set /home/ubuntu/.ssh/identity to the matching private key of my new instance.
UPDATE:
I notice the following in the bastion host's /var/log/auth.log:
May 9 12:15:47 ip-10-0-224-93 sshd[8455]: Invalid user from <WORKSTATION_IP>
May 9 12:15:47 ip-10-0-224-93 sshd[8455]: input_userauth_request: invalid user [preauth]

I finally resolved this. I was missing the username when specifying my gateway. I originally thought that the --ssh-user argument would be used for both the gateway AND the VM I'm attempting to bootstrap. This was incorrect, username must be specified for both.
knife ec2 server create \
--flavor t1.micro \
--identity-file <ssh_private_key> \
--image ami-3fec7956 \
--security-group-ids sg-9721e1f8 \
--subnet subnet-e4764d88 \
--ssh-user ubuntu \
--server-connect-attribute private_ip_address \
--ssh-port 22 \
--ssh-gateway ubuntu#<gateway_public_dns_hostname (route 53)> \
--tags isVPC=true,os=ubuntu-12.04,subnet_type=public-build-1c \
--node-name <VM_NAME>
Just the line containing the update (notice the ubuntu# in front):
--ssh-gateway ubuntu#<gateway_public_dns_hostname (route 53)>
I have now gone through and locked my bastion host back down, including removal of /home/ubuntu/.ssh/identity, as storing the private key on the bastion host was really bugging me.
FYI: When setting up a bastion host, the "out of the box" configuration of sshd will work when using the Amazon Linux AMI image. Also, some of the arguments above are optional, such as --ssh-port.

Related

csshX using a jumphost/bastion

I am currently using the following cmd to login to a ec2 instance using a jumphost -
ssh -J jumphost:2222 some_ip
I have installed csshX as I need to login to multiple instances simultaneously. I am not sure how to specify a jumphost in csshX.
Regards,
Aditya
You can use an ssh config, the default location being ~/.ssh/config which has a similar configuration, and ssh client honours it.
Host 192.168.*.*
ProxyCommand ssh jumphost -W %h:%p
and when you do csshX 192.168.0.10, it will go through the jumphost. (Tested and working from a mac.)

Kubespray with bastion and custom SSH port + agent forwarding

Is it possible to use Kubespray with Bastion but on custom port and with agent forwarding? If it is not supported, what changes does one need to do?
Always, since you can configure that at three separate levels: via the host user's ~/.ssh/config, via the entire playbook with group_vars, or as inline config (that is, on the command line or in the inventory file).
The ssh config is hopefully straightforward:
Host 1.2.* *.example.com # or whatever pattern matches the target instances
ProxyJump someuser#some-bastion:1234
# and then the Agent should happen automatically, unless you mean
# ForwardAgent yes
I'll speak to the inline config next, since it's a little simpler:
ansible-playbook -i whatever \
-e '{"ansible_ssh_common_args": "-o ProxyJump=\"someuser#jump-host:1234\""}' \
cluster.yaml
or via the inventory in the same way:
master-host-0 ansible_host=1.2.3.4 ansible_ssh_common_args="-o ProxyJump='someuser#jump-host:1234'"
or via group_vars, which you can either add to an existing group_vars/all.yml, or if it doesn't exist then create that group_vars directory containing the all.yml file as a child of the directory containing your inventory file
If you have more complex ssh config than you wish to encode in the inventory/command-line/group_vars, you can also instruct the ansible-invoked ssh to use a dedicated config file via the ansible_ssh_extra_args variable:
ansible-playbook -e '{"ansible_ssh_extra_args": "-F /path/to/special/ssh_config"}' ...
In my case where I needed to access the hosts on particular ports, I just had to modify the host's ~/.ssh/config to be:
Host 10.40.45.102
ForwardAgent yes
User root
ProxyCommand ssh -W %h:%p -p 44057 root#example.com
Host 10.40.45.104
ForwardAgent yes
User root
ProxyCommand ssh -W %h:%p -p 44058 root#example.com
Where 10.40.* was the internal IPs.

Cannot SSH Dataproc Master in Cluster

I've been trying to create a cluster in Dataproc using as initialization script the jupyter repository.
But when I try to ssh to the master so to be able to access the Jupyter interface running this command:
gcloud compute ssh --zone=zone_name \
--ssh-flag="-D 10000" \
--ssh-flag="-N" \
--ssh-flag="-n" "cluster1-m" &
I get the error:
Permission denied (publickey). ERROR: (gcloud.compute.ssh)
[/usr/bin/ssh] exited with return code [255].
I could confirm that all ssh keys are created normally. I tried this other option then:
gcloud compute ssh --zone=zone_name \
--ssh-flag="-D 10000" \
--ssh-flag="-N" \
--ssh-flag="-n" "will#cluster1-m" &
Which seems to work as I can ssh into the instance but now I get the error:
bind: Cannot assign requested address
channel_setup_fwd_listener_tcpip: cannot listen to port: 10000 Could
not request local forwarding.
For creating the cluster I used:
gcloud dataproc clusters create $CLUSTER_NAME \
--metadata "JUPYTER_PORT=8124,JUPYTER_CONDA_PACKAGES=numpy:pandas:scikit-learn:jinja2:mock:pytest:pytest-cov" \
--initialization-actions \
gs://dataproc-initialization-actions/jupyter/jupyter.sh \
--bucket $BUCKET_NAME
and I'm running this in a docker image Debian 8.9 (jessie).
If you need any extra information please let me know.
If you verified you can do a normal SSH into the cluster, then if you're just getting the bind: Cannot assign requested address error it probably means you have another SSH session with local port forwarding already on your current machine using port 10000 already. If you see the bind error, you should always first try a different local port, like -D 12345. You can use top or your task manager to check if you have a hanging ssh -D command somewhere still running and occupying port 10000.

Ansible percent expand

I have an ansible playbook which connects to a virtual machine via a non-standard ssh port (forwarded to localhost) and a different user than the host user (vagrant).
The ssh port is specified in the ansible inventory:
[vms]
localhost:2222
The username given on the command line to ansible-playbook:
ansible-playbook -i <inventory from above> <some playbook> -u vagrant
The communication with the VM works correctly, however, %p always expands to 22 and %r to the host username.
Consequently, I cannot flush the SSH connection (for the user's changed group membership to take effect) like this:
- name: flush the ssh connection
command: ssh -o ControlPath="~/.ansible/cp/ansible-ssh-%h-%p-%r" -O stop {{inventory_hostname}}
delegate_to: 127.0.0.1
Am I making a silly mistake somewhere? Alternatively, is there a different way to flush the SSH connection?
The percent expand is not expanded by ansible, but by ssh later on.
Sorry, forgot to add the most important part
Using
command: ssh -o ControlPath=[...] -O stop {{inventory_hostname}}
will use default port, because you didn't specify it on the command-line. You would have to specify also the port to "flush" the connection this way:
command: ssh -o ControlPath=[...] -O stop -p {{inventory_port}} {{inventory_hostname}}
But I don't think it is needed. Ansible should clean up the connections when the playbook ends and I don't see any different reason why to do that.

gcloud compute ssh from one VM to another VM on Google Cloud

I am trying to ssh into a VM from another VM in Google Cloud using the gcloud compute ssh command. It fails with the below message:
/usr/local/bin/../share/google/google-cloud-sdk/./lib/googlecloudsdk/compute/lib/base_classes.py:9: DeprecationWarning: the sets module is deprecated
import sets
Connection timed out
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255]. See https://cloud.google.com/compute/docs/troubleshooting#ssherrors for troubleshooting hints.
I made sure the ssh keys are in place but still it doesn't work. What am I missing here?
There is an assumption that you have connected to the externally-visible instance using SSH beforehand with gcloud.
From your local machine, start ssh-agent with the following command to manage your keys for you:
me#local:~$ eval `ssh-agent`
Call ssh-add to load the gcloud compute public keys from your local computer into the agent, and use them for all SSH commands for authentication:
me#local:~$ ssh-add ~/.ssh/google_compute_engine
Log into an instance with an external IP address while supplying the -A argument to enable authentication agent forwarding.
gcloud compute ssh --ssh-flag="-A" INSTANCE
source: https://cloud.google.com/compute/docs/instances/connecting-to-instance#sshbetweeninstances.
I am not sure about the 'flags' because it's not working for me bu maybe I have a different OS or Gcloud version and it will work for you.
Here are the steps I ran on my Mac to connect to the Google Dataproc master VM and then hop onto a worker VM from the master MV. I ssh'd to the master VM to get the IP.
$ gcloud compute ssh cluster-for-cameron-m
Warning: Permanently added '104.197.45.35' (ECDSA) to the list of known hosts.
I then exited. I enabled forwarding for that host.
$ nano ~/.ssh/config
Host 104.197.45.35
ForwardAgent yes
I added the gcloud key.
$ ssh-add ~/.ssh/google_compute_engine
I then verified that it was added by listing the key fingerprints with ssh-add -l. I reconnected to the master VM and ran ssh-add -l again to verify that the keys were indeed forwarded. After that, connecting to the worker node worked just fine.
ssh cluster-for-cameron-w-0
About using SSH Agent Forwarding...
Because instances are frequently created and destroyed on the cloud, the (recreated) host fingerprint keeps changing. If the new fingerprint doesn't match with ~/.ssh/known_hosts, SSH automatically disables Agent Forwarding. The solution is:
$ ssh -A -o UserKnownHostsFile=/dev/null ...