Problems connecting one GCE instance to another via SSH - ssh

I am attempting to connect (via SSH) one GCE VM instance to another GCE VM instance (which will be referred to as Machine 1 and Machine 2 from now one).
So far I have generated (via ssh-keygen -t rsa -f ~/.ssh/ssh_key) a public and private key on Machine 1, and have added the contents of ssh_key.pub to the ~/.ssh/authorized_keys file on Machine 2.
However, whenever I try to connect them via ssh using the following command: gcloud compute ssh --project [PROJECT_ID] --zone [ZONE] [Machine_2_Name] it simply times out (Connection timed out. ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].)
I have doubled checked that each VM instance has plenty of disk space, and their firewall settings are permissive, and OS Login is not enabled. I have read through the answer here but nothing is working.
What am I doing wrong? How do I properly SSH from one GCE VM instance to another?

The problem I was having was that each VM was using a different network/sub-network with different firewall configurations. After making one using the same network/sub-network, I was able to easily ssh into one from the other via
username#machine1:~$ ssh machine2

I tested the same scenario on my side and I got the same result as you said. Then I ran this command inside the machine to debug the SSH process to try to narrow down the issue:
gcloud compute ssh YOUR_INSTANCE_NAME --zone ZONE --ssh-flag="-vvv"
Then I got this result:
debug1: connect to address 35.x.x.x port 22: Connection timed out
ssh: connect to host 35.x.x.x port 22: Connection timed out
So, means the instance 1 is unable to connect to the external IP address of instance 2. I only added a new firewall rule and it works.
After running above mentioned command, if you see any permission denied message, it means you did not copy the public key to the source machine properly.

Related

GCP: how to use CLI to connect with SSH to newly created VM?

I think I'm missing one step in the script below.
The first time I run it, the VM gets created just fine, but the connection is refused. It continues to be refused even if I wait ten minutes after creating the VM.
However, if I use the GCP console to connect manually "Open in browser window", I get the message "Transferring SSH keys...", and the connection works. After this step, the script can connect fine.
What should I add to this script to get it to work without having to manually connect from the console?
#!/bin/bash
MY_INSTANCE="janne"
MY_TEMPLATE="dev-tf-nogpu-template"
HOME_PATH="/XXX/data/celeba/"
# Create instance
gcloud compute instances create $MY_INSTANCE --source-instance-template $MY_TEMPLATE
# Start instance
gcloud compute instances start $MY_INSTANCE
# Copy needed directories & files
gcloud compute scp ${HOME_PATH}src/ $MY_INSTANCE:~ --recurse --compress
gcloud compute scp ${HOME_PATH}save/ $MY_INSTANCE:~ --recurse --compress
gcloud compute scp ${HOME_PATH}pyinstall $MY_INSTANCE:~
gcloud compute scp ${HOME_PATH}gcpstartup.sh $MY_INSTANCE:~
# Execute startup script
gcloud compute ssh --zone us-west1-b $MY_INSTANCE --command "bash gcpstartup.sh"
# Connect over ssh
gcloud compute ssh --project XXX --zone us-west1-b $MY_INSTANCE
The full output of this script is:
(base) xxx#ubu-dt:/XXX/data/celeba$ bash gcpcreate.sh
Created [https://www.googleapis.com/compute/v1/projects/XXX/zones/us-west1-b/instances/janne].
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
janne us-west1-b n1-standard-1 XXX XXX RUNNING
Starting instance(s) janne...done.
Updated [https://compute.googleapis.com/compute/v1/projects/xxx/zones/us-west1-b/instances/janne].
ssh: connect to host 34.83.3.161 port 22: Connection refused
lost connection
ERROR: (gcloud.compute.scp) [/usr/bin/scp] exited with return code [1].
ssh: connect to host 34.83.3.161 port 22: Connection refused
lost connection
ERROR: (gcloud.compute.scp) [/usr/bin/scp] exited with return code [1].
ssh: connect to host 34.83.3.161 port 22: Connection refused
lost connection
ERROR: (gcloud.compute.scp) [/usr/bin/scp] exited with return code [1].
ssh: connect to host 34.83.3.161 port 22: Connection refused
lost connection
ERROR: (gcloud.compute.scp) [/usr/bin/scp] exited with return code [1].
ssh: connect to host 34.83.3.161 port 22: Connection refused
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].
ssh: connect to host 34.83.3.161 port 22: Connection refused
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].
Edit: adding gcloud version info
(base) bjorn#ubu-dt:/media/bjorn/data/celeba$ gcloud version
Google Cloud SDK 269.0.0
alpha 2019.10.25
beta 2019.10.25
bq 2.0.49
core 2019.10.25
gsutil 4.45
kubectl 2019.10.25
The solution I found is this: wait.
For OS login, SSH starts working about 20 seconds after the instance is started.
For non-OS login, it takes about a minute.
So I just added this after gcloud compute instances start $MY_INSTANCE
sleep 20s
When you connect through Console it manages the keys for you.
Your last comment leads me to believe that when you connect from console you are generating an SSH key and it somehow allows you to run the script, I would recommend you to take a look at how to manage SSH keys in metadata and creating your own SSH key to access through the SDK.
If outside of the script through the SDK you cannot directly SSH either then I assume that it's because of the same reason of the generated key.
Also please make sure that when using the SDK the service account has the correct permissions.
Let me know.

GCE VM cannot SSH to the new GCE VM it has just created in a different project

I'd like to solve the following problem using command line:
I'm trying to run the following PoC script from a GCE VM in project-a.
gcloud config set project project-b
gcloud compute instances create gce-vm-b --zone=us-west1-a
gcloud compute ssh --zone=us-west1-a gce-vm-b -- hostname
The VM is created successfully:
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
gce-vm-b us-west1-a n1-standard-16 10.12.34.56 12.34.56.78 RUNNING
But get the following error when trying to SSH:
WARNING: The public SSH key file for gcloud does not exist.
WARNING: The private SSH key file for gcloud does not exist.
WARNING: You do not have an SSH key for gcloud.
WARNING: SSH keygen will be executed to generate a key.
Generating public/private rsa key pair.
Your identification has been saved in /root/.ssh/google_compute_engine.
Your public key has been saved in /root/.ssh/google_compute_engine.pub.
The key fingerprint is:
...
Updating project ssh metadata...
.....................Updated [https://www.googleapis.com/compute/v1/projects/project-b].
>.done.
>Waiting for SSH key to propagate.
>ssh: connect to host 12.34.56.78 port 22: Connection timed out
>ERROR: (gcloud.compute.ssh) Could not SSH into the instance. It is possible that your SSH key has not propagated to the instance yet. Try running this command again. If you still cannot connect, verify that the firewall and instance are set to accept ssh traffic.
Running gcloud compute config-ssh hasn't changed anything in the error message. It's still ssh: connect to host 12.34.56.78 port 22: Connection timed out
I've tried adding a firewall rule to the project:
gcloud compute firewall-rules create default-allow-ssh --allow tcp:22
.
Creating firewall...
...........Created [https://www.googleapis.com/compute/v1/projects/project-b/global/firewalls/default-allow-ssh].
done.
NAME NETWORK DIRECTION PRIORITY ALLOW DENY
default-allow-ssh default INGRESS 1000 tcp:22
The error is now Permission denied (publickey).
gcloud compute ssh --zone=us-west1-a gce-vm-b -- hostname
.
Pseudo-terminal will not be allocated because stdin is not a terminal.
Warning: Permanently added 'compute.4123124124324242' (ECDSA) to the list of known hosts.
Permission denied (publickey).
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].
P.S. The project-a "VM" is a container run by Prow cluster (which is run by GKE).
"Permission denied (publickey)" means it is unable to validate the public key for the username.
You haven't specified the user in your command, so the user from the environment is selected and it may not be allowed into the instance gce-vm-b. Specify a valid user for the instance in your command according to the public SSH key metadata.

GCP- SSH connection timed out

I've been using ssh to connect to my Google Cloud Compute instance and it's been working fine. However, I left some code running on my instance and shut down my laptop. After turning it back on, I saw that the connection was disconnected with a port 22: Broken pipe error. Since then, I haven't been able to ssh into my instance. I get this error each time-
ssh: connect to host <IP> port 22: Operation timed out
I'm new to SSH (just a data scientist trying to train some models on GCP..) and not sure how to proceed. Would appreciate any pointers. Thanks!
ssh /authorized_keys using
command ls -la
if you have run this ssh -i [PATH_TO_PRIVATE_KEY] [USERNAME]#[EXTERNAL_IP_ADDRESS]
if not configure keygen to have private key

Permission Denied (public key)

I'm running a google cloud instance. I'm able to successfully connect to the instance via ssh.
But I'm not able to do the port forwarding to my localhost.
Here's the command I used:
ssh -L 16006:127.0.0.1:8080 username#instance_external_ip
When I run the above command , I get the following error
The authenticity of the host cannot be determined.
username#instance_external_ip : Permission Denied (public key)
How to solve this problem?
I found the answer for this question. The problem I had was that the server did not know the ssh keys. So, I did the following and it worked.
I deleted all the ssh keys in the my local machine and connect to my gcloud instance using the following command. gcloud command creates the ssh keys automatically and it transfers to the cloud ssh keys automatically. So, no need to manually copy paste the keys.
gcloud compute --project "project_name" ssh --zone "zone_name" "instance_name"
After this I connected to my instance using ssh. Before doing if you try to ssh tunnel , as the server won't be aware of the localhost, it will say permission denied on running ssh -L .....
Therefore, instead of directly connecting through ssh -L ... , connect along with ssh-key file stored in .ssh directory. Use the following command.
ssh -i ~/.ssh/google_compute-engine -L <ur localhost port number>:127.0.0.1:<remote_host_port> username#server_ip

Creating Instances from Snapshots

I've an f1-micro instance which I've been testing docker on created as such:
$ gcloud compute instances create dockerbox \
--image container-vm-v20140731 \
--image-project google-containers \
--zone europe-west1-b \
--machine-type f1-micro
This all works fine.
I'm now in the process of upgrading to a larger google compute engine VM. I've taken a snapshot of the fi-micro dockerbox, then used this as the Boot Source for the larger n1-standard-8 VM... this seems to create without problems until I try to ssh onto it.
via the command line:
$ gcloud compute --project "secure-electron-631" ssh --zone "europe-west1-b" "me#biggerbox"
ssh: connect to host xx.xx.xx.xx port 22: Connection timed out
ERROR: (gcloud.compute.ssh) Your SSH key has not propagated to your instance yet. Try running this command again.
via the browser, ssh connection I get:
Connection Failed
We are unable to connect to the VM on port 22. Please check that the VM is healthy and the SSH server is running.
I've tried multiple times but same result
I've confirmed it biggerbox is RUNNING. not sure about sshd
OK, problem seemed to stem from not detaching the micro instance from a mounted persistant disk when I took the snapshot. Detached and unmounted the PD volume and snapshotted the micro-instance again and based a new n1-standard-8 on it. Works ok now.
FYI, also handy for those troubleshooting GCE instance ssh:
https://github.com/GoogleCloudPlatform/compute-ssh-diagnostic-sh