SSH using terminal, cloud shell, or window browser in to Google Cloud VM instance not working - ssh

I am new to google cloud computing. Recently have not been able to SSH into my VM instance, previously it worked by clicking the SSH button and opened a new browser window. I have tried several fixes based on internet searches, mostly summarized below:
When I click the SSH button, it says "Transferring Keys to the VM" but never logs in and eventually times out.
Attempt 1: I attempted to follow another thread Google VM Instance not opening with SSH and I increased the persistent disk size and reset the VM. Now it says "Could not connect, retrying..." when I click the SSH button and eventually times out.
Attempt 2: I also tried SSHing into the VM using Google Cloud Shell using the command
gcloud beta compute ssh --zone "myZONE" "VM_instance_name" --project "PROJECT_ID"
And got the following error
Permission denied (publickey).
ERROR: (gcloud.beta.compute.ssh) [/usr/bin/ssh] exited with return code [255].
I created new keys using
sudo gcloud compute config-ssh
and it gave me this error
ERROR: (gcloud.compute.config-ssh) The project property is set to the empty string, which is invalid.
I ran the following and it seems to work
gcloud config set project myPROJECT_ID
Updated property [core/project].
But when I ran sudo gcloud again it gave me the same empty string error.
Attempt 3: I also tried setting up the log in locally using google-cloud-sdk. I followed interactive installing instructions on Using the Google Cloud SDK installer. I tried obtaining ssh key using
gcloud compute project-info describe --project myPROJECT_ID
and copied in into the ssh key on for the VM thru the cloud console website. I also tried
sudo gcloud compute config-ssh
which seemed to work and gave the following
Updating project ssh metadata...⠼Updated [https://...].
Updating project ssh metadata...done.
You should now be able to use ssh/scp with your instances.
For example, try running:
$ ssh VMinstance.myzone.PROJECT_ID
When I try running the ssh command I get the following error.
ssh: Could not resolve hostname VMinstance.myzone.PROJECT_ID: nodename nor servname provided, or not known
My instance has "enable connecting to serial ports" activated.
Any assistance is greatly appreciated.
Thank you in advance.
Update: for the VM instance in question, I clicked on serial port 1 (console) and here are the first 100 lines
serialport: Connected to PROJECT_ID.Zone.VMInstance port 1 (session ID: ##..., active connections: 1).
Jul 2 12:19:02 INSTANCE google-accounts: INFO Removing user root.
Jul 2 12:19:02 INSTANCE google-accounts: INFO Removing user root from the Google sudoers group.
[1##.##8] google_accounts_daemon[822]: Removing user root from group google-sudoers
Jul 2 12:19:02 INSTANCE google_accounts_daemon[822]: Removing user root from group google-sudoers
[1##.##0] google_accounts_daemon[822]: gpasswd: /etc/group.####: No space left on device
Jul 2 12:19:02 INSTANCE google_accounts_daemon[822]: gpasswd: /etc/group.####: No space left on device
[1##.##6] google_accounts_daemon[822]: gpasswd: cannot lock /etc/group; try again later.
Jul 2 12:19:02 INSTANCE google_accounts_daemon[822]: gpasswd: cannot lock /etc/group; try again later.
Jul 2 12:19:02 INSTANCE google-accounts: WARNING Could not update user root. Command '['gpasswd', '-d', 'root', 'google-sudoers']' returned non-zero exit status 1..
Jul 2 12:19:02 INSTANCE google-accounts: ERROR Exception calling the response handler. [Errno 2] No usable temporary directory found in ['/tmp', '/var/tmp', '/usr/tmp', '/'].#012Traceback (most recent call last):#012 File "/usr/lib/python3/dist-packages/google_compute_engine/metadata_watcher.py", line 200, in WatchMetadata#012 handler(response)#012 File "/usr/lib/python3/dist-packages/google_compute_engine/accounts/accounts_daemon.py", line 285, in HandleAccounts#012 self.utils.SetConfiguredUsers(desired_users.keys())#012 File "/usr/lib/python3/dist-packages/google_compute_engine/accounts/accounts_utils.py", line 318, in SetConfiguredUsers#012 mode='w', prefix=prefix, delete=True) as updated_users:#012 File "/usr/lib/python3.6/tempfile.py", line 681, in NamedTemporaryFile#012 prefix, suffix, dir, output_type = _sanitize_params(prefix, suffix, dir)#012 File "/usr/lib/python3.6/tempfile.py", line 269, in _sanitize_params#012 dir = gettempdir()#012 File "/usr/lib/python3.6/tempfile.py", line 437, in gettempdir#012 tempdir = _get_default_tempdir()#012 File "/usr/lib/python3.6/tempfile.py", line 372, in _get_default_tempdir#012 dirlist)#012FileNotFoundError: [Errno 2] No usable temporary directory found in ['/tmp', '/var/tmp', '/usr/tmp', '/']
Jul 2 12:19:41 INSTANCE systemd[1]: snapd.service: Start operation timed out. Terminating.
Jul 2 12:20:03 INSTANCE google_accounts_daemon[822]: sed: couldn't flush /etc/ssh/sedYpckqQ: No space left on device
[1##.##1] google_accounts_daemon[822]: sed: couldn't flush /etc/ssh/sedYpckqQ: No space left on device
Jul 2 12:20:03 INSTANCE google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sedeXXx0O: No space left on device
[1##.##4] google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sedeXXx0O: No space left on device
Jul 2 12:20:03 INSTANCE google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sed27Z7HO: No space left on device
[1##.##2] google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sed27Z7HO: No space left on device
Jul 2 12:20:03 INSTANCE google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sed3bBBFO: No space left on device
[1##.##6] google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sed3bBBFO: No space left on device
Jul 2 12:20:03 INSTANCE google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sedUKduxP: No space left on device
[1##.##3] google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sedUKduxP: No space left on device
Jul 2 12:20:03 INSTANCE google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sedhtixlP: No space left on device
[1##.##0] google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sedhtixlP: No space left on device
Jul 2 12:20:03 INSTANCE google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sedy6zVAS: No space left on device
[1##.##9] google_accounts_daemon[822]: sed: couldn't flush /etc/pam.d/sedy6zVAS: No space left on device
Jul 2 12:20:03 INSTANCE google_accounts_daemon[822]: Restarting optional services.
[1##.##7] google_accounts_daemon[822]: Restarting optional services.
Jul 2 12:20:03 INSTANCE systemd[1]: Stopping Login Service...
Jul 2 12:20:03 INSTANCE systemd[1]: Stopped Login Service.
Jul 2 12:20:03 INSTANCE systemd[1]: Starting Login Service...
Jul 2 12:20:03 INSTANCE systemd[1]: Started Login Service.
Jul 2 12:20:03 INSTANCE systemd[1]: Stopping Regular background program processing daemon...
Jul 2 12:20:03 INSTANCE systemd[1]: Stopped Regular background program processing daemon.
Jul 2 12:20:03 INSTANCE systemd[1]: Started Regular background program processing daemon.
Jul 2 12:20:03 INSTANCE google_accounts_daemon[822]: Restarting SSHD
[1##.##5] google_accounts_daemon[822]: Restarting SSHD
Jul 2 12:20:03 INSTANCE systemd[1]: Stopping OpenBSD Secure Shell server...
Jul 2 12:20:03 INSTANCE systemd[1]: Stopped OpenBSD Secure Shell server.
Jul 2 12:20:03 INSTANCE systemd[1]: Starting Google Compute Engine Instance Setup...
Jul 2 12:20:03 INSTANCE instance-setup: WARNING [Errno 28] No space left on device
Jul 2 12:20:03 INSTANCE instance-setup: INFO Running google_set_multiqueue.
Jul 2 12:20:03 INSTANCE instance-setup: INFO Setting /proc/irq/31/smp_affinity_list to 0 for device virtio1.
Jul 2 12:20:03 INSTANCE instance-setup: INFO /proc/irq/31/smp_affinity_list: real affinity 0
Jul 2 12:20:03 INSTANCE instance-setup: INFO Setting /proc/irq/32/smp_affinity_list to 0 for device virtio1.
Jul 2 12:20:03 INSTANCE instance-setup: INFO /proc/irq/32/smp_affinity_list: real affinity 0
Jul 2 12:20:03 INSTANCE instance-setup: INFO /usr/bin/google_set_multiqueue: line 139: echo: write error: No such file
or directory
Jul 2 12:20:03 INSTANCE instance-setup: INFO cat: /sys/class/net/ens4/queues/tx-0/xps_cpus: No such file or directory
Jul 2 12:20:03 INSTANCE instance-setup: INFO Queue 0 XPS=/sys/class/net/ens4/queues/tx-0/xps_cpus for
Jul 2 12:20:03 INSTANCE instance-setup: WARNING [Errno 28] No space left on device
Jul 2 12:20:03 INSTANCE systemd[1]: Started Google Compute Engine Instance Setup.
Jul 2 12:20:03 INSTANCE systemd[1]: Starting OpenBSD Secure Shell server...
Jul 2 12:20:03 INSTANCE systemd[1]: Started OpenBSD Secure Shell server.
Jul 2 12:20:03 INSTANCE systemd[1]: Stopping OpenBSD Secure Shell server...
Jul 2 12:20:03 INSTANCE systemd[1]: Stopped OpenBSD Secure Shell server.
Jul 2 12:20:03 INSTANCE systemd[1]: Starting Google Compute Engine Instance Setup...
Jul 2 12:20:04 INSTANCE instance-setup: WARNING [Errno 28] No space left on device
Jul 2 12:20:04 INSTANCE instance-setup: INFO Running google_set_multiqueue.
Jul 2 12:20:04 INSTANCE instance-setup: INFO Setting /proc/irq/31/smp_affinity_list to 0 for device virtio1.
Jul 2 12:20:04 INSTANCE instance-setup: INFO /proc/irq/31/smp_affinity_list: real affinity 0
Jul 2 12:20:04 INSTANCE instance-setup: INFO Setting /proc/irq/32/smp_affinity_list to 0 for device virtio1.
Jul 2 12:20:04 INSTANCE instance-setup: INFO /proc/irq/32/smp_affinity_list: real affinity 0
Jul 2 12:20:04 INSTANCE instance-setup: INFO /usr/bin/google_set_multiqueue: line 139: echo: write error: No such file or directory

If your instance is running and you can restart it you may try logging in via serial console. This method is independed from any firewall / network settings :)
Just add a user with a password; create a startup script like this (it will create a user and add it to google-sudoers group so you can do everything with this account):
#! /bin/bash
adduser username
echo 'sudouser:userspass' | chpasswd
usermod -aG google-sudoers sudouser
And then connect to serial console you can log in. Either using console window or cloud shell: gcloud compute connect-to-serial-port instance-name
This will allow you to log in. Unless there's something really wront with your VM.
But - judging from you desription - you want regular SSH access.
For that make sure that:
you firewall rules don't block port 22 on this machine
your firewall on a VM in question allows traffic on port 22
your ethernet interface is up in the VM
SSH server listens (ps aux | grep sshd) on port 22
Then check if your VM has external IP - if not you will be only able to connect to your VM via another that has one, or via cloud shell (and of course serial console as described above).
However - since you got this Permission denied (publickey) message that means that for some reason your SSH keys don't work (corrupted, deleted etc). When you log in with serial console check if they're there. If not you can add them manually.
----------- UPDATE ------------
Look at this line: Jul 2 12:19:02 INSTANCE google_accounts_daemon[822]: gpasswd: /etc/group.####: No space left on device - it says that your VM ran out of disk space. Stop your VM, try resizing your VM's disk first with gcloud compute disks resize mydiskname --size=100GB --zone=us-central1-a and start it. After the start VM's partition will be resized. –

Related

Redis service is not starting

I built from source and installed Redis on my system following this digital-ocean guide. But after running
$ sudo systemctl status redis
I get this failed status report.
● redis.service - Redis In-Memory Data Store Loaded: loaded
(/etc/systemd/system/redis.service; disabled; vendor preset: enabled)
Active: failed(Result: exit-code) since Tue 2018-04-03 01:51:54
+0530; 1s ago Process: 24974 ExecStart=/usr/local/bin/redis-server /etc/redis/redis.conf (code=exited, status=203/EXEC) Main PID: 24974
(code=exited, status=203/EXEC)
systemd[1]: redis.service: Unit entered failed state.
systemd[1]: redis.service: Failed with result 'exit-code'.
systemd[1]: redis.service: Service hold-off time over, scheduling restart.
systemd[1]: Stopped Redis In-Memory Data Store.
systemd[1]: redis.service: Start request repeated too quickly.
systemd[1]: Failed to start Redis In-Memory Data Store.
systemd[1]: redis.service: Unit entered failed state.
systemd[1]: redis.service: Failed with result 'exit-code'.
My system is Ubuntu 17.10 x64
Check by typing this:
sudo /usr/local/bin/redis-server /etc/redis/redis.conf
It will tell you where you're wrong.
I always have this kind of trouble. Usually I use the tool on utils package and this problem is solved.
sudo /tmp/redis-stable/utils/install_server.sh
I guess this approach a good way.
Could you please tell me what happen when you exec the following command? redis-server
Please don't forget give more information about your redis.conf file because the root of problem could be there and redis logs too, you can add the following line in your config file to get some error logs.
logfile /path/to/my/log/file.log
After that you should restart or reload the service to get additional information
I hope this information help you!
I have same error and I solved that problem by fixing dir in redis.conf(I guess that is the directory for dump.rdb).
dir /some/directory
So dump.rdb can be located in the directory that the user have permission.

DCOS navstar service failed to start on agent nodes

I'm setting up DC/OS on dev servers and faced with one of agent node failing to run navstar service:
# journalctl -u dcos-navstar -b
Mar 18 13:45:15 localhost.localdomain systemd[1]: Starting Navstar: A distributed systems & network overlay orchestration engine...
Mar 18 13:45:15 localhost.localdomain check-time[5868]: Checking whether time is synchronized using the kernel adjtimex API.
Mar 18 13:45:15 localhost.localdomain check-time[5868]: Time can be synchronized via most popular mechanisms (ntpd, chrony, systemd-timesyncd, etc.)
Mar 18 13:45:15 localhost.localdomain check-time[5868]: Time is in sync!
Mar 18 13:45:15 localhost.localdomain ping[5870]: ping: ready.spartan: Name or service not known
Mar 18 13:45:15 localhost.localdomain systemd[1]: dcos-navstar.service: control process exited, code=exited status=2
Mar 18 13:45:15 localhost.localdomain systemd[1]: Failed to start Navstar: A distributed systems & network overlay orchestration engine.
The ntpd service is installed and running (service is active). Time synchronization with ntpd works fine. Please advice.
Check 123 port is open and is not blocked by iptables or other firewall. Or try to use chrony as a service to synchronize the system clock with NTP servers (it is more accurate and has more features than ntp).
For CentOS:
yum install chrony
I had the same trouble with DC/OS. But not only navstar.service, but also metronome.service was failed (same time sync issue). Spent lot's of time searching for the grain of problem. Finally migrated to chrony and the problem disappeared.
For long-running tasks use Marathon. For one-time or cron tasks use Chronos. You simply use REST API to place and manage your tasks at DCOS through mentioned above frameworks. And I recommend you to use containers. Here you can read about: micro-services at DCOS

How to authenticate ldap user and login on server as GUI ,it should login on server directly via GUI

I am new to System admin My problem is : In my department there are 30 students in 1st year and 30 students in 2nd year which are divided into two groups lets say group1 and group2 which need to login as ldap user via Ubuntu(14.04) GUI through any System connected to LAN.Every users home directory should be created on server side ,It should mount while login as GUI in ubuntu14.04, No other user should access anyone else home directory except by self.
[I don't want authenticating user to ldap-server and creating home directory on local machine ,instead I want central directory on server side,It should looks like login to server.]
Server Side : Ubuntu 14.04
I tried this and it works fine for me.
Client side : Ubuntu14.04
I tried this , it also works
but the issue is , this tutorial creates home directory on local machine instead of mounting server directory.I know from where it does.
I want : If i login through ldap user It should login on server via GUI not on local machine home directory.
on client side file "/var/log/auth.log"
Jul 28 11:53:06 issc systemd-logind[674]: System is rebooting.
Jul 28 11:53:23 issc systemd-logind[650]: New seat seat0.
Jul 28 11:53:23 issc systemd-logind[650]: Watching system buttons on /dev/input/event1 (Power Button)
Jul 28 11:53:23 issc systemd-logind[650]: Watching system buttons on /dev/input/event4 (Video Bus)
Jul 28 11:53:23 issc systemd-logind[650]: Watching system buttons on /dev/input/event0 (Power Button)
Jul 28 11:53:24 issc sshd[833]: Server listening on 0.0.0.0 port 22.
Jul 28 11:53:24 issc sshd[833]: Server listening on :: port 22.
Jul 28 11:53:25 issc lightdm: PAM unable to dlopen(pam_kwallet.so): /lib/security/pam_kwallet.so: cannot open shared object file: No such file or directory
Jul 28 11:53:25 issc lightdm: PAM adding faulty module: pam_kwallet.so
Jul 28 11:53:25 issc lightdm: pam_unix(lightdm-greeter:session): session opened for user lightdm by (uid=0)
Jul 28 11:53:25 issc systemd-logind[650]: New session c1 of user lightdm.
Jul 28 11:53:25 issc systemd-logind[650]: Linked /tmp/.X11-unix/X0 to /run/user/112/X11-display.
Jul 28 11:53:26 issc lightdm: PAM unable to dlopen(pam_kwallet.so): /lib/security/pam_kwallet.so: cannot open shared object file: No such file or directory
Jul 28 11:53:26 issc lightdm: PAM adding faulty module: pam_kwallet.so
Jul 28 11:53:26 issc lightdm: pam_succeed_if(lightdm:auth): requirement "user ingroup nopasswdlogin" not met by user "scicomp"
Jul 28 11:53:29 issc lightdm: PAM unable to dlopen(pam_kwallet.so): /lib/security/pam_kwallet.so: cannot open shared object file: No such file or directory
Please help me, i tried many tutorial online and every tutorial looks like same ,like this one.I am trying from last 2 weeks its not working.Thank you for your time.
You need to install and configure autofs for this to work. autofs will automatically mount user's home directories on the client machine from an NFS server. I'm not sure about creating them on the server on the fly, but if it does work, you will likely need to enable the pam_mkhomedir module in the appropriate /etc/pam.d file(s), as described here
Yep! I tried and worked for me.
**Server Side :** Package require to install :
$ sudo apt-get install nfs-kernel-server
Updated in below file like this
abdulrahim#issc-ldap:/ldap/batch2016part2$ sudo vi /etc/exports
#/homes 198.1.10.*(fsid=0,rw,insecure,no_subtree_check,sync)
/ldap/batch2015part1/home 198.1.10.*(fsid=1,rw,insecure,no_subtree_check,sync)
/ldap/batch2015part2/home 198.1.10.*(fsid=2,rw,insecure,no_subtree_check,sync)
Exported as per below::::
abdulrahim#issc-ldap:/ldap/batch2016part2$ sudo exportfs -r
root#issc-ldap:/ldap/rnd# showmount -e 198.1.10.45
Export list for 198.1.10.45:
/ldap/batch2015part1/home
/ldap/batch2015part2/home
**On Client Side :** Package require to install :
$ sudo apt-get install nfs-kernel-client
NOW ON CLIENT SIDE mount,permission ,ownership::::::
$ sudo gedit /etc/fstab
#####below are partition mounted from server
198.1.10.45:/ldap/batch2015part1/home /ldap/batch2015part1/home nfs nfsvers=3,sync 0 3
198.1.10.45:/ldap/batch2015part2/home /ldap/batch2015part2/home nfs nfsvers=3,sync 0 4
### or like this below
198.1.10.45:/ldap/batch2015part1/home /ldap/batch2015part1/home nfs noauto,x-systemd.automount 0 3
198.1.10.45:/ldap/batch2015part2/home /ldap/batch2015part2/home nfs noauto,x-systemd.automount 0 4
Now mount all pertition from server side as per below : :::::
$ sudo mount -a
Check mounted partion by below commands
$ df -h

Cannot connect to Google Compute Engine instance via SSH in browser

I cannot connect to GCE via ssh. It is showing Connection Failed, and we are unable to connect VM on port 22.
And serial console output its shows
Jul 8 10:09:26 Instance sshd[10103]: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
Jul 8 10:09:27 Instance sshd[10103]: User username from 0.0.0.0 not allowed because not listed in AllowUsers
Jul 8 10:09:27 Instance sshd[10103]: input_userauth_request: invalid user username [preauth] Jul 8 10:09:27 Instance sshd[10103]: Connection closed by 0.0.0.0 [preauth] –
Yesterday it was working fine, but today it shows this error. I am new to GCE. Any suggestions?
UPDATE
I'd like to post this update to mention that on June 2016 a new feature is released where you can enable interactive access to the serial console so you can more easily troubleshoot instances that are not booting properly or that are otherwise inaccessible. See Interacting with the Serial Console for more information.
-----------------------------------------------------------------------------------
It looks like you've added AllowUsers in /etc/ssh/sshd_config configuration file.
To resolve this issue, you'll need to attach the boot disk of your VM instance to a healthy instance as the second disk. Mount it, edit the configuration file and fix the issue.
Here are the steps you can take to resolve the issue:
First of all, take a snapshot of your instance’s disk, in case if a loss or corruption happens you can recover your disk.
In the Developers Console, click on your instance. Uncheck Delete boot disk when instance is deleted and then delete the instance. The boot disk will remain under “Disks”, and now you can attach the disk to another instance. You can also do this step using gcloud command:
$ gcloud compute instances delete NAME --keep-disks all
Now attach the disk to a healthy instance as an additional disk. You can do this through the Developers Console or using the gcloud command:
$ gcloud compute instances attach-disk EXAMPLE-INSTANCE --disk DISK --zone ZONE
SSH into your healthy instance.
Determine where the secondary disk lives:
$ ls -l /dev/disk/by-id/google-*
Mount the disk:
$ sudo mkdir /mnt/tmp
$ sudo mount /dev/disk/by-id/google-persistent-disk-1-part1 /mnt/tmp
Where google-persistent-disk-1 is the name of the disk
Edit sshd_config configuration file and remove AllowUsers line and save it.
$ sudo nano /mnt/tmp/etc/ssh/sshd_config
Now unmout the disk:
$ sudo umount /mnt/tmp
Detach it from the VM instance. This can be done through the Developers Console or using the command below:
$ gcloud compute instances detach-disk EXAMPLE-INSTANCE --disk DISK
Now create a new instance using your fixed boot disk.

User gets instantly disconnected after connection successful on a chrooted SSH

I configured a jail with Chroot in SSH following this tutorial.
I found another question on StackOverflow dealing with the same problem, however the answers didn't work for me either.
The auth.log file contains the following:
Mar 16 18:36:06 *** sshd[30509]: Accepted password for thenewone from x.x.x.x port 49583 ssh2
Mar 16 18:36:06 *** sshd[30509]: pam_unix(sshd:session): session opened for user thenewone by (uid=0)
Mar 16 18:36:07 *** sshd[30509]: lastlog_openseek: Couldn't stat /var/log/lastlog: No such file or directory
Mar 16 18:36:07 *** sshd[30509]: lastlog_openseek: Couldn't stat /var/log/lastlog: No such file or directory
Mar 16 18:36:07 *** sshd[30509]: pam_unix(sshd:session): session closed for user thenewone
My sshd_config file contains the following:
Match User thenewone
ChrootDirectory /home/thenewone
AllowTcpForwarding no
X11Forwarding no
My /home/thenewone directory is owned by root:root and contains the chrooted system (all files but /home/thenewone/home/thenewone owned by root:root)
I don't understand why the connection is successful then simply close.
Problem found: some binaries dependencies were missing, even for the shell associated with the chrooted account...
Shell failed to load --> disconnection!
If you are experiencing the same trouble as mine, use ldd <binary> to find all needed dependencies in the chroot jail