How to prevent "connection refused" with SOCKS on dataproc? - ssh

I'm trying to set up a SOCKS connection to my dataproc spark cluster following the Google Jupyter guide, but I keep getting "connection refused" errors after launching the browser, Chrome:
channel 4: open failed: connect failed: Connection refused
channel 5: open failed: connect failed: Connection refused
channel 6: open failed: connect failed: Connection refused
channel 7: open failed: connect failed: Connection refused
channel 8: open failed: connect failed: Connection refused
channel 5: open failed: connect failed: Connection refused
channel 5: open failed: connect failed: Connection refused
channel 5: open failed: connect failed: Connection refused
channel 16: open failed: connect failed: Connection refused
channel 16: open failed: administratively prohibited: open failed
channel 17: open failed: administratively prohibited: open failed
channel 18: open failed: administratively prohibited: open failed
channel 16: open failed: connect failed: Connection refused
channel 18: open failed: connect failed: Connection refused
channel 18: open failed: connect failed: Connection refused
channel 18: open failed: connect failed: Connection refused
This happens with both --proxy-server="socks5://localhost:1080" and --proxy-server="socks5://127.0.0.1:1080"

So, I'm not 100% sure where the "administratively prohibited" messages come from, but in my experience those have always been false alarms, and I see those open failed: administratively prohibited: open failed messages even when my socks proxy is functioning correctly as expected.
As for the actual problem, due to the way things like YARN bind their web services, I've gotten Connection refused if trying to access the YARN UI with http://localhost:8088 instead of http://<master-hostname>:8088. This matches the behavior running get inside the cluster:
dhuo#dhuo-jupyter-m:~$ wget http://localhost:8124
...
Saving to: ‘index.html.13’
index.html.13 100%[=============================================================================================================================================================================>] 11.41K --.-KB/s in 0s
2016-07-15 23:26:25 (222 MB/s) - ‘index.html.13’ saved [11686/11686]
dhuo#dhuo-jupyter-m:~$ wget http://localhost:8088
--2016-07-15 23:26:28-- http://localhost:8088/
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:8088... failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:8088... failed: Connection refused.
dhuo#dhuo-jupyter-m:~$ wget http://`hostname`:8124
...
Saving to: ‘index.html.14’
index.html.14 100%[=============================================================================================================================================================================>] 11.41K --.-KB/s in 0s
2016-07-15 23:26:34 (260 MB/s) - ‘index.html.14’ saved [11686/11686]
dhuo#dhuo-jupyter-m:~$ wget http://`hostname`:8088
...
Saving to: ‘index.html.15’
index.html.15 100%[=============================================================================================================================================================================>] 10.81K --.-KB/s in 0s
2016-07-15 23:26:37 (248 MB/s) - ‘index.html.15’ saved [11067/11067]
As you can see, this is different from the Jupyter behavior (which I ran on port 8124), where the Jupyter webapp works correctly resolving localhost:8124 on the master. Since the name resolution with those linked instructions should be happening on the master, the browser's behavior resolving hosts will be the same as running wget in the node you've tunneled into.
So if you just use your master's hostname instead of localhost it should work.

Related

GCP: how to use CLI to connect with SSH to newly created VM?

I think I'm missing one step in the script below.
The first time I run it, the VM gets created just fine, but the connection is refused. It continues to be refused even if I wait ten minutes after creating the VM.
However, if I use the GCP console to connect manually "Open in browser window", I get the message "Transferring SSH keys...", and the connection works. After this step, the script can connect fine.
What should I add to this script to get it to work without having to manually connect from the console?
#!/bin/bash
MY_INSTANCE="janne"
MY_TEMPLATE="dev-tf-nogpu-template"
HOME_PATH="/XXX/data/celeba/"
# Create instance
gcloud compute instances create $MY_INSTANCE --source-instance-template $MY_TEMPLATE
# Start instance
gcloud compute instances start $MY_INSTANCE
# Copy needed directories & files
gcloud compute scp ${HOME_PATH}src/ $MY_INSTANCE:~ --recurse --compress
gcloud compute scp ${HOME_PATH}save/ $MY_INSTANCE:~ --recurse --compress
gcloud compute scp ${HOME_PATH}pyinstall $MY_INSTANCE:~
gcloud compute scp ${HOME_PATH}gcpstartup.sh $MY_INSTANCE:~
# Execute startup script
gcloud compute ssh --zone us-west1-b $MY_INSTANCE --command "bash gcpstartup.sh"
# Connect over ssh
gcloud compute ssh --project XXX --zone us-west1-b $MY_INSTANCE
The full output of this script is:
(base) xxx#ubu-dt:/XXX/data/celeba$ bash gcpcreate.sh
Created [https://www.googleapis.com/compute/v1/projects/XXX/zones/us-west1-b/instances/janne].
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
janne us-west1-b n1-standard-1 XXX XXX RUNNING
Starting instance(s) janne...done.
Updated [https://compute.googleapis.com/compute/v1/projects/xxx/zones/us-west1-b/instances/janne].
ssh: connect to host 34.83.3.161 port 22: Connection refused
lost connection
ERROR: (gcloud.compute.scp) [/usr/bin/scp] exited with return code [1].
ssh: connect to host 34.83.3.161 port 22: Connection refused
lost connection
ERROR: (gcloud.compute.scp) [/usr/bin/scp] exited with return code [1].
ssh: connect to host 34.83.3.161 port 22: Connection refused
lost connection
ERROR: (gcloud.compute.scp) [/usr/bin/scp] exited with return code [1].
ssh: connect to host 34.83.3.161 port 22: Connection refused
lost connection
ERROR: (gcloud.compute.scp) [/usr/bin/scp] exited with return code [1].
ssh: connect to host 34.83.3.161 port 22: Connection refused
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].
ssh: connect to host 34.83.3.161 port 22: Connection refused
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].
Edit: adding gcloud version info
(base) bjorn#ubu-dt:/media/bjorn/data/celeba$ gcloud version
Google Cloud SDK 269.0.0
alpha 2019.10.25
beta 2019.10.25
bq 2.0.49
core 2019.10.25
gsutil 4.45
kubectl 2019.10.25
The solution I found is this: wait.
For OS login, SSH starts working about 20 seconds after the instance is started.
For non-OS login, it takes about a minute.
So I just added this after gcloud compute instances start $MY_INSTANCE
sleep 20s
When you connect through Console it manages the keys for you.
Your last comment leads me to believe that when you connect from console you are generating an SSH key and it somehow allows you to run the script, I would recommend you to take a look at how to manage SSH keys in metadata and creating your own SSH key to access through the SDK.
If outside of the script through the SDK you cannot directly SSH either then I assume that it's because of the same reason of the generated key.
Also please make sure that when using the SDK the service account has the correct permissions.
Let me know.

SSH port-forwarding works on one port but not on the other

I am trying this command
ssh username#example -L 27017:10.230.0.6:27017 -L 9201:10.290.0.8:9200 -L 5601:10.210.0.5:5601
The port forwarding works for the 27107 but not the others, do I need to override the ports?
I always get the same error which is:
channel 8: open failed: connect failed: Connection timed out
channel 7: open failed: connect failed: Connection timed out
ssh username#example ... -L 9201:10.290.0.8:9200 -L 5601:10.210.0.5:5601
...
channel 8: open failed: connect failed: Connection timed out
When you connect to port 9201 or 5601 on your local system, that connection is tunneled through your ssh link to the ssh server on the remote ssh server. From there, the ssh server makes a TCP connection to the target of the tunnel--10.290.0.8:9200 or 10.210.0.5:5601--and relays data between the tunneled connection and the connection to target of the tunnel.
The "Connection timed out" error is coming from the remote ssh server when it tries to make the TCP connection to the target of the tunnel. "Connection timed out" means that the ssh server process transmitted a TCP connection request to the target system, and it never received a response.
Common reasons for a connection timeout include:
The target system is down or disconnected from the network.
Some firewall or other network device is blocking traffic between the ssh server and the target system.
The IP address and/or port is incorrect, and the connection attempts are going to the wrong place.

packer: ssh communicator ignores "ssh_port"

I am working on building a VirtualBox VM with type "virtualbox-iso" and OpenSUSE 42.3 as the guest OS.I am specifying host and port that the ssh communicator should use during the build but it looks like packer is ignoring the port specification.
I am overwriting the default settings for host and port in my "builders" section. This is an excerpt of my json file:
"builders": [
{
"type": "virtualbox-iso",
"communicator": "ssh",
"ssh_host": "192.168.1.5",
"ssh_port": "22",
"ssh_username": "some_user",
"ssh_password": "some_password",
"ssh_timeout": "20m",
"ssh_handshake_attempts": "1000",
Packer is unable to connect to the VM because packer is ignoring the port I am providing with "ssh_port".
This is the debug output (enabled with PACKER_LOG=1):
2019/06/10 15:10:10 packer: 2019/06/10 15:10:10 [INFO] Waiting 1s
2019/06/10 15:10:11 ui: ==> opensuse-master-box: Using ssh communicator to connect: 192.168.1.5
2019/06/10 15:10:11 packer: 2019/06/10 15:10:11 [INFO] Waiting for SSH, up to timeout: 20m0s
2019/06/10 15:10:11 ui: ==> opensuse-master-box: Waiting for SSH to become available...
2019/06/10 15:10:26 packer: 2019/06/10 15:10:26 [DEBUG] TCP connection to SSH ip/port failed: dial tcp 192.168.1.5:4240: i/o timeout
2019/06/10 15:10:31 packer: 2019/06/10 15:10:31 [DEBUG] TCP connection to SSH ip/port failed: dial tcp 192.168.1.5:4240: connect: connection refused
2019/06/10 15:10:36 packer: 2019/06/10 15:10:36 [DEBUG] TCP connection to SSH ip/port failed: dial tcp 192.168.1.5:4240: connect: connection refused
2019/06/10 15:10:41 packer: 2019/06/10 15:10:41 [DEBUG] TCP connection to SSH ip/port failed: dial tcp 192.168.1.5:4240: connect: connection refused
Is this expected behavior or am I doing something wrong?
This is because of how VirtualBox NAT networks work. From the host you can't reach the guest VM directly. Packer solves this by setting up port forwarding rule. A random port between ssh_host_port_min and ssh_host_port_max is forwarded to the guest VMs ssh_port.
If you want to turn this of set ssh_skip_nat_mapping to true, but then you have to ensure that you have a network setup where Packer can reach the guest.
It is because you are doing "22" instead of 22. The config is looking for an int, not a string.

SSH works fine with .ssh/config, but fabric breaks

I have an .ssh/config:
Host host01 host01.in.mynet
User costello
HostName 1.2.3.4
Port 22222
Which is working fine with plain ssh:
ssh host01
costello#host01 ~ »
But fabric is not using that config:
$ fab deploy:host=host01
[host01] Executing task 'deploy'
Fatal error: Low level socket error connecting to host host01 on port 22: Connection refused (tried 1 time)
Underlying exception:
Connection refused
Aborting.
Why is fabric not using the ssh's configuration? I would really like to avoid duplicating the configuration for fabric or, even worse, change the ssh port of my server.

Jenkins could not start Windows slave

when I'm trying to start jenkins slave (windows) I get this error:
[07/07/15 12:54:15] [SSH] Opening SSH connection to pcskala:22105.
[07/07/15 12:54:15] [SSH] Authentication successful.
[07/07/15 12:54:15] [SSH] The remote users environment is:
Unable to execute command or shell on remote system: Failed to Execute process.
[07/07/15 12:54:15] [SSH] Starting sftp client.
[07/07/15 12:54:15] [SSH] SFTP failed. Copying via SCP.
hudson.util.IOException2: Could not copy slave.jar into 'c:\Users\jenkins' on slave
at hudson.plugins.sshslaves.SSHLauncher.copySlaveJarUsingSCP(SSHLauncher.java:1065)
at hudson.plugins.sshslaves.SSHLauncher.copySlaveJar(SSHLauncher.java:1024)
at hudson.plugins.sshslaves.SSHLauncher.access$300(SSHLauncher.java:133)
at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:709)
at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:696)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Sorry, this connection is closed.
at com.trilead.ssh2.transport.TransportManager.ensureConnected(TransportManager.java:587)
at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:660)
at com.trilead.ssh2.channel.ChannelManager.openSessionChannel(ChannelManager.java:572)
at com.trilead.ssh2.Session.<init>(Session.java:42)
at com.trilead.ssh2.Connection.openSession(Connection.java:1129)
at com.trilead.ssh2.Connection.exec(Connection.java:1551)
at hudson.plugins.sshslaves.SSHLauncher.copySlaveJarUsingSCP(SSHLauncher.java:1048)
... 8 more
Caused by: java.net.SocketException: Socket closed
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:121)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at com.trilead.ssh2.crypto.cipher.CipherOutputStream.flush(CipherOutputStream.java:75)
at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:193)
at com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:107)
at com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:677)
at com.trilead.ssh2.channel.ChannelManager.closeChannel(ChannelManager.java:304)
at com.trilead.ssh2.Session.close(Session.java:565)
at com.trilead.ssh2.Connection.exec(Connection.java:1568)
at hudson.plugins.sshslaves.SSHLauncher.reportEnvironment(SSHLauncher.java:1071)
at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:704)
... 5 more
[07/07/15 12:54:15] Launch failed - cleaning up connection
[07/07/15 12:54:15] [SSH] Connection closed.
Windows slave is running SSH and SFTP servers. Keys should be set fine, because I can connect to that machine via ssh and copy files via scp and sftp as well without entering password (from Unix system). I apologize that I don't post any more info, but I do not have direct access to slaves configuration in jenkins.
I don't know if this is important, but slave.jar file is located in home folder: C:\Users\jenkins.
I have found a thread in jenkins issues, where one had the similar issue and he resolved it with changing permission to /dev/null, but I don't think (just a thought) this would be the reason.
Do you have any ideas how to fix this?
Thank you