I have some files to be uploaded to an SFTP server, so I use JSch to accomplish this goal.
I have these options for implementation:
JSch opens one session and one channel
JSch opens one session and multiple channels
The above two solutions, which is more efficient?
Does one session correspond to a TCP connection, or does one channel correspond to a TCP connection?
If one session corresponds to a TCP connection, then multiple channels must share the same TCP connection, can it be more efficient?
One SSH session corresponds to one TCP connection. A channel is just a virtual "connection" within the one SSH/TCP connection.
As you have rightly assumed, it can hardly be more efficient to use multiple channels.
Option to use multiple channels is not for efficiency, but for flexibility (imo).
Actually using multiple channels can be less efficient.
It depends on how efficiently the SSH parties implement an SSH flow control (sliding window), comparing to efficiency of a TCP flow control (which will usually be super-optimized).
Some SFTP clients, when they know that only one channel will be opened, deliberately set client-side SSH window to a huge number, to leave the flow control to TCP (expecting it to be more efficient).
Also, PuTTY-based SFTP clients (like psftp or WinSCP) announce to the server that it will only ever use one channel (using a proprietary simple#putty.projects.tartarus.org message), so that the server can also opt to leave flow control to TCP too. Not that I know of any SSH server to actually take advantage of this.
Related
I'm trying to use "scp" to copy TB-sized files, which is fine, until whatever router or other issue throws a tantrum and drops my connections (lost packets or unwanted RSTs or whatever).
# scp user#rmt1:/home/user/*z .
user#rmt1's password:
log_backups_2019_02_09_07h44m14.gz
16% 6552MB 6.3MB/s 1:27:46 ETAclient_loop: send disconnect: Broken pipe
lost connection
It occurs to me that (if ssh doesn't already support this) it should be possible for something at each end point and in between the connection to simply connect with its peer, and when "stuff goes wrong", to transparently just bloody handle-it (to re-try indefinitely and reconnect basically).
Anyone know the solution?
My "normal" way of tunnelling remote machines into a local connection is using ssh of course, catch-22 - that's the thing that's breaking so I can't do that here...
SSH uses TCP, and TCP is generally designed to be relatively fault-tolerant, with retries for dropped packets, acknowledgements, and other techniques to overcome occasional network problems.
If you're seeing dropped connections nevertheless, then you are seeing excessive network problems, more than any standard protocol can be expected to handle, or you are seeing a malicious attacker intentionally try to disrupt the connection, which cannot be avoided. Those are both issues that no reasonable network protocol can overcome, and so you're going to have to deal with them. That's true whether you're using SSH or some other protocol.
You could try using SFTP instead of SCP, because SFTP supports resuming (e.g., put -a), but that's the best that's going to be possible. You can also try a command like lftp, which may have more scripting possibilities to copy and retry (e.g., mirror --continue --loop), and can also use SFTP under the hood.
Your best bet is to find out what the network problem is and get that fixed. mtr may be helpful for finding where your packet loss is.
On the law lvl of SSL protocols there are 4 types of messages:
Handshake Protocol
ChangeCipherSpec Protocol
Alert Protocol
Application Data Protocol
After the handshaking is completed and the symmetric private key been exchanged, the client will send Application Data messages to the server.
How ever same server can handle multiple clients, and each of those clients got it's own symmetric key.
Does the server keeps the connection open with all of the clients? If not how does the server know what symmetric key to use for an incoming connection? Does Application Data Protocol provide some sort of session id that the server can use to map to the right key?
Does the server keeps the connection open with all of the clients?
It can, depending on how the server is implemented.
how does the server know what symmetric key to use for an incoming connection?
Imagine you have a multiplayer game. Your server code typically looks something like this:
sock = socket.listen(port)
clients = []
if sock.new_client_waiting:
new_client = sock.accept()
clients.append(new_client)
for client in clients:
if client.new_data_waiting:
data = client.read()
# handle incoming actions...
So if there are two clients, the server will just know about both and have a socket object for both. Therein lies your answer: the OS (its TCP stack) handles the concept of connections and by providing you a socket, you can read/write from and to that socket, and you know from which clients it originates (to some degree of certainty anyway).
Many servers work differently, e.g. web server code is much more like this:
sock = socket.listen(port)
while True: # infinitely loop...
client = sock.accept() # This call will block until a client is available
spawn_new_http_handler(client)
Whenever a new person connects, a worker thread will pick it up and manage things from there. But still, it will have its socket to read from and write to.
Does Application Data Protocol provide some sort of session id that the server can use to map to the right key?
I do not know these specs by heart, but I am pretty sure the answer is no. Session resumption is done earlier, at the handshake stage, and is meant for clients returning. E.g. if I connected to https://example.com 30 minutes ago and return now, it might have my session and we don't need to do the whole handshake again. It doesn't have anything to do with telling clients apart.
Background: The default setting for MaxStartups in OpenSSH is 10:30:60, and most Linux distributions keep this default. That means there can be only 10 ssh connections at a time that are exchanging keys and authenticating before sshd starts dropping 30% of new incoming connections, and at 60 unauthenticated connections, all new connections will be dropped. Once a connection is set up, it doesn't count against this limit. See e.g. this question.
Problem: I'm using GNU parallel to run some heavy data processing on a large number of backend nodes. I need to access those nodes through a single frontend machine, and I'm using ssh:s ProxyCommand to set up a tunnel to transparently access the backends. However, I'm constantly hitting the maximum unauthenticated connection limit because parallel is spawning more ssh connections than the frontend can authenticate at once.
I've tried to use ControlMaster auto to reuse a single connection to the frontend, but no luck.
Question: How can I limit the rate at which new ssh connections are opened? Could I control how many unauthenticated connections there are open at a given time, and delay new connections until another connection has become authenticated?
I think we need a 'spawn at most this many jobs per second per host' option for GNU Parallel. It would probably make sense to have the default work for hosts with MaxStartups = 10:30:60, fast CPUs, but with 500 ms latency.
Can we discuss it on parallel#gnu.org?
Edit:
--sshdelay was implemented in version 20130122.
Using ControlMaster auto still sounds like the way to go. It shouldn't hit MaxStartups, since it keeps a single connection open (and opens sessions on that connection). In what way didn't it work for you?
Other relevant settings that might prevent ControlMaster from working, given your ProxyCommand setup are ControlPath:
ControlPath %r#%h:%p - name the socket {user}#{host}:{port}
and ControlPersist:
ControlPersist yes - persists initial connection (even if closed) until told to quit (-O exit)
ControlPersist 1h - persist for 1 hour
From what I understand IMAP requires a connection per each user. I'm writing an IMAP client (currently just gmail) that supports many (100s, 1000s maybe 10000s+) users at a time. Obviously cutting down the number of open connections would be great. I'm wondering if it's possible to use thread pooling on my side to connect to gmail via IMAP or if that simply isn't supported by the IMAP protocol.
IMAP typically uses SSL over TCP/IP. And a TCP/IP connection will need to be maintained per IMAP client connection, meaning that there will be many simultaneous open connections.
These multiple simultaneous connections can easily be maintained in a non-threaded (single thread) implementation without affecting the state of the TCP connections. You'll have to have some sort of a flow concept per IMAP TCP/IP connection, and store all of the flows in a container (a c++ STL map for instance) using the TCP/IP five-tuple (or socketFd) as a key. For each data packet received, lookup the flow and handle the packet accordingly. There is nothing about this approach that will affect the TCP nor IMAP connections.
Considering that this will work in a single-thread environment, adding a thread pool will only increase the throughput of the application, since you can handle data packets for several flows simultaneously (assuming its a multi-core CPU) You will just need to make sure that 2 threads dont handle data packets for the same flow at the same time, which could cause the packets to be handled out of order. An approach could be to have a group of flows per thread, maybe using IP pools or something similar.
When I use ssh to log in a remote server and open vim, if I don't type any words the session will timeout and I have to log in again.
But if I run command like top the session will never timeout?
What's the reason?
Note that the behavior you're seeing isn't related to vim or to top. Chances are good some router along the way is culling "dead" TCP sessions. This is often done by a NAT firewall or a stateful firewall to reduce memory pressure and protect against simple denial of service attacks.
Probably the ServerAliveInterval configuration option can keep your idle-looking sessions from being reaped:
ServerAliveInterval
Sets a timeout interval in seconds after which if no
data has been received from the server, ssh(1) will
send a message through the encrypted channel to request
a response from the server. The default is 0,
indicating that these messages will not be sent to the
server, or 300 if the BatchMode option is set. This
option applies to protocol version 2 only.
ProtocolKeepAlives and SetupTimeOut are Debian-specific
compatibility aliases for this option.
Try adding ServerAliveInterval 180 to your ~/.ssh/config file. This will ask for the keepalive probes every three minutes, which should be faster than many firewall timeouts.
vim will just sit there waiting for input, and (unless you've got a clock or something on the terminal screen) will also produce no output. If this continues for very long, most firewalls will see the connection as dead and kill them, since there's no activity.
Top, by comparison, updates the screen once every few seconds, which is seen as activity and the connection is kept open, since there IS data flowing over it on a regular basis.
There are options you can add the SSH server's configuration to send timed "null" packets to keep a connection alive, even though no actual user data is going across the link: http://www.howtogeek.com/howto/linux/keep-your-linux-ssh-session-from-disconnecting/
Because "top" is always returning data through your SSH console, it will remain active.
"vim" will not because it is static and only transmits data according to your key presses.
The lack of transferred data causes the SSH session to time out