I am testing the performance of scp command. I want to minimize the overheads for making TCP connection of ssh protocol inside scp.
How can I open the first ssh connection and reuse it over time?
thanks for your help.
// I should have said that one way to achieve it is to zip the files and send it at once, which only works when the files are available all the time. Let's assume then the files are generated in stream fashion in the source side, and I want to send those as early as possible after they are generated.
// Pleaser refer to the link for the answer I found: (How To Reuse SSH Connection To Speed Up Remote Login Process) http://www.cyberciti.biz/faq/linux-unix-reuse-openssh-connection/
If you are copying so many tiny files that the connection overhead comes into play, you could try tar'ing everything up on the fly and sending that instead.
Try something like this:
tar zcvf - data | ssh user#server "cat > data.tar.gz"
You can drop the z if compression isn't desired/helpful too.
Related
I want to programmatically download a file from a remote machine.
So, I know the host's IP and port:
Login data:
I also know that it creates an SSH tunnel.
Any suggestions? Is it even possible knowing just that data?
My knowledge on that topic is very scarce.
My answer focuses on SSH usage. In order to download a file via SSH, you need to run the scp command, like
scp yourusername#server.url:/the/path/to/the/file.extension ./
That's enough in order to download the file. However, it is possible that this will not work by itself. First, you need the other machine to know about your ssh, so you will need to
vim ~/.ssh/authorized_keys
hit insert and paste your public SSH key to its end. Don't remove anything. If it is still not working, then ssh might be disallowed on the server and you will need to enable it. Example for Ubuntu: https://linuxize.com/post/how-to-enable-ssh-on-ubuntu-18-04/
Your user needs access to the file you want to download, otherwise this won't work.
Alternatively you could set up an SFTP connection as well and use that.
How would you, in Ansible, make one remote node connect to another remote node?
My goal is to copy a file from remote node a to remote node b and untar it on the target, however one of the files is extremely large.
So doing it normally via fetching to controller, copy from controller to remote b, then unarchive is unacceptable. Ideally, I would do from _remote_a_ something like:
ssh remote_b cat filename | tar -x
It is to speed things up. I can use shell module to do this, however my main problem is that this way, I lose Ansible's handling of SSH connection parameters. I have to manually pass an SSH private key if any, or password in a non interactive way, or whatever to _remote_b_. Is there any better way to do this without multiple copying?
Also, doing it over SSH is a requirement in this case.
Update/clarification: Actually I know how to do this from shell and I could do same in ansible. I was just wondering if there is a better way to do this that is more ansible-like. The file in question is really large. The main problem is that when ansible executes commands on remote hosts, then I can configure everything in inventory. But in this case, if I would want a similar level of configurability/flexibility when it goes to parameters of that manually established ssh connection I would have to write it from scratch (maybe even as an ansible module), or something similar. Othervise for example trying to just use ssh hostname command would require a passwordless login or default private key, where I wouldn't be able to modify the private key path used in the inventory without adding that manually, and for ssh connection plugin there are actually two possible variables that may be used to set a private key.
Looks like more a shell question than an ansible one.
If the 2 nodes cannot talk to each other you can do a
ssh remote_a cat file | ssh remote_b tar xf -
if they can talk (one of the nodes can connect to the other) you can launch tell one remote node to connect to the other, like
ssh remote_b 'ssh remote_a cat file | tar xf -'
(maybe the quoting is wrong, launching ssh under ssh is sometimes confusing).
In this last case you need probably to insert some password or set properly public/private ssh keys.
I need a ssh client capable to reconnect if connection breaks, and, on reconnection, to get password from a file or web address.
Thank You.
Looking around, there does exist autossh.
Rely on autossh to reconnect a broken connections (if everything else is right).
Then you may fetch files or execute any other commands.
Alternatively, you can run a script to loop and check for the port, some like this.
Or you may simply use scp to fetch the required file.
Need to transfer 1 file from old host with no SSH access to a new host in which I do have SSH access. Having a hard time figuring this out. Looking for a simple answer if there is one. And also trying to avoid the slow upload times from my local machine, hence the reason for server to server transfer needed.
Are you able use FTP? You could use that to transfer the files.
If you have the URL of the file you want to move from your old host, you can use the wget command in your SSH terminal. You can use this for any file extension, or folders if you want.
For example, if you want to move http://www.yourhost.com/file.zip to your new host, you would SSH into the folder you want to download move this to, and type:
wget http://www.yourhost.com/file.zip
I have a bash file that contains wget commands to download over 100,000 files totaling around 20gb of data.
The bash file looks something like:
wget http://something.com/path/to/file.data
wget http://something.com/path/to/file2.data
wget http://something.com/path/to/file3.data
wget http://something.com/path/to/file4.data
And there are exactly 114,770 rows of this. How reliable would it be to ssh into a server I have an account on and run this? Would my ssh session time out eventually? would I have to be ssh'ed in the entire time? What if my local computer crashed/got shut down?
Also, does anyone know how many resources this would take? Am I crazy to want to do this on a shared server?
I know this is a weird question, just wondering if anyone has any ideas. Thanks!
Use
#nohup ./scriptname &>logname.log
This will ensure
The process will continue even if ssh session is interrupted
You can monitor it, as it is in action
Will also recommend, that you can have some prompt at regular intervals, will be good for log analysis. e.g. #echo "1000 files copied"
As far as resource utilisation is concerned, it entirely depends on the system and majorly on network characteristics. Theoretically you can callculate the time with just Data Size & Bandwidth. But in real life, delays, latencies, and data-losses come into picture.
So make some assuptions, do some mathematics and you'll get the answer :)
Depends on the reliability of the communication medium, hardware, ...!
You can use screen to keep it running while you disconnect from the remote computer.
You want to disconnect the script from your shell and have it run in the background (using nohup), so that it continues running when you log out.
You also want to have some kind of progress indicator, such as a log file that logs every file that was downloaded, and also all the error messages. Nohup sends stderr and stdout into files.
With such a file, you can pick up broken downloads and aborted runs later on.
Give it a test-run first with a small set of files to see if you got the command down and like the output.
I suggest you detach it from your shell with nohup.
$ nohup myLongRunningScript.sh > script.stdout 2>script.stderr &
$ exit
The script will run to completion - you don't need to be logged in throughout.
Do check for any options you can give wget to make it retry on failure.
If it is possible, generate MD5 checksums for all of the files and use it to check if they all were transferred correctly.
Start it with
nohup ./scriptname &
and you should be fine.
Also I would recommend that you log the progress so that you would be able to find out where it stopped if it does.
wget url >>logfile.log
could be enough.
To monitor progress live you could:
tail -f logfile.log
It may be worth it to look at an alternate technology, like rsync. I've used it on many projects and it works very, very well.