I'm trying to run rsync over ssh in parallel to transfer files between two machines for evaluation purposes. I wanna see how faster can I get compared to a single rsync process.
I tried these two solutions:
https://wiki.ncsa.illinois.edu/display/~wglick/Parallel+Rsync but with no great success.
https://gist.github.com/rcoup/5358786 (I couldn't make it work)
Based on the first link I run a command like this:
ssh HOST "mkdir -p ~/destdir/basefolder"
cd ./basefolder; ls | xargs -n1 -P 4 -I% rsync -arvuz -e ssh % HOST:~/destdir/basefolder/.
and I get the files transfered, but it doesn't seem to work well... In this case, It will run a process for every file and folder in the basefolder, but when it finds a folder, it will transfer everything inside that folder using only 1 process.
I tried to use find -type f, but I got problems because I loose the file hierarchy.
Does anyone how some methods to do what I want? (Use rsync in parallel over ssh while keeping files and folders hierarchy).
Since you tagged your question 'gnu-parallel' the obvious is to refer you to http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync
cd src-dir; find . -type f -size +100000 | parallel -v ssh fooserver mkdir -p /dest-dir/{//}\;rsync -Havessh {} fooserver:/dest-dir/{}
Related
After making a full forensic copy of a harddrive using dd, I would like to keep up with changes between the original and backup harddisc, therefore I started using rsync.
Whenever, I run
sync -a -v -n --progress /media/drive1 /media/drive2
the command would start listing all files contained in drive1. However, only a couple of them has changed after I did DD.
Trying that on a single folder
sync -a -v -n --progress /media/drive1/folder /media/drive2
works fine and just displays the new files in that folder - those which are not contained in /media/drive2/folder.
However, executing the command on the level of both volumes
sync -a -v -n --progress /media/drive1 /media/drive2
does not account for the differentials, contrary to the documentation which is everywhere available, but takes all files which are already on both drives.
What is my mistake?
The way rsync treats its source and destination paths is easy to get wrong. When you use the command:
sync -a -v -n --progress /media/drive1 /media/drive2
...it tries to sync the drive1 folder into drive2; that is, it creates and populates /media/drive2/drive1. When you add "/folder" to the source path, it works as expected because then it's trying to sync with /media/drive2/folder, which is what you want.
Fortunately, the solution is easy: add "/" to the end of the source path, which tells it to sync the contents of drive1 into drive2, rather than the folder itself:
sync -a -v -n --progress /media/drive1/ /media/drive2
BTW, I'd recommend adding --dry-run to make sure it's doing what you want before running it "for real". You'll probably also have to delete /media/drive2/drive1.
I have about a thousand files on a remote server (all in different directories). I would like to scp them to my local machine. I would not want to run scp command a thousand times in a row, so I have created a text file with a list of file locations on the remote server. It is a simple text file with a path on each line like below:
...
/iscsi/archive/aat/2005/20050801/A/RUN0010.FTS
/iscsi/archive/aat/2006/20060201/A/RUN0062.FTS
/iscsi/archive/aat/2013/20130923/B/RUN0010.FTS
/iscsi/archive/aat/2009/20090709/A/RUN1500.FTS
...
I have searched and found someone trying to do a similar but not the same thing here. The command I would like to edit is below:
cat /location/file.txt | xargs -i scp {} user#server:/location
In my case I need something like:
cat fileList.txt | xargs -i scp user#server:{} .
To download files from a remote server using the list in fileList.txt located in the same directory I run this command from.
When I run this I get an error: xargs: illegal option -- i
How can I get this command to work?
Thanks,
Aina.
You get this error xargs: illegal option -- i because -i was deprecated. Use -I {} instead (you could also use a different replace string but {} is fine).
If the list is remote, the files are remote, you can do this to retrieve it locally and use it with xargs -I {}:
ssh user#server cat fileList.txt | xargs -I {} scp user#server:{} .
But this creates N+1 connections, and more importantly this copies all remote files (scattered in different directories you said) to the same local directory. Probably not what you want.
So, in order to recreate a similar hierarchy locally, let's say everything under /iscsi/archive/aat, you can:
use cut -d/ to extract the part you want to be identical on both sides
use a subshell to create the command that creates the target directory and copies the file there
Thus:
ssh user#server cat fileList.txt \
| cut -d/ -f4- \
| xargs -I {} sh -c 'mkdir -p $(dirname {}); scp user#server:/iscsi/archive/{} ./{}'
Should work, but that's starting to look messy, and you still have N+1 connections, so now rsync looks like a better option. If you have passwordless ssh connection, this should work:
rsync -a --files-from=<(ssh user#server cat fileList.txt) user#server:/ .
The leading / is stripped by rsync and in the end you'll get everything under ./iscsi/archive/....
You can also copy the files locally first, and then:
rsync -a --files-from=localCopyOfFileList.txt user#server:/ .
You can also manipulate that file to remove for example 2 levels:
rsync -a --files-from=localCopyOfFileList2.txt user#server:/iscsi/archive .
etc.
I have a Arduino Yun and want setup the server for Yun.
So what I want is to copy a folder that contain a py file and a index.html to my Yun
I used mac terminal to do this operation
the command looks like this
scp -r /Users/gudi/Desktop/LobsterHeartRate root#192.168.240.1:/mnt/sda1
and then terminal asked for the password
after I typed, it shows
scp: /mnt/sda1/LobsterHeartRate: Not a directory
I didn't type /mnt/sda1/LobsterHeartRate why it shows this error
Your code
scp -r /Users/gudi/Desktop/LobsterHeartRate root#192.168.240.1:/mnt/sda1
requires that the remote directory /mnt/sda1 exists. This looks like it is not true in your case. Check it using ssh root#192.168.240.1 ls /mnt/sda1.
scp is simple tool and it does not allow you to rename directories on the fly and the target directory must exists. You might try
scp -r /Users/gudi/Desktop/LobsterHeartRate root#192.168.240.1:/mnt/
ssh root#192.168.240.1 mv /mnt/LobsterHeartRate /mnt/sda1
or so, if it will suit your needs. But copying more files, rsync is usually more suitable. Check its manual page and give it a try next time.
As #Jens Höpken notes, your post is a bit sparse. But trying to read between the lines of your post I suspect that LobsterHeartRate is a DIRECTORY on your local system but a FILE named LobsterHeartRate in your target system. This might be happening right at the top of the directory tree, or perhaps you have directories/files of the same name further down the tree. scp -rv might help resolve any confusions here.
Beware: scp -r resolves symbolic links. If you want to preserve symlinks you need to do something else. For historic reasons I use the following, though cpio with a find front-end opens up interesting possibilities for fine-grained file selections.
( cd /Users/gudi/Desktop && tar -cf - LobsterHeartRate ) |
ssh root#192.168.240.1 'cd /mnt/sda1 && tar -xf -'
For a safe "dry run" you could change the -xf to a -tf. The && chains are required to prevent bad things from happening if any prior command fails.
Disclaimer: any debugging is left as an exercise for the student.
Whenever I try to SCP files (in bash), they end up in a seemingly random(?) order.
I've found a simple but not-very-elegant way of keeping a desired order, described below. Is there a clever way of doing it?
Edit: deleted my early solution from here, cleaned, adapted using other suggestions, and added as an answer below.
To send files from a local machine (e.g. your laptop) to a remote (e.g. your calculation server), you can use Merlin2011's clever solution:
Go into the folder in your local machine where you want to copy files from.
Execute the scp command, assuming you have an access key for the remote server:
scp -r $(ls -rt) user#foo.bar:/where/you/want/them/.
Note: if you don't have a public access key it may be better to do something similar using tar, then send the tar file, i.e. tar -zcvf files.tar.gz $(ls -rt), and then send that tar file on its own using scp.
But to do it the other way around you might not be able to run the scp command directly from the remote server to send files to, say, your laptop. Instead, you may need to, let's say bring files into your laptop. My brute-force solution is:
In the remote server, cd into the folder you want to copy files from.
Create a list of the files in the order you want. For example, for reverse order of creation (most recent copied last):
ls -rt > ../filenames.txt
Now you need to add the path to each file name. Before you go up to the directory where the list is, print the path using pwd. Now do go up: cd ..
You now need to add this path to each file name in the list. There are many ways to do this, here's one using awk:
cat filenames.txt | awk '{print "path/to/files/" $0}' > delete_me.txt
You need the filenames to be in the same line, separated by a space, so change newlines to spaces:
tr '\n' ' ' < delete_me.txt > filenames.txt
Get filenames.txt to the local server, and put it in the folder where you want to copy the files into.
The scp run would be:
scp -r user#foo.bar:"$(cat filenames.txt)" .
Similarly, this assumes you have a private access key, otherwise it's much simpler to tar the file in the remote, and bring that.
One can achieve file transfer with alphabetical order using rsync:
rsync -P -e ssh -r user#remote_host:/some_path local_path
P allows partial downloading, e sets the SSH protocol and r downloads recursively.
You can do it in one line without an intermediate using xargs:
ls -r <directory> | xargs -I {} scp <Directory>/{} user#foo.bar:folder/
Of course, this would require you to type your password multiple times if you do not have public key authentication.
You can also use cd and still skip the intermediate file.
cd <directory>
scp $(ls -r) user#foo.bar:folder/
I want to make a list of files of locate's output.
I want scp to take the list.
I am not sure about the syntax.
My attempt with pseudo-code
locate labra | xargs scp {} masi#11.11.11:~/Desktop/
How can you move the files to the destination?
xargs normally takes as many arguments it can fit on the command line, but using -I it suddenly only takes one. GNU Parallel http://www.gnu.org/software/parallel/ may be a better solution:
locate labra | parallel -m scp {} masi#11.11.11:~/Desktop/
Since you are looking at scp, may I suggest you also check out rsync?
locate labra | parallel -m rsync -az {} masi#11.11.11:~/Desktop/
Typically, {} is a findism:
find ... -exec cmd {} \;
Where {} is the current file that find is working on.
You can get xargs to behave similar with:
locate labra | xargs -I{} echo {} more arguments
However, you'll quickly notice that it runs the commands multiple times instead of one call to scp.
So in the context of your example:
locate labra | xargs -I{} scp '{}' masi#11.11.11:~/Desktop/
Notice the single quotes around the {} as it'll be useful for paths with spaces in them.