What is the most efficient way to use rsync to confirm transfer of large folder - ssh

I’m using a synology ds214 and I have a 6tb folder I originally was using USB Copy to transfer to an external drive connected directly to the nas. That process failed somewhere along the way and I attempted to rsync each folder within the 6tb individually using the flags -avPc. What’s the most efficient command to run to ensure that all the files are synced and completed?
Would —-ignore-existing be the best flag in this case?

Related

Accessing external hard drive after logging into a remote machine using ssh command

I am doing an intensive computing project with a super old C program. The program requires a library called Sun Performance Library which is a commercial ware. Instead of purchasing the library by myself, I am running the program by logging onto a Solaris machine in our computer lab with the ssh command, while the working directory to store output data is still on my local Mac.
Now, a problem just occurred: the program uses large amount of disk space to save some intermediate results and the space on my local Mac is quickly filled (50 GB for each user prescribed by the administrator). These results are necessary for the next stage of computing and I cannot delete any of them before it finally produce the output data. Therefore, I have to move the working directory to an external hard drive in order to continue. Obviously,
cd /Volumes/VOLNAME
is not the correct way to do it because the remote machine will give me a prompt saying
/Volumes/VOLNAME: No such file or directory.
So, what is the correct way to do it?
sshfs recently added support for "slave mode" which allows you to do this. Assuming you have sshfs on Solaris (I'm not sure about this), the following command (ran from your Mac) will do what you want: dpipe /usr/lib/openssh/sftp-server = ssh SOLARISHOSTNAME sshfs MACHOSTNAME:/Volumes/VOLNAME MOUNTPOINT -o slave
This will result in the MOUNTPOINT directory on the server being mounted to your local external drive. Note that I'm not sure whether macOS has dpipe. If it doesn't, you can replace it with one of the equivalent solutions at How to make bidirectional pipe between two programs?. Also, if your SFTP server binary is somewhere else, substitute its path.
The common way to mount a remote volume in Solaris is via NFS, but that usually requires root permissions.
Another approach would be to make your application read its data from stdin and output its results to stdout, without using the file system directly. Then you could just redirect the data from/to your local machine through ssh. For instance:
ssh user#host </Volumes/VOLNAME/input.data >/Volumes/VOLNAME/output.data

how to move files from remote server to s3 at the command line

I have a lot of big files on a remote server and I want to move them into S3. I want to do it at the command line or with a bash script (e.g., I do NOT want to use a gui app like cyberduck) so that I can automate/replicate efforts.
I have tried to mount my remote server onto my local machine using Osxfuse and sshfs and then push it to s3 using s3cmd. This does work but I keep running into errors (connection being lost for no apparent reason; mount errors, etc.).
Is this the best way to do it? Does anyone know a better way to do it?
Thanks,
A
You can use minio client aka mc to do the same.
$ mc cp --recursive localDir/ s3/remoteBucket
In case of network disconnection, mc will provide you an option to resume the upload.
Is your remote server in ec2? Your current setup requires two copies (first to pull data to your local machine via sshfs, then to push to s3 via s3cmd), if you run s3cmd on your remote server directly you can reduce that to one.
If you want to mount s3 as a filesystem, you can also use tools like goofys or s3fs. Again you should do that on your remote server to avoid extra copies.

Smart local copy of a remote directory

Currently I have a bunch of local copies of dev/production websites. Each copy contains the "files" directory, which contains files uploaded by site users. Currently I use rsync to synchronize the directories contents from remote servers (via ssh).
There are some annoyances:
I have to run rsync manually each time when I want fresh files (this could be automated of course, but as I have a lot of website copies, it's not a good idea).
The rsync execution takes some time.
Disc space on my laptop is running out.
I think all of this could be solved if there is some kind of a software that can work like a proxy:
When I list files, it requests the file list from the remote server and caches the results for some (configurable) time.
When I first time request file contents, it retrieves the remote file and saves it locally.
When I update a file, it only gets updated locally.
When I save a new file in the "files" directory, it not goes to the remote server.
Of course, the logic of such software should be much more complex, but I hope, my idea is clear: don't waste disk space, download files on demand, no remote changes.
Is there any software that works like that?
Map a network drive with NFS or sshfs. Make local copies if you really need a file.
I did not mention it in the question, but I needed this for work with Drupal. And now I have found a Drupal-only solution, the Stage File Proxy module.
It does exactly what I need: downloads files from a remote server only when they are requested.

Use rsync without copying files that are in use

I have a server (Machine A) that receives uploads throughout the day from other machines. I have a script running on another internal server (running as cron - Machine B) that uses rsync to pull these files onto itself and remove the originals on Machine A. Some of these uploads last an hour or more.
How do I use rsync so that it won't attempt to copy files that are currently uploaded (being written to)? I don't want it to pull partial uploads and then attempt to process them.
I'm using Ubuntu 10.04 64-bit on both machine A & B.
In order to make incremental baclups in rsync, you should put the --update or -u option. The only situation in which the a file existing in the receiver will be updated is when the archive exists and has the same timestamp in both ends but the size differs.
About the partial updates, all the temporary uploads are stored in a temporary archive and then moved to the dest directory when uploaded. you can use the --partial in case of a rsync or network problem, this will resume the partial updates next time you execute the sync again.
You can check the whole options from this man page.

Best Method for online update

I was trying to work out the best method of synchronising an update of files via vb.net. I want to automatically update some files in a few folders from an FTP site to a local drive. This would ideally be the same folder structure, but on a specified local folder. I have looked at the FTP methods in .net and am able to transfer individual files successfully, but wondered if there was an elegant method to collecting the entire folder from the FTP and it's contents to be re-created locally. I have established the reading of an XML via linq from the ftp to establish an update single and decide whether the local machine is up-to-date or not, it's just the transfer of the folders that i'm stuck on as it's different to using a IO method.
Can any gurus confirm that this is the best way of approaching this task?
cheers CS
Your approach looks fine to me. You will have to transfer the files/folders individually if you are using the base class library.
There may be 3rd party libraries available which will do this on one shot.