rsync "failed to set times on "XYZ": No such files or directory (2) - backup

I have a Dlink NAS (dns-323) in RAID1 that I use to backup family photos, videos and some other data. I also manually rsync to a dedicated backup drive on a little Atom Linux box whenever we add a lot of new files to the NAS. I finally lost a drive on the NAS and through a misstep of my own, also lost the entire volume. No problem, that's what the backup drive is for. I used the same rsync command in reverse to restore files to the NAS after I replaced the bad drive and created a new RAID volume. This worked well, except that after the command finished, I noticed that it did not preserve timestamps. Timestamps were preserved in the NAS->backup direction, but not the backup->NAS direction.
I run the rsync command on the Atom Linux box with these options (this does preserve timestamps):
rsync --archive --human-readable --inplace --numeric-ids --delete /mnt/dns-323 /mnt/dlink_backup --progress --verbose --itemize-changes
The reverse command to restore the volume from the backup (which did not preserve timestamps) is very similar:
rsync --archive --human-readable --inplace --numeric-ids --delete /mnt/dlink_backup/dns-323/ /mnt/dns-323/ --progress --verbose --itemize-changes
which actually restores the files, but gives many errors like:
rsync: failed to set times on "/mnt/dns-323/Rich/Code/.emacs": No such file or directory (2)
I've been googling most of the afternoon and trying different things, but so far haven't solved my problem. I used the 'touch' command to successfully modify the times of one or two files on the NAS, just to prove that it can be done since I believe that is one thing that rsync must do. I've tried doing this as my user and as root. By this I mean that I've run sudo rsync ..... as well as rsync --rsync-path='/usr/bin/sudo /usr/bin/rsync' ..... where ..... is all of the previously mentioned parameters. My /etc/fstab has these entries for the NAS and the backup drive, respectively:
# the dns-323
//192.168.1.202/Volume_1 /mnt/dns-323 cifs guest,rw,uid=1000,gid=1000,nounix,iocharset=utf8,file_mode=0777,dir_mode=0777 0 0
# the dlink_backup drive
/dev/sdb /mnt/dlink_backup ext3 defaults 0 0
It's not absolutely critical to preserve timestamps if it just plain can't be done, but it seems like it should be possible - I'm just stumped.
Thanks in advance. Let me know if I can provide any additional information.

I've earned my "tumbleweed" badge as a result of this one. pats self on back
What I've learned:
My solution:
1) Removed the left hard drive from the dns-323, which is half of the RAID1 volume.
2) Mounted (ext3) this drive using a USB-to-SATA adapter to the machine where I run rsync.
3) Performed the rsync command for the restore outlined above. I removed the --delete option which really shouldn't be there and I added the option --size-only. The size-only option made it so that timestamps were essentially the only thing that got restored, since files had already restored properly.
4) Unmounted the left drive from the Atom machine and returned that drive to the dns-323, while also removing the right drive. The right drive needs to be removed so that the dns-323 recognizes that the RAID volume is degraded.
5) Re-add the right drive to the dns-323 and tell it to rebuild the RAID volume.
6) All timestamps are now good.
A possible alternate solution:
I've read enough about rsync and NFS/Samba/cifs now to understand that this problem is likely related to permissions on the NFS server (dns-323). Internally, the user/group ids in the dns-323 are 501/501. No permutation of how I mounted the dns-323 on the Atom box would allow rsync to properly set timestamps. I do believe that changing my user account on the Atom box to have uid/gid of 501/501 would have worked, though. My user had the default 1000/1000 and root had 0/0 IIRC.

Related

Why does Tensorboard not refresh when used with rsync?

I am running a tensorflow experiment on a remote machine continuously writing to the same events.out.tfevents.xxx file. I would expect tensorboard to refresh automatically every minute or so displaying the new logs. This does work when using sshfs to mount the remote machine on my laptop and using the mounted directory to run tensorboard on.
However, when using rsync to copy the files over and run tensorboard on the local files, the tensorboard never refreshes, I have to restart it in order to get the updates.
This is my rsync command:
rsync -aP --del -e ssh server_name:folder_on_server local_folder --exclude='*checkpoints*' --exclude='*.json' --exclude='*.DS_Store'
Any help would be greatly appreciated!
It's a known issue with the Tensorboard, see this issue on github.
Here's an quote from the issue (emphasis is mine) :
It looks like when the tensorboard reads an event file from local directory - it will not notice that the event file was deleted and recreated (which is quite valid case when you are using [...] rsync to sync the data)
One workaround is to use --inplace as an option in your rsync command.

How do I copy a file into a docker-cloud container? (AKA How to copy a file over ssh without using scp)

docker-machine has an scp command, but docker-cloud doesn't seem to have any way to transfer a file from my local machine to the cloud container or vice-versa.
I'm submitting an answer below that I've finally figured out (in hopes that it will help someone), but I'd love to hear better answers if there are any!
(I realize docker-cloud is going away, but perhaps this will be helpful for other cloud platforms as well)
To transfer a file from your local machine to a docker-cloud instance that is running linux with the tee command available:
docker-cloud container exec id12345 tee filename.ext < file_to_copy.ext > /dev/null
(you'll want to redirect output to /dev/null as shown unless you want the entire contents of the file to be echoed to the terminal... twice)
To transfer a file to your local machine, is somewhat easier:
docker-cloud container exec id12345 cat file_to_copy.ext > filename.ext
Note: I'm not sure this works for binary files, and it can even cause issues with linefeed characters in text files, based on terminal settings, etc. - but it's the best answer I've got short of using an external service like https://transfer.sh

Speed up gsutil rsync between s3 and gs

I want to rsync a bucket with 100M files between s3 and gs. I've got a c3.8xlarge instance and did a quick dry run:
$ time gsutil -m rsync -r -n s3://s3-bucket/ gs://gs-bucket/
Building synchronization state...
At source listing 10000...
^C
real 4m11.946s
user 0m0.560s
sys 0m0.268s
About 4 minutes for 10k files. At this rate, it's going to take 27 days just to compute the sync state. Anything I can do to speed this up?
I also noticed [and fixed] the following warning:
WARNING: gsutil rsync uses hashes when modification time is not
available at
both the source and destination. Your crcmod installation isn't using the
module's C extension, so checksumming will run very slowly. If this is your
first rsync since updating gsutil, this rsync can take significantly longer than
usual. For help installing the extension, please see "gsutil help crcmod".
Are the file hashes computed or am I just waiting for listing 100M files?
When setting up a sync process between two buckets, the first iteration is going to be the slowest because it needs to copy all of the data in source-bucket to dest-bucket. For cross-provider syncs, this is further slowed down by the need for two separate connections per object -- one to pull the data from the source to the host machine, and another to funnel it through from the host to the destination (gsutil refers to this as "daisy-chain" mode).
For the initial sync (and possibly subsequent syncs as well) between buckets, you might be better off using GCS's transfer service, which allows GCS to copy the objects on your behalf. This tends to be much faster than doing all the work with one machine running gsutil.
As for the warning, it's a general warning that's printed at the beginning of the command execution if you don't have the crcmod C extension installed, regardless of what's present at the destination.
Skyplane is a much faster alternative for transferring data between clouds (up to 110x for large files). You can transfer data with the command:
# copy data
skyplane cp -r s3://aws-bucket-name/ gcs://google-bucket-name/
# sync data
skyplane sync -r s3://aws-bucket-name/ gcs://google-bucket-name/

Bacula/Bareos disaster recover from scratch using bextract

On Bacula/Bareos, document stress the importance of Catalog bootstrap file must be save on somewhere safe, I know Catalog consist of MySQL DB dump and optional included Bacula/bareos config file, but how exactly does anyone recover from scratch in case the whole backup infrastructure is gone?
Is it just install all Bacula/bareos software, then import MySQL and config then fire up Director would do the trick?
A bit of an old question, but I'll provide some feed back,
If you've done a mysqldump of the database (or pgdump depending on the backend) you essentially have the catalog in it's full state. I believe that you can simply restore this database to a new server, and restore the old config files (these are not stored in the dump but rather in /etc/bareos). Also, make sure that the same user/password is used for the database user as specified in the bareos-dir.conf file, or else you will not be able to connect to the database. Depending on how your storage devices are setup you may need to mess around with the baroes-sd.conf file.
To answer the other question off the OP, you can use a volume without a catalog. It's a bit cumbersome, but is possible with the following:
http://www.bacula.org/5.0.x-manuals/en/utility/utility/Volume_Utility_Tools.html
For example:
List jobs on a volume: bls -j -V Full_1-1886 FileStorage1
List files on a volume: bls -V Full_1-1886 FileStorage1
Once you have found the file, or directory (Note wildcard characters are supported) you can extract the file:
bextract -i restoreFiles -V Full_2-1277 FileStorage2 /tmp/
Where:
restoreFiles specifies a file separated with newlines that lists files/directories to restore
/tmp/ is the destination of the restore

How can I use boto or boto-rsync a full backup of 1000+ files to an S3-compatible cloud?

I'm trying to back up my entire collection of over 1000 work files, mainly text but also pictures, and a few large (0.5-1G) audiorecordings, to an S3 cloud (Dreamhost DreamObjects). I have tried to use boto-rsync to perform the first full 'put' with this:
$ boto-rsync --endpoint objects.dreamhost.com /media/Storage/Work/ \
> s3:/work.personalsite.net/ > output.txt
where '/media/Storage/Work/' is on a local hard disk, 's3:/work.personalsite.net/' is a bucket named after my personal web site for uniqueness, and output.txt is where I wanted a list of the files uploaded and error messages to go.
Boto-rsync grinds its way through the whole dirtree, but refreshing output about each file's progress doesn't look so good when it's printed in a file. Still as the upload is going, I 'tail output.txt' and I see that most files are uploaded, but some are only uploaded to less than 100%, and some are skipped altogether. My questions are:
Is there any way to confirm that a transfer is 100% complete and correct?
Is there a good way to log the results and errors of a transfer?
Is there a good way transfer a large number of files in a big directory hierarchy to one or more buckets for the first time, as opposed to an incremental backup?
I am on a Ubuntu 12.04 running Python 2.7.3. Thank you for your help.
you can encapsulate the command in an script and starts over nohup:
nohup script.sh
nohup generates automaticaly nohup.out file where all the output aof the script/command are captured.
to appoint the log you can do:
nohup script.sh > /path/to/log
br
Eddi