Does rsync make a backup copy or a history of backup copies? - backup

Many tutorial sites mention making hourly/daily/weekly backups with rsync [1], [2], and even more claim to set up rsync like Mac's Time Machine [3], [4], [5]. But when I look at the code, it seems like they make currentBackup/ folder, next time around they rsync against this currentBackup to only copy over the changes necessary, then delete the currentBackup and set the new folder to be currentBackup. But what if I want daily backups likes so
March-10-44-BC/
March-11-44-BC/
March-12-44-BC/
March-13-44-BC/
March-14-44-BC/
March-15-44-BC/
So on March 16 I can 'roll back' to the March 15 version or the March 14 version. I have noticed each site mentions something called hard links. Since I can't understand what this is, perhaps this retains information capable of performing 'roll backs'. If not, what am I supposed to do? Keep all previous backups and tar.gz them?

For these purposes hard links serve as "copy on write" copies of the files. If a file is unchanged, hard linked "copies" of it don't take up any extra space. When the file is changed, because of the hard links, a new copy with the changes is created instead. See Wikipedia for more information.
So rsync does mainly backup, but can provide some archiving functionality as well. This is in contrast to other backup software like Time Machine that always and automatically provide archiving as well as backup.

Related

Rsync seems to erase EXIF data from photos

Trying to set up a simple backup solution for my wife's computer. Have a volume on my server upstairs mounted locally using OSX automount, so it should just be a simple
rsync -a sourceDir targetDir
When I look at the files it syncs over though, all metadata is lost on jpg files. The created date is preserved on the file and the modified date ends up being the timestamp when the rsync runs, but I can't imagine why EXIF data (Device, exposure etc) would disappear when it should just be a straight file copy. Hoping someone has run into this before and can shed some light on it.
This can't be a rsync problem, there should be something else going on. rsync just does a binary copy from source to destination, the most probable explanation is a simple user error (e.g. you copied from the wrong source directory, source files where already without EXIF data, and so on).
For normal copies on reliable hardware, rsync is without doubt the best tool for the job, especially considering the huge amount of filesystems it has to cope with.
There are some corner cases where rsync may not behave as it should, at least with default parameters. For example, right now I'm investigating on an issue where, copying to a "not-so-reliable" USB drive, rsync continued to copy happily even when the drive disconnected from USB and the device disappeared.

Move file in one AccuRev workspace that has been edited in another workspace

We have a need to refactor a code base. The thing is that this will be done by one person and it would be desirable to avoid having the rest of the development team sitting idle while this job takes place.
We therefore tried the following scenario to see if it is possible to work in parallel.
Created file test.txt in directory first in developer A's workspace.
Promoted this file.
Updated developer B's workspace, thereby getting file test.txt
In A's workspace moved file test.txt to directory second.
Promoted this move.
In B's workspace edited file test.txt while it still resides in directory first (no update is made thereby emulating that work is done while refactoring is taking place).
Tried to promote and got a message saying that file test.txt had been modified (correct, file has been moved).
Tried to merge but got an error message saying that AccuRev can't merge since the file is missing in directory second (where it has been moved).
Tried to update B's workspace but that is not allowed since there is a modified file that needs to be merged first.
We are now stuck in a catch 22 situation.
We did try to place a fake file in directory second but that is not being recognized since this file does not belong to the workspace.
Has anyone out there tried something like this and gotten it to work?
It is of course possible to copy files but if there is a better way we would be grateful to hear about this. Or if this is a known bug or limitation in the tool.
We will contact also contact AccuRev support but I thought that I might be able to get some useful tips from the community.
Currently we are using AccuRev client 5.5.0.
Thanks for any suggestions on how to make the tool support this operation.
Referring to your steps 6 & 7: In AccuRev 5.5 after a file is edited and has a (modified) status you first have to keep before you can promote.
At step 8 you could try doing the merge from the Browse Versions view of the file. That way you can select any node to merge with, including the one that has been moved.
Step 9. An AccuRev update will not run successfully if one of the files to be updated is (modified). This is by design. You can keep the file so it has (kept)(member) status then run the update.
David Howland
After contact with AccuRev support the answer is that the only option available is to copy the file to some temp directory, revert the changes, update the workspace and copy the file into the new location in the workspace.
AccuRev will at least tell you which files you have to copy since they will be marked as modified.
I could experimentally verify David's remark to step 9 using AccuRev 5.5.
Let's assume that in the workspace of user A the file was moved and the move was promoted, while in the workspace of user B the file was modified and user B is about to promote his/her change.
Before the file is kept, it will not be possible for user B neither to merge nor to update. But after keeping the modified file the update is possible. The file is first marked as overlap, then the merge succeeds in the new location. Basically, this avoids creating a copy of the file, reverting it and restoring it in the new location after an update, which can be quite cumbersome, as AccuRev does not reveal easily where the move goes.
If user B promotes the modification before user A promotes the move, all goes smoothly, i.e. on update the moved file appears as overlap, but easily merges into the moved file in the new location.
Similar results are obtained when the two users have workspaces connected to different streams and the overlap occurs on a common parent stream. Only if the file is unkept, an error can occur (i.e. only if the move is promoted before the change). Then a simple keep allows to proceed as usual (update, merge, then promote).

How Can I Recover a .nfs file?

I have been working on an algorithm in Python, and I was using Vim to edit this file. I opened it up, did a save, and it came up with an Error something like it occasionally does:
"WARNING: YOUR FILE CANNOT BE SAVED! ALL CHANGES WILL BE LOST! CANNOT WRITE THE FILE!"
As this happens occasionally, I did what I normally do, and I hit :q! to quit without writing any changes. No harm, no foul. When I looked at my file, everything had been erased! Everything!
I talked around the office, and it seems that the nfs mount was full, and so that was why I couldn't save anything. There was a huge script generating a ton of data, which caused the mount to be full temporarily. I believe the NFS mount is from NetApp. I found 2 files in my current directory.
One was last saved two days ago, and one was today. They are in the format of:
.nfs.xxxxxxxxxxx
When I try to attempt to open up this file, I see some of my code, here and there, splattered among unknown characters. Apparently, this must be a binary representation of the state of the file.
Is there any way to recover this file from this NFS mount? If there is a shortcut to recover this file in Emacs, I will switch to Emacs from vim!
So, I did find a way to recover the file. I found two ways, in fact. Since it was on a NetApp NFS mount, I was able to use the snapshots feature. When you are in a directory just do
ls .snapshot
And this will pull up any snapshots that your system administrators have set. For us, we have an hourly.0, hourly.1, and nightly.0, and nightly.1 backups. So, we can go back two days, and in the same day, we can go back one hour (the current hour, and the previous).
The other way was to rename the file to a vim swap file like this.
mv .nfs.xxx my_vim_file.cpp.swp
vim my_vim_file.cpp.swp
Then attempt to open it up in Vim, and it should ask you if you want to Recover the swap file, say yes, and it should be back!
Apparently your Netapp uses NFS to mount its volumes (as opposed to iSCSI, for example). Generally, each VM is stored on a unique volume (aka datastore) on the Netapp filer. To find out the volumes and snapshots, and then restore a snapshot, here are the commands to execute at the command line:
# list all volumes, snapshots are taken of volumes
vol status
# list the snapshots available for a particular volume
snap list <vol_name>
# restore a snapshot, nightly.1 for example
snap restore <vol_name> nightly.1
That's it. All that's left is to turn the VM back on and see if you've restored far back enough. If not, then do another "snap restore" but with an older snapshot.
Note that this procedure assumes your administrator didn't disable snapshots (Netapp has a snapshot schedule by default) and that the Netapp is licensed for snaprestore (use the "license" command to verify). This procedure can further be simplified if you have the Netapp OnCommand System Manager, which is a GUI for managing the Netapp. Reverting a snapshot in the GUI is simple:
Go to Storage > Volumes > click on a volume > click on Snapshot Copies (at the bottom)
Choose a snapshot and restore

The strategy to get recovery from broken files?

Me and my colleague are trying to implement a mechanism to get recovery from broken files on an embedded equipment.
This could be happened during certain circumstances, e.g. user takes off the battery during file writing.
Orz, but now we have just one idea:
Create duplicated backup files, and copy them back if dangerous file i/o is not finished properly.
This is kind of stupid, as if the backup files also broken, we are just dead.
Do you have any suggestions or good articles on this?
Thanks in advance.
Read up on database logging and database journal files.
A database (like Oracle) has very, very robust file writing. Do not actually use Oracle. Use their design pattern. The design pattern goes something like this. You can borrow these ideas without actually using the actual product.
Your transaction (i.e., Insert) will fetch the block to be updated. Usually this is in memory cache, if not, it is read from disk to memory cache.
A "before image" (or rollback segment) copy is made of the block you're about to write.
You change the cache copy, write a journal entry, and queue up a DB write.
You commit the change, which makes the cache change visible to other transactions.
At some point, the DB writer will finalize the DB file change.
The journal is a simple circular queue file -- the records are just a history of changes with little structure to them. It can be replicated on multiple devices.
The DB files are more complex structures. They have a "transaction number" -- a simple sequential count of overall transactions. This is encoded in the block (two different ways) as well as written to the control file.
A good DBA assures that the control file is replicated across devices.
When Oracle starts up, it checks the control file(s) to find which one is likely to be correct. Others may be corrupted. Oracle checks the DB files to see which match the control file. It checks the journal to see if transactions need to be applied to get the files up to the correct transaction number.
Of course, if it crashes while writing all of the journal copies, that transaction will be lost -- not much can be done about that. However, if it crashes after the journal entry is written, it will probably recover cleanly with no problems.
If you lose media, and recover a backup, there's a chance that the journal file can be applied to the recovered backup file and bring it up to date. Otherwise, old journal files have to be replayed to get it up to date.
Depends on which OS etc. etc. but in most cases what you can do is copy to a temporary file name and as the last final step rename the files to the correct name.
This means the (WOOPS) Window of Opertunity Of Potential S****p is confined to the interval when the renames take place.
If the OS supports a nice directory structure and you lay out the files intelligently you can further refine this by copying the new files to a temp directory and renaming the directory so the WOOPS becomes the interval between "rename target to save" and "rename temp to target".
This gets even better if the OS supports Soft link directories then you can "ln -s target temp". On most OSes replacing a softlink will be an "atomic" operation which will work or not work without any messy halfway states.
All these options depend on having enough storage to keep a complete old and new copy on the file system.

sync automatic background offline

I'm working on some documents on a laptop which is sometimes offline (it runs winXP).
I'd like to backup automatically the documents to a folder to a remote location so that it runs in the background.
I want to edit the documents and forget about backuping and once online - have it all backuped to a remote location, or even better - to an svn server or something that supports versioning.
I want something which is:
1. free
2. does not overload the network too much but only send the diff.
3. works 100%
thanks in advance
DropBox does everything your asking. http://www.getdropbox.com/
Plus it's fully cross platform, Windows, Mac, Linux.
Free up to 2GB.
I like IDrive Online personally. It's Windows/Mac (no Linux), 2 GB free, and here's the important bit: it stores the last 30 revisions of files. And the history for a file isn't counted against your space either; only the most recent version counts. None of the other free online backup solutions I've seen handle versioning as well. It supports continuous backup too. Oh, and they handle being offline beautifully -- even losing connection in the middle of the backup process doesn't bother it.
Dropbox also keeps previous revisions, including of deleted files.