tracking file renaming/deleting with FSEvents on Lion - objective-c

I'm trying to use FSEvents to detect when files were added/removed from a specific folder. For the moment, I implemented a simple wrapper around FSEvents, and it works fine : I get all the events.
BUT the problem I have now is that when I rename a file in the Finder, I catch 2 distinct events : the first one of type "renamed" with the old file name, and another one with "renamed" and the new filename. The event ids are different between both calls.
So, how am I supposed to know which "renamed" event contains the old name, and which event contains the old one ?? I tried looking in the documentation, but unfortunately, kFSEventStreamEventFlagItemRenamed is not documented ... it seems new in Lion.
PS: the only way I could think of was : on a renamed event, I check my UI to see if I have an item corresponding to the event path. If so, I flag it for renaming. If not, I check if an item was flagged for renaming, and if so, then I rename it to the new event path. But I really don't like this idea ...
Edit: Ok, I imlemented something along the line of my "PS" : I noticed that when renaming something, the ids of the 2 events are consecutives, so that with the id of the event containing the new name, I can get the event containing the old name. I simply use a little dictionnary in my interface to store ids and associated paths in the case of a "renamed" event.
Anyway, I can now catch rename events, and even move events : when you move a file, it's a "renamed" event which is caught by the FSEventStream ...
But, I still have one last problem : deleting. When I delete something, it's moved to the recycle bin : I receive a "renamed" event. But the problem is that I don't receive the second rename event. Only a "modified" event on the .DS_Store file. I think this file is used by the Finder to know which files are in the bin, etc. So I could check modification to this file, and get the last "renamed" event to detect that a file was sent to the bin. But I'm using TotalFinder which uses Asepsis, which modifies the way the Finder stores .DS_Store files : I no longer receive "modified" on this.
To sumarize : I can't detect when a file is sent to the bin ...
Any idea how I can do that ? Maybe use something else than FSEvents to catch only this event ?

Well, I didn't find the perfect answer to my question, but I found a solution which I eventually was really satisfied with, so I thought I might share ^^
As I said, when moving stuff to the trash, if you're only watching 1 folder, you won't catch the event generated when the image is put in the trash. So, I decided to do the following :
I have a class which creates a stream on the root folder ("/") so that it will catch all the events -> this solves the problem of files being sent to the trash, and all such stuff. Then, this class allow to register delegates on certain pathes. So, instead of creating many streams, I create one big stream, then filter events as needed, and I create many delegates.
So all I have to do now when I want to watch events on a special folder is the following :
[[FSEventsListener instance] addListener:self forPath:somePath];
I just have to create an instance of FSEventListener at application start, and release it when the app stops.
And I just need to implement the following 3 methods which will be automatically called :
-(void)fileWasAdded:(NSString *)file;
-(void)fileWasRemoved:(NSString *)file;
-(void)fileWasRenamed:(NSString *)oldFile to:(NSString *)newFile;
If you're interested in the source code of this little utility, you can check here : http://blog.pcitron.fr/tools/macosx-imageviewer/ (the utility was added at the version 0.8)
I developed it as part of a a little image viewer to keep the UI synchronized with the disk content (it displays the number of images contained in each directories, etc.) The source code is available, and the utility is in Utils/FSEventsListener.h/.m.
And if by any chance someone actually downloads the application and take a look at the sources, if you find anything usefull (performance / feature improvement, whatever) feel free to drop a comment / mail ^^

You are actually raising two issues related to FSEvents and renames.
1. A file is renamed and both the old and new file names are within the directory trees being monitored.
2. A file is renamed and one of the names is not in the directory trees being monitored.
You have solved (almost) the first issue. It is also necessary to provide your application with a means of knowing which events are being reported in the same FSEvent group of events. Your method of knowing that two renames are reported consecutively only works if they are within the same group of events being reported within the same latency period. If two rename events of type 2 occur one after another but are not within the same group of events being reported in the same latency group, they actually have nothing to do with each other - and you will mistakenly think one file has be renamed to another.
It is possible to handle the second type of rename by simply monitoring every directory in the system using the root, but this will flood you with many unnecessary events. You can determine if a "partial" rename is the result of a file being moved out of the directory tree being monitored or into the directory tree being monitored by doing a stat() on the file. If stat() fails with an errno of 2, then the file has been moved outside the directory being monitored, and it can be treated as if it has been deleted. If stat() succeeds, the event can be treated as if the file has been created.

Related

Detect old file name and new file name via FSEventsFramework

I am using FSEventsFramework to monitor directories for changes. I was wondering if my logic here is sound, for detecting old file name and new name after a rename.
I think on rename, both events are in the same callback.
Old file name will have event id X-1
New file name will have even id X
Is this true?
Thanks
As far as I experienced it, your assumption is not always true. On FSEventStreamCreate you will pass a latency as well as a flag for the meaning of the latency kFSEventStreamCreateFlagNoDefer. Therefore the events may or may not come in the same callback. Furthermore there are different ways the file can be renamed. Some Filesystem-APIs actually rename the file while keeping the inode like mv, others like NSDocument create a new inode. Sometimes you will receive kFSEventStreamEventFlagItemRenamed in the callback, sometimes not.
EDIT: Alternatives to FSEvents are Kernel Queues and NSFileCoordinator
FSEvents are somehow not fully documented in the API-Docs. Have a look at the header file FSEvent.h, there is more to read there.
Then to easily see what is going on for a rename, run your app and do some renaming with Finder, with terminal mv as well as in a document based app using the small triangle right to the filename.

Move file in one AccuRev workspace that has been edited in another workspace

We have a need to refactor a code base. The thing is that this will be done by one person and it would be desirable to avoid having the rest of the development team sitting idle while this job takes place.
We therefore tried the following scenario to see if it is possible to work in parallel.
Created file test.txt in directory first in developer A's workspace.
Promoted this file.
Updated developer B's workspace, thereby getting file test.txt
In A's workspace moved file test.txt to directory second.
Promoted this move.
In B's workspace edited file test.txt while it still resides in directory first (no update is made thereby emulating that work is done while refactoring is taking place).
Tried to promote and got a message saying that file test.txt had been modified (correct, file has been moved).
Tried to merge but got an error message saying that AccuRev can't merge since the file is missing in directory second (where it has been moved).
Tried to update B's workspace but that is not allowed since there is a modified file that needs to be merged first.
We are now stuck in a catch 22 situation.
We did try to place a fake file in directory second but that is not being recognized since this file does not belong to the workspace.
Has anyone out there tried something like this and gotten it to work?
It is of course possible to copy files but if there is a better way we would be grateful to hear about this. Or if this is a known bug or limitation in the tool.
We will contact also contact AccuRev support but I thought that I might be able to get some useful tips from the community.
Currently we are using AccuRev client 5.5.0.
Thanks for any suggestions on how to make the tool support this operation.
Referring to your steps 6 & 7: In AccuRev 5.5 after a file is edited and has a (modified) status you first have to keep before you can promote.
At step 8 you could try doing the merge from the Browse Versions view of the file. That way you can select any node to merge with, including the one that has been moved.
Step 9. An AccuRev update will not run successfully if one of the files to be updated is (modified). This is by design. You can keep the file so it has (kept)(member) status then run the update.
David Howland
After contact with AccuRev support the answer is that the only option available is to copy the file to some temp directory, revert the changes, update the workspace and copy the file into the new location in the workspace.
AccuRev will at least tell you which files you have to copy since they will be marked as modified.
I could experimentally verify David's remark to step 9 using AccuRev 5.5.
Let's assume that in the workspace of user A the file was moved and the move was promoted, while in the workspace of user B the file was modified and user B is about to promote his/her change.
Before the file is kept, it will not be possible for user B neither to merge nor to update. But after keeping the modified file the update is possible. The file is first marked as overlap, then the merge succeeds in the new location. Basically, this avoids creating a copy of the file, reverting it and restoring it in the new location after an update, which can be quite cumbersome, as AccuRev does not reveal easily where the move goes.
If user B promotes the modification before user A promotes the move, all goes smoothly, i.e. on update the moved file appears as overlap, but easily merges into the moved file in the new location.
Similar results are obtained when the two users have workspaces connected to different streams and the overlap occurs on a common parent stream. Only if the file is unkept, an error can occur (i.e. only if the move is promoted before the change). Then a simple keep allows to proceed as usual (update, merge, then promote).

Prevent file being overwritten

Imagine there are 3 or more independent locations where a file can be modified. These locations communicate to each other through email or mail (direct flash drive restoration). Though there is a big room for flow - to make simultaneous editing to the file and screw up things, this client won't change too much. He rather call everyone that he is working on the last update or tell the other guys that he is waiting for third guy's last update. Anyway, at some point after several exchanges, due to one of participants unintentional error THE LAST VERSION of the file eventually gets mixed up. From this point everyone searches for the last version BY LOOKING THE CONTENT of the file.
This client wants to have a central location (he has actually, that is his PC's some location) and let everybody (including himself) copy any new or suspected new file to this location but prevent file's last version being copied. From this location he has to easily copy, send or open the file and work.
So, here is my concept (2 steps):
step 1: I made an ad to the main application where this file is created or edited. This ad prompts the user to give a version number to the file with every invoked save command from the editing application. In fact the file can be re-saved multiple times but not considered modified (file attributes creation, save etc. do not have great meaning here). This said the user can cancel my ad-in but have saved the file, not saving a new file version.
step 2: multiple solutions:
solution A: I'm thinking to have a folder/file watch and prevent the last version of the file being overwritten. As you know, FileSystemWatcher will fire the change/delete etc., events AFTER FACT so, I have to back copy overwritten file after the fact (w/ some tricks).
solution B: have a database to store all version of files and built-in some shell extension to extract/view files from the database. Move all copied/pasted files to the database (my program folder) and restore latest file in working folder after watcher fires change/delete event.
solution 3: find out built-in windows tools (API etc.) to greatly rely on it with some programming.
Any ideas?
Thanks in advance.

Watch folder for files being Read

I am trying to watch files in a directory to determine when files are opened/accessed. I thought FileSystemWatcher would do the trick using the event Changed.
Problem is that some applications do not create a lock on the file they open/access or change either the date modified or date accessed (even after fsutil behavior set disablelastaccess 0). Notepad for example. Apparently is makes a copy of the file in memory and plays with it there until you save it. Nor does it update the Date Accessed.
How can I monitor a directory of files and be notified when a file is simply opened/accessed by any program (e.g. Notepad)? Files may be opened from another computer, not necessarily on the computer running the "watcher".
I found lots of similar questions but did not see one focusing on file "access".
This is quite normal. Updating an existing file is quite dangerous since it can cause irretrievable data loss. A disk error (like disk full) while writing is very bad news. The common algorithm used:
rename the original file
write a new file using the original name
no error: delete the renamed file
error: delete the new file, rename original file back
Clearly this doesn't cause a Changed event to be raised, no file was changed.
Sorry, I didn't read the question well enough. There is no notification whatsoever for an app just opening a file for reading. FSW can only detect changes to the file system. There is no ready alternative either, this requires a custom file system filter driver that snoops on driver requests. Like the kind that SysInternals' ProcMon utility uses. I'm not aware of such a driver ready for use in a C# program, you can't write them in C# either. This just isn't a common requirement.

How to reliably handle files uploaded periodically by an external agent?

It's a very common scenario: some process wants to drop a file on a server every 30 minutes or so. Simple, right? Well, I can think of a bunch of ways this could go wrong.
For instance, processing a file may take more or less than 30 minutes, so it's possible for a new file to arrive before I'm done with the previous one. I don't want the source system to overwrite a file that I'm still processing.
On the other hand, the files are large, so it takes a few minutes to finish uploading them. I don't want to start processing a partial file. The files are just tranferred with FTP or sftp (my preference), so OS-level locking isn't an option.
Finally, I do need to keep the files around for a while, in case I need to manually inspect one of them (for debugging) or reprocess one.
I've seen a lot of ad-hoc approaches to shuffling upload files around, swapping filenames, using datestamps, touching "indicator" files to assist in synchronization, and so on. What I haven't seen yet is a comprehensive "algorithm" for processing files that addresses concurrency, consistency, and completeness.
So, I'd like to tap into the wisdom of crowds here. Has anyone seen a really bulletproof way to juggle batch data files so they're never processed too early, never overwritten before done, and safely kept after processing?
The key is to do the initial juggling at the sending end. All the sender needs to do is:
Store the file with a unique filename.
As soon as the file has been sent, move it to a subdirectory called e.g. completed.
Assuming there is only a single receiver process, all the receiver needs to do is:
Periodically scan the completed directory for any files.
As soon as a file appears in completed, move it to a subdirectory called e.g. processed, and start working on it from there.
Optionally delete it when finished.
On any sane filesystem, file moves are atomic provided they occur within the same filesystem/volume. So there are no race conditions.
Multiple Receivers
If processing could take longer than the period between files being delivered, you'll build up a backlog unless you have multiple receiver processes. So, how to handle the multiple-receiver case?
Simple: Each receiver process operates exactly as before. The key is that we attempt to move a file to processed before working on it: that, and the fact the same-filesystem file moves are atomic, means that even if multiple receivers see the same file in completed and try to move it, only one will succeed. All you need to do is make sure you check the return value of rename(), or whatever OS call you use to perform the move, and only proceed with processing if it succeeded. If the move failed, some other receiver got there first, so just go back and scan the completed directory again.
If the OS supports it, use file system hooks to intercept open and close file operations. Something like Dazuko. Other operating systems may let you know about file operations in anoter way, for example Novell Open Enterprise Server lets you define epochs, and read list of files modified during an epoch.
Just realized that in Linux, you can use inotify subsystem, or the utilities from inotify-tools package
File transfers is one of the classics of system integration. I'd recommend you to get the Enterprise Integration Patterns book to build your own answer to these questions -- to some extent, the answer depends on the technologies and platforms you are using for endpoint implementation and for file transfer. It's a quite comprehensive collection of workable patterns, and fairly well written.