Intercepting File Writes on OS X - objective-c

I have a program that generates information from the contents of files, however, I believe it would be more efficient if I were able to do this as the files are being written; rather than having to then read the contents back after some delay, since I can simply generate the data as the file is writing to disk.
What method(s) are available for an application to hook into the file-write process, i.e- to process the data stream as it's being written to disk? Also, which of these (if any) are allowable for app store apps?
I've been considering using a Spotlight Importer, however this still involves reading the contents of a file after they've been written, in which case I'm relying on the file still being in the RAM cache to reduce disk access.

Related

Saving data so it is persistent in Objective-C

Over many different Objective-C iOS coding projects I have frequently come across the issue of having data be accessible after I initially got it.
For example, currently I am reading from the stackoverflow API. I do this with a session and get a dictionary back (my JSON response).
But outside the scope of the session, the dictionary is unavailable! I can't copy the contents to a different dictionary that I've defined globally, or anything. It's like it disappears outside of the session.
So I am wondering, what's the best way to save this data that I want to use? From what I've been reading it seems like NSUserDefaults or maybe creating a plist file, although admittedly I've been having trouble with both options. If there is a method that is best for this then I can concentrate on that.
Thank you!
It depends on how persistent you want to be.
If you save this dictionary into a global variable, it is stored in the part of device's RAM that is reserved for the running app. When the app stops running (gets killed by OS or removed by the user) or if your device reboots - this memory is lost.
If you save this dictionary onto the device's flash memory drive (and its file system) - it will live past restarts/reboots.
Usually people combine the approaches: when you get the data from the network, you keep it in a global variable, and save it to the file system. After the app restart you try to load the data from the file system. The reason for not using the FS all the time is that it is much slower than RAM access. I guess I'm describing caching.
Note that you can implement manual caching (using plain data or text files, NSUserDefaults, Core Data or other libraries), but also you can utilize a builtin HTTP cache - NSURLCache. If you create a session with NSURLSession.sharedSession it will use the default NSURLCache and respect a caching policy dictated by the server side.
For more control and full offline support I'd recommend to implement caching manually. See this about reading and writing plists and writeToFile:atomically:.

Writing files safely (WinRT)

What approach do you use to write critical app files like settings, configuration files, user files in WinRT, or in general?
To illustrate my concern - in my app I am saving the list of user selected data sources as a JSON file. In case the user updates the list and saves it, I just overwrite the current file with the new JSON serialized list. But if the app were killed from the task manager or the computer lost power in that very moment when the file is being written, it would stay in an inconsistent state and would probably cause the app not to launch or the user would definitely lose data.
I considered writing into a different file and then swap them when finished. Is this solution the best one possible?

Reading file content from the MFT at runtime

I have to read the MFT file of a running Windows (XP or higher) and through it to reach the HD sectors that held the contents ($DATA) of a specific file that exists on the machine.
The problem is that between the time of reading the MFT until the fetching of the relevant sectors and reading them, the file system structure can vary and the locations may not be relevant anymore.
Is there a way to "freeze" the system for a certain time? Perhaps guarantee that there will not be changes for this file? Lock a specific file in order to make it not moving between sectors? (Including due to optimizations and changes in indirect)
Of course I would prefer not to copy the entire hard disk and to work statically since it's a slow operation that would disallow normal use of the system at this time. Needless to say, I don't want to use the API functions of the OS or to write a driver.
I'd simply open the file, requesting read/write access, with read share mode. If you succeed to open the file, you're guaranteed that data will not change until you close the handle. See https://msdn.microsoft.com/en-us/library/windows/desktop/hh449422%28v=vs.85%29.aspx
If you want to achieve that on files that are already opened and locked by different processes, that's entirely different story and I believe you have to write own filter driver.
If the file location in the system varies, it will be accordingly reflected in the MFT. So instead of trying to stop any activity for the file you can simply compare the MFT info before and after reading the file. Unless you are de-fragmenting or deleting contents of the file the file storage structure will not change. Additions to files do not affect the consistency of data that you read. So if this is your scenario, you can just go ahead with the above method.

How chunk file upload works

I am working on file upload and really wandering how actually chunk file upload works.
While i understand client sends data in small chunks to server instead of complete file at once. But i have few questions on this:-
For browser to divide and send whole file into chunks, Will it read complete file to its memory? If yes, then again there will me chances of memory leak and browser crash for big files(say > 10GB)
How cloud application like google drive droopbox handles such big files upload?
If multiple files are selected to upload and all have size grater than 5-10 GB, Does browser keep all files into memory then send it chunk by chunk?
Not sure if you're still looking for answer, I been in your position recently, and here's what I've come up with, hope it helps: Deal chunk uploaded files in php
During uploading, If you can print out the request from the backend, you shall see three parameters: _chunkNumber, _totalSize and _chunkSize, with these parameters it's easy to decide whether this chunk is the last piece, if it is, assemble all of the pieces as a whole shouldn't be hard.
As for javascript side, ng-file-upload has a setting named "resumeChunkSize" where you can enable chunk mode and setup the chunk size.

(OS X) Determine if file is being written to?

My app is monitoring a "hot" folder somewhere on the local filesystem for newly added files to push to a network location. I'm running into a problem when very large files are being written into the hot folder: the file system event notifying me of changes in the hot folder will fire well before the file completes writing. When my app tries to upload the file, it mis-reads the file size as the current number of copied bytes, not the eventual total number of bytes.
Things I've tried:
NSURL getResourceValue:forKey:error: to read NSURLAllocatedFileSizeKey (same value as NSURLFileSizeKey while the file is being written).
NSFileManager attributesOfItemAtPath:error: to look at NSFileBusy (always NO).
I can't seem to find any mechanism short of repeatedly polling a file for its size to determine if the file is finished copying and can be uploaded.
There aren't great ways to do this.
If you can be certain that the writer is using NSFileCoordinator, then you can also use that to coordinate your access to the file.
Likewise, if you're sure that the writer has opted in to advisory locking, you could try to open the file for shared access by calling open() with the O_SHLOCK and O_NONBLOCK flags. If you succeed, then there are no other descriptors open for exclusive access. You can either use the file descriptor you've got or close it and then use some other API to access the file.
However, if you can't be sure of any of those, then your best bet may be to set a timer to repeatedly check the file's metadata (size, date modified, etc.). Only when you see that it has stopped changing over a reasonable time interval (2 seconds, maybe) would you attempt to access it (and cancel the timer).
You might want to do all three. Wait for the file's metadata to settle down, then use a NSFileCoordinator to read from the file. When it calls your reader block, use open() with O_SHLOCK | O_NONBLOCK to make sure there are no other processes which have exclusive access to it.
You need some form of coordinated file locking.
fcntl() and flock() are common functions for this.
Read up on it first.
Then see what options you have.
If you can control the code base of those other processes, all the better.
The problem with really large files is that what's changed or changing inside them is opaque and isn't always at the end.
Good processes should generally be doing atomic writes. (Write to a temp file then swap it out) but if these files are actually databases then you will want to look at using the db's server app for this sort of thing.
If the files are wrappers containing other files then it gets extra messy as those contents might have dependencies on one another to be in a usable state.