filesAdded and filesSubmitted events - flow-js

I'm using the flow.js library for file uploads in a project.
I'm confused about the parameter files received by filesAdded and filesSubmitted events. Would the parameter always contain files corresponding to a particular directory on disk (in case of directory uploads) and a single file (in case of file uploads) or can it contain files from seemingly unrelated uploads as well?
To give you an example, consider the scenario where a user adds two files in sequential fashion.
Can the filesAdded and filesSubmitted events be triggered with the files parameter containing both these files even when they were part of unrelated uploads?
The problem arises when you have multiple upload sections on a page, and you use a single instance of Flow to handle all uploads. In this case, if files uploaded on distinct upload sections appear together in events like filesAdded or filesSubmitted, then it mixes up your files belonging to separate upload actions. One solution is creating a new Flow instance for every upload section on a page, but I wanted to understand the behavior filesAdded and filesSubmitted and whether the behavior would permit one to solve the problem using just one Flow instance.

Related

Amazon S3: How to safely upload multiple files?

I have two client programs which are using S3 to communicate some information. That information is a list of files.
Let's call the clients the "uploader" and "downloader":
The uploader does something like this:
upload file A
upload file B
upload file C
upload a SUCCESS marker file
The downloader does something lie this:
check for SUCCESS marker
if found, download A, B, C.
else, get data from somewhere else
and both of these programs are being run periodically. The uploader will populate a new directory when it is done, and the downloader will try to get the latest versions of A,B,C available.
Hopefully the intent is clear — I don't want the downloader to see a partial view, but rather get all of A,B,C or skip that directory.
However, I don't think that works, as written. Thanks to eventual consistency, the uploader's PUTs could be reordered into:
upload file B
upload a SUCCESS marker file
upload file A
...
And at this moment, the downloader might run, see the SUCCESS marker, and assume the directory is populated (which it is not).
So what's the right approach, here?
One idea is for the uploader to first upload A,B,C, then repeatedly check that the files are stored, and only after it sees all of them, then finally write the SUCCESS marker.
Would that work?
Stumbled upon similar issue in my project.
If the intention is to guarantee cross-file consistency (between files A,B,C) the only possible solution (purely within s3) is:
1) to put them as NEW objects
2) do not explicitly check for existence using HEAD or GET request prior to the put.
These two constraints above are required for fully consistent read-after-write behavior (https://aws.amazon.com/about-aws/whats-new/2015/08/amazon-s3-introduces-new-usability-enhancements/)
Each time you update the files, you need to generate a unique prefix (folder) name and put this name into your marker file (the manifest) which you are going to UPDATE.
The manifest will have a stable name but will be eventually consistent. Some clients may get the old version and some may get the new one.
The old manifest will point to the old “folder” and the new one will point the new “folder”. Thus each client will read only old files or only new files but never mixed, so cross file consistency will be achieved. Still different clients may end up having different versions. If the clients keep pulling the manifest and getting updated on change, they will eventually become consistent too.
Possible solution for client inconsistency is to move manifest meta data out of s3 into a consistent database (such as dynamo db)
A few obvious caveats with pure s3 approach:
1) requires full set of files to be uploaded each time (incremental updates are not possible)
2) needs eventual cleanup of old obsolete folders
3) clients need to keep pulling manifest to get updated
4) clients may be inconsistent between each other
It is possible to do this single copies in S3. Each file (A B C) will have prepended to it a unique hash or version code [e.g. md5sum generated from the concatenation of all three files.]
In addition the hash value will be uploaded to the bucket as well into a separate object.
When consuming the files, first read the hash file and compare to the last hash successfully consumed. If changed, then read the files and check the hash value within each. If they all match, the data is valid and may be used. If not, the downloaded files should be disgarded and downloaded again (after a suitable delay)..
This will catch the occassional race condition between write and read across multiple objects.
This works because the hash is repeated in all objects. The hash file is actually optional, serving as a low-cost and fast short cut for determining if the data is updated.

Writing files safely (WinRT)

What approach do you use to write critical app files like settings, configuration files, user files in WinRT, or in general?
To illustrate my concern - in my app I am saving the list of user selected data sources as a JSON file. In case the user updates the list and saves it, I just overwrite the current file with the new JSON serialized list. But if the app were killed from the task manager or the computer lost power in that very moment when the file is being written, it would stay in an inconsistent state and would probably cause the app not to launch or the user would definitely lose data.
I considered writing into a different file and then swap them when finished. Is this solution the best one possible?

tracking file renaming/deleting with FSEvents on Lion

I'm trying to use FSEvents to detect when files were added/removed from a specific folder. For the moment, I implemented a simple wrapper around FSEvents, and it works fine : I get all the events.
BUT the problem I have now is that when I rename a file in the Finder, I catch 2 distinct events : the first one of type "renamed" with the old file name, and another one with "renamed" and the new filename. The event ids are different between both calls.
So, how am I supposed to know which "renamed" event contains the old name, and which event contains the old one ?? I tried looking in the documentation, but unfortunately, kFSEventStreamEventFlagItemRenamed is not documented ... it seems new in Lion.
PS: the only way I could think of was : on a renamed event, I check my UI to see if I have an item corresponding to the event path. If so, I flag it for renaming. If not, I check if an item was flagged for renaming, and if so, then I rename it to the new event path. But I really don't like this idea ...
Edit: Ok, I imlemented something along the line of my "PS" : I noticed that when renaming something, the ids of the 2 events are consecutives, so that with the id of the event containing the new name, I can get the event containing the old name. I simply use a little dictionnary in my interface to store ids and associated paths in the case of a "renamed" event.
Anyway, I can now catch rename events, and even move events : when you move a file, it's a "renamed" event which is caught by the FSEventStream ...
But, I still have one last problem : deleting. When I delete something, it's moved to the recycle bin : I receive a "renamed" event. But the problem is that I don't receive the second rename event. Only a "modified" event on the .DS_Store file. I think this file is used by the Finder to know which files are in the bin, etc. So I could check modification to this file, and get the last "renamed" event to detect that a file was sent to the bin. But I'm using TotalFinder which uses Asepsis, which modifies the way the Finder stores .DS_Store files : I no longer receive "modified" on this.
To sumarize : I can't detect when a file is sent to the bin ...
Any idea how I can do that ? Maybe use something else than FSEvents to catch only this event ?
Well, I didn't find the perfect answer to my question, but I found a solution which I eventually was really satisfied with, so I thought I might share ^^
As I said, when moving stuff to the trash, if you're only watching 1 folder, you won't catch the event generated when the image is put in the trash. So, I decided to do the following :
I have a class which creates a stream on the root folder ("/") so that it will catch all the events -> this solves the problem of files being sent to the trash, and all such stuff. Then, this class allow to register delegates on certain pathes. So, instead of creating many streams, I create one big stream, then filter events as needed, and I create many delegates.
So all I have to do now when I want to watch events on a special folder is the following :
[[FSEventsListener instance] addListener:self forPath:somePath];
I just have to create an instance of FSEventListener at application start, and release it when the app stops.
And I just need to implement the following 3 methods which will be automatically called :
-(void)fileWasAdded:(NSString *)file;
-(void)fileWasRemoved:(NSString *)file;
-(void)fileWasRenamed:(NSString *)oldFile to:(NSString *)newFile;
If you're interested in the source code of this little utility, you can check here : http://blog.pcitron.fr/tools/macosx-imageviewer/ (the utility was added at the version 0.8)
I developed it as part of a a little image viewer to keep the UI synchronized with the disk content (it displays the number of images contained in each directories, etc.) The source code is available, and the utility is in Utils/FSEventsListener.h/.m.
And if by any chance someone actually downloads the application and take a look at the sources, if you find anything usefull (performance / feature improvement, whatever) feel free to drop a comment / mail ^^
You are actually raising two issues related to FSEvents and renames.
1. A file is renamed and both the old and new file names are within the directory trees being monitored.
2. A file is renamed and one of the names is not in the directory trees being monitored.
You have solved (almost) the first issue. It is also necessary to provide your application with a means of knowing which events are being reported in the same FSEvent group of events. Your method of knowing that two renames are reported consecutively only works if they are within the same group of events being reported within the same latency period. If two rename events of type 2 occur one after another but are not within the same group of events being reported in the same latency group, they actually have nothing to do with each other - and you will mistakenly think one file has be renamed to another.
It is possible to handle the second type of rename by simply monitoring every directory in the system using the root, but this will flood you with many unnecessary events. You can determine if a "partial" rename is the result of a file being moved out of the directory tree being monitored or into the directory tree being monitored by doing a stat() on the file. If stat() fails with an errno of 2, then the file has been moved outside the directory being monitored, and it can be treated as if it has been deleted. If stat() succeeds, the event can be treated as if the file has been created.

Recommended document structure. File Wrappers? Roll my own?

I'm currently working out the best structure for a document I'm trying to create. The document is basically a core data document that uses sqlite as its store, but uses the Apple provided NSPersistentDocument+FileWrapperSupport to enable file wrapper support.
The document makes heavy use of media, such as images, videos, audio files, etc. with potentially 1000s of files. So what I'm trying to do is create a structure similar to the following:
/myfile.ext/
/myfile.ext/store.sqlite
/myfile.ext/content/
/myfile.ext/content/images/*
/myfile.ext/content/videos/*
/myfile.ext/content/audio/*
Now, first of all I went down the route of creating a temporary directory and placing all of my media in there. Basically creating the paths and file names '/content/images/image1.jpg' as I wanted them to appear in the saved file wrapper, and then upon save I attempted to copy these all into the filewrapper...
What I found was that the files were indeed copied into the wrapper with the file structure I wanted, but when the actual wrapper was saved, these files all magically disappeared.
Great.
So, I trashed my existing solution and tried to use file wrappers instead. This solution involved creating a content wrapper file directory when a new document was created, or loading in a content directory file wrapper upon opening a document.
When an image was added/modified, I created the necessary directory wrappers inside this root content wrapper (i.e. an images directory wrapper if it didn't already exist, or any other intermediary directory wrappers that needed to be created) and then created a regular file wrapper for the media, removing any existing wrapper for that file name if one was there.
Saving the document was just a case of making sure the content file wrapper was added to the document file wrapper, and the document would save.
Well... it did. The first time. However, any attempts to make any subsequent changes i.e add an image, save. Then replace image, save. Did not behave as expected, only showing the image from the first save.
So, my question is... first of all, which of the above approaches is the correct one, if at all, and what am I doing that wrong for them to fail.
And secondly, as I expect to be managing 1000s of images, is using file wrappers the correct way to go about things at all.
With that much media in play, you should likely give your users control over whether the media resides in the document or only a reference to the media is included in the document, and the media resides elsewhere, such as in a library/repository managed by your application. Then they could save out a (potentially many times larger) copy with all references resolved.
You might want to zip/unzip any directory so that users don't get confused trying to attach the document to an email. I believe iWork has been doing this with its document bundles for a while now.
As far as what you are doing wrong, no-one can say, as you haven't provided any code demonstrating what you are doing.
Why don't you create a one-off application that lets you select files on disk and saves those files in a document using a file wrapper? This would let you tackle this functionality without any interference from other issues in your application. Once you understand how to use file wrappers, you can port the code back or just write new code that works.

Storing uploaded content on a website

For the past 5 years, my typical solution for storing uploaded files (images, videos, documents, etc) was to throw everything into an "upload" folder and give it a unique name.
I'm looking to refine my methods for storing uploaded content and I'm just wondering what other methods are used / preferred.
I've considered storing each item in their own folder (folder name is the Id in the db) so I can preserve the uploaded file name. I've also considered uploading all media to a locked folder, then using a file handler, which you pass the Id of the file you want to download in the querystring, it would then read the file and send the bytes to the user. This is handy for checking access, and restricting bandwidth for users.
I think the file handler method is a good way to handle files, as long as you know to how make good use of resources on your platform of choice. It is possible to do stupid things like read a 1GB file into memory if you don't know what you are doing.
In terms of storing the files on disk it is a question of how many, what are the access patterns, and what OS/platform you are using. For some people it can even be advantageous to store files in a database.
Creating a separate directory per upload seems like overkill unless you are doing some type of versioning. My personal preference is to rename files that are uploaded and store the original name. When a user downloads I attach the original name again.
Consider a virtual file system such as SolFS. Here's how it can solve your task:
If you have returning visitors, you can have a separate container for each visitors (and name it by visitor login, for example). One of the benefits of this approach is that you can encrypt the container using visitor's password.
If you have many probably one-time visitors, you can have one or several containers with files grouped by date of upload.
Virtual file system lets you keep original filenames either as actual filesnames, or as a metadata for the files being stored.
Next, you can compress the data being stored in the container.