Dropbox API - Are the File IDs unique? - api

We are using the dropbox REST API to migrate file from dropbox repo to our platform. But the thing is that different files have same IDs, we are using the IDs and path to create the folder/file hierarchy and replicate the same in our platform. Now, because of this issue, some files are seen in different folders and not in the expected folder. Is this a bug or should we rely on the file paths entirely?

File IDs in the Dropbox API are indeed expected to be unique. That is, files at two different paths shouldn't simultaneously have the same ID.
If you are seeing that, that would be a bug, which you can report here:
https://www.dropbox.com/developers/contact
However, note that file IDs are case sensitive, and it's possible for some file IDs to differ only be case, so make sure you're using case-sensitive comparisons.

Related

Handling large amounts of file uploads - Any limitations I should know of?

I'm building a website that will involve a lot of uploaded files. Hopefully, more than I intend for there to be.
I figured I'd have an uploaded files path and use a UUID as the filename. I was curious if there are any limitations on this? For instance, would storing thousands of files in the one folder on my server create problems?
There are quite many issues that couldappear, from file system limitations to backup problems.
I suggest using the first X characters of tue UUIS as folder name - possibly multiple levels deep (first 4, second 43, third 4). This way you have one structure but can back up folders and move them to different servers if needed later (by using the folders as redirection points).

Copy (or list) files of a remote folder, given the URL - using objective-c - is it possible?

I'd like to list all files from a remote folder (let's say www.mysite.com/folder, and this folder is already configured through .htaccess for directory listing).
After listing, i'll need to copy the remote files to a local folder.
For listing/copying only local files, I was using NSFileManager, but this doesn't work for the remote ones. I've been looking for some reference on it, but couldn't find so far...
While NSFileManager can in fact handle URLs, it's not going to download the apache HTML page with the directory listing and parse it to do this... you'll have to do that yourself. This sounds like a strange thing to be doing however, so you may want to explain the reasoning and we may be able to suggest better alternatives. WebDAV comes to mind.
UPDATE: Based on your comment, why not put the resources in a .zip (or similar) file and download that? Then it's a single download and you can just extract it locally. Sounds like it would save a lot of headaches and would make it much easier to do things like checksum validations on the download(s).
Maybe it's not the best way, but - instead of get directory listing - we're going to keep a list of files that should be transfered (could be a .txt or .xml).
For downloading and tracking multiple requests, we're going to use ASINetworkQueues (more details can be found on http://allseeing-i.com/ASIHTTPRequest).
Another good suggestion, given by d11wqt (thank you for your help), is compressing the files and just make one single request.

Asset Management: which is the better way to organise user generated files on a web server?

We are in the process of building a system which allows users to upload multiple images and videos to our servers.
The team I'm working with have decided to save all the assets belonging to a user in a folder named using the user's unique identifier. This folder in turn will be a sub-folder of our main assets folder on the file server.
The file structure they have proposed is as follows:
[asset_root]/userid1/assets1
[asset_root]/userid1/assets2
[asset_root]/userid2/assets1
[asset_root]/userid2/assets2
etc.
We are expecting to have thousands or possibly a million+ users in the life time of this system.
I always thought that it wasn't a good idea to have many sub-folders in a single location and suggested a year/month/day approach as follows:
[asset_root]/2010/11/04/userid1/assets1
[asset_root]/2010/11/04/userid1/assets2
[asset_root]/2010/11/04/userid2/assets1
[asset_root]/2010/11/04/userid2/assets2
etc.
Does anyone know which of the above approaches would be better suited for this many assets? Is there a better method to organize images/videos on a server?
The system in question will be an Windows IIS 7.5 with a SAN.
Many thanks in advance.
In general you are correct, in that many file systems impose a limit on the number of files and folders which may be in one folder. If you hit that limit with the number of users you have, your in trouble.
In general, I would simply use a uuid for each image, with some dimension of partitioning. e.g. A hash of ABCDEFGH would end up as [asset_root]/ABC/DEFGH. Using a hash gives you a greater degree of assurance about the number of files which will end up in each folder and prevents you from having to worry about, for example, not knowing which month an image you need was stored in.
I'm presuming your file system is NTFS? IF so, you've got a limit of 4,294,967,295 files on the disk - the limit of files in a folder is the same. If you have on the order of millions of users you should be fine, though you might want to consider having only one folder per user instead of several as your example indicates.

Storing uploaded content on a website

For the past 5 years, my typical solution for storing uploaded files (images, videos, documents, etc) was to throw everything into an "upload" folder and give it a unique name.
I'm looking to refine my methods for storing uploaded content and I'm just wondering what other methods are used / preferred.
I've considered storing each item in their own folder (folder name is the Id in the db) so I can preserve the uploaded file name. I've also considered uploading all media to a locked folder, then using a file handler, which you pass the Id of the file you want to download in the querystring, it would then read the file and send the bytes to the user. This is handy for checking access, and restricting bandwidth for users.
I think the file handler method is a good way to handle files, as long as you know to how make good use of resources on your platform of choice. It is possible to do stupid things like read a 1GB file into memory if you don't know what you are doing.
In terms of storing the files on disk it is a question of how many, what are the access patterns, and what OS/platform you are using. For some people it can even be advantageous to store files in a database.
Creating a separate directory per upload seems like overkill unless you are doing some type of versioning. My personal preference is to rename files that are uploaded and store the original name. When a user downloads I attach the original name again.
Consider a virtual file system such as SolFS. Here's how it can solve your task:
If you have returning visitors, you can have a separate container for each visitors (and name it by visitor login, for example). One of the benefits of this approach is that you can encrypt the container using visitor's password.
If you have many probably one-time visitors, you can have one or several containers with files grouped by date of upload.
Virtual file system lets you keep original filenames either as actual filesnames, or as a metadata for the files being stored.
Next, you can compress the data being stored in the container.

Vb.Net Document Storage

I am attempting to add a document storage module to our AR software.
I will be prompting the user to attach a doc/image to thier account. I will then put a copy of this file into our folder so that we can reference it without having to rely on them keeping the file in its original place. This system is not using a database but instead its using multiple flat files.
I am looking for guidance on how to handle these files once they have attached them to our system.
How should I store these attached files?
I was thinking I could copy the file over to a sub directory then renaming it to a auto-generated number so that we do not have duplicates. The bad thing about this, is the contents of the folder can get rather large.
Anyone have a better way? Should I create directories and store them...?
This system is not using a database but instead its using multiple flat files.
This sounds like a multi-user system. How are you handing concurrent access issues? Your answer to that will greatly influence anything we tell you here.
Since you aren't doing anything special with your other files to handle concurrent access, what I would do is add a new folder under your main data folder specifically for document storage, and write your user files there. Additionally, you need to worry about name collisions. To handle that, I'd name each file there with by appending the date and username to the original file name and taking the md5 or sha1 hash of that string. Then add a file to your other data files to map the hash values to original file names for users.
Given your constraints (and assuming a limited number of total users) I'd also be inclined to go with a "documents" folder -- plus a subfolder for each user. Each file name should include the date to prevent collisions. Over time, you'll have to deal with getting rid of old or outdated files either administratively or with a UI for users. Consider setting a maximum number of files or maximum byte count for each user. You'll also want to handle the files of departed users.