How to design RESTful API for hierarchical entities - api

I'm designing RESTful API for file storage and having problems with finding the best way to organize URL's to actions.
Files can be grouped to folders but it is needed to be able to get all the files.
Guidelines suggests to use the following url to get files for specific folder.
GET /folders/{folderName}/files
But what should be used to just get all files? GET /files or GET /folders/files?
Also Google Drive has somewhat similar functionality and they use diifferent approach
GET files/{folderName}/children

As you've noticed this can range from one API designer to another.
If I was facing this problem I would want to consider all use cases and figure out what works best.
It looks like the following would meet your needs:
GET / Retrieves all files and folders
GET /{folderId} Retrieves all contents of said folderId (folders and files)
GET /{fileId} Retrieves the file
GET /{folderId}/{folderId} Same as above, but for nested folder
GET /{folderId}/{folderId}/{fileId} Retrieves the file
this pattern can continue for however nested the file structure is (note there is a limit on URL length)
Then if you have a unique requirement such as all you just create a new api endpoint.
GET /files/ Retrieves all files
GET /files/?filter="*.txt" Retrieves all text files
So to answer your EXACT question of:
But what should be used to just get all files? GET /files or GET
/folders/files
I would lean towards /files instead of /folders/files. /folders/files does not make much sense as an api consumer.

Related

How to create a searchable central repository of code documentation using DocFx

I'm looking to create a central repository for all of our published API documentation using DocFx. I have documentation auto-generated via my build (using TFS) and published through my release (using Octopus) just fine for multiple individual sites. However, I'm wanting to pull it altogether in one location. The thinking is that through a parent site you could filter content in any of the individual sites without having to drill down into them. Do you have a recommendation on how to do this?
Also, within this same documentation repository I want to provide the capability to search by all of the meta data (project-level documentation) across the hundreds of projects in our portfolio. This will give our BA, DEV and QA teams easier access to what all our systems do. I like the "filtering" capability built into DocFx, but I'm wanting full-text search across all of the meta data. Do you have a recommendation for this functionality as well?
To change the location of the docfx output, edit the docfx.json file and specify the dest value. By default it is "dest": "_site". For more formatting guidance, reference: https://dotnet.github.io/docfx/tutorial/docfx.exe_user_manual.html.
Regarding full-text search, that is possible by simply ensuring the ExtractSearchIndex post-processor is invoked (in order to generate an index.json file of keywords) and that the global _enableSearch value is set to true in the docfx.json file. A snippet from that file would look like:
"postProcessors": [ "ExtractSearchIndex" ],
"globalMetadata": {
"_enableSearch": "true"
}
For your first question:
I think what you expect is like the .NET API Browser. The source code behind this page is not open to public, so you need create this page by yourself, through collecting xrefmap.yml from multiple sites, and extract the needed data into this page.
For your second question:
DocFX uses Luna to scan all the output files and generate an index file called index.json for later search use. In your case, you should want to limit the search scope only in the metadata you defined. This is also not supported by DocFX by default. You can also use Luna in your central place to search these meta. You can create your specific index.json for each project first, and the cental place to collect them for the search page.

dropbox API searchFileNames

Current Dropbox API searchFileNames() allows the searching of a query / substring, so if I wanted to look for filenames in a folder that were of a specific extension such as ".jpg", that works, but if I combine to look for ".jpg .png" I get nothing returned... as the documentation states 'A file matches only if it contains all the substrings.'
Is there another API that will allow a union of the searched vs the exclusion?
Thanks
No, the Dropbox API doesn't enable you to search for multiple file extensions at once like this, but I'll be sure to pass this along as a feature request.
As a workaround, you can split this into multiple API calls.

Get directory tree of a users dropbox

I have an app which integrates with dropbox, I want the user to select a folder from their dropbox.
I can call '/2/files/list_folder' (https://www.dropbox.com/developers/documentation/http/documentation#files-list_folder) with recursive set to true, and then recursively call it based on the returned cursor. I then filter out any which aren't directories.
But this is a long slow process, and unpredictable given the potential size of some users directory tree on Dropbox.
I know there is a dropbox file select plug in (https://www.dropbox.com/developers/chooser), but I want to make a folder select, with no option to select a file.
What I would like is one api call that returns a list of all directories for a user.
Does this exist with an API method I don't know about? Or is there another widget that allows folder selection?
I've seen this question which just does a recursive api call too, not practically efficient.
The Dropbox API v2 doesn't offer a way to list only folders like that, but we'll consider it a feature request.
Dropbox also doesn't offer a component like the Chooser that allows folder selection, but we'll consider that a feature request as well.

How can i get file's properties for a file in OneDrive?

I am using the REST API for OneDrive. I have a name of a file in the users storage. I want to obtain the properties for this file. According the documentation file's properties can be retrieved
if you have the file ID.(http://msdn.microsoft.com/en-us/library/dn659731.aspx) So I need the file ID and the only way I see to obtain it is to search the whole storage which is really unnecessary.
Is there a way to find properties of a file(with a known name) with a single request to the service?
Ideally the API would support access by path which would do what you require (assuming you have the full path and not just the name). Unfortunately, to my knowledge that isn't supported.
There is a heavy handed approach that may work for you though - you can use the search capabilities of the API to find files with the name you specify:
GET /[userid]/skydrive/search?q=MyVideo.mp4
The documentation is available at the link below:
http://msdn.microsoft.com/en-us/library/dn631847.aspx

Searching Inside an Amazon S3 Bucket

If I have a bucket with hundreds of thousands of images, is it ok to have to search for each image I want to display in my site via it's ID or is there a more efficient way (including having multiple folders in a bucket maybe)?
I was also thinking of giving each image a unique hash or something similar in order to stop duplicated names in the bucket. Does that seem like a good idea?
You just link to each image using normal urls. for public files the urls are in the format:
http://mybucket.s3.amazonaws.com/myimage.jpg
For private urls, you need to generate a url (which is easy using any of the sdks) in the format:
http://mybucket.s3.amazonaws.com/myimage.jpg?AWSAccessKeyId=44CF9SAMPLEF252F707&Expires=1177363698&Signature=vjSAMPLENmGa%2ByT272YEAiv4%3D
There's nothing wrong with storing each file with a unique name. If you set the correct headers on the file, any downloads can still have the original name. eg Content-Disposition: attachment; filename=myimage.jpg;
For listing a buckets contents you would use the APIs GetBucket command. I find it easier to use the SDKs for any access via the API.
It can be a pain to search or do things in parallel over bucket objects as amazon lists everything lexicographically (the only way currently supported). The problem with using random IDs is that all of it would be written to the same block storage and you cannot do search in parallel to optimize.
Here is an interesting article on performance improvements. I use it for my work and see significant difference in high load.
http://aws.typepad.com/aws/2012/03/amazon-s3-performance-tips-tricks-seattle-hiring-event.html