How to design a REStful API for a media analysis engine - api

I am new to Restful concept and have to design a simple API for a media analysis service I need to set up, to perform various tasks, e.g. face analysis, region detection, etc. on uploaded images and video.
Outline of my initial design is as follows:
Client POSTs a configuration XML file to http://manalysis.com/facerecognition. This creates a profile that can be used for multiple analysis sessions. Response XML includes a ProfileID to refer to this profile. Clients can skip this step to use the default config parameters
Client POSTs video data to be analyzed to http://manalysis.com/facerecognition (with ProfileID as a parameter, if it's set up). This creates an analysis session. Return XML has the SessionID.
Client can send a GET to http://manalysis.com/facerecognition/SessionID to receive the status of the session.
Am I on the right track? Specifically, I have the following questions:
Should I include facerecognition in the URL? Roy Fielding says that "a REST API must not define fixed resource names or hierarchies" Is this an instance of that mistake?
The analysis results can either be returned to the client in one large XML file or when each event is detected. How should I tell the analysis engine where to return the results?
Should I explicitly delete a profile when analysis is done, through a DELETE call?
Thanks,
C

You can fix the entry point url,
GET /facerecognition
<FaceRecognitionService>
<Profiles href="/facerecognition/profiles"/>
<AnalysisRequests href="/facerecognition/analysisrequests"/>
</FaceRecognitionService>
Create a new profile by posting the XML profile to the URL in the href attribute of the Profiles element
POST /facerecognition/profiles
201 - Created
Location: /facerecognition/profile/33
Initiate the analysis by creating a new Analysis Request. I would avoid using the term session as it is too generic and has lots of negative associations in the REST world.
POST /facerecognition/analysisrequests?profileId=33
201 - Created
Location: /facerecognition/analysisrequest/2103
Check the status of the process
GET /facerecognition/analysisrequest/2103
<AnalysisRequest>
<Status>Processing</Status>
<Cancel Method="DELETE" href="/facerecognition/analysisrequest/2103" />
</AnalysisRequest>
when the processing has finished, the same GET could return
<AnalysisRequest>
<Status>Completed</Status>
<Results href="/facerecognition/analysisrequest/2103/results" />
</AnalysisRequest>
The specific URLs that I have chosen are relatively arbitrary, you can use whatever is the clearest to you.

Related

REST API - drive logic by data provided or make separate endpoint/http method - best practice

I am building my own implementation for file upload for my REST backend service, and I have POST .../file endpoint which has function of file upload initilalization.
It accepts some parameters regarding to settings for upcoming upload request(s) and perzists some data, so that is the reason i choosed POST HTTP method, and important one parameter is file_id.
Currently implemented logic is:
if it is not provided, then new file is going to be uploaded (file_id will be obtained on return)
if file_id is provided then informations regarding to status of file is returned (last sucessfully uploaded part, uplodaded parts, errors...)
Is it considered good approach to have this two actions under one endpoint ? Or should I split logic for "new file" and "continue with next part" into two endpoints (or separate HTTP method)
It uses same DTOs for request/response, only some fields are selectivelly not filled/returned.
Generally, it would be better to have one endpoint per action. So two endpoints would be highly desirable:
POST: to upload file
GET: you can appeal status of file through this method by file_id
Why? Because:
Single responsibility principle. One method of API does just one action
great separation of concerns
clean and clear way of how to use of this API for consumers

Restful api - Is it a good practice, for the same upsert api, return different HttpStatusCode based on action

So i have one single http post API called UpsertPerson, where it does two things:
check if Person existed in DB, if it does, update the person , then return Http code 200
if not existed in DB, create the Person, then return http 201.
So is it a good practices by having the same api return different statusCode (200,201) based on different actions(update, create)?
This is what my company does currently , i just feel like its weird. i think we should have two individual api to handle the update and create.
ninja edit my answer doesn't make as much sense because I misread the question, I thought OP used PUT not POST.
Original answer
Yes, this is an excellent practice, The best method for creating new resources is PUT, because it's idempotent and has a very specific meaning (create/replace the resource at the target URI).
The reason many people use POST for creation is for 1 specific reason: In many cases the client cannot determine the target URI of the new resource. The standard example for this is if there's auto-incrementing database ids in the URL. PUT just doesn't really work for that.
So PUT should probably be your default for creation, and POST if the client doesn't control the namespace. In practice most APIs fall in the second category.
And returning 201/200/204 depending on if a resource was created or updated is also an excellent idea.
Revision
I think having a way to 'upsert' an item without knowing the URI can be a useful optimization. I think the general design I use for building APIs is that the standard plumbing should be in place (CRUD, 1 resource per item).
But if the situation demands optimizations, I typically layer those on top of these standards. I wouldn't avoid optimizations, but adopt them on an as-needed basis. It's still nice to know if every resource has a URI, and I have a URI I can just call PUT on it.
But a POST request that either creates or updates something that already exists based on its own body should:
Return 201 Created and a Location header if something new was created.
I would probably return 200 OK + The full resource body of what was updated + a Content-Location header of the existing resource if something was updated.
Alternatively this post endpoint could also return 303 See Other and a Location header pointing to the updated resource.
Alternatively I also like at the very least sending a Link: </updated-resource>; rel="invalidates" header to give a hint to the client that if they had a cache of the resource, that cache is now invalid.
So is it a good practices by having the same api return different statusCode (200,201) based on different actions(update, create)?
Yes, if... the key thing to keep in mind is that HTTP status codes are metadata of the transfer-of-documents-over-a-network domain. So it is appropriate to return a 201 when the result of processing a POST request include the creation of new resources on the web server, because that's what the current HTTP standard says that you should do (see RFC 9110).
i think we should have two individual api to handle the update and create.
"It depends". HTTP really wants you to send request that change documents to the documents that are changed (see RFC 9111). A way to think about it is that your HTTP request handlers are really just a facade that is supposed to make your service look like a general purpose document store (aka a web site).
Using the same resource identifier whether saving a new document or saving a revised document is a pretty normal thing to do.
It's absolutely what you would want to be doing with PUT semantics and an anemic document store.
POST can be a little bit weird, because the target URI for the request is not necessarily the same as the URI for the document that will be created (ie, in resource models where the server, rather than the client, is responsible for choosing the resource identifier). A common example would be to store new documents by sending a request to a collection resource, that updates itself and selects an identifier for the new item resource that you are creating.
(Note: sending requests that update an item to the collection is a weird choice.)

Creating API - general question about verbs

I decided to move my application to a new level by creating a RESTful API.
I think I understand the general principles, I have read some tutorials.
My model is pretty simple. I have Projects and Tasks.
So to get the lists of Tasks for a Project you call:
GET /project/:id/tasks
to get a single Task:
GET /task/:id
To create a Task in a Project
CREATE /task
payload: { projectId: :id }
To edit a Task
PATCH /task/:taskId
payload: { data to be changed }
etc...
So far, so good.
But now I want to implement an operation that moves a Task from one Project to another.
My first guess was to do:
PATCH /task/:taskId
payload: { projectId: :projectId }
but I do not feel comfortable with revealing the internal structure of my backend to the frontend.
Of course, it is just a convention and has nothing to do with security, but I would feel better with something like:
PATCH /task/:taskId
payload: { newProject: :projectId }
where there is no direct relation between the 'newProject' and the real column in the database.
But then, the next operation comes.
I want to copy ALL tasks from Project A to Project B with one API call.
PUT /task
payload: { fromProject: :projectA, toProject: :projectB }
Is it a correct RESTful approach? If not - what is the correct one?
What is missing here is "a second verb".
You can see that we are creating a new task(s) hence: 'PUT' but we also 'copy' which is implied by fromProject and toProject.
Is it a correct RESTful approach? If not - what is the correct one?
To begin, think about how you would do it in a web browser: the world wide web is the reference implementation for the REST architectural style.
One of the first things that you will notice: on the web, we are almost always using POST to make changes to the server. You fill in a form in a browser, submit the form, the browser takes information from the input controls of the form to create the HTTP request body, the server figures out how to do the work that is described.
What we have in HTTP is a standardized semantics for messages that manipulate individual documents ("resources"); doing useful work is a side effect of manipulating documents (see Webber 2011).
The trick of POST is that it is the method whose standardized meaning includes the case where "this method isn't worth standardizing" (see Fielding 2009).
POST /2cc3e500-77d5-4d6d-b3ac-e384fca9fb8d
Content-Type: text/plain
Bob,
Please copy all of the tasks from project A to project B
The request line and headers here are metadata in the transfer of documents over a network domain. That is to say, that's the information we are sharing with the general purpose HTTP application.
The actual underlying business semantics of the changes we are making to documents is not something that the HTTP application cares about -- that's the whole point, after all.
That said - if you are really trying to do manipulation of document hierarchies in general purpose and standardized way, then you should maybe see if your problem is a close match to the WebDAV specifications (RFC 2291, RFC 4918, RFC 3253, etc).
If the constraints described by those documents are acceptable to you, then you may find that a lot of the work has already been done.

Rest API and UUID

One of the reasons, and probably the main one, to use UUID is to avoid having a "centralized" point responsible for creating and assigning ids to resources.
That means that, for REST APIs, the clients could (and should) be able to generate, and give the UUID for a certain resource when they POST that specific resource for the first time. That would minimize problems related with successfully posting a resource for the first time but not getting the ID back as response (connectivity problems for example). That can result in a new post for some of the clients, generating duplicated resources.
My question are:
Having UUID generated by clients and being exposed in the REST API is a best practice?
Are there any other alternatives to that?
REST does not really care if the UUID is generated by the server or by the client. It just needs a unique resource-identifier in form of an URI.
What form the URI has, is not important to clients and servers - only that they are unique and may be obtained by clients (HATEOAS). You need of course also a resource on the server side which is able to create the sub-resource for you and understands that you want to provide the UUID instead of generating an own one. Instead of a UUID you could f.e. also use a url-encoded title of a blog-post or like this question a combination of hash-value and question-title 31584303/rest-api-and-uuid to uniquely identify a resource.
Generating a UUID by a client is in my opinion not used that ofen in practice, but I may be wrong on this matter. The actual question is rather, why should a client really provide an own UUID instead of letting the server create one? The client is, IMO, only interested in getting the data to the server and having some way to retrieve it at some later point, which will be provided through the location header returned in the response of a POST request.
If connection issues are an actual concern, you could let the client send an empty POST to create a resource and send back the location within the header. The content is than added via a PUT request.
There still can be some connection issues involved:
request does not reach the server
response does not reach the client
While the primer one is no issue for the client as well as for the server as no operation is executed and the request can simply be resent, the latter one will actually create a resource at the server side, though the link never reaches the client. Therefore the actual resource is in an unreferencable state unless you provide a way to iterate over all resources, which contains also the empty ones.
A server can have a cleanup thread in the back which removes empty resources after a given amount of time. If a client sends a further empty POST request but this time also receives the URI of the created resource, he can update the state of the resource via PUT. PUT is idempotent. If the server did not receive the request, the client can simply resent it. PUT has the semantic of updating existing or creating a new resource if it is not yet available. So, the server can create the resource in that case with the provided content. If the request did reach the server, a further update does not change the state of the resource.
one advantage of client-generated UUID is: the client knows the resource key even before creating the resource. no need to read the response of the POST/PUT except for errors

RESTful - when should I use POST and GET?

This is my WCF service, where user can find message for him.
Simple:
[OperationContract]
[WebGet(UriTemplate = "/GetMessages/{UserGLKNumber}/{UserPassword}/{SessionToken}")]
Messages GetMessages(string SessionToken, string UserPassword, string UserGLKNumber);
I have concerns about that line: {UserGLKNumber}/{UserPassword}/{SessionToken}
I have to authenticate user, before he get that messages. But with GET method, I cannot send objects, like in POST.
Is it consistent with REST pattern?
Please, clear up my doubts.
There are already posts & question about this, I am summarizing all of them
POST verb is used when are you creating a new resource (a file in your case) and repeated operations would create multiple resources on the server. This verb would make sense if uploading a file with the same name multiple times creates multiple files on the server.
PUT verb is used when you are updating an existing resource or creating a new resource with a predefined id. Multiple operations would recreate or update the same resource on the server. This verb would make sense if uploading a file with the same name for the second, third... time would overwrite the previously uploaded file.
POST everytime you are modifying some state on the server like database update, delete. GET for readonly fetching like database select.
GET: Get a collection of entries (as a feed document) or a single entry (as an entry document).
POST: Create a new entry from an entry document.
PUT: Update an existing entry with an entry document.
DELETE: Remove an entry.
Source:Difference between PUT and POST using WCF REST
Another Useful reads are:
What's the difference between a POST and a PUT HTTP REQUEST?
http://www.codeproject.com/Articles/105273/Create-RESTful-WCF-Service-API-Step-By-Step-Guide
http://msdn.microsoft.com/en-us/magazine/dd315413.aspx
http://social.msdn.microsoft.com/Forums/vstudio/en-US/643e0d8b-80bb-45eb-8a84-318ac8de4497/difference-between-the-rest-verbs-put-and-post?forum=wcf
In terms of Restful services...
Post :
1. Its a secure to use in application rather than get.
2. Its not configure proxy server.
3. Big length of data restricted by web server.
4. Its not cached on browser.
5. Its take input as xml
Get :
1. Its a not secure to use in application rather than get.
2. Its configure proxy server.
3. Its use url encoding technique.
4. Its cached on browser.
5. Its a default if you are not declaring anyone.
6 Its take input as a string an returned a formatted output.