REST API for multiple actions on a file

REST API for multiple actions on a file - api

This is continuation of my question on how to design a REST API for a media analysis server. As per Derrel's answer, in my current design I start the analysis of a media file using a POST /facerecognition/analysisrequests?profileId=33 which specifies that profile ID 33 (previously created on the server by another POST) should be used.
I have two short questions:
How can I extend this approach to have multiple analysis requests on the same file, e.g. perform both face recognition, text detection, and ad detection on the given file? Is using a binary coding (e.g. each bit signifies an analysis) and e.g. doing POST http:[server URL]/00000011/analysisrequests?profileId=33 a good idea?
Is using a server side DB (e.g. mySQL) the best way to keep track of all the profile and process IDs?
Thanks,
C

I'd put the types of analysis requested as parameters, rather than as part of the path. They could be POST parameters in the body of the request, or specified in the URL list profileId. Example: POST http://server/analysisrequest?profileId=33&analysisType=faceRecognition&analysisType=textDetection. It's perfectly ok to submit multiple values for a parameter.
You could submit the binary encoding of the analysis type, but spelling it out is a lot more clear and self-documenting. The binary encoding is a bit fragile when adding a new analysis type as well; adding a new digit would affect the urls all requests, even those that don't use the new type.
A server side database is typical for this kind of web application and it's probably a good solution. You might also want to consider an in-process SQL database solution like sqlite or derby to avoid the complexity of a separate database process.

I would recommend making more complete use of HTTP POST. Make all POST requests against the same URI: /analysisrequest. Use application/x-www-form-urlencoded to send the parameters.
So:
Host: yourserver.com
Accept: */*
Content-Length: 73
Content-Type: application/x-www-form-urlencoded
face_recognition=true&text_detection=true&ad_detection=true&profile_id=33
multipart/form-data would also allow you to send the file being analyzed in the same request as the operations to perform on the file, assuming that's a desired scenario. With the added advantage that you ought to be able to use the exact same API end-point for both HTML forms and your REST API.

Related

Restful api - Is it a good practice, for the same upsert api, return different HttpStatusCode based on action

So i have one single http post API called UpsertPerson, where it does two things:
check if Person existed in DB, if it does, update the person , then return Http code 200
if not existed in DB, create the Person, then return http 201.
So is it a good practices by having the same api return different statusCode (200,201) based on different actions(update, create)?
This is what my company does currently , i just feel like its weird. i think we should have two individual api to handle the update and create.

ninja edit my answer doesn't make as much sense because I misread the question, I thought OP used PUT not POST.
Original answer
Yes, this is an excellent practice, The best method for creating new resources is PUT, because it's idempotent and has a very specific meaning (create/replace the resource at the target URI).
The reason many people use POST for creation is for 1 specific reason: In many cases the client cannot determine the target URI of the new resource. The standard example for this is if there's auto-incrementing database ids in the URL. PUT just doesn't really work for that.
So PUT should probably be your default for creation, and POST if the client doesn't control the namespace. In practice most APIs fall in the second category.
And returning 201/200/204 depending on if a resource was created or updated is also an excellent idea.
Revision
I think having a way to 'upsert' an item without knowing the URI can be a useful optimization. I think the general design I use for building APIs is that the standard plumbing should be in place (CRUD, 1 resource per item).
But if the situation demands optimizations, I typically layer those on top of these standards. I wouldn't avoid optimizations, but adopt them on an as-needed basis. It's still nice to know if every resource has a URI, and I have a URI I can just call PUT on it.
But a POST request that either creates or updates something that already exists based on its own body should:
Return 201 Created and a Location header if something new was created.
I would probably return 200 OK + The full resource body of what was updated + a Content-Location header of the existing resource if something was updated.
Alternatively this post endpoint could also return 303 See Other and a Location header pointing to the updated resource.
Alternatively I also like at the very least sending a Link: </updated-resource>; rel="invalidates" header to give a hint to the client that if they had a cache of the resource, that cache is now invalid.

So is it a good practices by having the same api return different statusCode (200,201) based on different actions(update, create)?
Yes, if... the key thing to keep in mind is that HTTP status codes are metadata of the transfer-of-documents-over-a-network domain. So it is appropriate to return a 201 when the result of processing a POST request include the creation of new resources on the web server, because that's what the current HTTP standard says that you should do (see RFC 9110).
i think we should have two individual api to handle the update and create.
"It depends". HTTP really wants you to send request that change documents to the documents that are changed (see RFC 9111). A way to think about it is that your HTTP request handlers are really just a facade that is supposed to make your service look like a general purpose document store (aka a web site).
Using the same resource identifier whether saving a new document or saving a revised document is a pretty normal thing to do.
It's absolutely what you would want to be doing with PUT semantics and an anemic document store.
POST can be a little bit weird, because the target URI for the request is not necessarily the same as the URI for the document that will be created (ie, in resource models where the server, rather than the client, is responsible for choosing the resource identifier). A common example would be to store new documents by sending a request to a collection resource, that updates itself and selects an identifier for the new item resource that you are creating.
(Note: sending requests that update an item to the collection is a weird choice.)

Appropriate REST design for data export

What's the most appropriate way in REST to export something as PDF or other document type?
The next example explains my problem:
I have a resource called Banana.
I created all the canonical CRUD rest endpoint for that resource (i.e. GET /bananas; GET /bananas/{id}; POST /bananas/{id}; ...)
Now I need to create an endpoint which downloads a file (PDF, CSV, ..) which contains the representation of all the bananas.
First thing that came to my mind is GET /bananas/export, but in pure rest using verbs in url should not be allowed. Using a more appropriate httpMethod might be cool, something like EXPORT /bananas, but unfortunately this is not (yet?) possible.
Finally I thought about using the Accept header on the same GET /bananas endpoint, which based on the different media type (application/json, application/pdf, ..) returns the corresponding representation of the data (json, pdf, ..), but I'm not sure if I am misusing the Accept header in this way.
Any ideas?

in pure rest using verbs in url should not be allowed.
REST doesn't care what spelling conventions you use in your resource identifiers.
Example: https://www.merriam-webster.com/dictionary/post
Even though "post" is a verb (and worse, an HTTP method token!) that URI works just like every other resource identifier on the web.
The more interesting question, from a REST perspective, is whether the identifier should be the same that is used in some other context, or different.
REST cares a lot about caching (that's important to making the web "web scale"). In HTTP, caching is primarily about re-using prior responses.
The basic (but incomplete) idea being that we may be able to re-use a response that shares the same target URI.
HTTP also has built into it a general purpose mechanism for invalidating stored responses that is also focused on the target URI.
So here's one part of the riddle you need to think about: when someone sends a POST request to /bananas, should caches throw away the prior responses with the PDF representations?
If the answer is "no", then you need a different target URI. That can be anything that makes sense to you. /pdfs/bananas for example. (How many common path segments are used in the identifiers depends on how much convenience you will realize from relative references and dot segments.)
If the answer is "yes", then you may want to lean into using content negotiation.
In some cases, the answer might be "both" -- which is to say, to have multiple resources (each with its own identifier) that return the same representations.
That's a normal thing to do; we even have a mechanism for describing which resource is "preferred" (see RFC 6596).

REST does not care about this, but the HTTP standard does. Using the accept header for the expected MIME type is the standard way of doing this, so you did the right thing. No need to move it to a separate endpoint if the data is the same just the format is different.

Media types are the best way to represent this, but there is a practical aspect of this in that people will browse a rest API using root nouns... I'd put some record-count limits on it, maybe GET /bananas/export/100 to get the first 100, and GET /bananas/export/all if they really want all of them.

How do I design a REST call that is just a data transformation?

I am designing my first REST API.
Suppose I have a (SOAP) web service that takes MyData1 and returns MyData2.
It is a pure function with no side effects, for example:
MyData2 myData2 = transform(MyData myData);
transform() does not change the state of the server. My question is, what REST call do I use? MyData can be large, so I will need to put it in the body of the request, so POST seems required. However, POST seems to be used only to change the server state and not return anything, which transform() is not doing. So POST might not be correct? Is there a specific REST technique to use for pure functions that take and return something, or should I just use POST, unload the response body, and not worry about it?

I think POST is the way to go here, because of the sheer fact that you need to pass data in the body. The GET method is used when you need to retrieve information (in the form of an entity), identified by the Request-URI. In short, that means that when processing a GET request, a server is only required to examine the Request-URI and Host header field, and nothing else.
See the pertinent section of the HTTP specification for details.

It is okay to use POST
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”
It's not a great answer, but it's the right answer. The real issue here is that HTTP, which is a protocol for the transfer of documents over a network, isn't a great fit for document transformation.
If you imagine this idea on the web, how would it work? well, you'd click of a bunch of links to get to some web form, and that web form would allow you to specify the source data (including perhaps attaching a file), and then submitting the form would send everything to the server, and you'd get the transformed representation back as the response.
But - because of the payload, you would end up using POST, which means that general purpose components wouldn't have the data available to tell them that the request was safe.
You could look into the WebDav specifications to see if SEARCH or REPORT is a satisfactory fit -- every time I've looked into them for myself I've decided against using them (no, I don't want an HTTP file server).

REST API for sending files between services

I'm building a microservice which one of it's API's expects a file and some parameters which the API will process and return a response for.
I've searched and found some references, mostly pointing towards form-data (multipart), however they mostly refer to client to service and not service to service like in my case.
I'll be happy to know what is the best practice for this case for both the client (a service actually) and me.

I would also suggest to perform a POST request (multipart) to a service endpoint that can process/accept a byte stream wrapped into the provided HTML body(s). A PUT request may also work in some cases.
Your main concerns will consist in binding enough metadata to the request so that the remote service can correctly handle it. This include in particular the following headers:
Content-Type: to provide the MIME type of the data being transferred and enable its proper processing.
Content-Disposition: to provide additional information about the body part such as the file name.
I personally believe that a single request is enough (in contrast to #Evert suggestion) as it will result in less overhead overall and will keep things simple (and RESTful) by avoiding any linking (or state) between successive requests.

I would not wrap data in form-data, because it just adds to the total body size. You can just put the entire raw file in the body of a PUT or POST request.
If you also need to send meta-data, I would suggest 2 requests. If you absolutely can't do 2 requests, form-data might still be the best option and it does work server-to-server.

What is http multipart request?

I have been writing iPhone applications for some time now, sending data to server, receiving data (via HTTP protocol), without thinking too much about it. Mostly I am theoretically familiar with process, but the part I am not so familiar is HTTP multipart request. I know its basic structure, but the core of it eludes me.
It seems that whenever I am sending something different than plain text (like photos, music), I have to use a multipart request. Can someone briefly explain to me why it is used and what are its advantages?
If I use it, why is it better way to send photos that way?

An HTTP multipart request is an HTTP request that HTTP clients construct to send files and data over to an HTTP Server. It is commonly used by browsers and HTTP clients to upload files to the server.
What it looks like
See Multipart Content-Type
See multipart/form-data

As the official specification says, "one or more different sets of data are combined in a single body". So when photos and music are handled as multipart messages as mentioned in the question, probably there is some plain text metadata associated as well, thus making the request containing different types of data (binary, text), which implies the usage of multipart.

I have found an excellent and relatively short explanation here.
A multipart request is a REST request containing several packed REST requests inside its entity.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas