Multi-page document as images over REST (additional questions) - api

I read this question - Multi-page document as images over REST which has very nice answer. We have similar scenario but with some additional requirement.
The request can either return the document as binary data or upload the pdf/jpeg(s) somewhere and return a URL to the uploaded file.
How should the calls be different? What is coming to my mind is to use GET for the binary data response and POST when it uploads and returns URL. When it is POST it will return 201 (Created) and return the URL to the uploaded document.
There is also one more detail about the first point. When uploading and returning a URL we can either upload for a temporary usage (and delete it after some time) or make it permanent. Should we do this by adding query parameter? like - POST /documents/12345.pdf?permanent=true
As in the original question, we also return different jpeg for every page. But the process of creating the document is time consuming so we want to take all pages at one request. What would be the proper way of doing this? I think in the 201 response you can only return one URL.
Is the GET /documents/12345.pdf approach a good alternative of the accept header or there are also other approaches? Would using the slash instead of dot be better, like GET /documents/12345/pdf? This way it wouldn't be needed to parse the "12345.pdf" string.
Thanks!

I've learned a thing or two in the past two years lets see if I can take a stab at this...
You can make the POST request always return the URI of the newly created resource. Do you ever need the POST to return the binary if you're the one providing it in the first place? I think I'm tracking with you in this case the GET would return the binary and the POST would create the resource and return the location (URI) of the newly created resource.
This isn't RESTful as GETs should be safe (GETs don't modify resources). You could make it a PUT if you want to update the status after the fact or when you POST you could add a flag in the request body to mark it should be permanent. Info on POSTing an image and data can be found here
For #3 do you want to fetch or upload everything in one request? You can return whatever you want in the response. You could structure your request in such a way to return a json array of urls. For example if you structure your request similar to my example then GET /documents/12345/1 may return page 1 of document 12345 while GET /documents/12345 may return all pages to document 12345.
For #4 I wouldn't do /pdf as that represents resource hierarchy. PDF is only the representation of the document, not the actual document. I would vote querystring since there may be other attributes. For example imagine trying to do the following with slashes... GET /documents/12345?format=pdf&color=grayscale&quality=300dpi&layout=portrait

Related

REST API: What HTTP return code for no data found? [duplicate]

This question already has answers here:
What is the proper REST response code for a valid request but an empty data?
(28 answers)
Closed 1 year ago.
If someone could please help settle this argument we might actually get this system finished LOL :^)
So, if you have a REST API.. for.. say.. returning patient details...
And you send in a request with a patient id...
But no patient with that patient id actually exists in the database..
What response should your API return?
1. a 404 ?
2. a 204 ?
3. a 200 with something in the body to indicate no patient found..
Thanks
Use a 404:
404 Not Found
The server can not find the requested resource. In the browser, this means the URL is not recognized. In an API, this can also mean
that the endpoint is valid but the resource itself does not exist.
Servers may also send this response instead of 403 to hide the
existence of a resource from an unauthorized client. This response
code is probably the most famous one due to its frequent occurrence on
the web.
From MDN Web docs https://developer.mozilla.org/en-US/docs/Web/HTTP/Status
What response should your API return?
It Depends.
Status codes are metadata in the transfer of documents over a network domain. The status code communicates the semantics of the HTTP response to general purpose components. For instance, it's the status code that announces to a cache whether the message body of the response is a "representation of the resource" or instead a representation of an error situation.
Rows in your database are an implementation detail; as far as REST is concerned, there doesn't have to be a database.
What REST cares about is resources, and in this case whether or not the resource has a current representation. REST doesn't tell you what the resource model should be, or how it is implemented. What REST does tell you (via it's standardized messages constraint, which in this case means the HTTP standard) is how to describe what's happening in the resource model.
For example, if my resource is "things to do", and everything is done, then I would normally expect a GET request for "things to do" to return a 2xx status code with a representation announcing there is nothing to do (which could be a completely empty document, or it could be a web page with an empty list of items, or a JSON document.... you get the idea).
If instead the empty result set from the database indicates that there was a spelling error in the URI, then a 404 is appropriate.
It might help to consider a boring web server, and how retrieving an empty file differs from retrieving a file that doesn't exist.
But, as before, in some resource models it might make sense to return a "default" representation in the case where there is no file.
if you have a REST API.. for.. say.. returning patient details...
Is it reasonable in the resource model to have a document that says "we have no records for this patient"?
I'm not a specialist in the domain of medical documents, but it sounds pretty reasonable to me that we might get back a document with no information. "Here's a list of everything we've been told about this patient" and a blank list.
What response should your API return?
If you are returning a representation of an error - ie, a document that explains that the document someone asked for is missing, then you should use a 404 Not Found status code (along with other metadata indicating how long that response can be cached, etc).
If you are returning a document, you should use a 200 OK with a Content-Length header.
204 is specialized, and should not be used here. The key distinction between 204 and 200 with Content-Length 0 is the implications for navigation.

Good practices for designing REST api relations

We currently trying to design some REST api for our webservices. We have two resources, 'record' and 'expedition'. As far as we know, a record can be associated with multiple expeditions, and an expedition can be associated with one record (but not necessarily).
When we create an expedition, and we want to "attach" it to a record, we have come to two solutions :
POST /expeditions?recordId=xxx
POST /records/xxx/expeditions
and a POST /expeditions WS to create expeditions independently.
My colleague suggested the first approach, but I found the second the most usual way to do so. I have not found articles on the web presenting the first approach as a good or bad design.
So, which solution is the good one for you ? Which kind of consideration can help us to choose ?
Thank you.
Which kind of consideration can help us to choose ?
Think about cache-invalidation.
HTTP is about document transfer. We obtain information from the server by asking for a copy of a document; that request might be handled by the server itself, or it might be handled by a cache that has a valid copy of the document.
We send information to a server by proposing edits to documents - POST being the most common method used to do that (via HTML forms).
When an edit is successful, it follows that the previously cached copies of the document are out of date, and we would really prefer that they be replaced by the updated copy.
General purpose cache invalidation is kind of limited; in particular, it doesn't support arbitrary invalidation of documents. Only the target-uri, Location, and Content-Location are invalidated.
Therefore, when we are designing our resource interactions, we want to consider this limitation.
That usually means that the request that we use to change a document should have the same target-uri as the request to read that same document.
(Yes, that means that if we are going to have different kinds of edits to the document, all of the different edits share the same target-uri, and we disambiguate the edit by looking at other parts of the request -- for instance by parsing the body.)
POST /records/xxx/expeditions and a POST /expeditions WS to create expeditions independently.
That's not required - the server is permitted to apply changes to more than one document; HTTP constrains the meaning of the request, but does not constrain the effects.
That said, general purpose caches won't magically know that both documents have been edited. To some degree, part of what you are choosing in your design is which document needs to be refreshed now, and which ones can be out of date for a time (typically until the cached representation reaches its max age).
For the special case where your response to the successful edit is going to be a copy of the updated representation of the resource, you have a little bit more freedom, because use can use the Content-Location header to identify which document we are returning in the response, and that header is automatically invalidated.
POST /foo/bar
...
200 OK
Content-Location: /foo
In this sequence, general purpose headers will invalidated their cached copies of both /foo and /foo/bar.
(of course, there are still issues, in so far as we don't have a mechanism to return both the updated copy of /foo and the updated copy of /bar in a single response. So instead we need to look into other ideas, like server push).
Design the URL paths in a way that the resources can easily be retrieved.
Query string/parameter present in the URL mentioned in the first approach is typically used to locate a resource and perhaps a little counter intuitive to me.
The second approach, perhaps this would work as you are creating an expedition under an associated record xxx i.e. /records/xxx/expeditions. But it could get challenging in a scenario where an expedition is not related to any record.
Another alternative thought here is to link the expedition and record through the payload i.e. have the record id XXX within the POST payload during the "expedition" resource creation. POST /expedition => This operation would return you an expedition id in response as the resource newly gets created. To retrieve the data, you could then use GET /expedition/XXX/record where XXX is the expedition id and you retrieve the record corresponding to XXX. You don't need to mention a record id in this case. You either get a associated record or you don't(in case there is no record tied to the expedition). To retrieve the expedition itself, the URL could be GET /expedition/XXX.

Rest API resources: composed for post but separate to get is ok?

I'm creating a set of RESTful API and I have this question: I have a resource that has some images attached and I'm creating my resources like this:
POST to /resource: create the object and save the images to the server.
GET to /resource returns the object but no the images, GET to /images returns the images.
My question is: is this compliant with REST or should I either make the resources completely separate or completely unified?
I choose this solution because when posting I for sure send the images, but I may need them or not when doing a get.
is this compliant with REST
It's fine - for instance, if you look at the way that Atom Publishing works, you'll see that when you post a media to a collection, two resources are created (the Media Resource and the Media Link Entry).
However, fine here means you are making tradeoffs. In particular, cache invalidation is more challenging when you a POST to one resource may change the representation of a different resource.
HTTP caching semantics are optimized for the common case, which is that a given request changes only the target resource; no spooky action at a distance.
That doesn't mean that this is the Right Way, but rather that you should understand that this is the way that HTTP makes easy, and in other cases you need to pay attention to the details.
POST can basically mean anything, and unlike PUT and GET (for which there is a bit more of a direct relation), POST is allowed to result in all kinds of side-effects, including the creation of 0 or more resources.
POST is basically the 'anything goes' method.

How should I provide DocuSign with a PDF?

We're using Python and the Requests library to add PDFs to DocuSign envelopes using the Add document method of the REST API v2:
response = requests.put(
'<base URL>/envelopes/<envelope ID>/documents/<next document ID>',
files={'document': <the PDF file object>}, # <- added to the request's body
headers=self._get_headers(
{
'Content-Disposition': 'document; filename="the-file.pdf";'
}
),
timeout=60
)
This has worked for us in most cases, except that about 1 in 100 PDFs isn't accepted via the API. When this problem occurs we tell our users to upload the PDFs directly through the DocuSign UI, which works. This prompted us (with the help of support) to look at the Document params link that appears above the example request on the Add document page linked above. The page shows a documentBase64 attribute, and a number of other fields. How can I supply the document in this format, with all the fields specified? Should I replace the file in my call above with files={'document': <JSON-encoded object>} or? I can't figure out how to add a document OTHER than the way we're currently doing it. Is there another way I'm missing?
It looks like there are now two different ways to add a document to a Draft Envelope with the REST API:
Use a multi-part request, where the first part contains the JSON body and each subsequent part contains a document's bytes -- in un-encoded format. An example of this approach is shown on pages 136-137 of the REST API guide (http://www.docusign.com/sites/default/files/REST_API_Guide_v2.pdf).
Use a normal request (i.e., not multi-part request), and supply document bytes in base64-encoded format as the value of the documentBase64 property for each document object in the Request. (This looks to be new as of the recent December 2013 API release/update.)
Based on the info you've included in your question, I suspect you're currently using approach #1. As described above, the main difference between the two approaches is the general structure of the request, and ALSO -- approach #1 expects document bytes to be un-encoded, while approach #2 expects document bytes to be base64-encoded. I suspect your issue has to do with encoding of files. i.e., if you're using approach #1 and any of the files are encoded, you'll likely have issues.

Implementing complex search in RESTfull service

We are trying to implement a complex search functionality in Rest service using coldfusion 10.
Something like projectid=1 and active=1 and (ManagerName contains John or ManagerName contains alfred)
One way of doing this is ?projectid=1&active=1&ManagerName=[John,Alfred]. However this does not serve my purpose as the ManagerName search will not return the required result. Also, as the number of search filter grows, the query string becomes tough to handle.
I tried to get an xml (with all search filter) as an input through HTTP Get Request, but that did not help, as GetHTTPRequestData() does not reflect the xml content.
Is there any means to pass an xml/json through an HTTP Get Request??
Will it be a bad practice, if xml is passed by a HTTP Post Request??
Are there any other options to pass complex filter param to a REST service??
i have gone through a lot of post on the site for similar question, but still could not find a solution to my problem.
GET should be idempotent it should not modify the state of the Resource. strictly limit get usage to Read operations
your POST triggers Resource Creation i.e along with payload (xml/json).It is very bad practice to use POST for search.
Also you should take-care of the Cache-Control as your GET requests might get cached and if your search is real time you might get stale data.
You can take example as stackoverflow itself
https://stackoverflow.com/questions/tagged/rest?sort=newest&pagesize=30
In above URl, path elements questions , tagged, rest derive subset of question resources.
The query parameters suggest filtering of those which met the creiteria.