RESTful way of getting a resource, but creating it if it doesn't exist yet - api

For a RESTful API that I'm creating, I need to have some functionality that get's a resource, but if it doesn't exist, creates it and then returns it. I don't think this should be the default behaviour of a GET request. I could enable this functionality on a certain parameter I give to the GET request, but it seems a little bit dirty.
The main point is that I want to do only one request for this, as these requests are gonna be done from mobile devices that potentially have a slow internet connection, so I want to limit the requests that need to be done as much as possible.
I'm not sure if this fits in the RESTful world, but if it doesn't, it will disappoint me, because it will mean I have to make a little hack on the REST idea.
Does anyone know of a RESTful way of doing this, or otherwise, a beatiful way that doesn't conflict with the REST idea?

Does the client need to provide any information as part of the creation? If so then you really need to separate out GET and POSTas otherwise you need to send that information with each GET and that will be very ugly.
If instead you are sending a GET without any additional information then there's no reason why the backend can't create the resource if it doesn't already exist prior to returning it. Depending on the amount of time it takes to create the resource you might want to think about going asynchronous and using 202 as per other answers, but that then means that your client has to handle (yet) another response code so it might be better off just waiting for the resource to be finalised and returned.

very simple:
Request: HEAD, examine response code: either 404 or 200. If you need the body, use GET.
It not available, perform a PUT or POST, the server should respond with 204 and the Location header with the URL of the newly created resource.

Related

GET vs POST API calls and cache issues

I know that GET is used to retrieve data from the server without modifying anything. Whereas POST is used to add data. I won't get into PUT/PATCH, and assume that POST is always used to update and replace data.
The theory is nice, but in practice I have encountered many situations where my GET calls need to be replaced with POST calls. This is because the response often gets incorrectly cached. Where I work there are proxy servers for security, caching, load balancing, etc., and often times the response for GET calls is directly cached to speed up the call, whereas POST calls never get fully cached.
So for my question, if I have an API call /api/get_orders/month. Theoretically, this should be a GET call, however, the number of orders might update any second. So if I call this API at any moment it may return for example 1000, and calling it just two seconds later should return 1001. However, because of the cache, and although adding a version flag such as ?v=<date_as_int> should ensure that the updated value is returned, there seems to be some caches in the proxy servers that might ignore this.
Basically, I don't feel safe enough using GET unless I know for certain that the data will not be modified or if I know for a fact that the response is always the updated data.
So, would you recommend using POST/GET in the case of retrieving daily/monthly number of orders. And if GET, with all the different and complex layers and server set-ups, how can one be certain that the data is always updated?
If you're doing multiple GET request and something is caching the data in between, you have no idea what it is or how to change it's behavior then POST is a valid workaround.
In any normal situation you would take the time what sits in between your browser and your server, and if there's something that's behaving in a way that doesn't make sense, I would try to investigate and fix that.
So you work at a place where some of that infrastructure exists. Maybe talk to the people that maintain it? But if that's not an option and you just want to find the 'ignore every convention and make my request work'-workaround, then you can use POST.

GET vs. POST when the request is some arbitrary size

I understand the semantics of GETting vs. POSTing, one endpoint should get data, the other should post it. The latter being a request that you may not wish the user to be able to easily replay.
That said, on the project I'm working on at the moment - the approach has been to POST to endpoints that are clearly responsible for responding with data, and these endpoints do not transform data in any way.
The reasoning behind this has been that the payloads are (potentially) of considerable size and seem more appropriate for a body as opposed to a query string.
Can anybody please shed light on which request would be right for a GET request that takes a large request payload? I'm not asking for opinion, I'm asking for what would be compliant to RESTful deisgn.
Further Context
The request is potentially large due to the fact it's a search DTO from the UI, where users may choose to pass any number of filters or search terms.
Can anybody please shed light on which request would be right for a GET request that takes a large request payload? I'm not asking for opinion, I'm asking for what would be compliant to RESTful deisgn.
Today's answer: It's OK to use POST.
For requests that are fundamentally read-only, we'd like to use standardized HTTP semantics to communicate that to general purpose components, so that they can themselves do intelligent things.
BUT: GET, while being both safe and ubiquitous, isn't an appropriate choice when you need to include a message-body in the request:
content received in a GET request has no generally defined semantics
So if you can't, for whatever reason, copy the information you need into a resource identifier, then GET is not an option.
Now, if your payloads are consistent with WebDAV, then you might be able to use one of the safe methods described in those specifications. But, as far as I can tell, they aren't really appropriate for general use.
Tomorrow's answer: the HTTP-WG accepted a proposal for a safe-method-with-body. So we should eventually expect to see a registered HTTP method that is safe and has defined semantic for the request content.
Then, depending on what those semantics are, we may be able to use it for requests like yours.

How do I design a REST call that is just a data transformation?

I am designing my first REST API.
Suppose I have a (SOAP) web service that takes MyData1 and returns MyData2.
It is a pure function with no side effects, for example:
MyData2 myData2 = transform(MyData myData);
transform() does not change the state of the server. My question is, what REST call do I use? MyData can be large, so I will need to put it in the body of the request, so POST seems required. However, POST seems to be used only to change the server state and not return anything, which transform() is not doing. So POST might not be correct? Is there a specific REST technique to use for pure functions that take and return something, or should I just use POST, unload the response body, and not worry about it?
I think POST is the way to go here, because of the sheer fact that you need to pass data in the body. The GET method is used when you need to retrieve information (in the form of an entity), identified by the Request-URI. In short, that means that when processing a GET request, a server is only required to examine the Request-URI and Host header field, and nothing else.
See the pertinent section of the HTTP specification for details.
It is okay to use POST
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”
It's not a great answer, but it's the right answer. The real issue here is that HTTP, which is a protocol for the transfer of documents over a network, isn't a great fit for document transformation.
If you imagine this idea on the web, how would it work? well, you'd click of a bunch of links to get to some web form, and that web form would allow you to specify the source data (including perhaps attaching a file), and then submitting the form would send everything to the server, and you'd get the transformed representation back as the response.
But - because of the payload, you would end up using POST, which means that general purpose components wouldn't have the data available to tell them that the request was safe.
You could look into the WebDav specifications to see if SEARCH or REPORT is a satisfactory fit -- every time I've looked into them for myself I've decided against using them (no, I don't want an HTTP file server).

Is it REST if I pass the following URI /apps/{id}?control=start

I'm in the process of designing a REST API for our web app.
POST > /apps > Creates an app
PUT > /apps/{id} > Updates the app
I want to start the apps.
Is this REST and if not, how can I make it more RESTful?
POST > /apps/{id}?control=start
Sun Cloud API does this: http://kenai.com/projects/suncloudapis/pages/CloudAPISpecificationResourceModels
Or is it better to:
2. PUT /apps/{id} and include a status parameter in the response Json/XML?
3. POST /apps/{id} and include a status parameter in the response Json/xml?
4. POST /apps/start?app={id}
I think the right question here is more whether the HTTP verbs are being used as intended rather than whether the application is or is not as RESTful as possible. However, these days the two concepts are pretty much the same.
The thing about PUT is that whatever you PUT you should be able to immediately GET. In other words, PUT does a wholesale replacement of the resource. If the resource stored at apps/5 is something that has a "control" attribute as part of its state, then the control=start part should be part of the representation you put. If you want to send just the new piece of the resource, you are doing a PATCH, not a PUT.
PATCH is not widely supported, so IMHO you should use a POST. POST has no requirements of safety or idempotency; generally you can do whatever you want with a POST (more or less), including patching parts of a resource. After all that is what you do when you create a new item in a collection with a POST. Updating a portion of a resource is not really much different.
Generally though you POST new data in the request body, not as query parameters. Query parameters are used mostly for GETs, because you are, well, querying. :)
Does starting an app changes it state? (to "running", for example) If it does what you're actually doing is updating the state of the resource (application). That seems like a good use for the PUT operation. Although as Ray said, if control is part of the state of the resource, the body of the PUT request should contain the state you're updating. I believe a partial update would be possible (CouchDB uses this).
On the other hand, if starting an app means creating a new resource (representing the app execution, for example), the POST method would be a great fit. You could have something like this:
POST /app/1/start
Which would result in a HTTP/1.1 201 Created. Then, to access the information on the created execution, you could use a URL like this:
GET /app/1/execution/1
To me, this would seem like a good "Restful" approach. For more information, check out this article.
PUT apps/{id}
I would PUT the app to update it's status from off to on
I like to do something like,
POST /runningapps?url=/app/1

Confused about Http verbs

I get confused when and why should you use specific verbs in REST?
I know basic things like:
Get -> for retrieval
Post -> adding new entity
PUT -> updating
Delete -> for deleting
These attributes are to be used as per the operation I wrote above but I don't understand why?
What will happen if inside Get method in REST I add a new entity or inside POST I update an entity? or may be inside DELETE I add an entity. I know this may be a noob question but I need to understand it. It sounds very confusing to me.
#archil has an excellent explanation of the pitfalls of misusing the verbs, but I would point out that the rules are not quite as rigid as what you've described (at least as far as the protocol is concerned).
GET MUST be safe. That means that a GET request must not change the server state in any substantial way. (The server could do some extra work like logging the request, but will not update any data.)
PUT and DELETE MUST be idempotent. That means that multiple calls to the same URI will have the same effect as one call. So for example, if you want to change a person's name from "Jon" to "Jack" and you do it with a PUT request, that's OK because you could do it one time or 100 times and the person's name would still have been updated to "Jack".
POST makes no guarantees about safety or idempotency. That means you can technically do whatever you want with a POST request. However, you will lose any advantage that clients can take of those assumptions. For example, you could use POST to do a search, which is semantically more of a GET request. There won't be any problems, but browsers (or proxies or other agents) would never cache the results of that search because it can't assume that nothing changed as a result of the request. Further, web crawlers would never perform a POST request because it could not assume the operation was safe.
The entire HTML version of the world wide web gets along pretty well without PUT or DELETE and it's perfectly fine to do deletes or updates with POST, but if you can support PUT and DELETE for updates and deletes (and other idempotent operations) it's just a little better because agents can assume that the operation is idempotent.
See the official W3C documentation for the real nitty gritty on safety and idempotency.
Protocol is protocol. It is meant to define every rule related to it. Http is protocol too. All of above rules (including http verb rules) are defined by http protocol, and the usage is defined by http protocol. If you do not follow these rules, only you will understand what happens inside your service. It will not follow rules of the protocol and will be confusing for other users. There was an example, one time, about famous photo site (does not matter which) that did delete pictures with GET request. Once the user of that site installed the google desktop search program, that archieves the pages locally. As that program knew that GET operations are only used to get data, and should not affect anything, it made GET requests to every available url (including those GET-delete urls). As the user was logged in and the cookie was in browser, there were no authorization problems. And the result - all of the user photos were deleted on server, because of incorrect usage of http protocol and GET verb. That's why you should always follow the rules of protocol you are using. Although technically possible, it is not right to override defined rules.
Using GET to delete a resource would be like having a function named and documented to add something to an array that deletes something from the array under the hood. REST has only a few well defined methods (the HTTP verbs). Users of your service will expect that your service stick to these definition otherwise it's not a RESTful web service.
If you do so, you cannot claim that your interface is RESTful. The REST principle mandates that the specified verbs perform the actions that you have mentioned. If they don't, then it can't be called a RESTful interface.