How to handle revisions and subresources in an HTTP REST API?

How to handle revisions and subresources in an HTTP REST API? - api

Background
I'm in the process of designing a RESTful HTTP API for discussions. Each discussion consists of metadata and a collection of comments. The URLs could look like:
/discussions/:discussionId
/discussions/:discussionId/comments/:commentId
Discussions have a revision: if the metadata is modified or if a comment is added, modified or removed, the discussion's revision is updated. Furthermore, optimistic concurrency control is used to make sure that the client applies updates to the most recent revision of a discussion. For example,
PUT /discussions/42?revision=5ac79c
can only succeed if 5ac79c is the most recent revision on the server. If it succeeds, the new discussion revision is returned to the client in the representation. For non-nested resources, all operations can be handled in a clean way (analogous to the CouchDB API). The problem arises with subresources.
Question
How should the new revision of the discussion be communicated to the client when a comment is created, updated or deleted?
For example: should the request
PUT /discussions/42/comments/23?revision=5ac79c
return the new revision of the discussion in the representation? It feels wrong to include the state of the parent resource.
Any hints?
Bonus: maybe there's also a better way to pass the current revision of the discussion than appending the query string ?revision=5ac79c? It's clear that the ETag and If-Match headers can be used to specify a revision – however, they don't solve the issue that the revision of the discussion is used for a comment.

Related

Restful api - Is it a good practice, for the same upsert api, return different HttpStatusCode based on action

So i have one single http post API called UpsertPerson, where it does two things:
check if Person existed in DB, if it does, update the person , then return Http code 200
if not existed in DB, create the Person, then return http 201.
So is it a good practices by having the same api return different statusCode (200,201) based on different actions(update, create)?
This is what my company does currently , i just feel like its weird. i think we should have two individual api to handle the update and create.

ninja edit my answer doesn't make as much sense because I misread the question, I thought OP used PUT not POST.
Original answer
Yes, this is an excellent practice, The best method for creating new resources is PUT, because it's idempotent and has a very specific meaning (create/replace the resource at the target URI).
The reason many people use POST for creation is for 1 specific reason: In many cases the client cannot determine the target URI of the new resource. The standard example for this is if there's auto-incrementing database ids in the URL. PUT just doesn't really work for that.
So PUT should probably be your default for creation, and POST if the client doesn't control the namespace. In practice most APIs fall in the second category.
And returning 201/200/204 depending on if a resource was created or updated is also an excellent idea.
Revision
I think having a way to 'upsert' an item without knowing the URI can be a useful optimization. I think the general design I use for building APIs is that the standard plumbing should be in place (CRUD, 1 resource per item).
But if the situation demands optimizations, I typically layer those on top of these standards. I wouldn't avoid optimizations, but adopt them on an as-needed basis. It's still nice to know if every resource has a URI, and I have a URI I can just call PUT on it.
But a POST request that either creates or updates something that already exists based on its own body should:
Return 201 Created and a Location header if something new was created.
I would probably return 200 OK + The full resource body of what was updated + a Content-Location header of the existing resource if something was updated.
Alternatively this post endpoint could also return 303 See Other and a Location header pointing to the updated resource.
Alternatively I also like at the very least sending a Link: </updated-resource>; rel="invalidates" header to give a hint to the client that if they had a cache of the resource, that cache is now invalid.

So is it a good practices by having the same api return different statusCode (200,201) based on different actions(update, create)?
Yes, if... the key thing to keep in mind is that HTTP status codes are metadata of the transfer-of-documents-over-a-network domain. So it is appropriate to return a 201 when the result of processing a POST request include the creation of new resources on the web server, because that's what the current HTTP standard says that you should do (see RFC 9110).
i think we should have two individual api to handle the update and create.
"It depends". HTTP really wants you to send request that change documents to the documents that are changed (see RFC 9111). A way to think about it is that your HTTP request handlers are really just a facade that is supposed to make your service look like a general purpose document store (aka a web site).
Using the same resource identifier whether saving a new document or saving a revised document is a pretty normal thing to do.
It's absolutely what you would want to be doing with PUT semantics and an anemic document store.
POST can be a little bit weird, because the target URI for the request is not necessarily the same as the URI for the document that will be created (ie, in resource models where the server, rather than the client, is responsible for choosing the resource identifier). A common example would be to store new documents by sending a request to a collection resource, that updates itself and selects an identifier for the new item resource that you are creating.
(Note: sending requests that update an item to the collection is a weird choice.)

What HTTP method should I use for an endpoint that updates a status field of multiple entities

I like to use the correct HTTP methods when I'm creating an API. And usually it's very straightforward. POST for creating entities, PUT for updating them, GET for retrieving etc.
But I have a use-case here where I will create an endpoint that updates the status of multiple objects given 1 identifier.
e.g.:
/api/v1/entity/update-status
But note that I mentioned multiple objects. The initial thought of my team would be to use map it as POST, but it won't actually be creating anything, plus if you were to call the same endpoint multiple times with the same identifier, nothing would change after the first time. Making it idempotent.
With this in mind, my idea was to create it as a PUT or even PATCH endpoint.
What do you smart people think?

I imagine PATCH would be the most correct way. Although if you use a PUT it would also not be incorrect.
The difference between the PUT and PATCH requests is reflected in the
way the server processes the enclosed entity to modify the resource
identified by the Request-URI. In a PUT request, the enclosed entity
is considered to be a modified version of the resource stored on the
origin server, and the client is requesting that the stored version be
replaced. With PATCH, however, the enclosed entity contains a set of
instructions describing how a resource currently residing on the
origin server should be modified to produce a new version. The PATCH
method affects the resource identified by the Request-URI, and it also
MAY have side effects on other resources; i.e., new resources may be
created, or existing ones modified, by the application of a PATCH.

Whilst it is a convention in REST APIs that POST is used to create a resource it doesn't necessarily have to be constrained to this purpose.
Referring back to the definition of POST in RFC 7231:
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. For example, POST is used for the following functions (among others):
Providing a block of data, such as the fields entered into an HTMl form, to a data-handling process
Posting a message to a bulletin board, newsgroup, mailing list, blog, or similar group of articles;
*Creating a new resource that has yet to be identified by the origin server; and *
Appending data to a resource's existing representation(s).
Clearly creation is only one of those purposes and updating existing resources is also legitimate.
The PUT operation is not appropriate for your intended operation because again, per RFC, a PUT is supposed to replace the content of the target resource (URL). The same also applies to PATCH but, since it is intended for partial updates of the target resource you can target it to the URL of the collection.
So I think your viable options are:
POST /api/v1/entity/update-status
PATCH /api/v1/entity
Personally, I would choose to go with the PATCH as I find it semantically more pleasing but the POST is not wrong. Using PATCH doesn't gain you anything in terms of communicating an idempotency guarantee to a consumer. Per RFC 5789: "PATCH is neither safe nor idempotent" which is the same as POST.

REST Api actions on resources outside standard

I am designing a HTTP API for a system and plan to adopt a REST style. Part of this system's functionality is to manage Windows Services across servers. This naturally suggests endpoints like GET /servers/:id/services to list known services on a server and GET /servers/:id/services/:name to detail one. In addition to being created, deleted and updated (typical POST, PUT, PATCH and DELETE), services can be stopped, started and restarted. It's these operations that I'm not sure exactly how to handle. There strikes me as being three reasonable solutions.
Provide reified commands, i.e. POST /servers/:id/services/:name/start, POST /servers/:id/services/:name/stop and POST /servers/:id/services/:name/stop, POST /servers/:id/services/:name/restart.
Treat the operational status as a sub resource, i.e. POST /servers/:id/services/:name/operation to start, DELETE /servers/:id/services/:name/operation to stop and POST /servers/:id/services/:name/restart_required to restart. (This last at least matches the internal behavior of the system - which does not immediately restart a service but posts to a db that the service is to be restarted.)
Make the operational status an attribute and use PATCH to perform these operations, i.e. PATCH /servers/:id/services/:name/ {"status": "started"}, PATCH /servers/:id/services/:name/ {"status": "stopped"} and PATCH /servers/:id/services/:name {"status": "restart_pending"}.
I'm not sure which is preferable. Option 1 is comprehensible, but it's not quite in the spirit of REST. Option 2 is conceptually a weird as it treats the attributes of a service as separate (if subordinate) entities. Option 3 is probably cleanest, the main difficulty is that it is not as obvious (would require reading the documentation for PATCH /servers/:id/services/:name more closely. Any ideas of which approach (including any I've missed) most cleanly exposes this functionality?

You shouldn't use action within resources paths. It's not really RESTful. I think the last option is the best one according REST principles.
In fact, patch allows to do partially updates of the resource state. You're free to put what you want in the request payload:
For example simply the new value of fields with something like that:
PATCH /servers/:id/services/:name/
Content-Type: application/json
{"status": "starting"}
and the response could be if successful:
200 OK
{"status": "started"}
There is a specification for PATCH content called JSON Patch for more advanced usage. For example, if you want to add (or remove, ...) something in the resource state. You could have a look at this post in the section "Implementing bulk updates" for a sample of use: http://restlet.com/blog/2015/05/18/implementing-bulk-updates-within-restful-services/.
Another option would be to use the method POST on the path /servers/:id/services/:name/ and provide in the content the operation to do. For example:
POST /servers/:id/services/:name/
Content-Type: application/json
{"command": "start"}
Hope it will help you,
Thierry

GET or PUT to reboot a remote resource?

I am struggling (in some sense) to determine which HTTP method is more appropriate for rebooting a remote resource: GET or PUT?
On one hand, it seems more semantic to call http://tools.serviceprovider.net/canopies/d34db33fc4f3?reboot=true because one might want to GET a representation of a freshly rebooted canopy.
On the other hand, a reboot is not 'safe' (nor is it necessarily idempotent, but then a canopy or modem is not just a row in a database) so it might seem more semantic to PUT the canopy into a state of rebooting, then have the server return a 202 to indicate that the reboot was initiated and is processing.
I have been reading up on HTTP/1.1, REST, HATEOAS, and other related concepts over the last week, so I am still putting the pieces together. Could a more seasoned developer please weigh in and confirm or dispel my hunch?

A GET doesn't seem appropriate because a GET is expected, like you said, to be "safe". i.e. no action other than retrieval.
A PUT doesn't seem appropriate because a PUT is expected to be idempotent. i.e. multiple identical operations cause same side-effects as as a single operation. Moreover, a PUT is usually used to replace the content at the request URI with the request body.
A POST appears most appropriate here. Because:
A POST need not be safe
A POST need not be idempotent
It also appears meaningful in that you are POSTing a request for a reboot (much like submitting a form, which also happens via POST), which can then be processed, possibly leading to a new URI containing reboot logs/results returned along with a 303 See Other status code.

Interestingly, Tim Bray wrote a blog post on this exact topic (which method to use to tell a resource representing a virtual machine to reboot itself), in which he also argued for POST. At the bottom of that post there are links to follow-ups on that topic, including one from none other than Roy Fielding himself, who concurs.

Rest is definitely not HTTP. But HTTP definitely does not have only four (or eight) methods. Any method is technically valid (even if as an extension method) and any method is RESTful when it is self describing — such as ‘LOCK’, ‘REBOOT’, ‘DELETE’, etc. Something like ‘MUSHROOM’, while valid as an HTTP extension, has no clear meaning or easily anticipated behavior, thus it would not be RESTful.
Fielding has stated that “The REST style doesn’t suggest that limiting the set of methods is a desirable goal. [..] In particular, REST encourages the creation of new methods for obscure operations” and that “it is more efficient in a true REST-based architecture for there to be a hundred different methods with distinct (non-duplicating), universal semantics.”
Sources:
http://xent.com/pipermail/fork/2001-August/003191.html
http://tech.groups.yahoo.com/group/rest-discuss/message/4732
With this all in mind I am going to be 'self descriptive' and use the REBOOT method.

Yes, you could effectively create a new command, REBOOT, using POST. But there is a perfectly idempotent way to do reboots using PUT.
Have a last_reboot field that contains the time at which the server was last rebooted. Make a PUT to that field with the current time cause a reboot if the incoming time is newer than the current time. If an intermediate server resends the PUT, no problem -- it has the same value as the first command, so it's a no-op.
You might want to get the current time from the server you're rebooting, unless you know that everyone is reasonably time-synced.
Or you could just use a times_rebooted count, eliminating the need for a clock. A PUT times_rebooted: 4 request will cause a reboot if times_rebooted is currently 3, but not if it's 4 or 5. If the current value is 2 and you PUT a 4, that's an error.
The only advantage to using time, if you have a clock, is that sometimes you care about when it happened. You could of course have BOTH a times_rebooted and a last_reboot_time, letting times_rebooted be the trigger.

In HTTP, does PUT and POST send data differently?

From what I know you can send JSON data via POST, but should PUT be specifically sending information in the URI or can you do both?
Thanks!

Both POST and PUT can be used for create and update operations in different situations. So what exactly is the difference between PUT and POST?
In a nutshell: use PUT if and only if you know both the URL where the resource will live, and the entirety of the contents of the resource. Otherwise, use POST.
POST is an incredibly general verb. Because it promises neither safety nor idempotence, and it has a relatively loosely-worded description in the RFC, you can use it for pretty much anything. In fact, you could make all of your requests POST requests because POST makes very few promises; it can behave like a GET, a PUT, or a DELETE if it wants to. It also can do some things that no other verb can do - it can create a new resource at a URL different from the URL in the HTTP request; and it can modify part of a resource without changing the whole thing (although the proposed but not widely-accepted PATCH method can do something similar).
PUT is a much more restrictive verb. It takes a complete resource and stores it at the given URL. If there was a resource there previously, it is replaced; if not, a new one is created. These properties support idempotence, which a naive create or update operation might not. I suspect this may be why PUT is defined the way it is; it's an idempotent operation which allows the client to send information to the server.
References:
RFC 2616 - HTTP 1.1
RFC 5789 - PATCH method for HTTP
Martin Fowler, the Richardson Maturity Model

From HTTP's point of view, the request format is the same.

You can send the request body the same way, it is just handled differently by your application code...
The POST verb is traditionally used to create a resource
The PUT verb is traditionally used to update a resource

PUT uploads a new resource on the server. If the resource already exists and is different, it is replaced; if it doesn't exist, it is created.
POST triggers an action on the server. It has side-effects and can be used to trigger an order, modify a database, post a message in a forum, or other actions.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas