Link relation granularity vs precision in a custom media type?

Link relation granularity vs precision in a custom media type? - api

I am in the process of designing a custom media type for a RESTful API, and have researched the types and semantic meaning of the some of the 'standard' link relations to give my design some steer.
To demonstrate the problem let's say that I have a resource that I can perform standard read, change, delete methods on and that I use the HTTP idioms of GET, PUT and DELETE respectively to implement those methods.
I could reasonably (re)use the "edit" link relation (from the IANA link registry) as defined in RFC5023 which states:
"...The value of "edit" specifies that the value of the href attribute
is the IRI of an editable Member Entry. When appearing within an
atom:entry, the href IRI can be used to retrieve, update, and delete
the Resource represented by that Entry...."
In this way, the user-agent can understand that a link with a "edit' relationship, will allow the resource to be GET, PUT and DELETEd.
However, and herein lies the problem, if the resource state is edited such that the resource now supports only GET and DELETE operations, the "edit" relation is no longer precise.
In order to retain the precision I need to either i) OPTION A: specify another (compound) link relation that supports GET & DELETE only, or ii) OPTION B: specify individual links for each possible state transfer and use the appropriate ones to indicate the permitted state transfers. The latter approach offers precision but seems overly verbose.
Alternatively, (OPTION C) I could leave the "edit" relationship in place and accept the lack of precision i.e. the link would convey the GET, PUT, DELETE semantics but a user-agent attempting a PUT would be met with an HTTP error '405 - Method not allowed'. However, I'm not happy with this approach either as it implies to the client a state transition which is not supported.
In summary, the question is what is the most sensible way to balance link relation generality and precision?

After some serious investigation I conclude that I'm trying to solve the wrong problem. Rather than be concerned with the granularity of HTTP verb in the definition of the Link Relation, a more refined question is 'Should the HTTP idioms (verbs) be conflated into the Link Relation?'.
I had used AtomPub as a reference of how to do Link Relations (for REST) and it turns out that this was an error. In the AtomPub mail archive Roy Fielding advises that (in REST terms) the approach to 'edit' is wrong and concludes that it is unnecessary. The argument suggests that there are other (HTTP) mechanisms to convey such properties and that they therefore have no place in 'rel' attribute.
The other mechanisms aren't made explicit in the mail archive, but I suspect they include the following options:
Let the user-agent try and examine the response (2xx or 4xx), or
Use OPTIONS to ask the resource for the permitted operations, or
Include an 'Allow' header in successful GET requests to convey
permitted resource operations to the user-agent.
Interestingly, Roy considers the 'Allow' header to be "a form of hypertext".
In summary, the answer to my own question is:
"Do not conflate HTTP operations into the meaning of 'rel' "
and
"Use the (provided) HTTP mechanisms to determine permitted resource operations"
Edit: I should add that there are some special uses of POST as data sink where these rules need to bent a little, but then they are a special case.

The WRML specification takes an approach where each "link" object can have a rel property.
GET /dogs/1
{
"links" : {
"self" : {
"href" : "http://api.example.com/dogs/1
"rel" : "http://api.example.com/relations/self"
}
}
}
And the client can then follow the rel url
GET /relations/self
{
"name" : "self"
"description" : " A reference back to the same object you are currently interacting with"
"method" : "GET"
}
The spec does recommend that each rel should have exactly 1 method specified. This has the enefit of being very explicit with your clients what they should do, and limits the amount of out of band knowledge that is required. I personally go back and forth on this, because I think there is some value in saying that certain "rel" provide multiple HTTP methods. Imagine a link for the owner of the dog
GET /dogs/1
{
"links" : {
"self" : {
"href" : "http://api.example.com/dogs/1
"rel" : "http://api.example.com/relations/self"
}
"owner" : {
"href" : "http://api.example.com/owner/1
"rel" : "http://api.example.com/relations/owner"
}
}
}
It would be nice to let "owner" imply GET and PUT since those are both valid actions. THe counter to that is you should always need to do a GET before doing an update so the value in giving that information prior to retrieving the resource is bad form.
So I guess all that said I would vote for OPTION B.

Another option would be to leave the "edit" relation, and allow a consumer who wants to know what they can currently perform on the resource to make a request with an OPTIONS HTTP method and the server can return a response with an Allow header to indicate the allowed methods on the resource given it's current state.
It doesn't give you the availability of the PUT operation without an extra request, but is fairly "clean" and lets you use a standard relation and HTTP mechanism.

Related

REST API Design: Can I use parameters to change the response structure?

I am trying to design a REST API that will return detailed information about an identifier that I pass in. For the sake of an example, I am passing in an identifier and returning information about a specific vehicle. The problem that I am facing is that there are many different kinds of vehicles, each with different unique properties. I am wondering if there is a way so that I can only return the relevant details with the REST API.
Currently I plan on having one endpoint /vehicles and passing in the identifier as a parameter.
My current request will consist of something like this GET /vehicles?id=123456
My current response structure will be something like this:
{
"vehicleDetails" : {
"color": "someColor",
"make: "someMake",
"model: "someModel",
"year: "someYear",
"carDetails": {
// some unique car fields
},
"motorcycleDetails" : {
// some unique motorcycle fields
},
"boatDetails" : {
// some unique boat fields
}
}
}
As you can see, there are some fields that are common to all vehicle types, but there are also fields that are unique to a certain type of vehicle, for example boatDetails. As far as I understand, I will have to return the entire resource which will have many empty fields. For example, when I request information about a car, I will still have boat and motorcycle details returned as part of the JSON response, even though they will all be empty. My concern with this is that the response payloads will be rather large when only a small subset of the fields will actually be used by the consumer. Would it make sense to add another parameter to filter the fields that come back? Something like /vehicles?id=123456&type=Car? Then in my code I could manipulate the response structure based on the type parameter? I feel that this violates REST best practices. Any advice into how I could change the design of my API would be appreciated.
Note: I cannot use GraphQL for this and would appreciate input about how I can improve this REST API design

Sure,query parameters (as well as matrix and path parameters) are fine from a REST standpoint. You'll end up with a unique URI that identifies a resource. Responses will be cacheable regardles what type of parameters you use. It is though questionable whether exposing the parameter as query parameter has any advantages over exposing it directly as path paramter, i.e. /vehicles/12345 in that case.
What IMO is more important than the actual form of the URI is the representation format you return. The proposed response format bears the potential of being treated as typed resource instead of utilizing a general, multi-purpose format and propper content-type negotiation. Think of HTML i.e. it defines the syntax and semantics of certain hyper-media controls you can use and any arbitrary Web client will be able to present it and operate on the response format. With custom formats, however, you will miss out on that feature.
If you only have one client, that moreover is under your control, the additional overhead you need to put in is probably not worth aiming for a REST environment in general, as it is easier to change the client once the server stuff changed. Though if you aim for a long-living environment that may be utilized by clients not under your control, this is for sure a thing to consider.

Can I use parameters to change the response structure?
Yes.
Example: https://www.merriam-webster.com/dictionary/JPEG
Notice that -- even though the URI says JPEG, the server is actually returning a text/html document, and the browser knows that from the Content-Type header in the HTTP response.
Identifiers are semantically opaque, in the same way that variable names are. We tend to choose spellings that make things easy for humans, but the machines don't care what spelling conventions we use.
So there's no constraint that says that /vehicles?id=1 and /vehicles?id=2 have to have the same "structure".
So you could have application/prs.rbanders-car, application/prs.rbanders-boat and application/prs.rbanders-rocketship, each with its own specific processing rules.
More likely, though, you'll want to piggyback on some other more common representation; so it's common to see a structured syntax suffix: application/prs.rbanders-car+json`, etc. Effectively, you are promising a json document, but with a more specific schema.
Of course, there's nothing stopping you from creating a application/prs.rbanders-vehicle+json schema, and then describing some fields as optional that will only appear if some condition is met.
Different options, different trade offs.
I feel that this violates REST best practices.
Not really - the important ideas are to handle metadata in a standard way (so that general purpose components can understand what is going on) and to use common formats where it makes sense to do so (to leverage the libraries that are already available).

With REST, do you use a body or query params when creating a Resource?

I'm working on a digital media library where users can create entries for a Media resource.
The media resource is made up of tons of properties, eg:
Media:{
id,
name,
type,
private,
}
the url users use to create a resource is
POST api/media
On the backend, we are creating the resource with a UID generated for them while defaulting name, type and private values. However, users can pass in name, type, private if they choose to.
RFC 4.3.3 doesn't seem to have an opinion on whether or not to use params or post body for these data.
So is it better to do this
api/media?type="audio"&name="Hopkins County Collective"&private=false
or with a body instead?
api/media
body{
name:
type:
private:
}
Althought after reading section 4.3.3 for POST here https://www.rfc-editor.org/rfc/rfc7231#section-4.3.3 and I see this piece
Providing a block of data, such as the fields entered into an HTML
form, to a data-handling process;
I'm leaning toward the post fields in the body but I'm still unsure.
Thanks

do you use a body or query params when creating a Resource?
The Body. But it can be more complicated than that.
HTTP gives us standardized message semantics - we all agree, by adopting the common standard, what a given message means. That doesn't necessarily constraint what we do with the message when we get it.
For example.
PUT /id=1 HTTP/?.?
Content-Type: text/plain
id=2
That message means that we want the resource identified by /id=1 to have the representation id=2. In other words, this is the future behavior intended by the client
GET /id=1 HTTP/?.?
200 OK
Content-Location: /id=1
Content-Type: text/plain
id=2
So the body describes what we want the representation to be, and the effective-uri identifies which document we are talking about.
The same basic pattern holds for POST and PATCH - the effective-uri tells us which resource we want to change, the body describes that change.
BUT...
You the server aren't actually required to do what the request asks you to do. You can reject the request (4xx), or you can do something similar to the request, and tell the client about that.
So you might, as part of the implementation hidden behind your REST facade, copy information from the effective-uri in addition to, or instead of, exactly applying the instructions provided by the client in the body of the request. (You have to be a little bit careful with the response metadata to ensure there's no ambiguity about what you did do).

Anecdotally, "just about everyone" seems to be using the body to represent what they want the created resource to look like, be, or contain.
Parameters are often not likely to be used at all, and if they are, only for, perhaps, controlling aspects of how that resource is to be created, not anything having to do with what the resource is to look like, be, or contain.
I say anecdotal, because I'm sure there are exceptions to this -- you're even contemplating it. That said, REST does not specifically say anything about parameters vs. body.
For the sake of conformity, and for the sake of "doing it like everyone else", go with body.
There are other considerations pointing away from parameters: 1) they are part of the URI, and URIs are used for identification purposes, 2) the query string length is highly constrained, so would prevent creating large objects, and 3) it would be a diagnostics/debugging nightmare parsing the query string in your head trying to make sense of it.

Efficiency of RESTful APIs

I'm currently creating an application (let's say, notes app, for instance - webapplication + mobile app). I wanted to use RESTful API here, so I read a lot about this topic and I found out there's a lot of ambiguity over there.
So let's start at the beginning. And the beginning in REST is that we have to first request the / (root), then it returns list of paths we can further retrieve, etc, etc. Isn't this the the first part where REST is completely wasteful? Instead of rigid paths, we have to obtain them each time we want to do something. Nah.
The second problem I encountered was bulk operations. How to implement them in REST? Let's say, user didn't have access to the internet for a while, made a few changes and they all have to be made on server as well. So, let's say user modified 50 notes, added 30 and removed 20. We have to make 100 separate requests now. A way to make bulk operations would be very helpful - I saw this stackoverflow topic: Patterns for handling batch operations in REST web services? but I didn't found anything interesting here actually. Everything is okay as long as you want to do one type of operation on one type of resources.
Last, but not least - retrieving whole collection of items. When writing an example app I mentioned - notes app - you probably want to retrieve all collection of items (notes, tags, available notes colors, etc...) at once. With REST, you have to first retrieve list of item links, then fetch the items one by one. 100 notes = over 100 requests.
Since I'm currently learning all this REST stuff, I may be completely wrong at what I said here. Anyway, the more I read about it, the more gruesome it looks like for me. So my question finally is: where am I wrong and/or how to solve problems I mentioned?

It's all about resources. Resources that are obtained through a uniform interface (usually via URI and HTTP methods).
You do not have to navigate through root every time. A good interface keeps their URIs alive forever (if they go stale, they should return HTTP Moved or similar). Root offering pathways to navigate is a part of HATEOAS, one of Roy Fieldings defined architectural elements of REST.
Bulk operations are a thing the architectural style is not strong on. Basically nothing is stopping you to POST a payload containing multiple items to a specific resource. Again, it's all up to what resource you are using/offering and ultimately, how your server implementation handles requests. Your case of 100 requests: I would probably stick with 100 requests. It is clean to write and the overhead is not that huge.
Retrieving a collection: It's about resources what the API decides to offer. GET bookCollection/ vs GET book/1 , GET/book/2 ... or even GET book/all. Or maybe GET book/?includeDetails=true to return all books with same detail as GETting them one-by-one.

I think that this link could give you interesting hints to design a RESTful service: https://templth.wordpress.com/2014/12/15/designing-a-web-api/.
That said, here are my answers to your questions:
There is no need to implement a resource for the root path. With this, I think that you refered to HATEOS. In addition, no link within the payload is also required. Otherwise you can use available formats like Swagger or RAML to document your RESTful service. This show to your end users what is available.
Regarding bulk operations, you are free to use methods POST or PATCH to implement this on the list resource. I think that these two answers could be helpful to you:
REST API - Bulk Create or Update in single request - REST API - Bulk Create or Update in single request
How to Update a REST Resource Collection - How to Update a REST Resource Collection
In fact, you are free to regarding the content you want for your methods GET. This means the root element managed by the resources (list resource and element resource) can contain its hints and also the ones of dependencies. So you can have several levels in the returned content. For example, you can have something like this for an element Book that references a list of Author:
GET /books
{
"title": "the title",
(...)
"authors": [
{
"firstName": "first name",
"lastName": last name"
}, {
(...)
},
(...)
]
}
You can notice that you can leverage query parameters to ask the RESTful service to get back the expected level. For example, if you want only book hints or book hints with corresponding authors:
GET /books
{
"title": "the title",
(...)
}
GET /books?include=authors
{
"title": "the title",
(...)
"authors": [
{
"firstName": "first name",
"lastName": last name"
}, {
(...)
},
(...)
]
}
You can notice that you can distinguish two concepts here:
The inner data (complex types, inner objects): data that are specific to the element and are embedded in the element itself
The referenced data: data that reference and correspond other elements. In this case, you can have a link or the data itself embedded in the root element.
The OData specification addresses such issue with its feature "navigation links" and its query parameter expand. See the following links for more details:
http://www.odata.org/getting-started/basic-tutorial/ - Section "System Query Option $expand"
https://msdn.microsoft.com/en-us/library/ff478141.aspx - Section "Query and navigation"
Hope it helps you,
Thierry

Resolving an API's own hypermedia links

Let's say we have a RESTful API method:
POST /people
{
"name" : "John",
"_links" : {
"address" : {
"href" : "/addresses/2"
}
}
}
You can see that address has a link to another resource.
To resolve the address_id of that resource, should the server:
Break up the URL and identify the "id" part of the route
Even make a curl request to itself, in order to get the address_id of that linked resource?

I feel, as you probably do, that #2 is the way it should be done, and #1 is just a hack based on internal knowledge. What if the client sent a href that was an absolute URI, http://yoursite.com/addresses/2 would you still want to use #1, and try to detect the ID?
Lets presume your site sends address resource representations with a defined and documented MIME type. What if the client sent a href value pointing to a third-party URI, which also returned responses of that format, and presuming you'd want to support that. You'd have to implement #2 anyway. In that instance there's little reason not to use it for your own (performance being the main one).
To be honest, what I would probably do is go with #1 until the need for #2 arose.

Confused about Http verbs

I get confused when and why should you use specific verbs in REST?
I know basic things like:
Get -> for retrieval
Post -> adding new entity
PUT -> updating
Delete -> for deleting
These attributes are to be used as per the operation I wrote above but I don't understand why?
What will happen if inside Get method in REST I add a new entity or inside POST I update an entity? or may be inside DELETE I add an entity. I know this may be a noob question but I need to understand it. It sounds very confusing to me.

#archil has an excellent explanation of the pitfalls of misusing the verbs, but I would point out that the rules are not quite as rigid as what you've described (at least as far as the protocol is concerned).
GET MUST be safe. That means that a GET request must not change the server state in any substantial way. (The server could do some extra work like logging the request, but will not update any data.)
PUT and DELETE MUST be idempotent. That means that multiple calls to the same URI will have the same effect as one call. So for example, if you want to change a person's name from "Jon" to "Jack" and you do it with a PUT request, that's OK because you could do it one time or 100 times and the person's name would still have been updated to "Jack".
POST makes no guarantees about safety or idempotency. That means you can technically do whatever you want with a POST request. However, you will lose any advantage that clients can take of those assumptions. For example, you could use POST to do a search, which is semantically more of a GET request. There won't be any problems, but browsers (or proxies or other agents) would never cache the results of that search because it can't assume that nothing changed as a result of the request. Further, web crawlers would never perform a POST request because it could not assume the operation was safe.
The entire HTML version of the world wide web gets along pretty well without PUT or DELETE and it's perfectly fine to do deletes or updates with POST, but if you can support PUT and DELETE for updates and deletes (and other idempotent operations) it's just a little better because agents can assume that the operation is idempotent.
See the official W3C documentation for the real nitty gritty on safety and idempotency.

Protocol is protocol. It is meant to define every rule related to it. Http is protocol too. All of above rules (including http verb rules) are defined by http protocol, and the usage is defined by http protocol. If you do not follow these rules, only you will understand what happens inside your service. It will not follow rules of the protocol and will be confusing for other users. There was an example, one time, about famous photo site (does not matter which) that did delete pictures with GET request. Once the user of that site installed the google desktop search program, that archieves the pages locally. As that program knew that GET operations are only used to get data, and should not affect anything, it made GET requests to every available url (including those GET-delete urls). As the user was logged in and the cookie was in browser, there were no authorization problems. And the result - all of the user photos were deleted on server, because of incorrect usage of http protocol and GET verb. That's why you should always follow the rules of protocol you are using. Although technically possible, it is not right to override defined rules.

Using GET to delete a resource would be like having a function named and documented to add something to an array that deletes something from the array under the hood. REST has only a few well defined methods (the HTTP verbs). Users of your service will expect that your service stick to these definition otherwise it's not a RESTful web service.

If you do so, you cannot claim that your interface is RESTful. The REST principle mandates that the specified verbs perform the actions that you have mentioned. If they don't, then it can't be called a RESTful interface.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas