RESTfully handling sub-resources - api

I've been creating a RESTful application and am undecided over how I should handle requests that don't return all entities of a resource or return multiple resources (a GET /resource/all request). Please allow me a few moments to setup the situation (I'll try to generalize this as much as possible so it can apply to others besides me):
Let's say I'm creating a product API. For simplicity, let's say it returns JSON (after the proper accept headers are sent). Products can be accessed at /product/[id]. Products have reviews which can be accessed at /products/[id]/review/[id].
My first question lies in this sub-resource pattern. Since you may not always want the reviews when you GET a product, they are accessible by another URI. From what I read I should include the URI of the request that will return all review URI's for a product in the response for a product request. How should I go about this so that it abides to RESTful standards? Should it be a header like Reviews-URI: /product/123/review/all or should I include the URL in the response body like so:
{ 'name': 'Shamwow',
'price': '$14.99',
'reviews': '/product/123/review/all'
}
My second question is about how the /product/[id]/review/all request should function. I've heard that I should just send the URL's of all of the reviews and make the user GET each of them instead of packaging all of them into one request. How should I indicate this array of review URIs according to RESTful standards? Should I use a header or list the URIs in the response body like so:
{ 'reviews': [ '/product/123/review/1',
'/product/123/review/2',
'/product/123/review/3'
]
}

Your problem is you're not using Hypermedia. Hypermedia specifically has elements that hold links to other things.
You should consider HAL, as this is a Hypermedia content type that happens to also be in JSON.
Then you can leverage the links within HAL to provide references to your reviews.

As to your first question (header or body), definitely do not invent your own custom header. Some here will argue that you should use the Link header, but I think you'll find plenty of need for nested links and should keep them in the body.
How you indicate either the URI to the reviews/ resource, or the list of URI's within that, is entirely up to the media type you select to represent each resource. If you're using HTML, for example, you can use an anchor tag. If you're using plain JSON, which has no hypermedia syntax, you'll have to spend some time in the documentation for your API describing which values are URI's, either by nominating them with special keys, or wrapping them in special syntax like {"link": "reviews/123"}, or with a related schema document.
Take a look at Shoji, a JSON-based media type which was designed explicitly for this pattern of subresources.

The JSON Schema standard might help you here, in particular Hyper-Schemas.
It lets you define how to extract link URIs from your data, and what their "rel"s are - essentially turning your JSON data into hyper-media. So for your first bit of data, you might write a schema like:
{
"title": "Product",
"type": "object",
"properties": {...},
"links": [
{"rel": "reviews", "href": "{reviews}"}
]
}
The value of href is a URI Template - so for example, if your data included productId, then you could replace the value of href with "/product/{productId}/review/all".
For the second bit of example data (the list of reviews) you might have a schema like this:
{
"type": "object",
"properties": {
"reviews": {
"type": "array",
"items": {
"links": [
{"rel": "full", "href": "{$}"}
]
}
}
}
}
In the URI Template of href, the special value of {$} means "the value of the JSON node itself". So that Hyper-Schema specifies that each item in the reviews array should be replaced with the data at the specified URL (rel="full").

Related

Is having two dependent resources not compliance with the RESTFul approach?

Context
In our project, we need to represent resources defined by the users. That is, every user can have different resources, with different fields, different validations, etc. So we have two different things to represent in our API:
Resource definition: this is just a really similar thing to a json schema, it contains the fields definitions of the resource and its limitations (like min and max value for numeric fields). For instance, this could be the resource definition for a Person:
{
"$id": "https://example.com/person.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Person",
"type": "object",
"properties": {
"firstName": {
"type": "string",
"description": "The person's first name."
},
"lastName": {
"type": "string",
"description": "The person's last name."
},
"age": {
"description": "Age in years which must be equal to or greater than zero.",
"type": "integer",
"minimum": 0
}
}
}
Resource instance: this is just an instance of the specified resource. For instance, for the Person resource definition, we can have the following instances:
[
{
"firstName": "Elena",
"lastName": "Gomez",
},
{
"firstName": "Elena2",
"lastName": "Gomez2",
},
]
First opinion
So, it seems this kind of presents some conflicts with the Restful API approach. In particular, I think it has some problems with the Uniform Interface. When you get a resource, you should be able to handle the resource without any additional information. With this design, you need to make an additional request to first get the resource definition. Let's see this with an example:
Suppose you are our web client. And you are logged in as an user with the Person resource. To show a person in the UI, you first need to know the structure of the Person resource, that is, you to do the following request: GET /resource_definitions/person. And then, you need to request the person object: GET /resource/person/123.
Second opinion
Others seem to think that this is not a problem and that the design is still RESTful. Every time you ask for something to an API, you need to know the format previously, is not self-documented in the API, so it makes sense for this endpoint to behave the same as the others.
Question
So what do you think? Is the proposed solution compliance with the RESTful approach to API design?
The simple solution is to add a link:
{
"_links": {
"describedby": {
"href": "https://example.com/person.schema.json",
"type": "application/schema+json"
}
},
"firstName": "Elena",
"lastName": "Gomez"
}
You could also put this in a header. This is semantically equivalent:
Link: <https://example.com/person.schema.json>; rel="describedby" type="application/schema+json"
It does not violate the uniform interface if there is no standard for this kind of stuff, but there is. RDF e.g. JSON-LD and schema.org vocab can handle most of these types. Even for REST there is an RDF vocab called Hydra, though the community is not that active nowadays.
As of the actual problem, I would look around, maybe RDF technologies or graph technologies are better for it, though I am not sure how much connection there is in your graph. If it is just a few types and instances, then I would probably stick to REST.
Ohh I see meanwhile, you used an actual JSON schema. Then that part is certainly uniform interface compatible. As of the instances you need to add something like type: "https://example.com/person.schema.json" and you are ok. Maybe a vendor specific JSON derived MIME type which describes what "type" means in this context if you want to be super precise or just use JSON-LD instead. https://www.w3.org/2019/wot/json-schema Or an alternative more common solution is using RDFS and XSD with JSON-LD instead of JSON Schema.

Can I compose two JSON Schemas in a third one?

I want to describe the JSON my API will return using JSON Schema, referencing the schemas in my OpenAPI configuration file.
I will need to have a different schema for each API method. Let’s say I support GET /people and GET /people/{id}. I know how to define the schema of a "person" once and reference it in both /people and /people/{id} using $ref.
[EDIT: See a (hopefully) clearer example at the end of the post]
What I don’t get is how to define and reuse the structure of my response, that is:
{
"success": true,
"result" : [results]
}
or
{
"success": false,
"message": [string]
}
Using anyOf (both for the success/error format check, and for the results, referencing various schemas (people-multi.json, people-single.json), I can define a "root schema" api-response.json, and I can check the general validity of the JSON response, but it doesn’t allow me to check that the /people call returns an array of people and not a single person, for instance.
How can I define an api-method-people.json that would include the general structure of the response (from an external schema of course, to keep it DRY) and inject another schema in result?
EDIT: A more concrete example (hopefully presented in a clearer way)
I have two JSON schemas describing the response format of my two API methods: method-1.json and method-2.json.
I could define them like this (not a schema here, I’m too lazy):
method-1.json :
{
success: (boolean),
result: { id: (integer), name: (string) }
}
method-2.json :
{
success: (boolean),
result: [ (integer), (integer), ... ]
}
But I don’t want to repeat the structure (first level of the JSON), so I want to extract it in a response-base.json that would be somehow (?) referenced in both method-1.json and method-2.json, instead of defining the success and result properties for every method.
In short, I guess I want some kind of composition or inheritance, as opposed to inclusion (permitted by $ref).
So JSON Schema doesn’t allow this kind of composition, at least in a simple way or before draft 2019-09 (thanks #Relequestual!).
However, I managed to make it work in my case. I first separated the two main cases ("result" vs. "error") in two base schemas api-result.json and api-error.json. (If I want to return an error, I just point to the api-error.json schema.)
In the case of a proper API result, I define a schema for a given operation using allOf and $ref to extend the base result schema, and then redefine the result property:
{
"$schema: "…",
"$id": "…/api-result-get-people.json",
"allOf": [{ "$ref": "api-result.json" }],
"properties": {
"result": {
…
}
}
}
(Edit: I was previously using just $ref at the top level, but it doesn’t seem to work)
This way I can point to this api-result-get-people.json, and check the general structure (success key with a value of true, and a required result key) as well as the specific form of the result for this "get people" API method.

Are resources state aware or static under hateoas/restful api

My question is about if resources should be aware of the state or statically defined. For example, I have an API that returns account information where the resource uri would be /api/accounts/2.
If I'm authenticated as user henk willemsa the resource would look like this:
{
"id": 2,
"firstname": "henk",
"lastname": "willemsa",
"birthday": "12-31-1980",
"email": "firstname.lastname#email.com",
"other": "other useless info",
"super-secret-info": "some super secret info"
}
Is it good practice to return the resource with stripped out data if you would be authenticated as another user? For instance, making a request to the same endpoint /api/accounts/2, but for a different user, jan smit, the returned response would be:
{
"id": 2,
"firstname": "henk",
"lastname": "willemsa"
"other": "other useless info"
}
The idea is that user jan smit is only allowed to see the public data, where henk willemsma sees the secret as well.
Would it be better for something like this be solved with 2 endpoints, where /api/accounts/2 would return a 403 for user jan smit and 200 for henk willemsa and another api endpoint /api/public-account/2 would return 200 for the both users? The later could give a response like:
{
"id": 2,
"firstname": "henk",
"lastname": "willemsa"
"other": "other useless info"
}
Having one endpoint and stripping out data would in my eyes be inconsistent, because the structure of the data-type/resource would change depending on who requests it and not because extra explicit data is sent, which changes the data-type/resource (like filter options).
But I can also see that splitting this out over multiple endpoints could cause for having lots and lots of different endpoints which basically do the same returning account information.
I also found this question, which somewhat describes what I'm looking for but is about collection calls. In my opinion, these are allowed to return different unique resource, but the data-types should always be the same. In my example, /api/accounts/ would always return a list of accounts, but depending on which user makes the request to the endpoint, while the size of the list could be different, it would always be a list of accounts.
What is the best approach?
The "best" approach can probably not be objectively defined. However, creating multiple resources for the same "thing" is probably not a good idea. Things should be identifiable by URI, so accounts should have a stable URI.
I would probably just omit the fields that the user can not see, if that is possible according to the data definitions/structure. If not, you could serve multiple 'representations', i.e. media-types, and let content-negotation handle the exchange. That means you create 2 media-types, one with the full data and one for the restricted view of the account, and serve both for the same resource URI. The server then can decide which representation you get based on your credentials. The client would also be able to easily see which representation it got, and inform the user if necessary that it has a restricted view of the account.
The client would have to ask with an 'Accept' header similar to this:
Accept: application/vnd.company.account-full; q=1.0, application/vnd.company.account-restricted; q=0.9,

How to construct intersection in REST Hypermedia API?

This question is language independent. Let's not worry about frameworks or implementation, let's just say everything can be implemented and let's look at REST API in an abstract way. In other words: I'm building a framework right now and I didn't see any solution to this problem anywhere.
Question
How one can construct REST URL endpoint for intersection of two independent REST paths which return collections? Short example: How to intersect /users/1/comments and /companies/6/comments?
Constraint
All endpoints should return single data model entity or collection of entities.
Imho this is a very reasonable constraint and all examples of Hypermedia APIs look like this, even in draft-kelly-json-hal-07.
If you think this is an invalid constraint or you know a better way please let me know.
Example
So let's say we have an application which has three data types: products, categories and companies. Each company can add some products to their profile page. While adding the product they must attach a category to the product. For example we can access this kind of data like this:
GET /categories will return collection of all categories
GET /categories/9 will return category of id 9
GET /categories/9/products will return all products inside category of id 9
GET /companies/7/products will return all products added to profile page of company of id 7
I've omitted _links hypermedia part on purpose because it is straightforward, for example / gives _links to /categories and /companies etc. We just need to remember that by using hypermedia we are traversing relations graph.
How to write URL that will return: all products that are from company(7) and are of category(9)? In otherwords how to intersect /categories/9/products and /companies/7/products?
Assuming that all endpoints should represent data model resource or collection of them I believe this is a fundamental problem of REST Hypermedia API, because in traversing hypermedia api we are traversing relational graph going down one path so it is impossible to describe such intersection because it is a cross-section of two independent graph paths.
In other words I think we cannot represent two independent paths with only one path. Normally we traverse one path like A->B->C, but if we have X->Y and Z->Y and we want all Ys that come from X and Z then we have a problem.
So far my proposition is to use query strings: /categories/9/products?intersect=/companies/9 but can we do better?
Why do I want this?
Because I'm building a framework which will auto-generate REST Hypermedia API based on SQL database relations. You could think of it as a trans compiler of URLs to SELECT ... JOIN ... WHERE queries, but the client of the API only sees Hypermedia and the client would like to have a nice way of doing intersections, like in the example.
I don't think you should always look at REST as database representation, this case looks more of a kind of specific functionality to me. I think I'd go with something like this:
/intersection/comments?company=9&product=5
I've been digging after I wrote it and this is what I've found (http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api):
Sometimes you really have no way to map the action to a sensible RESTful structure. For example, a multi-resource search doesn't really make sense to be applied to a specific resource's endpoint. In this case, /search would make the most sense even though it isn't a resource. This is OK - just do what's right from the perspective of the API consumer and make sure it's documented clearly to avoid confusion.
What You want to do is to filter products in one of the categories ... so following Your example if we have:
GET /categories/9/products
Above will return all products in category 9, so to filter out products for company 7 I would use something like this
GET /categories/9/products?company=7
You should treat URI as link to fetch all data (just like simple select query in SQL) and query parameters as where, limit, desc etc.
Using this approach You can build complex and readable queries fe.
GET /categories/9/products?company=7&order=name,asc&offset=10&limit=20
All endpoints should return single data model entity or collection of
entities.
This is NOT a REST constraint. If you want to read about REST constraints, then read the Fielding dissertation.
Because I'm building a framework which will auto-generate REST
Hypermedia API based on SQL database relations.
This is a wrong approach and has nothing to do with REST.
By REST you describe possible resource state transitions (or operation call templates) by sending hyperlinks in the response. These hyperlinks consist of a HTTP methods and URIs (and other data which is not relevant now) if you build the uniform interface using the HTTP and URI standards, and we usually do so. The URIs are not (necessarily) database entity and collection identifiers and if you apply such a constraint you will end up with a CRUD API, not with a REST API.
If you cannot describe an operation with the combination of HTTP methods and already existing resources, then you need a new resource.
In your case you want to aggregate the GET /users/1/comments and GET /companies/6/comments responses, so you need to define a link with GET and a third resource:
GET /comments/?users=1&companies=6
GET /intersection/users:1/companies:6/comments
GET /intersection/users/1/companies/6/comments
etc...
RESTful architecture is about returning resources that contain hypermedia controls that offer state transitions. What i see here is a multistep process of state transitions. Let's assume you have a root resource and somehow navigate over to /categories/9/products using the available hypermedia controls. I'd bet the results would look something like this in hal:
{
_links : {
self : { href : "/categories/9/products"}
},
_embedded : {
item : [
{json of prod 1},
{json of prod 2}
]
}
}
If you want your client to be able to intersect this with another collection you need to provide to them the mechanism to perform this. You have to give them a hypermedia control. HAL only has links, templated links, and embedded as control types. let's go with links..change the response to:
{
_links : {
self : { href : "/categories/9/products"},
x:intersect-with : [
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 1",
title : "Company 6 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 2",
title : "Company 5 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 3",
title : "Company 7 products"
}
]
},
_embedded : {
item : [
{json of prod 1},
{json of prod 2}
]
}
}
Now the client just picks the right hypermedia control (aka link) based on the title field of the link.
That's the simplest solution. But you'll probably say there's 1000's of companies i don't want 1000's of links...well ok if that;s REALLY the case...you just offer a state transition in the middle of the two we have:
{
_links : {
self : { href : "/categories/9/products"},
x:intersect-options : { href : "URL to a Paged collection of all intersect options"},
x:intersect-with : [
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 1",
title : "Company 6 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 2",
title : "Company 5 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 3",
title : "Company 7 products"
}
]
},
_embedded : {
item : [
{json of prod 1},
{json of prod 2}
]
}
}
See what i did there? an extra control for an extra state transition. JUST LIKE YOU WOULD DO IF YOU HAD A WEBPAGE. You'd probably put it in a pop up, well that's what the client of your app can do too with the result of that control.
It's really that simple...just think how you'd do it in HTML and do the same.
The big benefit here is that the client NEVER EVER needed to know a company or category id or ever plug that in to some template. The id's are implementation details, the client never knows they exist, they just executed Hypermedia controls..and that is RESTful.

RESTful web service: create new resource by combining other resources: provide IDs or URIs?

When obtaining a collection of items from a RESTful web service (via GET), the representation of each single item (e.g. in JSON) usually contains the item's resource identifier. This can either be the ID of the resource or the entire URI which usually contains the ID.
This identifier (ID or URI) is required in case the client needs to further interact with the remote resource representing the single item. Many people seem to consider it good practice to provide the entire URI and not only the ID, so that the client has nothing to do with URI construction (for example, this is what Miguel Grinberg writes in this article).
But what should be done in case multiple items are to be combined in order to create a new resource? Then the client needs to tell the server which items are to be combined. And eventually, the server requires a list of IDs for processing the request. Assuming that the client retrieved URIs for each item in the first place -- where would you perform the URI parsing in order to extract the raw IDs again: in the client or in the server?
Example: the client retrieved a collection of pages in a GET request. Each page item identifies itself with an URI (containing the ID):
{
"pages": [
{
"content": "bla bla",
"uri": "/pages/1"
},
{
"content": "that is no interesting content",
"uri": "/pages/2"
},
...
]
}
Now assume that the client instructs the server to create a new resource combining multiple pages: a book, built by pages 1 and 2. The POST request body can either contain IDs or URIs:
{
"title": "A Boring Book",
"pages": [1, 2]
}
or
{
"title": "A Boring Book",
"pages": ["/pages/1", "/pages/2"]
}
In the first case, the clients needs to know the structure of the URI and extract the ID before sending the request. In the second case the server needs to extract the ID from the URI.
On the one hand, I like the idea of resources being represented on the client side by URIs only. On the other hand, I also like to keep things simple and pragmatic and why should we send entire URIs to the server when the context is clear and only the IDs are needed (the book creation does not directly act on page resources)?
What would you prefer and why? Or do you think that this is really not too important?
Do you think the following approach would be a good compromise? Client-side extraction of the ID from the URI by parsing the URI from right to left and extracting the number after the rightmost slash, i.e. assuming a certain URI structure without the need to hardcode the entire path.
I think that clients should receive absolute URLs from the server and only use these without any kind of modification. Therefore, I would even go one step further beyond your last example:
{
"title" : "A Boring Book",
"pages" : [ "http://.../pages/1", "http://.../pages/2" ]
}
Only the server should be responsible to extract Ids from URLs if necessary.