REST API Set of possible values: strings or integers?

REST API Set of possible values: strings or integers? - api

I'm designing a REST API and have run across this issue:
How should a set of values be defined?
Say I have a Picture object that is going to be requested at http://myserver.com/api/getPicture/1
so the server responds:
{
url : "http://myserver.com/pictures/1.jpg",
taken_at : "1/1/2012"
}
Now, say I wanted to add a color_depth field.
Two possible choices to do this are:
color_depth : "BLACK_WHITE" or "COLOR" or "GRAYSCALE"
color_depth : "0" OR "1" OR "2" //would need to map these to their meaning somewhere
Is there a standard for what to do in this situation?

For JSON, there isn't any de-facto or official standard. JSON-SCHEMA tries to solve this, but the specs aren't recommended yet and even implementations aren't popular.
Using XML, XML Schema is the standard solution. For RDF, there is also RDFS that solves this problem.
For every format, the coice is yours. Depending on integer identifiers (1, 2, 3) and translating them without a schema means that your requests are far less self-contained than strings that express what they mean like "COLOR". It is a core concept of RESTful API design that requests should be self-contained. This loosely relates to the visibility property of RESTful architectures described in the Roy Fielding dissertation.
I would go for full strings.

Related

API modelling: Two resources with the same name

I have an API modelling question I need help with.
Let's assume I have Product representation that consists of:
Attribute
Type
id
string
names
MultilingualName List
price
decimal
MultilingualName is represented by:
Attribute
Type
locale
string
name
string
This is to support multiple languages, so that a store owner user is able to create products (and maintain/CRUD) within a product shelf that supports multiple languages.
Our POST API (/products) may look like this:
{
"id": "123abcxyz"
"names": [
{
"locale": "en-CA",
"name": "Pencil"
},
{
"locale": "fr-CA",
"name": "Crayon"
}
],
"price": 1.99
}
The problem I am having trouble with is with the consumer user based GET end point for Product.
I would like to model the resource without having the MultilingualName language complexity. So for example, the GET endpoint would look like this:
Attribute
Type
id
string
name
string
price
decimal
The end point would return the product name based on the user's preferred language, which is already known.
But clearly I have a problem. I have two Products resources now: CRUD operations (GET, POST, PUT, DELETE) geared to the store owner that maintains their product shelf (with the language complexity), and one for the consumer user (GET only, without the language complexity).
How should I approach this model? Is this a naming problem or am I approaching the modelling incorrectly?
Thanks for your help.
Edit: adding a UML as suggest by #qwerty_so.

Remark: answer provided when question had an UML tag (last part still valid though)
Multiple problems in your diagram:
An actor is a classifier that is external to the system. A class can therefore not be associated with an actor.
A «use» dependency should be noted with a dashed line arrow, and not a plain line arrow.
It is not clear what the links between your «Representation» classes and you »«DB table» represents: is it again a dependency? Is it a navigable association?
You have multiple definitions of a name without any possibility to disambiguate.
How to do it in UML?
In UML, having two classes with the same name in the same namespace is not valid, since it would not be clear which class is meant.
Nevertheless, you could adjust your model and separate the different family of classes, enclosing them in distinct packages, e.g. Database, CRUD, Consumer.
A package defines a namespaces. In consequence all references to a Product in the CRUD package would refer to the relevant CRUD class.
For relations between classes of different packages, you may disambiguate with a qualified name (i.e. including the package names), or, in case there would be no conflict, by importing a package into another.
Is it the best way to design your API?
Managing conflicting names in endpoints is not ideal. It’s n easy source of confusion. You can of course find technical solutions to address this, but still, why bother ?
Why not leave the client side tailoring/simplifying data as needed (e.g. using the user's language by default, unless something else is required)?
Why not let the end-point query parameters provide an optional language code to filter the set of returned languages ?
Or why not just use different names to call differently different things? I was thinging about ProductLocalized or ProductShort?

Is this a naming problem or am I approaching the modelling incorrectly?
It sounds to me like a naming problem.
It's perfectly reasonable to have more than one resource whose representations are derived from the same underlying data.
But each resource does need to have an identifier of its own. The machines don't particularly care what the identifier is, so you could use...
/b6d5cc6a-e4b3-4bc9-90d0-723f1d8ee22a
/5952d730-d447-4537-9bf0-74cdc2f9f79a
But human beings generally copy better with identifiers that are human readable.
Any spelling consistent with your local conventions is fine.
One possibility would be to just encode the name of the audience into the identifier itself
/storeOwner/products/123
/shopper/products/123
or you might instead choose names that are more closely aligned with the business capability
/inventory/products/123
/sales/products/123
You've got the freedom to choose which human beings you want to optimize for (domain experts? operators reading access logs? tech writers documenting the api? remote developers consuming it? and so on).

REST API filter operator best practice

I am building a REST API that uses a filter parameter to control search results. E.g., one could search for a user by calling:
GET /users/?filter=name%3Dfoo
Now, my API should allow many different filter operators. Numeric operators such as equals, greater than, less than, string operators like contains, begins with or ends with and date operators such as year of or timediff. Moreover, AND and OR combinations should be possible.
Basically, I want to support a subset of the underlying MySQL database operators.
I found a lot of different implementations (two good examples are Google Analytics and LongJump) that seem to use custom syntax.
Looking at my requirements, I would probably design a custom syntax pretty similiar to the MySQL operator syntax.
However, I was wondering if there are any best practices established that I should follow and whether I should consider anything else. Thanks!

You need an already existing query language, don't try to reinvent the wheel! By REST this is complicated and not fully solved issue. There are some REST constraints your application must fulfill:
uniform interface / hypermedia as the engine of application state:
You have to send hypermedia responses to your clients, and they have to follow the hyperlinks given in those responses, instead of building the requests on their own. So you can decouple the clients from the structure of the URI.
uniform interface / self-descriptive messages:
You have to send messages annotated with semantics. So you can decouple the clients from the data structure. The best solution to do this is RDF with for example open linked data vocabs. If you don't want to use RDF, then the second best solution to use a vendor specific MIME type, so your messages will be self-descriptive, but the clients need to know how to parse your custom MIME type.
To describe simple search links, you can use URI templates, for example GET /users/{?name} will wait a name parameter in the query string. You can use the hydra:IRITemplateMapping from the hydra vocab to add semantics to the paramers like name.
Describing ad-hoc queries is a hard task. You have to describe somehow what your query can contain.
You can choose an URI query language and stick with URI templates and probably hydra annotation. There are many already existing URI query languages, like HTSQL, OData query (ppl don't like that one), etc...
You can choose an existing query language and send it in a single URI param. This can be anything you want, for example SQL, SPARQL, etc... You have to teach your client to generate that param. You can create your own vocab to describe the constraints of the actual query. If you don't need complicated things, this should not be a problem. I don't know of already existing query structure descibing vocabs, but I never looked for them...
You can choose an existing query language and send it in the body in a SEARCH request. Afaik SEARCH is not cached or supported by recent HTTP clients. It was defined by webdav. You can describe your query with the proper MIME type, and you can use the same vocab as by the previous solution.
You can use an RDF query solution, for example a SPARQL endpoint, or triple pattern fragments, etc... So your queries will contain the semantic metadata, and not your link description. By SPARQL you don't necessary need a triple data storage, you can translate the queries on server side to SQL, or whatever you use. You can probably use SPIN to describe query constraints and query templates, but that is new for me too. There might be other solutions to describe SPARQL query structures...
So to summarize if you want a real REST solution, you have to describe to your clients, how they can construct the queries and what parameters, logical operators they can use. Without query descriptions they won't be able to generate for example a HTML form for the user. If you don't want a REST solution, then pick a query language write a builder on the client, write a parser on the server and that's all.

The Open Data Protocol (OData)
You can check BreezeJs too and see how this protocol it's implemented for node.js + mongodb with breeze-mongodb module and for a .NET project using Web API and EntityFramework with Breeze.ContextProvider dll.

By embracing a set of common, accepted delimiters, equality comparison can be implemented in
straight-forward fashion. Setting the value of the filter query-string parameter to a string using those
delimiters creates a list of name/value pairs which can be parsed easily on the server-side and utilized
to enhance database queries as needed. You can use the delimeters of your choice say (“|”) to separate individual filter phrases for OR and ("&") to separate
individual filter phrases for AND and a double colon (“::”) to separate the names and values.
This provides a unique-enough set of delimiters to support the majority of use cases and creates a user readable
query-string parameter. A simple example will serve to clarify the technique. Suppose we want
to request users with the name “Todd” who live in "Denver" and have the title of “Grand Poobah”.
The request URI, complete with query-string might look like this:
GET http://www.example.com/users?filter="name::todd&city::denver&title::grand poobah”
The delimiter of the double colon (“::”) separates the property name from the comparison value,
enabling the comparison value to contain spaces—making it easier to parse the delimiter from the value
on the server.
Note that the property names in the name/value pairs match the name of the properties that would be
returned by the service in the payload.
Case sensitivity is certainly up for debate on a case-by-case basis, but in general,
filtering works best when case is ignored. You can also offer wild-cards as needed using the asterisk
(“*”) as the value portion of the name/value pair.
For queries that require more-than simple equality or wild-card comparisons, introduction of operators
is necessary. In this case, the operators themselves should be part of the value and parsed on the server
side, rather than part of the property name. When complex query-language-style functionality is
needed, consider introducing query concept from the Open Data Protocol (OData) Filter System Query
Option specification (http://www.odata.org/documentation/odata-version-4-0/)

There seems to be a lot of standards (like OData), but many are quite complicated in that they introduce new syntax.
For simple multi filtering the following format avoid polluting the parameter namespace while still standing on top of existing web-technology
GET /users?filter[name]=John&filter[title]=Manager
It's easily readable and on the backend languages like PHP will receive it as an array of filters to apply.

A possible standard would SCIM which is adopted by some commercial products. But it's not distinguished by brevity. For a pet project I used this
= equal
! not equal
* like
< smaller
> greater
& bitwise and 
| bitwise or
^ bitwise xor
~ in comma separated value list
Examples
So GET /user?name=*An* would get all users whose name start with An and GET /user?name=~Anna,Bertha would get those two users.
Not yet a standard but who knows...

How to design a REST API with LIKE criteria?

I'm designing a REST API and have an entity for "people":
GET http://localhost/api/people
Returns a list of all the people in the system
GET http://localhost/api/people/1
Returns the person with id 1.
GET http://localhost/api/people?forename=john&surname=smith
Returns all the people with matching forenames and surnames but I have a further requirement. What is the cleanest / best practice way of allowing API consumers to retrieve all the people whose forename starts with "jo" for example.
I've seen some APIs do this like:
GET http://localhost/api/people?forename=jo~&surname=smith
where the tilde signifies a "fuzzy" match. On the other hand I've seen it implemented with a totally different criteria e.g.
GET http://localhost/api/people?forename-startswith=jo&surname=smith
which seems a bit cumbersome considering I might have -endswith, -contains, -soundslike (for some sort of soundex match).
Can anyone suggest from experience which works better and also any examples of well designed REST APIs that have similar functionality.

IMHO it does not matter if you have fuzzy matches or have -endswith -contains etc. What matters is if your REST API permits easy parsing of such parameters so that you can define functions to fetch data from your data source (DB or xml file etc.) accordingly
If you are using PHP...from my experience, SlimFramework is a great light weight, easy-to-get-started solution.

I would recommend you the OData protocol which provides a Query String Options. What you did is ok and follows REST conventions.
But, the OData protocol describes a $expand parameter and even a $filter parameter. This $ prefix denotes "System Query Options" and you will be interested in the last one because it allows you to write the following URI:
http://services.odata.org/Northwind/Northwind.svc/Customers?$filter=tolower(CompanyName) eq 'foobar' &select=FirstName,LastName&$orderby=Name desc
It allows you to pass SQL like data, it can be a nice alternative to what you described (both solutions are fine, it's just a matter of taste).

AFAIK, none of above are quite RESTful. Both of them rely on a priory knowledge on the client's part on how to invoke queries (in the first case, query pattern and on the second one a query DSL). In the second example, in fact, the API is reduced mere to a wrapper around the data store. As such, API does not define a server domain - it is a data provider. This is in contrast to the client-server constraint of REST.
If you need to expose a full-blown data store with all various querying capabilities, you had better stick to known standards which we have OData. OData has been sold as REST but many REST-heads have problems with it. Anyhow, at the end of the day it works and REST discussions can commonly lead to analysis-paralysis.
If I was doing this, I would probably constraint the API to a common use-case, so something more like the second one without defining a query DSL (hence forenameStartsWith rather than forename-startswith).
Having said that, if you need to query based on many fields and various conditions, I would use OData.

Both examples use query parameters for filtering. I don't think it matters what these query parameters are called or if some wildcard syntax is used.
Both approaches are equally RESTFul.

Design RESTful URI

I am in the process of creating a RESTful API. I read
http://microformats.org/wiki/rest/urls
but this site doesn't give me enough "good" practice on designing my API.
Specifically I will write an API (only GET methods that far) which will provide functions to convert geo-coordinates.
Example:
A geohash is a single value representation of a coordinate, thus
/convert/geohash/u09tvkx0.json?outputformat=latlong
makes sense. On the other hand
/convert/latlong.xml?lat=65&long=13&outputformat=UTC requires two input values.
See my "question"? What makes a good API which requires more than one input parameter?
(Tried to "identify" good practice by "analysing" twitter & FF but failed)

In terms of being considered a "technically" correct REST URI, there is no difference between using query string parameters or not. In RFC 3986, it states:
The query component contains non-hierarchical data that, along with data in the path component (Section 3.3), serves to identify a resource
That's why you're having a difficult time finding a definitive "best practice". Having said that, many REST APIs do embed multiple parameters in the URI without using query strings. For exammple, to identify the make and model of a car, you'll see websites with URI's like this: cars.com/honda/civic. In that case it's very obvious the relationship between the 2 and so having everything in the URI is "hackable". It's also much easier to stick with a non-query string approach when you only have one parameter which is uniquely identifying the resource; but if it's something like a search query, then I'd probably keep it in the query string. This SO question has an interesting discussion about the different approaches as well.
In your example above, I would stick with the query string parameters. Although REST typically has more intuitive URLs, that's really not what REST is about. REST is more about hypermedia and HATEOAS.

There are few REST best practices to be followed when designing a REST API
Abstract vs Concrete
CRUD Operations
Error Handling
API Versioning
Filtering
Security
Analytics
Documentation
Stability and Consistency
URL Structure
Read More

Does my API design violate RESTful principles?

I'm currently (I try to) designing a RESTful API for a social network. But I'm not sure if my current approach does still accord to the RESTful principles. I'd be glad if some brighter heads could give me some tips.
Suppose the following URI represents the name field of a user account:
people/{UserID}/profile/fields/name
But there are almost hundred possible fields. So I want the client to create its own field views or use predefined ones. Let's suppose that the following URI represents a predefined field view that includes the fields "name", "age", "gender":
utils/views/field-views/myFieldView
And because field views are kind of higher logic I don't want to mix support for field views into the "people/{UserID}/profile/fields" resource. Instead I want to do the following:
utils/views/field-views/myFieldView/{UserID}
Another example
Suppose we want to perform some quantity operations (hope that this is the right name for it in English). We have the following URIs whereas each of them points to a list of persons -- the friends of them:
GET people/exampleUID-1/relationships/friends
GET people/exampleUID-2/relationships/friends
And now we want to find out which of their friends are also friends of mine. So we do this:
GET people/myUID/relationships/intersections/{Value-1};{Value-2}
Whereas "{Value-1/2}" are the url encoded values of "people/exampleUID-1/friends" and "people/exampleUID-2/friends". And then we get back a representation of all people which are friends of all three persons.
Though Leonard Richardson & Sam Ruby state in their book "RESTful Web Services" that a RESTful design is somehow like an "extreme object oriented" approach, I think that my approach is object oriented and therefore accords to RESTful principles. Or am I wrong?
When not: Are such "object oriented" approaches generally encouraged when used with care and in order to avoid query-based REST-RPC hybrids?
Thanks for your feedback in advance,
peta

I've never worked with REST, but I'd have assumed that GETting a profile resource at '''/people/{UserId}/profile''' would yield a document, in XML or JSON or something, that includes all the fields. Client-side I'd then ignore the fields I'm not interested in. Isn't that much nicer than having to (a) configure a personalised view on the server or (b) make lots of requests to fetch each field?

Hi peta,
I'm still reading through RESTful Web Services myself, but I'd suggest a slightly different approach than the proposed one.
Regarding the first part of your post:
utils/views/field-views/myFieldView/{UserID}
I don't think that this is RESTful, as utils is not a resource. Defining custom views is OK, however these views should be (imho) a natural part of your API's URI scheme. To incorporate the above into your first URI example, I would propose one of the following examples instead of creating a special view for it:
people/{UserID}/profile/fields/name,age,gender/
people/{UserID}/profile/?fields=name,age,gender
The latter example considers fields as an input value for your algorithm. This might be a better approach than having fields in the URI as it is not a resource itself - it just puts constraints on the existing view of people/{UserID}/profile/. Technically, it's very similar as pagination, where you would limit a view by default and allow clients to browse through resources by using ?page=1, ?page=2 and so on.
Regarding the second part of your post:
This is a more difficult one to crack.
First:
Having intersection in the URI breaks your URI scheme a bit. It's not a resource by itself and also it sits on the same level as friends, whereas it would be more suitable one level below or as an input value for your algorithm, i.e.
GET people/{UserID}/relationships/friends/intersections/{Value-1};{Value-2}
GET people/{UserID}/relationships/friends/?intersections={Value-1};{Value-2}
I'm again personally inclined to the latter, because similarly as in the first case, you are just constraining the existing view of people/{UserID}/relationships/friends/
Secondly, regarding:
Whereas "{Value-1/2}" are the url
encoded values of
"people/exampleUID-1/friends" and
"people/exampleUID-2/friends"
If you meant that {Value-1/2} contain the whole encoded response of the mentioned GET requests, then I would avoid that - I don't think that the RESTful way. Since friends is a resource by itself, you may want to expose it and access it directly, i.e.:
GET friends/{UserID-1};{UserID-2};{UserID-3}
One important thing to note here - I've used ; between user IDs in the previous example, whereas I used , in the fields example above. The reasoning is that both represent a different operator. In the first case we needed OR (,) in order to get all three fields, while in the last example above we had to use AND (;) in order to get an intersection.
Usage of two types of operators can over-complicate the API design, but it should provide more flexibility in the end.

thanks for your clarifying answers. They are exactly what I was asking for. Unfortunately I hadn't the time to read "RESTful Web Services" from cover to cover; but I will catch it up as soon as possible. :-)
Regarding the first part of my post:
You're right. I incline to your first example, and without fields. I think that the I don't need it at all. (At the moment) Why do you suggest the use of OR (,) instead of AND (;)? Intuitively I'd use the AND operator because I want all three of them and not just the first one existing. (Like on page 121 the colorpairs example)
Regarding the second part:
With {Value-1/2} I meant only the url-encoded value of the URIs -- not their response data. :) Here I incline with you second example. Here it should be obvious that under the hood an algorithm is involed when calculating intersecting friends. And beside that I'm probably going to add some further operations to it.
peta

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas