REST API and GET with structured data

REST API and GET with structured data - api

I'm designing a REST API where given an address, I will return the timezone corresponding to the address. I started along this path,
GET /api/TimeZone?address=
Now this address is a free form address field that I need to parse on the server.
To avoid errors, I'd really like to have the user send in a record like {"city": "", "state": "", "country": "" }
But I can only do this using a POST or a PUT endpoint => which semantically implies that the data is changing on the server but its actually NOT.
What would be a good way to address this?

You can have multiple fields in your query:
TimeZone?city=London&country=uk
Or if you could use hierarchical URLs such as:
TimeZone/UK
TimeZone/UK/London
I would expect the first of these to supply a bunch of cities, such as those found in most clocks, for UK, London would be the only suggestion as the whole of the UK is "London Time", but for America, there would be many suggestions.

Related

REST GET mehod: Can return a list of enriched resources?

I have a doubt when I'm designing a REST API.
Consider I have a Resource "Customer" with two elements in my server, like this:
[
{
name : "Mary",
description : "An imaginary woman very tall."
},
{
name : "John",
description : "Just a guy."
}
]
And I want to make an endpoint, that will accept a GET request with a query. The query will provide a parameter with a value that will make an algorithm count how many occurrences for this text are there in all of its parameters.
So if we throw this request:
GET {baseURL}/customers?letters=ry
I should get something like
[
{
name : "Mary",
description : "An imaginary woman very tall.",
count : 3
},
{
name : "John",
description : "Just a guy.",
count : 0
}
]
Count parameter can not be included in the resource scheme as will depend on the value provided in the query, so the response objects have to be enriched.
I'm not getting a list of my resource but a modified resource.
Although it keeps the idempotent condition for GET Method, I see it escapes from the REST architecture concept (even the REST beyond CRUD).
Is it still a valid endpoint in a RESTful API? or should I create something like a new resource called "ratedCustomer"?

REST GET mehod: Can return a list of enriched resources?
TL;DR: yes.
Longer answer...
A successful GET request returns a representation of a single resource, identified by the request-target.
The fact that the information used to create the representation of the resource comes from multiple entities in your domain model, or multiple rows in your database, or from reports produced by other services... these are all implementation details. The HTTP transfer of documents over a network application doesn't care.
That also means that we can have multiple resources that include the same information in their representations. Think "pages in wikipedia" that duplicate each others' information.
Resource identifiers on the web are semantically opaque. All three of these identifiers are understood to be different resources
/A
/A?enriched
/B
We human beings looking at these identifiers might expect /A?enriched to be semantically closer to /A than /B, but the machines don't make that assumption.
It's perfectly reasonable for /A?enriched to produce representations using a different schema, or even a different content-type (as far as the HTTP application is concerned, it's perfectly reasonable that /A be an HTML document and /A?enriched be an image).
Because the machines don't care, you've got additional degrees of freedom in how you design both you resources and your resource identifiers, which you can use to enjoy additional benefits, including designing a model that's easy to implement, or easy to document, or easy to interface with, or easy to monitor, or ....
Design is what we do to get more of what we want than we would get by just doing it.

Are resources state aware or static under hateoas/restful api

My question is about if resources should be aware of the state or statically defined. For example, I have an API that returns account information where the resource uri would be /api/accounts/2.
If I'm authenticated as user henk willemsa the resource would look like this:
{
"id": 2,
"firstname": "henk",
"lastname": "willemsa",
"birthday": "12-31-1980",
"email": "firstname.lastname#email.com",
"other": "other useless info",
"super-secret-info": "some super secret info"
}
Is it good practice to return the resource with stripped out data if you would be authenticated as another user? For instance, making a request to the same endpoint /api/accounts/2, but for a different user, jan smit, the returned response would be:
{
"id": 2,
"firstname": "henk",
"lastname": "willemsa"
"other": "other useless info"
}
The idea is that user jan smit is only allowed to see the public data, where henk willemsma sees the secret as well.
Would it be better for something like this be solved with 2 endpoints, where /api/accounts/2 would return a 403 for user jan smit and 200 for henk willemsa and another api endpoint /api/public-account/2 would return 200 for the both users? The later could give a response like:
{
"id": 2,
"firstname": "henk",
"lastname": "willemsa"
"other": "other useless info"
}
Having one endpoint and stripping out data would in my eyes be inconsistent, because the structure of the data-type/resource would change depending on who requests it and not because extra explicit data is sent, which changes the data-type/resource (like filter options).
But I can also see that splitting this out over multiple endpoints could cause for having lots and lots of different endpoints which basically do the same returning account information.
I also found this question, which somewhat describes what I'm looking for but is about collection calls. In my opinion, these are allowed to return different unique resource, but the data-types should always be the same. In my example, /api/accounts/ would always return a list of accounts, but depending on which user makes the request to the endpoint, while the size of the list could be different, it would always be a list of accounts.
What is the best approach?

The "best" approach can probably not be objectively defined. However, creating multiple resources for the same "thing" is probably not a good idea. Things should be identifiable by URI, so accounts should have a stable URI.
I would probably just omit the fields that the user can not see, if that is possible according to the data definitions/structure. If not, you could serve multiple 'representations', i.e. media-types, and let content-negotation handle the exchange. That means you create 2 media-types, one with the full data and one for the restricted view of the account, and serve both for the same resource URI. The server then can decide which representation you get based on your credentials. The client would also be able to easily see which representation it got, and inform the user if necessary that it has a restricted view of the account.
The client would have to ask with an 'Accept' header similar to this:
Accept: application/vnd.company.account-full; q=1.0, application/vnd.company.account-restricted; q=0.9,

Validating Shipping Addresses

scenario:
a customer shops and creates an e-commerce order. there is a customer db table, and a shipping table. a customer can have more then shipping address. if the customer logs in to do another order, the shipping address(es) are pulled from the shipping table based on customer id.
the shipment is boxed, the store admin orders a shipping label and the address is sent to be "verified". when the shipping address is Verified, a new version of the shipping address is returned. There are 4 possibilities here:
1) there is no change at all to the Verified address except the letters might have been changed to upper case.
2) there is some slight change to the Verified address that doesn't affect the original version. maybe Ave was added to a street name field or a 5 digit zip code is updated to the 9 digit version.
3) the original submitted shipping address had a minor error - but the Verification was able to correct it. for example the zip code had a wrong digit.
4) the original submitted address has a major error that cannot be resolved by the Verification. either it can be figured out by the admin and resubmitted or the customer must be contacted.
so the questions are:
A) do we always update shipping tables with the (new) Verified address?
B) or do we do some kind of check and compare the original and verified to see if there is a change and then only update if the address has been changed?
C) or should we update the address AND keep a backup copy of the original address?
choice A seems like the simplest but am curious how people are dealing with this. note that this is probably most relevant for shipping with USPS postal in terms of the rigor of their verification.
====== edit
validating the shipping to address when the customer enters it in is obviously the most optimal but e-commerce merchants can get orders from different "channels" that the merchant has no control over. so validation at the time of creating the shipping label is still required.

D) Interaction with the customer at point of entry.
At that point of creating an eCommerce order - is that where they enter the address? That is the point that you should be validating the details. That is the point that you can ask for more information and clarity.
For example, if you validate afterwards then simple corrections (or changes over time) can be applied, but if there are major changes or the customer did not enter their apartment information there is no way to guess that afterwards and it can be a costly and time consuming task to create (call the customer back etc).
There are a few ways you can do this, I'm not sure what you use at the moment. If you are using third party solutions (such as my company's offering: https://www.edq.com/address-verification) they probably have services you can use which provide the interaction and prompts required. A solution such as this has the advantage that it can speed the capture of an address and ease the check out process but also makes sure that the addresses are validated and corrected at the point of capture.
If you do the above, then you only need to store the one, correct, customer accepted address. If they enter it correctly first time then it goes straight through with a lovely user experience, if there are problems then minor ones are fixed with extra interaction (appending Zip +4 perhaps) or they are prompted and guided to correct their address.

The USPS address validation only corrects addresses if they're positive that it does not actually change the intended address. Whenever the address validation is ambiguous, the address validation service returns the original address (or an error) and usually includes a message such as "ZIP and city are a match, but street address is invalid".
Thus you should be able to go with options A or B. That being said, most carrier APIs (including address validation) are unfortunately not 100% reliable, thus it might be reasonable to go with option C to have a backup.
If you need access to an easy-to-use address validation interface, my company Shippo offers an easy-to-use address validation REST API.

You can try Pitney Bowes “IdentifyAddress” Api available at - https://identify.pitneybowes.com/
The service analyses and compares the input addresses against the known address databases around the world to output a standardized detail. It corrects addresses, adds missing postal information and formats it using the format preferred by the applicable postal authority. I also uses additional address databases so it can provide enhanced detail, including address quality, type of address, transliteration (such as from Chinese Kanji to Latin characters) and whether an address is validated to the premise/house number, street, or city level of reference information.
You will find a lot of samples and sdk available on the site and i found it extremely easy to integrate.
Here is the small sample -
JSON Request -
{
"options": {
"OutputCasing": "M"
},
"Input": {
"Row": [
{
"AddressLine1": "13 milk st",
"AddressLine2": "",
"City": "boston",
"Country": "us",
"StateProvince": "ma",
"PostalCode": "",
"user_fields":[
{
"name": "User1",
"value": "Value1"
}
]
}
]
}
}
}
JSON Response
{"Output": [{
"Status": "F",
"Status.Code": "UnableToDPVConfirm",
"Status.Description": "Address Not Deliverable",
"AddressLine1": "13 Milk St",
"City": "Boston",
"StateProvince": "MA",
"PostalCode": "02109-5402",
"Country": "United States of America",
"BlockAddress": "13 Milk St Boston MA 02109-5402 United States of America",
"PostalCode.Base": "02109",
"PostalCode.AddOn": "5402",
"user_fields": [ {
"name": "User1",
"value": "Value1"
}]
}]}

Is this a valid REST API?

I am designing an API.
There's the user profile, accessible at
http://example.org/api/v1/users (resp. http://example.org/api/v1/users/:id)
Now, the user's profile will be dynamic.
So we will allow an API function to add a new profile attribute.
Is the following a valid REST API URL for this?
POST http://example.org/api/v1/users/attributes
Indeed, to retrieve a specific user, the user's id would be appended to the .../users/ URL.
Now if I use the "attributes" element after /users/, would that somehow break the user id pattern for the URL?
I'd like to keep the base URL to be api/v1/users though, because logically I am modifying the users profile still...
EDIT: The attributes would be added valid for all profiles, it's independent of a user. Say the profile has "name", "surname", "email", and I want to add "address" to all profiles (Of course I know that users with a missing "address" field would not get the new attribute)
What is a good practice to address such an issue?

I think the id should be kept in the URL because you are adding the attributes to a specific user, right?

It is an acceptable solution to use the /api/v1/users/attributes as long as the :id cannot be the text: "attributes". However I recommend to create your own media type, microformat, or microdata for the attributes, because it is rather a type than a resource.
I think you should check these links:
http://alps.io/spec/index.html
http://www.markus-lanthaler.com/hydra/spec/latest/core/
http://schema.org/
http://microformats.org/wiki/microformats2
http://amundsen.com/media-types/maze/
If the user can set what attributes she can have, only then should you use a resource for attributes. But then each user should have one. But I don't think using resources will be necessary, microdata and microformats both contain more than enough person description attributes...
Some update after 5 months:
Now if I use the "attributes" element after /users/, would that
somehow break the user id pattern for the URL?
From the perspective of the client that "id pattern" does not exist. The client follows links by checking the semantics annotated to them. So REST clients are completely decoupled from the URI structure of the actual REST API (aka. uniform interface constraint). If your pattern breaks, then it is completely a server side, link generation and routing issue, which is not a client side concern.
Say the profile has "name", "surname", "email", and I want to add
"address" to all profiles. What is a good practice to address such an
issue?
Address is an optional field in this case and probably a sub-resource, because it can have further fields, like city, postal code, street, etc... You can add address separately, for example with PUT /users/123/address {city: "", street: "", ...} or you can add those fields to your user form, and add a partial update to the user, like PATCH /users/123 {address: {city: "", street: "", ...}} if only the address changes.

In case you want to update every resource in the entire collection I would send a PATCH request to /users.

While it is a valid URI, I would suggest avoiding POST http://example.org/api/v1/users/attributes. In my opinion, it violates the principle of least surprise when a collection endpoint has a child node which is not a member of the collection. If you want to track user attributes as shared by all users, then that's a separate collection, perhaps /user-attributes.
POST /user-attributes
{
"name": "Email Address",
"type": "String",
...
}
GET /user-attributes would return all the possible attributes, and GET /user-attributes/{id} would return all the metadata around an attribute.
If there's no metadata, then #inf3rno's suggestion to just PUT the attribute up and let the server deal with it is definitely worth considering.
This all presupposes you need to manage attributes through the API. If not, I agree with #inf3rno that media types are the way to go. Of course, in that case you may want a media type for the user-attributes resource ..

RESTful API - Correct behaviour when spurious/not requested parameters are passed in the request

We are developing a RESTful api that accepts query parameters in the request in the form of JSON encoded data.
We were wondering what is the correct behaviour when non requested/not expected parameters are passed along with the required ones.
For example, we may require that a PUT request on a given endpoint have to provide exactly two values respectively for the keys name and surname:
{
"name": "Jeff",
"surname": "Atwood"
}
What if a spurious key is passed too, like color in the example below?
{
"name": "Jeff",
"surname": "Atwood",
"color": "red"
}
The value for color is not expected, neither documented.
Should we ignore it or reject the request with a BAD_REQUEST 400 status error?
We can assert that the request is bad because it doesn't conform to the documentation. And probably the API user should be warned about it (She passed the value, she'll expects something for that.)
But we can assert too that the request can be accepted because, as the required parameters are all provided, it can be fulfilled.

Having used RESTful APIs from numerous vendors over the years, let me give you a "users" perspective.
A lot of times documentation is simply bad or out of date. Maybe a parameter name changed, maybe you enforce exact casing on the property names, maybe you have used the wrong font in your documentation and have an I which looks exactly like an l - yes, those are different letters.
Do not ignore it. Instead, send an error message back stating the property name with an easy to understand message. For example "Unknown property name: color".
This one little thing will go a long ways towards limiting support requests around consumption of your API.
If you simply ignore the parameters then a dev might think that valid values are being passed in while cussing your API because obviously the API is not working right.
If you throw a generic error message then you'll have dev's pulling their hair out trying to figure out what's going on and flooding your forum, this site or your phone will calls asking why your servers don't work. (I recently went through this problem with a vendor that just didn't understand that a 404 message was not a valid response to an incorrect parameter and that the documentation should reflect the actual parameter names used...)
Now, by the same token I would expect you to also give a good error message when a required parameter is missing. For example "Required property: Name is missing".
Essentially you want to be as helpful as possible so the consumers of your API can be as self sufficient as possible. As you can tell I wholeheartedly disagree with a "gracious" vs "stern" breakdown. The more "gracious" you are, the more likely the consumers of your API are going to run into issues where they think they are doing the right thing but are getting unexpected behaviors out of your API. You can't think of all possible ways people are going to screw up so enforcing a strict adherence with relevant error messages will help out tremendously.

If you do an API design you can follow two path: "stern" or "gracious".
Stern means: If you do anything I didn't expect I will be mad at you.
Gracious means: If I know what you want and can fulfil it I will do it.
REST allows for a wonderful gracious API design and I would try to follow this path as long as possible and expect the same of my clients. If my API evolves I might have to add additional parameters in my responses that are only relevant for specific clients. If my clients are gracious to me they will be able to handle this.
Having said that I want to add that there is a place for stern API design. If you are designing in an sensitive domain (e.g. cash transactions) and you don't want to leave room for any misunderstanding between the client and server. Imagine the following POST request (valid for your /account/{no}/transaction/ API):
{ amount: "-100", currency : "USD" }
What would you do with the following (invalid API request)?
{ amount: "100", currency : "USD", type : "withdrawal" }
If you just ignore the "type" attribute, you will deposit 100 USD instead of withdrawing them. In such a domain I would follow a stern approach and show no grace whatsoever.
Be gracious if you can, be stern if you must.
Update:
I totally agree with #Chris Lively's answer that the user should be informed. I disagree that it should always be an error case even the message is non-ambiguous for the referenced resource. Doing it otherwise will hinder reuse of resource representations and require repackaging of semantically identical information.

It depends on your documentation.. how strict you want to be .. But commonly speaking, Just ignore it. Most other servers also ignore request parameters it didn't understand.
Example taken from my previous post
Extra Query parameters in the REST API Url
"""Google ignore my two extra parameters here https://www.google.com/#q=search+for+something&invalid=param&more=stuff"""

Imagine I have the following JSON schema:
{
"frequency": "YEARLY",
"date": 23,
"month": "MAY",
}
The frequency attribute accepts "WEEKLY", "MONTHLY" and "YEARLY" value.
The expected payload for "WEEKLY" frequency value is:
{
"frequency": "WEEKLY",
"day": "MONDAY",
}
And the expected payload for "MONTHLY" frequency value is:
{
"frequency": "MONTHLY",
"date": 23,
}
Give the above JSON schema, typically I will have need a POJO containing frequency, day, date, and month fields for deserialization.
If the received payload is:
{
"frequency": "MONTHLY",
"day": "MONDAY",
"date": 23,
"year": 2018
}
I will throw an error on "day" attribute because I will never know the intention of the sender:
frequency: "WEEKLY" and day: "MONDAY" (incorrect frequency value entered), or
frequency: "MONTHLY" and date: 23
For the "year" attribute, I don't really have choice.
Even if I wish to throw an error for that attribute, I may not be able to.
It's ignored by the JSON serialization/deserialization library as my POJO has no such attribute. And this is the behavior of GSON and it makes sense given the design decision.
Navigating the Json tree or the target Type Tree while deserializing
When you are deserializing a Json string into an object of desired type, you can either navigate the tree of the input, or the type tree of the desired type. Gson uses the latter approach of navigating the type of the target object. This keeps you in tight control of instantiating only the type of objects that you are expecting (essentially validating the input against the expected "schema"). By doing this, you also ignore any extra fields that the Json input has but were not expected.
As part of Gson, we wrote a general purpose ObjectNavigator that can take any object and navigate through its fields calling a visitor of your choice.
Extracted from GSON Design Document

Just ignore them.
Do not give the user any chance to reverse engineer your RESTful API through your error messages.
Give the developers the neatest, clearest, most comprehensive documentation and parse only parameters your API need and support.

I will suggest that you ignore the extra parameters. Reusing API is a game changer in the integration world. What if the same API can be used by other integration but with slightly extra parameters?
Application A expecting:
{
"name": "Jeff",
"surname": "Atwood"
}
Application B expecting:
{
"name": "Jeff",
"surname": "Atwood",
"color": "red"
}
Simple get application application A to ignore "color" will do the job rather to have 2 different API to handle that.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas