I am designing a REST service for my company. No one here has had much experience with REST so I read through a few books on the subject but I am stuck on resource design of a POST vs. the resource design of a GET request for the same data. Particularly in the case of foreign relationships.
For instance I have a class PurchaseRequest which represents a request to purchase some fixed asset. Behind the scenes my service is an interface to a relational DB. There is a PURCHASE_REQUEST table which has a foreign key to an ASSET table (Defining which of a fixed list of assets are being requested) and a PERSON table (Defining which of the users is doing the requesting). Currently in my service when a GET command is issued for a purchase request, the service returns the whole thing: An XML representation of the PURCHASE_REQUEST table entry along with a list of asset entries like so:
<PurchaseRequest>
<ID></ID>
<RequestDate></RequestDate>
<Requestor href="/requestors/requestorID">
<RequestorID></RequestorID>
<FirstName></FirstName>
<LastName></LastName>
</Requestor>
<RequestedAssets>
<RequestedAsset href="/assets/AssetNumber" >
<AssetNumber></AssetNumber>
<Year></Year>
<Make></Make>
<Model></Model>
<Cost></Cost>
</RequestedAsset>
<RequestedAsset href="/assets/AssetNumber" >
<AssetNumber></AssetNumber>
<Year></Year>
<Make></Make>
<Model></Model>
<Cost></Cost>
</RequestedAsset>
<RequestedAsset href="/assets/AssetNumber" >
<AssetNumber></AssetNumber>
<Year></Year>
<Make></Make>
<Model></Model>
<Cost></Cost>
</RequestedAsset>
</RequestedAssets>
</PurchaseRequest>
This works pretty efficiently. The consuming application makes a single request and gets the whole thing and links to the full resource requestor resource or asset resource if they need them.
The problem comes on a POST. My gut tells me to try to use the same resource layout for POSTing a new purchase request as I used to retrieve an existing one. This is what all the examples in the books I have read do anyway. I don’t need to know anything more than the Asset Number and Requestor ID to fulfill the POST. That means that data is not necessary but the inefficiency alone is not what bothers me. The main thing is you should not be able to edit the year, make or model of an asset when creating a purchase request, those fields are pre-defined. You also should be able to create a new asset definition when creating a purchase request. Similarly you should not be able to update/create a person's details when creating a purchase request. There are separate services for creating/updating people and assets.
The only thing I can think of is to define a different DataContract class for the POST which has the minimum info to identify an asset or a person and does not expose those fields which cannot be updated. I really don’t love this option because it is going to create a large number of DataContracts classes (nearly all of the tables in my DB have foreign relationships, this is not isolated to one request or I would not be worrying about it) However I really don’t love my current design because REST does not have read-only fields.The burden is now on the consumers of my service to constantly be checking, "does it save this field… what about this one?..." Has anyone else ran into this issue? Is it common to have to define a separate DataContract class for POSTing and GETing? Seems like a pretty basic design question but I don’t see a lot of posts out there on the subject so I am hoping I just missed something. Any help is appreciated.
Related
A person can have many reviews. My endpoint to CREATE a new review is:
post /person/{id}/reviews
How about the endpoint to UPDATE a review? I see two options:
Stick to the parent resource: patch /person/{person_id}/reviews/{id}
Only have reviews in the URI: patch /reviews/{id}
I could be sold on using either of them:
It's consistent with the previously defined endpoint, but {person_id} is not needed.
It's 'efficient' as we're not specifying a parameter ({person_id}) that is not really needed. However, it breaks the API convention.
Which one is preferable and why?
The client shouldn't have to know about ids at all. After a client creates the review, the response should include the URI to the new review like this:
HTTP/1.1 201 Created
Location: /person/4/reviews/5
The client now has the full URL to the review, making it completely irrelevant how it looks like and what information is here.
Don't forget that the URL itself is a system to create globally unique IDs, that embed not just it's own unique identity but also information on how to access the data. If you introduce a separate 'id' and 'person_id' field you are not taking advantage of how the web is supposed to work.
In terms of API design, without knowing too much detail about OP's situation I'd walk along these guideposts:
Only have reviews in the URI: patch /reviews/{id}
It's 'efficient' as we're not specifying a parameter ({person_id})
that is not really needed. However, it breaks the API convention
The "efficiency" allows for a more flexible design. There's no existing API convention broken at this point. Moreover, this approach gives you the flexibility to avoid the need of always needing the parent resource ID whenever you display your items.
Stick to the parent resource: patch /person/{person_id}/reviews/{id}
It's consistent with the previously defined endpoint, but {person_id}
is not needed.
The consistency aspect here can be neglected. It's not beneficial to design endpoints similarly to other endpoints just because the previous ones were designed in a certain way.
The key when deciding one way or the other is the intent you communicate and the following restrictions that are put on the endpoint.
The crucial question here is:
Can the reviews ever exist on their own or will they always have a parent person?
If you don't know for sure, go for the more flexible design: PATCH /reviews/{id}
If you do know for sure that it always will be bound to a particular person and never can have a null value for person_id in the database, then you could embed it right into your endpoint design with: PATCH /person/{person_id}/reviews/{id}
By the way, the same is true for other endpoints, like the creation endpoint POST /person/{person_id}/reviews/{id}. Having an endpoint like this removes the flexibility of creating reviews without a person, which may be desirable or not.
I am building a RESTful API where users may create resources on my server using post requests, and later reference them via get requests, etc. One thing I've had trouble deciding on is what IDs the clients should have. I know that there are many ways to do what I'm trying to accomplish, but I'd like to go with a design which follows industry conventions and best design practices.
Should my API decide on the ID for each newly created resource (it would most likely be the primary key for the resource assigned by the database)? Or should I allow users to assign their own reference numbers to their resources?
If I do assign a reference number to each new resource, how should this be returned to the client? The API has some endpoints which allow for bulk item creation, so I would need to list out all of the newly created resources on every response?
I'm conflicted because allowing the user to specify their own IDs is obviously a can of worms - I'd need to verify each ID hasn't been taken, makes database queries a lot weirder as I'd be joining on reference# and userID rather than foreign key. On the other hand, if I assign IDs to each resource it requires clients to have to build some type of response parser and forces them to follow my imposed conventions.
Why not do both? Let the user create there reference and you create your own uid. If the users have to login then you can use there reference and userid unique key. I would also give the uid created back if not needed the client could ignore it.
It wasn't practical (for me) to develop both of the above methods into my application, so I took a leap of faith and allowed the user to choose their own IDs. I quickly found that this complicated development so much that it would have added weeks to my development time, and resulted in much more complex and slow DB queries. So, early on in the project I went back and made it so that I just assign IDs for all created resources.
Life is simple now.
Other popular APIs that I looked at, such as the Instagram API, also assign IDs to certain created resources, which is especially important if you have millions of users who can interact with each-other's resources.
Say I have a relational database with 100+ tables. Each table models some sort of entity (person, address, vehicle, dog, etc etc). Say I also have a restful API and a bunch of people who want to POST data into this database. Many times this data comes in as an XML package or POST data from a web form or something of that nature. Sometimes we need to post to all the tables of the database, sometimes most, sometimes some, sometimes one.
Now requiring our clients to post clumps of multi resource data into a 100+ table persistence via the restful way of
POST /person
POST /email
POST /vehicle
POST /insurance
is insane! So we could have a resource instead that is
POST /auto-record
{ post body of key values for all the tables needed to make an 'auto-record' }
and it would be connected to some sort of business logic that knows to make inserts into the many tables of the database needed. Okay great. But now that I'm thinking about it, does this design abide by the open/closed principle? If we ever needed to update/add/remove to what an 'auto-record' is then we screw up our clients.
How can restful api's deal with resource groupings? Or does it simply not? Are there alternatives?
You can implement more versions of your RESTful API resource /auto-record. For now modify your resource URI to /v1/auto-record. When there will be a feature change request, you will simply provide your customers with a new resource /v2/auto-record. Old functionality will be preserved at /v1/auto-record and new users will have their needed functionality at v2/auto-record.
I'm trying to wrap my head around how to design a RESTful API for creating object graphs. For example, think of an eCommerce API, where resources have the following relationships:
Order (the main object)
Has-many Addresses
Has-many Order Line items (what does the order consist of)
Has-many Payments
Has-many Contact Info
The Order resource usually makes sense along with it's associations. In isolation, it's just a dumb container with no business significance. However, each of the associated objects has a life of it's own and may need to be manipulated independently, eg. editing the shipping address of an order, changing the contact info against an order, removing a line-item from an order after it has been placed, etc.
There are two options for designing the API:
The Order API endpoint intelligently creates itself AND its associated resources by processing "nested resource" in the content sent to POST /orders
The Order resource only creates itself and the client has to make follow-up POST requests to newly created endpoints, like POST /orders/123/addresses, PUT /orders/123/line-items/987, etc.
While the second option is simpler to implement at the server-side, it makes the client do extra work for 80% of the use-cases.
The first option has the following open questions:
How does one communicate the URL for the newly created resource? The Location header can communicate only one URL, however the server would've potentially created multiple resources.
How does one deal with errors? What if one of the associons has an error? Do we reject the entire object graph? How is that error communicated to the client?
What's the RESTful + pragmatic way of dealing with this?
How I handle this is the first way. You should not assume that a client will make all the requests it needs to. Create all the entities on the one request.
Depending on your use case you may also want to enforce an 'all-or-nothing' approach in creating the entities; ie, if something falls, everything rolls back. You can do this by using a transaction on your database (which you also can't do if everything is done through separate requests). Determining if this is the behavior you want is very specific to your situation. For instance, if you are creating an order statement you may which to employ this (you dont want to create an order that's missing items), however if you are uploading photos it may be fine.
For returning the links to the client, I always return a JSON object. You could easily populate this object with links to each of the resources created. This way the client can determine how to behave after a successful post.
Both options can be implemented RESTful. You ask:
How does one communicate the URL for the newly created resource? The Location header can communicate only one URL, however the server would've potentially created multiple resources.
This would be done the same way you communicate linkss to other Resources in the GET case. Use link elements or what ever your method is to embed the URL of a Resource into a Representation.
i am a rails programmer who is on to his 3rd project now (new of course).I am looking for an answer to a general question about Restful architecture. I am sure i am doing something that has a good established answer already.
In restful approach we expose resources but some times this approach feels a little Non user friendly. For example i can expose a product via a show method and then i have another resource called sales that i can expose via product/:id/sales show template to show all sales for a product. But i am taking the user through an extra click here. The ideal will be to show product and all its associated sales on one page itself. But that is a violation of the Restful rule.
I just wanted to ask that are these rules generally broken to make the site user friendly? Being a new comer i dont want to adopt ways that are non ideal so i thought i should ask this question.
Thanks in advance.
Adding in the sales for a particular product would not be breaking any constraints from the RESTful architecture. You have the product ID in the HTTP request so you can just also get the sales for that product. Your separation of concerns should not be affected and you don't need to store a state to get this information. Just extend the model that you return with the view.
It seems like you are more concerned with straying from the convention over configuration that Rails promotes. This extension means that your model will not correlate with only one table in your database, but that is fine. The conventions are meant to reduce the configuration work that you need to do, not restrict your functionality.