HTTP: Is it acceptable to use the COPY method where the Destination header is not a URI? - api

Background
I'm building an API that allows clients to manipulate geospatial objects. These objects contain a location on the world (in latitude/longitude), and a good bit of metadata. The actual API is rather large, so I present a simplified version here.
Current API
Consider an API with two objects, features and attributes.
The feature endpoint is /api/feature and looks like:
{
id: 5,
name: "My super cool feature",
geometry: {
type: "Point",
coordinates: [
-88.043355345726,
43.293055846667315
]
}
}
The attribute endpoint is /api/attribute. An attribute looks like:
{
id: 3,
feature_id: 5,
name: "attr-name",
value: "value"
}
You can interact with these objects by issuing HTTP requests to their endpoints using different HTTP methods, like you might expect:
GET /api/feature/5 reads the feature with id 5.
PUT /api/feature/5 updates the feature with id 5.
POST /api/feature creates a new feature.
DELETE /api/feature/5 deletes the feature with id 5.
Same goes for attributes.
Attributes are related to features by foreign key (commonly expressed as "features have many attributes").
The Problem
It would be useful to be able to make a copy of a feature and all its metadata (all the attributes that belong to it). The use case is more or less, "I just made this feature and gave it a bunch of attributes, now I want the same thing... but over there." So the only difference between the two features would be their geometries.
Solution #1: Make the client do it.
My first thought was to just have the client do it. Create a new feature with the same name at a new location, then iterate through all the attributes on the source feature, issuing POST requests to make copies of them on the new feature. This, however, suffers from a few problems. First, it isn't atomic. Should the client's Internet connection flake out during this process, you'd be left with an incomplete copy, which is lame. Second, it'd probably be slow, especially for features with many attributes. Anyway, this is a bad idea.
Solution #2: Add copy functionality to the API.
Doing the copy server-side, in a single API call, would be the better approach. This leads me to https://www.rfc-editor.org/rfc/rfc2518#section-8.8 and the COPY method. Being able to do a deep copy of a feature in a single COPY /api/feature/5 request seems ideal.
The Question
My issue, here, is the semantics of COPY don't quite fit the use I envision for it. Issuing a COPY request on a resource executes a copy of that resource to the destination specified in the Destination header. According to the RFC, Destination must be present, and it must be a URI specifying where the copied resource will end up. In my case, the destination for the copied feature is a geometry, which is decidedly not a URI.
So, my questions are: Would stuffing json for the geometry into the Destination header of a COPY request be a perversion of the spec? Is COPY even the right thing to use, here? If not, what alternatives are there? I just want to be sure I'm implementing this in the most HTTP-kosher way.

Well, you'll need a way to make the Destination a URI then (why is that a problem). If you're using the Destination header field for something else, you're not using COPY per spec. (And, BTW, the current specification is RFC 4918)

Related

Appropriate REST design for data export

What's the most appropriate way in REST to export something as PDF or other document type?
The next example explains my problem:
I have a resource called Banana.
I created all the canonical CRUD rest endpoint for that resource (i.e. GET /bananas; GET /bananas/{id}; POST /bananas/{id}; ...)
Now I need to create an endpoint which downloads a file (PDF, CSV, ..) which contains the representation of all the bananas.
First thing that came to my mind is GET /bananas/export, but in pure rest using verbs in url should not be allowed. Using a more appropriate httpMethod might be cool, something like EXPORT /bananas, but unfortunately this is not (yet?) possible.
Finally I thought about using the Accept header on the same GET /bananas endpoint, which based on the different media type (application/json, application/pdf, ..) returns the corresponding representation of the data (json, pdf, ..), but I'm not sure if I am misusing the Accept header in this way.
Any ideas?
in pure rest using verbs in url should not be allowed.
REST doesn't care what spelling conventions you use in your resource identifiers.
Example: https://www.merriam-webster.com/dictionary/post
Even though "post" is a verb (and worse, an HTTP method token!) that URI works just like every other resource identifier on the web.
The more interesting question, from a REST perspective, is whether the identifier should be the same that is used in some other context, or different.
REST cares a lot about caching (that's important to making the web "web scale"). In HTTP, caching is primarily about re-using prior responses.
The basic (but incomplete) idea being that we may be able to re-use a response that shares the same target URI.
HTTP also has built into it a general purpose mechanism for invalidating stored responses that is also focused on the target URI.
So here's one part of the riddle you need to think about: when someone sends a POST request to /bananas, should caches throw away the prior responses with the PDF representations?
If the answer is "no", then you need a different target URI. That can be anything that makes sense to you. /pdfs/bananas for example. (How many common path segments are used in the identifiers depends on how much convenience you will realize from relative references and dot segments.)
If the answer is "yes", then you may want to lean into using content negotiation.
In some cases, the answer might be "both" -- which is to say, to have multiple resources (each with its own identifier) that return the same representations.
That's a normal thing to do; we even have a mechanism for describing which resource is "preferred" (see RFC 6596).
REST does not care about this, but the HTTP standard does. Using the accept header for the expected MIME type is the standard way of doing this, so you did the right thing. No need to move it to a separate endpoint if the data is the same just the format is different.
Media types are the best way to represent this, but there is a practical aspect of this in that people will browse a rest API using root nouns... I'd put some record-count limits on it, maybe GET /bananas/export/100 to get the first 100, and GET /bananas/export/all if they really want all of them.

Creating API - general question about verbs

I decided to move my application to a new level by creating a RESTful API.
I think I understand the general principles, I have read some tutorials.
My model is pretty simple. I have Projects and Tasks.
So to get the lists of Tasks for a Project you call:
GET /project/:id/tasks
to get a single Task:
GET /task/:id
To create a Task in a Project
CREATE /task
payload: { projectId: :id }
To edit a Task
PATCH /task/:taskId
payload: { data to be changed }
etc...
So far, so good.
But now I want to implement an operation that moves a Task from one Project to another.
My first guess was to do:
PATCH /task/:taskId
payload: { projectId: :projectId }
but I do not feel comfortable with revealing the internal structure of my backend to the frontend.
Of course, it is just a convention and has nothing to do with security, but I would feel better with something like:
PATCH /task/:taskId
payload: { newProject: :projectId }
where there is no direct relation between the 'newProject' and the real column in the database.
But then, the next operation comes.
I want to copy ALL tasks from Project A to Project B with one API call.
PUT /task
payload: { fromProject: :projectA, toProject: :projectB }
Is it a correct RESTful approach? If not - what is the correct one?
What is missing here is "a second verb".
You can see that we are creating a new task(s) hence: 'PUT' but we also 'copy' which is implied by fromProject and toProject.
Is it a correct RESTful approach? If not - what is the correct one?
To begin, think about how you would do it in a web browser: the world wide web is the reference implementation for the REST architectural style.
One of the first things that you will notice: on the web, we are almost always using POST to make changes to the server. You fill in a form in a browser, submit the form, the browser takes information from the input controls of the form to create the HTTP request body, the server figures out how to do the work that is described.
What we have in HTTP is a standardized semantics for messages that manipulate individual documents ("resources"); doing useful work is a side effect of manipulating documents (see Webber 2011).
The trick of POST is that it is the method whose standardized meaning includes the case where "this method isn't worth standardizing" (see Fielding 2009).
POST /2cc3e500-77d5-4d6d-b3ac-e384fca9fb8d
Content-Type: text/plain
Bob,
Please copy all of the tasks from project A to project B
The request line and headers here are metadata in the transfer of documents over a network domain. That is to say, that's the information we are sharing with the general purpose HTTP application.
The actual underlying business semantics of the changes we are making to documents is not something that the HTTP application cares about -- that's the whole point, after all.
That said - if you are really trying to do manipulation of document hierarchies in general purpose and standardized way, then you should maybe see if your problem is a close match to the WebDAV specifications (RFC 2291, RFC 4918, RFC 3253, etc).
If the constraints described by those documents are acceptable to you, then you may find that a lot of the work has already been done.

What HTTP method should I use for an endpoint that updates a status field of multiple entities

I like to use the correct HTTP methods when I'm creating an API. And usually it's very straightforward. POST for creating entities, PUT for updating them, GET for retrieving etc.
But I have a use-case here where I will create an endpoint that updates the status of multiple objects given 1 identifier.
e.g.:
/api/v1/entity/update-status
But note that I mentioned multiple objects. The initial thought of my team would be to use map it as POST, but it won't actually be creating anything, plus if you were to call the same endpoint multiple times with the same identifier, nothing would change after the first time. Making it idempotent.
With this in mind, my idea was to create it as a PUT or even PATCH endpoint.
What do you smart people think?
I imagine PATCH would be the most correct way. Although if you use a PUT it would also not be incorrect.
The difference between the PUT and PATCH requests is reflected in the
way the server processes the enclosed entity to modify the resource
identified by the Request-URI. In a PUT request, the enclosed entity
is considered to be a modified version of the resource stored on the
origin server, and the client is requesting that the stored version be
replaced. With PATCH, however, the enclosed entity contains a set of
instructions describing how a resource currently residing on the
origin server should be modified to produce a new version. The PATCH
method affects the resource identified by the Request-URI, and it also
MAY have side effects on other resources; i.e., new resources may be
created, or existing ones modified, by the application of a PATCH.
Whilst it is a convention in REST APIs that POST is used to create a resource it doesn't necessarily have to be constrained to this purpose.
Referring back to the definition of POST in RFC 7231:
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. For example, POST is used for the following functions (among others):
Providing a block of data, such as the fields entered into an HTMl form, to a data-handling process
Posting a message to a bulletin board, newsgroup, mailing list, blog, or similar group of articles;
*Creating a new resource that has yet to be identified by the origin server; and *
Appending data to a resource's existing representation(s).
Clearly creation is only one of those purposes and updating existing resources is also legitimate.
The PUT operation is not appropriate for your intended operation because again, per RFC, a PUT is supposed to replace the content of the target resource (URL). The same also applies to PATCH but, since it is intended for partial updates of the target resource you can target it to the URL of the collection.
So I think your viable options are:
POST /api/v1/entity/update-status
PATCH /api/v1/entity
Personally, I would choose to go with the PATCH as I find it semantically more pleasing but the POST is not wrong. Using PATCH doesn't gain you anything in terms of communicating an idempotency guarantee to a consumer. Per RFC 5789: "PATCH is neither safe nor idempotent" which is the same as POST.

REST API Design: Can I use parameters to change the response structure?

I am trying to design a REST API that will return detailed information about an identifier that I pass in. For the sake of an example, I am passing in an identifier and returning information about a specific vehicle. The problem that I am facing is that there are many different kinds of vehicles, each with different unique properties. I am wondering if there is a way so that I can only return the relevant details with the REST API.
Currently I plan on having one endpoint /vehicles and passing in the identifier as a parameter.
My current request will consist of something like this GET /vehicles?id=123456
My current response structure will be something like this:
{
"vehicleDetails" : {
"color": "someColor",
"make: "someMake",
"model: "someModel",
"year: "someYear",
"carDetails": {
// some unique car fields
},
"motorcycleDetails" : {
// some unique motorcycle fields
},
"boatDetails" : {
// some unique boat fields
}
}
}
As you can see, there are some fields that are common to all vehicle types, but there are also fields that are unique to a certain type of vehicle, for example boatDetails. As far as I understand, I will have to return the entire resource which will have many empty fields. For example, when I request information about a car, I will still have boat and motorcycle details returned as part of the JSON response, even though they will all be empty. My concern with this is that the response payloads will be rather large when only a small subset of the fields will actually be used by the consumer. Would it make sense to add another parameter to filter the fields that come back? Something like /vehicles?id=123456&type=Car? Then in my code I could manipulate the response structure based on the type parameter? I feel that this violates REST best practices. Any advice into how I could change the design of my API would be appreciated.
Note: I cannot use GraphQL for this and would appreciate input about how I can improve this REST API design
Sure,query parameters (as well as matrix and path parameters) are fine from a REST standpoint. You'll end up with a unique URI that identifies a resource. Responses will be cacheable regardles what type of parameters you use. It is though questionable whether exposing the parameter as query parameter has any advantages over exposing it directly as path paramter, i.e. /vehicles/12345 in that case.
What IMO is more important than the actual form of the URI is the representation format you return. The proposed response format bears the potential of being treated as typed resource instead of utilizing a general, multi-purpose format and propper content-type negotiation. Think of HTML i.e. it defines the syntax and semantics of certain hyper-media controls you can use and any arbitrary Web client will be able to present it and operate on the response format. With custom formats, however, you will miss out on that feature.
If you only have one client, that moreover is under your control, the additional overhead you need to put in is probably not worth aiming for a REST environment in general, as it is easier to change the client once the server stuff changed. Though if you aim for a long-living environment that may be utilized by clients not under your control, this is for sure a thing to consider.
Can I use parameters to change the response structure?
Yes.
Example: https://www.merriam-webster.com/dictionary/JPEG
Notice that -- even though the URI says JPEG, the server is actually returning a text/html document, and the browser knows that from the Content-Type header in the HTTP response.
Identifiers are semantically opaque, in the same way that variable names are. We tend to choose spellings that make things easy for humans, but the machines don't care what spelling conventions we use.
So there's no constraint that says that /vehicles?id=1 and /vehicles?id=2 have to have the same "structure".
So you could have application/prs.rbanders-car, application/prs.rbanders-boat and application/prs.rbanders-rocketship, each with its own specific processing rules.
More likely, though, you'll want to piggyback on some other more common representation; so it's common to see a structured syntax suffix: application/prs.rbanders-car+json`, etc. Effectively, you are promising a json document, but with a more specific schema.
Of course, there's nothing stopping you from creating a application/prs.rbanders-vehicle+json schema, and then describing some fields as optional that will only appear if some condition is met.
Different options, different trade offs.
I feel that this violates REST best practices.
Not really - the important ideas are to handle metadata in a standard way (so that general purpose components can understand what is going on) and to use common formats where it makes sense to do so (to leverage the libraries that are already available).

Efficiency of RESTful APIs

I'm currently creating an application (let's say, notes app, for instance - webapplication + mobile app). I wanted to use RESTful API here, so I read a lot about this topic and I found out there's a lot of ambiguity over there.
So let's start at the beginning. And the beginning in REST is that we have to first request the / (root), then it returns list of paths we can further retrieve, etc, etc. Isn't this the the first part where REST is completely wasteful? Instead of rigid paths, we have to obtain them each time we want to do something. Nah.
The second problem I encountered was bulk operations. How to implement them in REST? Let's say, user didn't have access to the internet for a while, made a few changes and they all have to be made on server as well. So, let's say user modified 50 notes, added 30 and removed 20. We have to make 100 separate requests now. A way to make bulk operations would be very helpful - I saw this stackoverflow topic: Patterns for handling batch operations in REST web services? but I didn't found anything interesting here actually. Everything is okay as long as you want to do one type of operation on one type of resources.
Last, but not least - retrieving whole collection of items. When writing an example app I mentioned - notes app - you probably want to retrieve all collection of items (notes, tags, available notes colors, etc...) at once. With REST, you have to first retrieve list of item links, then fetch the items one by one. 100 notes = over 100 requests.
Since I'm currently learning all this REST stuff, I may be completely wrong at what I said here. Anyway, the more I read about it, the more gruesome it looks like for me. So my question finally is: where am I wrong and/or how to solve problems I mentioned?
It's all about resources. Resources that are obtained through a uniform interface (usually via URI and HTTP methods).
You do not have to navigate through root every time. A good interface keeps their URIs alive forever (if they go stale, they should return HTTP Moved or similar). Root offering pathways to navigate is a part of HATEOAS, one of Roy Fieldings defined architectural elements of REST.
Bulk operations are a thing the architectural style is not strong on. Basically nothing is stopping you to POST a payload containing multiple items to a specific resource. Again, it's all up to what resource you are using/offering and ultimately, how your server implementation handles requests. Your case of 100 requests: I would probably stick with 100 requests. It is clean to write and the overhead is not that huge.
Retrieving a collection: It's about resources what the API decides to offer. GET bookCollection/ vs GET book/1 , GET/book/2 ... or even GET book/all. Or maybe GET book/?includeDetails=true to return all books with same detail as GETting them one-by-one.
I think that this link could give you interesting hints to design a RESTful service: https://templth.wordpress.com/2014/12/15/designing-a-web-api/.
That said, here are my answers to your questions:
There is no need to implement a resource for the root path. With this, I think that you refered to HATEOS. In addition, no link within the payload is also required. Otherwise you can use available formats like Swagger or RAML to document your RESTful service. This show to your end users what is available.
Regarding bulk operations, you are free to use methods POST or PATCH to implement this on the list resource. I think that these two answers could be helpful to you:
REST API - Bulk Create or Update in single request - REST API - Bulk Create or Update in single request
How to Update a REST Resource Collection - How to Update a REST Resource Collection
In fact, you are free to regarding the content you want for your methods GET. This means the root element managed by the resources (list resource and element resource) can contain its hints and also the ones of dependencies. So you can have several levels in the returned content. For example, you can have something like this for an element Book that references a list of Author:
GET /books
{
"title": "the title",
(...)
"authors": [
{
"firstName": "first name",
"lastName": last name"
}, {
(...)
},
(...)
]
}
You can notice that you can leverage query parameters to ask the RESTful service to get back the expected level. For example, if you want only book hints or book hints with corresponding authors:
GET /books
{
"title": "the title",
(...)
}
GET /books?include=authors
{
"title": "the title",
(...)
"authors": [
{
"firstName": "first name",
"lastName": last name"
}, {
(...)
},
(...)
]
}
You can notice that you can distinguish two concepts here:
The inner data (complex types, inner objects): data that are specific to the element and are embedded in the element itself
The referenced data: data that reference and correspond other elements. In this case, you can have a link or the data itself embedded in the root element.
The OData specification addresses such issue with its feature "navigation links" and its query parameter expand. See the following links for more details:
http://www.odata.org/getting-started/basic-tutorial/ - Section "System Query Option $expand"
https://msdn.microsoft.com/en-us/library/ff478141.aspx - Section "Query and navigation"
Hope it helps you,
Thierry