REST API design for cloning a resource [duplicate]

REST API design for cloning a resource [duplicate] - api

This question already has answers here:
What is the restful way to represent a resource clone operation in the URL?
(5 answers)
Closed 7 years ago.
I am writing a YAML document using swagger to design a RESTful API method for cloning a resource. I have a few options and don't know which would be best. Please can someone advise?
Options:
Relinquishing the responsibility of cloning the resource object to the consumer (where the consumer assigns values to properties on a new object and then creates a new object), the process would need to consist of two requests to the API: a GET against a resource for the source object and then a POST to that resource for creating the new one. This feels like the consumer has too much responsibility.
Using the WebDAV HTTP extensions which provides a COPY method (see here). It would appear that this is exactly what I would like for cloning. However, I would like to stick to the standard methods as much as possible
POSTing to /{resource}?resourceIdToClone={id} where resourceIdToClone is an optional parameter. This would conflict with an API path that I already have for creating the resource, where I add a schema to the POST body. It would mean using a POST to /{resource}/ for creating and cloning, and that would violate SRP.
Adding a new resource called 'CloneableResource' and performing a POST to /CloneableResource/{resource_type}/{resource_source_id}. For the example of cloning a sheep, you'd make a POST to /CloneableResource/Sheep/10. This way, it would be possible to stick to using the standard HTTP methods, there'd be no conflict with any other resource paths (or SRP violation). However, I would be adding a new and potentially superfluous type to the domain. I also can't think of a scenario when a consumer would want to perform anything other than a POST to this resource, so it seems like a code smell to me.
A GET against /resource/{id}?method=clone. One of the advantages here is that no additional resource is required and it may be determined by a simple optional querystring parameter. I'm aware that one of the risks here is that it can be dangerous to provide post or delete capabilities using a GET method if the URL is in a web page as it may be crawled by a search engine.
Thanks for any help!

Most of these options are perfectly good choices. A lot of it just your style choice in the end. Here are my comments on each of your options.
Relinquishing the responsibility of cloning the resource object to the consumer
In general I don't really have a problem with this solution. This option is very straight forward for a user to understand and implement. It might be better than coming up with some proprietary cloning functionality that your users have to learn how to use.
Using the WebDAV HTTP extensions which provides a COPY method
I like to stick to the standard methods as well. I would not use COPY, but I wouldn't appalled if you did.
POSTing to /{resource}?resourceIdToClone={id}
This is a perfectly good solution. From a REST standpoint, you don't really have a conflict with the rest of your API. The URI with a query parameter identifies a different resource than the URI without the query parameter. Query parameters are a URI feature for identifying resources that you can not be referenced hierarchically. However, it might be difficult to separate these in your code because of the way most REST frameworks work. You could do something similar to this except with a hierarchical URI such as /{resource}/clone. You could POST to this URI and pass the resource_source_id in the body.
Adding a new resource called 'CloneableResource' and performing a POST to /CloneableResource/{resource_type}/{resource_source_id}
There is nothing wrong with this approach from a REST standpoint, but I think adding a new type is both unnecessary and clutters the API. However, I disagree with your intuition there could be problem with having a resource that has only a POST operation. It happens. In the real world, not everything fits nicely into GET, PUT, or DELETE.
A GET against /resource/{id}?method=clone
This is the only option of the 5 that I can not condone. It seems from your description that you already understand why this is a bad idea, so I'm not sure why you are considering it. However, all you have to do to make this a good solution is to change GET to POST. It then becomes very similar to the #3 solution. The URI could also be hierarchical instead of using a query parameter. POST /resource/{id}/clone would work just as well.
I hope this was helpful. Good luck with your decision.

If you want to COPY a resource, then, yes, COPY is an obvious choice.
(and yes, it would be good to pull the definitions of COPY and MOVE out of RFC 4918 to untangle them from WebDAV).

Influenced by project requirements and the range of preferences amongst members in my team, option 1 will serve us best at this stage.
Conforming to the standard HTTP methods will simplify and clarify my API
There will be a single, consistent approach to cloning a resource. This outweighs the issue I have with designating the cloning work to the consumer.

Related

Why does HATEOAS not specify a schema for the request body

A question for this already exists, but is more tech focused and doesnt have answers: Representing a request body on HATEOAS link
I like HATEOAS. I love using it in my frontend to check if I can perform some actions by checking if a link exists instead of having business logic.
But what I do not understand is how HATEOAS can truly be useful in other scenario's. What if you have an "AddItemToBasket" link which would need a request body with some properties in it. The frontend would still need to know what this request body looks like. But HATEOAS doesn't tell you this.
This means you still have a dependency on API knowledge. I think lots of applications solve this problem with generated API clients/graphql, but that makes HATEOAS a hard sell.
Why use HATEOAS if we can't use the URL and http method, because it doesn't offer the full picture.

REST builds on standards (uniform interface constraint) and currently there is no standard way to do this. There is a Hydra W3C WorkGroup writing a standard about how to describe Hypermedia APIs. They use RDF, standard vocabs like schema.org and you can write your API specific vocab they call documentation. As far as I understand their model you can give parameters in the documentation for operations represented by hyperlinks. You can use for example XSD to add constraints like numbers, etc. to the parameters. It takes a lot more effort than normally to write this kind of formal documentation and as far as I understand there are currently no general REST clients which could profit from these, so it does not make much sense currently to write such an API, but it is possible if you want to.
As of why to use HATEOAS, it makes your API flexible and backward compatible. For example if somebody does not have permission for an operation, you simply don't send a hyperlink for it in the response. You can always add new operations and the existing clients don't have to support them, they can just focus on what they already know and they won't break because something extra is added. They don't have to know about the URI structures and the methods, which can freely change if the only thing they depend on is the operation type and the parameters.

DDD + CQRS pattern with multiple inherited Aggregate Roots

Disclaimer: I know that DDD/CQRS might not be a perfect choice for below use case, but it was working great so far for other tasks in the project and I want to stick with it for learning purposes (eg. learn the hard way where not to use it ;).
Let's assume I'm building a Blogging Platform and have 3 types of posts: LifestylePost, RecipePost and ReviewPost. They have the same abstract base class PostBase which is an Aggregate Root and all of them share some properties like Author or implement methods like Publish, Delete etc. which change their Status and verify whether the state is valid. Each created post starts as a Draft which barely requires any validation, but switching to Published verifies nearly all properties.
Implementing commands/handlers like PublishPost or DeletePost was a breeze, but the problem starts when I'm thinking about CreatePost or UpdatePost. These are some dilemmas I'm currently facing:
Create/update commands/handlers for each post types or one for all
Having them separately for each post types gives me more control over the logic, allows to tailor each command model precisely, but makes my API more complex for the client. With the second option I can resolve the post type by some discriminator in the command and check whether all necessary properties for that particular type were provided.
Create empty post draft first or already filled with the data
When a user creates a new post (draft with an initial state) I can already call the API and add an empty post to the database, which would then be updated or I can wait until user inputs any data and clicks save. It's basically a matter of the arguments in the constructors of the posts.
Handling post updates in general
This is the point where I'm having the biggest issues now. Since there are quite many properties that could or could not be changed by the user, it's hard to split them to particular methods inside the Aggregate Root different than just Update with huge amount of nullable arguments where null would mean that the property has not been provided by the client and therefore not changed. Also adding Status property here would mean that the proper method for validating the state would have to be resolved. This solution somehow doesn't feel like a proper DDD design...
Which decisions would you make in each points and why?

Create/update commands/handlers for each post types or one for all
Depends on whether the difference in post types has influence over the API or not. If your clients have differentiated interactions with your API depending on the post type, then I would have specific commands for each type. If your API is post type agnostic, then this is internal business concerns. You should then ensure that domain implementation is sufficiently different from type to type, because polymorphic domain is a lot more work than a simple "category" property.
Create empty post draft first or already filled with the data
Depends on whether knowing a user has started working on a draft has any value for your system. If it's like a personal blog, probably not. If you're writing software for a newspaper, then it could be insightful to anyone. Ask your domain experts.
Handling post updates in general
This is an interface design question rather than a business logic one in my opinion. If your users want to do a lot of small changes, you should consider providing PATCH routes in your API, with a standard like JsonPatch. Depending on your implementation technology, you could benefit from libraries that does the object patching for you, saving a lot of code writing.

Is there really a difference in behaviour between the post types? Don't they all have Draft(), Publish(), Cancel() behaviours?
Given that inheritance means X is a Y, then you're basically saying they are all the same. This feels to me like a single Post aggregate with the possibility of a "PostType" value somewhere that might be part of an invariant (e.g. if you had a business rule that says "Review Posts cannot be published until a cooling-off period has elapsed).
This would mean a single set of application services to invoke those methods (and validate the invariants they implement).

In the Diode library for scalajs, what is the distinction between an Action, AsyncAction, and PotAction, and which is appropriate for authentication?

In the scala and scalajs library Diode, I have used but not entirely understood the PotAction class and only recently discovered the AsyncAction class, both of which seem to be favored in situations involving, well, asynchronous requests. While I understand that, I don't entirely understand the design decisions and the naming choices, which seem to suggest a more narrow use case.
Specifically, both AsyncAction and PotAction require an initialModel and a next, as though both are modeling an asynchronous request for some kind of refreshable, updateable content rather than a command in the sense of CQRS. I have a somewhat-related question open regarding synchronous actions on form inputs by the way.
I have a few specific use cases in mind. I'd like to know a sketch (not asking for implementation, just the concept) of how you use something like PotAction in conjunction with any of:
Username/password authentication in a conventional flow
OpenAuth-style authentication with a third-party involved and a redirect
Token or cookie authentication behind the scenes
Server-side validation of form inputs
Submission of a command for a remote shell
All of these seem to be a bit different in nature to what I've seen using PotAction but I really want to use it because it has already been helpful when I am, say, rendering something based on the current state of the Pot.

Historically speaking, PotAction came first and then at a later time AsyncAction was generalized out of it (to support PotMap and PotVector), which may explain their relationship a bit. Both provide abstraction and state handling for processing async actions that retrieve remote data. So they were created for a very specific (and common) use case.
I wouldn't, however, use them for authentication as that is typically something you do even before your application is loaded, or any data requested from the server.
Form validation is usually a synchronous thing, you don't do it in the background while user is doing something else, so again Async/PotAction are not a very good match nor provide much added value.
Finally for the remote command use case PotAction might be a good fit, assuming you want to show the results of the command to the user when they are ready. Perhaps PotStream would be even better, depending on whether the command is producing a steady stream of data or just a single message.
In most cases you should use the various Pot structures for what they were meant for, that is, fetching and updating remote data, and maybe apply some of the ideas or internal models (such as the retry mechanism) to other request types.
All the Pot stuff was separated from Diode core into its own module to emphasize that they are just convenient helpers for working with Diode. Developers should feel free to create their own helpers (and contribute back to Diode!) for new use cases.

Spring Data Rest Without HATEOAS

I really like all the boilerplate code Spring Data Rest writes for you, but I'd rather have just a 'regular?' REST server without all the HATEOAS stuff. The main reason is that I use Dojo Toolkit on the client side, and all of its widgets and stores are set up such that the json returned is just a straight array of items, without all the links and things like that. Does anyone know how to configure this with java config so that I get all the mvc code written for me, but without all the HATEOAS stuff?

After reading Oliver's comment (which I agree with) and you still want to remove HATEOAS from spring boot.
Add this above the declaration of the class containing your main method:
#SpringBootApplication(exclude = RepositoryRestMvcAutoConfiguration.class)
As pointed out by Zack in the comments, you also need to create a controller which exposes the required REST methods (findAll, save, findById, etc).

So you want REST without the things that make up REST? :) I think trying to alter (read: dumb down) a RESTful server to satisfy a poorly designed client library is a bad start to begin with. But here's the rationale for why hypermedia elements are necessary for this kind of tooling (besides the probably familiar general rationale).
Exposing domain objects to the web has always been seen critically by most of the REST community. Mostly for the reason that the boundaries of a domain object are not necessarily the boundaries you want to give your resources. However, frameworks providing scaffolding functionality (Rails, Grails etc.) have become hugely popular in the last couple of years. So Spring Data REST is trying to address that space but at the same time be a good citizen in terms of restfulness.
So if you start with a plain data model in the first place (objects without to many relationships), only want to read them, there's in fact no need for something like Spring Data REST. The Spring controller you need to write is roughly 10 lines of code on top of a Spring Data repository. When things get more challenging the story gets becomes more intersting:
How do you write a client without hard coding URIs (if it did, it wasn't particularly restful)?
How do you handle relationships between resources? How do you let clients create them, update them etc.?
How does the client discover which query resources are available? How does it find out about the parameters to pass etc.?
If your answers to these questions is: "My client doesn't need that / is not capable of doing that.", then Spring Data REST is probably the wrong library to begin with. What you're basically building is JSON over HTTP, but nothing really restful then. This is totally fine if it serves your purpose, but shoehorning a library with clear design constraints into something arbitrary different (albeit apparently similar) that effectively wants to ignore exactly these design aspects is the wrong approach in the first place.

What are the benefits of routers when the URI can be parsed dynamically?

I'm trying to make an architectural decision and I'm worried that I'm missing something big about URL routing / mapping when it comes to designing a basic REST API / framework.
Creating routing classes and such that is typically seen in REST API frameworks, that require one to manually map a URL to a class, and a class method (action), kind of seems like a failure to encapsulate the problem. When this can all be determed by parsing the URL dynamically and having an automatic router or front page controller.
GET https://api.example.com/companies/
Collection resource that gets a list of all companies.
GET https://api.example.com/companies/1
Fetches a single company by ID.
Which all seems to follow the template:https://api.example.com/<controller>/<parameter>/
Benefit 1: URL Decoupling and Abstraction
I assume one of the on paper benefits of having a typical routing class, is that you can decouple or abstract a URL from a resource / physical class. So you could have arbitrary URL's like GET https://api.example.com/poo/ instead of GET https://api.example.com/companies/ that fetches all the companies if you felt like it.
But in almost every example and use-case I've seen, the desire is to have a URL that matches the desired controller, action and parameters, 1 : 1.
Another possible benefit, is that collection resources within a resource, or nested resources, might be easier to achieve with URL mapping and typical routers. For example:
GET https://api.example.com/companies/1/users/
OR
GET https://api.example.com/companies/1/users/1/
Could be quite challenging to come up with a paradigm that can dynamically parse this to know what controller to call in order to get the data, what parameters to use, and where to use them. But I think I have come up with a standard way that could make this work dynamically.
Whereas manually mapping this would be easy.
I could just re-route GET https://api.example.com/companies/1/users/ to the users controller instead of the companies controller, bypassing it, and just set the parameter "1" to be the company id for the WHERE clause.
Benefit 1.1: No Ties to Physical Paths
An addendum to benefit 1, would be that a developer could completely change the URL scheme and folder structure, without affecting the API, because everything is mapped abstractly. If I choose to move files, folders, classes, or rename them, it should just be a matter of changing the mapping / routing.
But still don't really get this benefit either, because even if you had to move your entire API to another location, a trivial change in .htaccess with fix this immediately.
So this:
GET https://api.example.com/companies/
TO
GET https://api.example.com/v1/companies/
Would not impact code, even in the slightest. Even with a dynamic router.
Benefit 2: Control Over What Functionality is Exposed
Another benefit I imagine a typical router class gives you, over a dynamic router that just interprets and parses the URL, is control over exactly what functionality you want to expose to the API consumer. If you just do everything dynamically, you're kind of dropping your pants, automatically giving your consumer access to the entire system.
I see this as a possible benefit for the dynamic router, as you wouldn't then have to manually define and map all your routes to resources. It's all there, automatically. To solve the exposure problem, I would probably do the opposite by defining a blacklist of what functionality the API consumer shouldn't be allowed to use. I might be more time effective, defining a blacklist, then defining each and every usable resource with mapping. Then again, it's riskier too I suppose. You could even do a whitelist... which is similar to a typical router, but you wouldn't need any extended logic at all. It's just a list of URL's that the system would check, before passing the URL to the dynamic router. Or it could just be a private property of the dynamic router class.
Benefit 3: When HTTP Methods Don't Quite Fit the Bill
One case where I see a typical routers shining, is where you need to execute an action, that conflicts with an existing resource. Let me explain.
Say you want to authenticate a user, by running the login function within your user class. But now, you can't execute POST https://api.example.com/users/ with credentials, because that is reserved for adding a new user. Instead, you need to somehow run the login method in your user class. You don't want to use POST https://api.example.com/users/login/ either, because then you're using verbs other than the HTTP methods. However, with a typical router, you can just map this directly, as said before. Easy.
url => "https://api.example.com/tenant/"
Controller => "users"
Action => "login"
Params => "api_key, api_secret"
But, once again, I see an plausible alternative. I could just create another controller, called login or tenant, that instantiates my user controller, and runs the login function. So a consumer could just POST https://api.example.com/tenant/, with credentials, and blam. Authentication.
Although, to get this alternative to work, I would have to physically create another controller, when with a URL mapper, I wouldn't need to. But this seperation of concerns, functionality and resources, is quite nice too. But, maybe that's the main trade off, would you rather just define a URL route, or have to create new classes for each nuance you encounter?
What am I not seeing, or understanding? Am I missing a core concept here and just ignorant? Are there more benefits to having typical URL mapping and routing classes and functionality, that I'm just not aware of, or have I pretty much got this?

A lot of the benefits to routing you describe are correct, and some of what you say about physical mappings is also true. I'd like to throw in some experience / practical information that colored my opinion on routers over the last few years.
first of all, the dynamic parsing of url works well (most of the time) when you architect your application according to the MVC design pattern. For example, I once built a very large application using Kohana, which is a hierarchical MVC framework, which allows you to extend controllers and models for the sake of making nested urls. In general, this makes a lot of sense. But there were a lot of times where it simply didn't make much sense to go build a whole class and model system around the need for one-off URLs to make the application more functional. But there are also times where MVC is not the design pattern you're using, and thus it is not the defining feature of your API, and routing is beautiful in this scenario. One could easily see this issue at work by playing with frameworks that have a lot of structural freedom, such as Slim framework or Express.js.
more often than people think, a fully functional API will have an element of RPC-ness to it in addition to the primarily RESTful endpoints available for consumption. And not only do those additional functionalities make more sense as a consumer when they're decorating existing resource method mappings. This tends to happen after you've built out most of your application and covered most of your bases, and then you realize that there are a couple little features you'd like to add in relation to a resource that isn't doesn't cleanly fit into the CREATE / READ / UPDATE / DELETE categories. you'll know it when you see it.
it really can not be understated, it is much safer to not go hacking on the actual structure of the controllers and models, adding, removing, changing, things for the sole purpose of adding an endpoint that isn't inherently following the same rules of the other controller methods (API endpoints).
another very important thing is that your API endpoints are actually more maleable than we often realize. What I mean is, you can be OK with the structure of your endpoints on monday, and then on friday,you get this task sent down from above saying you need to change all of these API calls to be of some other structure, and thats fine. but if you have a large application, this requires a very, very significant amount of file renaming, class renaming, linkages, and all sorts of very breakable code when the framework you're using has strict rules for class naming, file naming, physical file path structure and stuff like that...just imagine, changing a class name to make it work with the new structure, and now you've got to go hunt down every line of code that instantiated the old class, and change it. Furthermore, in that scenario, it could be said that the problem is that your code is tightly coupled with the url structure of your API, and that is not very maintainable should your url needs change.
Either way, you really ought to decide whats best for the particular application. but the more you use routers, the more you'll see why they're so useful.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas