i have a general design question.
we have a fairly big data model that represents an clinical object, the object itself has 200+ child attributes in the hierarchy.
and we have a SetObject operation, and a GetObject operation. my question is, best practice wise, would it make sense to use that single data model in both operations or different data model for each? Because the Get operation will return much more details than what's needed for Set.
an example of what i mean: the data model has say ProviderId, and ProviderName attributes, in the Get operation, both the ProviderId, and ProviderName would need to be returned. However, in the Set operation, only the ProviderId is needed, and ProviderName is ignored by the service since system has that information already. In this case, if the Get and Set operations use the same data model, the ProviderName is exposed even for Set operation, does that confuse the consuming developer?
It would say: it depends :-)
No seriously. How do you edit / work on the object? I assume your software is calling the WCF service to retrieve an object, using an ID or a search term or something.
So you get back the object with 200+ attributes. How do you work on it, how much of it do you typically change?
If you typically only change a handful of attributes - then maybe having a generic SetProperty method on the service that would take the object ID, a property name, and a new value, might make sense. But think about how this is going to work:
the server side code will get the ID for the object
it will load the object from the database
it will then set a single property to a new value
it will save the object back to the database
What if you update four properties? You'd go through 4 of those cycles. Or: you could extend the SetProperty method to include a dictionary of (property name, value) pairs.
So I guess it depends on how many of those 200 properties are you changing at any given time? If you change 10%, 20% of those properties - wouldn't it be easier to just pass back the whole, modified object?
This looks like a good candidate for using your clinical object as canonical model and providing a restful style service interface. You can then provide different views, or representations of your your data object with only the fields required based on the usage model. Your verbs (get, set) will become the http standard Get, Put.
There are a number of open source Rest frameworks that you can use to make this easier to get started. Restlet is one that I have used successfully.
Related
I have a resource, as an example a 'book'.
I want to create a REST POST endpoint to allow consumers to create a new book.
However, some of the properties are required and computed by API, and others were actually taken as they are
Book
{
name,
color,
author # computed
}
Let's say the author is somehow calculated in API based on the book name.
I can think of these solutions each has its drawbacks:
enforce consumer to provide the author and just filter it (do not take into account as an input) # bad because it is very unpredictable why the author was changed
allow the user to provide author # same problem
do not allow the user to provide an author and show an exception if the user provides it
The last solution seems to be the most obvious one. The main problem I can see is that it is inconsistent and can be bizarre for consumers to see the author later on GET request.
I want my POST endpoint to be as expressive as possible. So the POST and GET data transfer objects will look almost the same.
Are there any simple, expressive, and predictable patterns to consider?
Personally I'm a big fan of using the same format for a GET request as well as a PUT.
This makes it possible for a client to do a GET request, add a property to the object they received and immediately PUT again. If your API and clients follow this pattern, it also means it can easily add new properties to GET requests and not break clients.
However, while this is a nice pattern I don't really think that same expectation exists at much for 'creation'. There's usually many things that make less less to require as a property when creating new items (think 'id' for example), so I usually:
Define a schema for PUT and GET.
Define a separate schema for POST that only contains the relevant properties for creation.
If users supply properties not in the schema, always error with a 422.
some of the properties are required and computed by API
Computed properties are neither required nor optional, by definition. No reason to ask consumers to pass such properties.
do not allow the user to provide an author and show an exception if the user provides it
Indeed, DTO should not contain author-property. Consumers can send over network whatever they want, however it is the responsibility of the API-provider to publish contract (DTO) for consumers to use properly. API-provider controls over what properties to consider, and no exception should be thrown, as the number of "bad" properties that can be sent by consumers is endless.
So the POST and GET data transfer objects will look almost the same
Making DTOs of the same resource look the same is not a goal. In many cases, get-operation exposes a lot more properties than post-operation for the same resource, especially when designing domain-driven APIs.
Are there any simple, expressive, and predictable patterns to consider?
If you want your API to express the fact that author is computed, you can have the following endpoints:
POST http://.../author-computed-books
GET http://.../books/1
Personally, I wouldn't implement that way since it does not look natural, however you can get the idea.
I want my POST endpoint to be as expressive as possible. So the POST
and GET data transfer objects will look almost the same.
Maybe just document it instead of relying explicit stuff like it must be almost the same as the GET endpoint.
E.g. my POST endpoint is POST /number "1011" and my GET endpoint is GET /number -> 11. If I don't document that I expect binary and I serve decimal, then nobody will know and they would guess for example decimal for both. Beyond documentation another way of doing this and to be more explicit is changing the response for GET to include the base {"base":10, value:"11"} or changing the GET endpoint GET /number/decimal -> 11.
As of the computed author I don't understand how you would compute it. I mean either a book is registered and the consumer shouldn't register it again or you don't know much about the author of it. If the latter, then you can guess e.g. based on google results for the title, but it will be a guess, not necessarily true. The same with consumer data, but at least that is what the consumers provided. There is no certainty. So for me it would be a complex property not just a primitive one if the source of the information matters. Something like "author": {name: "John Wayne", "source": "consumer/service"} normally it is complex too, because authors tend to have ids, names, other books, etc.
Another thought that if it is weird for the consumers instead of expected, then I have no idea why it is a feature at all. If author guessing is a service, then a possible solution is making the property mandatory and adding a guessing service GET /author?by-book-name={book-name}, so they can use the service if they want to. Or the same with a completely optional property. This way you give back the control to the consumers on whether they want to use this service or not.
I'm designing a restful web service and I was wondering what should I name my DTOs. Can I use suffixes like Request and Response for them? for example for addUser service, there will be 2 DTOs named: AddUserRequest and AddUserResponse.
Does your organization already have a schema that describes a canonical user that you pass in? If that's what you're using, of course you would use the name from that schema. Otherwise, describe them just as you would any class or schema element.
Note that since a DTO doesn't contain its own methods, you probably would not give it a name with an action verb.
However, consider calling them AddUserRequest and AddUserResponse, especially if the method requires more info than just your regular user DTO. This fits with the Interface Segregation Principle in that your interface parameters should be specifically tailored to the request itself (it shouldn't require elements that are unrelated to the request; and you shouldn't have function-type parameters that change the request, those should be extracted into their own calls.) The AddUserRequest might then contain an element called User that holds the user-specific data, and another element holding the set of other associated data on the request, perhaps groups or access permissions, that sort of thing.
DTOs (Data Transfer Object) are like POJOs(Plain Old Java Objects). It should only have getters and setters and not any business logic.
From Wikepedia:
A data transfer object is an object that carries data between
processes. The motivation for its use is that communication between
processes is usually done resorting to remote interfaces (e.g., web
services), where each call is an expensive operation. Because the
majority of the cost of each call is related to the round-trip time
between the client and the server, one way of reducing the number of
calls is to use an object (the DTO) that aggregates the data that
would have been transferred by the several calls, but that is served
by one call only.
The difference between data transfer objects and business objects or
data access objects is that a DTO does not have any behavior except
for storage and retrieval of its own data (mutators and accessors).
DTOs are simple objects that should not contain any business logic
that would require testing.
This pattern is often incorrectly used outside of remote interfaces.
This has triggered a response from its author[3] where he reiterates
that the whole purpose of DTOs is to shift data in expensive remote
calls.
So ideally for those actions you should create some helpers or you can add those as controllers.
Since it is a RESTful service, ideally the user addition/creation request should send back 201 created HTTP status code , with userId in location header and no response body. For the request, you could name it like UserDetails or UserData or simply User. Refer https://pontus.ullgren.com/view/Return_Location_header_after_resource_creation
We are building a real-estate portal. We have Services, Mappers and Entites. At the stage we are allowing users to either
Create a property via a form.
Upload a batch file containing 1 or more properties.
So if he create a property via the form we can validate the form and if its a valid property, we can add it into our system. But if he upload via a batch file, we think that the responsibility of the form is
to validate that the user provided a file
the file type is valid
and the file size is within the allowed limits.
After this it should hand over the file to the controller or service.
Now the pending tasks are
Process the file and retrieve the contents
Validate the contents
If validated, save the properties or display an error.
So which part(s) are responsible for the above tasks?
I am thinking that the controller should do the initial file processing and pass the data to the service. This means that we will create/fetch the form object in the controller and validate the form within the controller.
Now the next section is to validate the contents, which is actually a collection of entities. So we have following ideas for this stage
Service will validate the data and create the entities, it will save them.
Or service will create the entity with the provided data and then call the validation function of the entity.
Or the service will try to create an entity with the provided data (send the data to the entity constructor), and if the data is valid, the entity will be created or will generate an error etc.
The possible issues I can think about above approaches are
If the service is validating the data, it means the service will know the inner structure of the entity, so if down the road we need to update the entity structure, we have to update the service as well. Which will introduce some sort of dependency.
In the 2nd approach, I don't think that an entity should be created at first place if it isn't valid.
In the 3rd approach, we are creating a functionality within entity's constructor, so making the entity dependent on the data. So when we need to fetch the entity from persistent, we need to provide some stub data.
Or am I over-thinking??
Now the next section is to validate the contents, which is actually a collection of entities.
The Contents, that Controller sends to Service, is a graph of objects / a structure / a plain string in the simplest case, but never a collection of business entities.
If the service is validating the data, it means the service will know the inner structure of the entity
What exactly is Service validating?
Service is validating the data means that Service ensures invariant of every structure / object that it receives.
For example, if F(T) is service method and T is structure with properties { A, B, C } that represents a triangle with three edges, then Service has to ensure the invariants (the length of each site is greater then zero and the sum of the lengths of any two sides must be greater than the length of the third side) of this structure after this structure has been deserialized.
This validation has to be done because deserializer doesn't use constructors to ensure invariants during deserialization.
When these validations are done, all objects passed to Service are valid and can be freely used in business layer directly or converted to objects (for example, entities) known to business layer.
if down the road we need to update the entity structure, we have to update the service as well. Which will introduce some sort of dependency.
This dependency is inavoidable. Since Transfer Objects and Entity Objects are separated, there always exists mapper that knows how to convert them.
Service will validate the data and create the entities, it will save them.
I'd go with this. Service validates data, converts into business layer objects, invokes business layer functions, persists changes.
It depends on what kind of constraints you're validating.
1.parameter validation like notEmpty property name or max length etc.
In this case you could extract the validation logic to a Validator object. This is useful when you have multiple property creating form(web form, file uploading), the validator may be invoked by multiple "client", but the validation logic keeps in one object.
2.business rule validation.
I prefer using domain models, you may have a look at the PhoneNumber example in this presentation
I am creating a brand new application, including the database, and I'm going to use Entity Framework Code First. This will also use WCF for services which also opens it up for multiple UI's for different devices, as well as making the services API usable from other unknown apps.
I have seen this batted around in several posts here on SO but I don't see direct questions or answers pertaining to Code First, although there are a few mentioning POCOs. I am going to ask the question again so here it goes - do I really need DTOs with Entity Framework Code First or can I use the model as a set of common entities for all boundaries? I am really trying to follow the YAGNI train of thought so while I have a clean sheet of paper I figured that I would get this out of the way first.
Thanks,
Paul Speranza
There is no definite answer to this problem and it is also the reason why you didn't find any.
Are you going to build services providing CRUD operations? It generally means that your services will be able to return, insert, update and delete entities as they are = you will always expose whole entity or single exactly defined serializable part of the entity to all clients. But once you do this it probably worth to check WCF Data Services.
Are you going to expose business facade working with entities? The facade will provide real business methods instead of just CRUD operations. These buisness methods will get some data object and decompose it to multiple entities in wrapped business logic. Here it makes sense to use specific DTO for every operation. DTO will transfer only data needed for the operation and return only date allowed to the client.
Very simple example. Suppose that your entities keep information like LastModifiedBy. This is probably information you want to pass back to the client. In the first scenario you have single serializable set so you will pass it back to the client and client pass it modified back to the service. Now you must verify that client didn't change the field because he probably didn't have permissions to do that. You must do it with every single field which client didn't have permission to change. In the second scenario your DTO with updated data will simply not include this property (= specialized DTO for your operation) so client will not be able to send you a new value at all.
It can be somehow related to the way how you want to work with data and where your real logic will be applied. Will it be on the service or on the client? How will you ensure that client will not post invalid data? Do you want to restrict passing invalid data by logic or by specific transferred objects?
I strongly recommend a dedicated view model.
Doing this means:
You can design the UI (and iterate on it) without having to wait to design the data model first.
There is less friction when you want to change the UI.
You can avoid security problems with auto-mapping/model binding "accidentally" updating fields which shouldn't be editable by the user -- just don't put them in the view model.
However, with a WCF Data Service, it's hard to ignore the advantage of being able to write the service in essentially one line when you expose entities directly. So that might make the most sense for the WCF/server side.
But when it comes to UI, you're "gonna need it."
do I really need DTOs with Entity Framework Code First or can I use the model as a set of common entities for all boundaries?
Yes, the same set of POCOs / entities can be used for all boundaries.
But a set of mappers / converters / configurators will be needed to adapt entities to some generic structures of each layer.
For example, when entities are configured with DataContract and DataMember attributes, WCF is able to transfer domain objects' state without creating any special classes.
Similarly, when entities are mapped using Entity Framework fluent mapping api, EF is able to persist domain objects' state in database without creating any special classes.
The same way, entities can be configured to be used in any layer by means of the layer infrastructure without creating any special classes.
I am new to programming (6 weeks now). i am reading a lot of books, sites and blogs right now and i learn something new every day.
Right now i am using coldfusion (job). I have read many of the oop and cf related articles on the web and i am planning to get into mxunit next and after that to look at some frameworks.
One thing bothers me and i am not able to find a satisfactory answer. Beans are sometimes described as DataTransferObjects, they hold Data from one or many sources.
What is the recommended practice to handle this data?
Should i use a separate Object that reads the data, mutates it and than writes it back to the bean, so that the bean is just a storage for data (accessible through getters) or should i implement the methods to manipulate the data in the bean.
I see two options.
1. The bean is only storage, other objects have to do something with its data.
2. The bean is storage and logic, other objects tell it to do something with its data.
The second option seems to me to adhere more to encapsulation while the first seems to be the way that beans are used.
I am sure both options fit someones need and are recommended in a specific context but what is recommended in general, especially when someone does not know enough about the greater application picture and is a beginner?
Example:
I have created a bean that holds an Item from a database with the item id, a name, and an 1d-array. Every array element is a struct that holds a user with its id, its name and its amount of the item. Through a getter i output the data in a table in which i can also change the amount for each user or check a user for deletion from this item.
Where do i put the logic to handle the application users input?
Do i tell the bean to change its array according to the user input?
Or do i create an object that changes the array and writes that new array into the bean?
(All database access (CreateReadUpdateDelete) is handled through a DataAccessObject that gets the bean as an argument. The DAO also contains a gateway method to read more than one record from the database. I use this method to get a table of items, which i can click to create the bean and its data.)
You're observing something known as "anemic domain model". Yes, it's very common, and no, it's not good OO design. Generally, logic should be with the data it operates on.
However, there's also the matter of separation of concerns - you don't want to stuff everything into the domain model. For example, database access is often considered a technically separate layer and not something the domain models themselves should be doing - it seems you already have that separated. What exactly should and should not be part of the domain model depends on the concrete case - good design can't really be expressed in absolute rules.
Another concern is models that get transferred over the network, e.g. between an app server and a web frontend. You want these to contain only the data itself to reduce badnwidth usage and latency. But that doesn't mean they can't contain logic, since methods are not part of the serialized objects. Derived fields and caches are - but they can usually be marked as transient in some way so that they are not transferred.
Your bean should contain both your data and logic.
Data Transfer Objects are used to transfer objects over the network, such as from ColdFusion to a Flex application in the browser. DTOs only contain relevant fields of an object's data.
Where possible you should try to minimise exposing the internal implementation of your bean, (such as the array of user structs) to other objects. To change the array you should just call mutator functions directly on your bean, such as yourBean.addUser(user) which appends the user struct to the internal array.
No need to create a separate DAO with a composed Gateway object for your data access. Just put all of your database access methods (CRUD plus table queries) into a single Gateway object.