I’ve been persuaded by Eric Evans’ book and am integrating DDD into my framework. All basic elements (services, repositories, bounded contexts, etc) have been implemented and now I’m looking for feedback on how to correctly integrate this.
I have some business logic which has to be performed when an entity is created or modified. This example is a very simple one. Most business logic will become much more complex.
This business logic can be split up into the following actions:
Update calculated fields;
Update a child record inside the aggregate root. When creating the aggregate root this entails creating a default child record. When updating the aggregate root this entails removing the existing child record and creating a new one if a specific field on the aggregate root has changed;
Propagate start and end date of the aggregate root to the start and end date of the child records inside the aggregate root. These must be kept in sync under certain circumstances;
Propagate a field of the aggregate root to a different aggregate root.
My first attempt is to put all of this on the aggregate root, but I feel this is not going to work. I have the following problems integrating this logic:
All these actions have to be completed as a single whole and should not be made available as separate actions. This has the result that this is going to be very difficult to test (TDD);
I am not clear on whether any of these actions can be moved out to a service. The reason for this is that they make no sense outside of the aggregate root, but it would make TDD a lot easier;
Some logic changes depending on whether a new entity is created or an existing one is modified. Should I put these two branches inside the update logic or should I make two entirely different paths that share the business code that does not differentiate based create/modify.
Any help on the above issues would be greatly appreciated and other feedback in general.
The algorithm you've described should remain in the aggregate root, elsewise you end up with an anemic domain model, excepting propagating a field to another aggregate root where I will describe what I think you should do later.
As far as TDD is concerned, a method with "package" access on the aggregate root (e.g. "calculate()", should coordinate the entire action, which either the service or repository object would normally call. This is what tests should exercise in conjunction with setting different combinations of instance variables. The aggregate root should expose its instance variables, the children collection, and each child should expose its instance variables, through getters - this allows tests to validate their state. In all cases if you need to hide information make these getters package or private access and use your unit testing framework to make them public for the purpose of testing.
For your testing environment consider mocking the repository objects (you're using dependency injection right?) to return hard coded values. Short of this consider using something like dbunit to work with a database in a known state.
As far as logic changes are concerned create vs. modify, are you referring to how to persist or is there an actual algorithm to consider? If the former, I would make the repository responsible, if the latter I would make two separate methods (e.g. "calculateCreate()" & "calculateUpdate()") which calculate() would delegate as appropriate.
Also, there's a concurrency issue to think about as well because it sounds as if calculated values rely on mutable fields. So either need to have careful locking or aggregate roots that can only be used by a client once at a time. This also applies to propagating a field across aggregates - I would probably use the repository for this purpose - but you need to think carefully on how this should or should not impact other clients who are using the repository object.
Related
I'm building a Repository for an Aggragate. We've got 3 different Entities that it's constructed out of, one of them is root.
Data for all 3 is persisted in a SQL database. Each has it's own table.
Let's consider simple case of getting full list of those Aggregates. I need to fetch data from all 3 tables. Should I build one optimised query to fetch this data set or rather encapsulate logic for each Entity in it's own Repository and assemble it the Aggragate's repo? (Aggregate repo would then call respective repos and assemble it)
I'm leaning twords the first solution, however it's stronger coupleing. The later seems nicer from OOP point of view, but seems to be overcomplicated and potentialy casue problems with cache invalidation for subsequent sets of data etc.
For each type of object requiring global access, create an object that provides the illusion of all objects of this type stored in memory. Configure access through the global interface. [..] Define methods for adding and removing objects. [..] Define repositories only for aggregates. ~ Evans, about repositories
You should create one repository for Aggregate only. There is no reason to create seperate repositories. What is more, creating seperate repository would cause some additional problems as you mentioned.
I'm leaning twords the first solution, however it's stronger
coupleing.
To answer that, please take a look at Aggregate definition from Martin Fowler:
Aggregate is a pattern in Domain-Driven Design. A DDD aggregate is a
cluster of domain objects that can be treated as a single unit. An
example may be an order and its line-items, these will be separate
objects, but it's useful to treat the order (together with its line
items) as a single aggregate.
Aggregate is coupling Entities that it is constructed out of by definition.
We want to write an API (Python Library) which provides information about few systems in our company. We really aren't sure what is the best OOP approach to implement what we want, so I hope you'll have an idea.
The API will expose a series of tests for each system. Each system will be presented as a Class (with properties and methods) and all systems will inherit from a base class (GenericSystem) which will contain basic, generic info regarding the system (I.E dateOfCreation, authors, systemType, name, technology, owner, etc.) Each system has many instances and each instance has a unique ID. Data about each system instance is stored in different databases, so the API will be a place where all users can find info regarding those systems at once. These are the requirements:
We want each user to be able to create an instance of a system (SystemName Class for example) and to be able to get some info about it.
We want each user to be able to create multiple instances of a system (or of GenericSystem) and to be able to get info about all of them at once. (It must be efficient. One query only, not one for each instance). So we thought that we may need to create MultipleSystemNames class which will implement all those plural-approach methods. This is the most challenging requirement, as it seems.
We want that data will be populated and cached to the instances properties and methods. So if I create a SystemName instance and calls systemNameInstance.propertyName, it will run needed queries and populate the data into propertyName. Next time the user will call this property, the data will be immediately returned.
Last one, a single system class approach must be preserved. Each system must be presented as a sole system. We can later create MultiSystem class if needed (For requirement 2) but at it's most basic form, each system must be represented singly (I hope you understand what I mean).
The second and the fourth (2,4) requirements are the ones that we really struggle to figure out.
Should we use MultiSystemNames class for each class and also for GenericSystem (MultiGenericSystems)? We don't want to complicate the user and ourselves.
Do you know any OOP (or another) best practice clean and simplified way? Have we missed something?
I'm sorry if I added some unnecessary information but I really wanted to give you a feel about how we want things to be.
If you've reach so far or not, thank you!
System and instance represents exactly the same think but are used in different contexts. It doesn't matter how you store or retrieve them. So if you need a collection of System you just use native collection data structure (e.g List, Queue, Map in java). The operations related to System/List must be decoupled from POJOs. That means you implement them in services, repositories,etc.
How you store and retrieve the data must not have impact on how you design your data structures. You achieve performance by applying different techniques and/or using proper technologies e.g caching, using key-value stores or nosql databases, denormalize relational database tables and/or using indexes,etc
In DDD, repositories are used to perform serialization and de-serialization of aggregates, e.g. by reading and writing to a database. That way, aggregates can contain purer business logic, and won't be coupled to non-domain-specific persistence strategies.
However, I wonder why repositories are always described as being used for aggregates specifically. Isn't it equally motivated to use it for all entities?
(If this is only a matter of the fact that all plain entities can be seen as aggregate roots with zero children, please notify me of this, and the question can be buried.)
I wonder why repositories are always described as being used for aggregates specifically. Isn't it equally motivated to use it for all entities?
Because aggregates are the consistency boundaries exposed to the application layer.
Which is to say that, yes, the repositories are responsible for taking the snapshot of state from the data store, and building from it the graph of entities and values that make up the aggregate.
The API of the repository only exposes an aggregate root, because that defines the consistency boundary. Instead of allowing the application to reach into an arbitrary location in the graph and make changes, we force the application to communicate with the root object exclusively. With this constraint in place, we only need to look in one place to ensure that all changes satisfy the business invariant.
So there's no need to develop a repository for each type of entity in your model, because the application isn't allowed to interact directly with the model on that fine a grain.
Put another way, the entities within the aggregate are private data structures. We don't allow the client code to manipulate the entities directly for the same reason that we don't implement lists that allow the clients to reach past the api and manipulate the pointers directly.
In cqrs, you do see "repositories" that are used for things other than aggregates -- repositories can also be used to look up cached views of the state of the model. The trick is that the views don't support modification. In the approach that Evans describes, each entity has one single representation that fulfills all of its roles. In CQRS, and entity may have different representations in each role, but typically only a single role that supports modifying the entity.
In DDD there are two kind of entities: Aggregate roots and nested entities. As #VoiceOfUnreason answered, you are not allowed to modify the nested entities from outside an Aggregate so there is no need to have a repository for them (by "repository" I'm refering to an interface for load and persist an entities state). If you would be allowed, it would break the Aggregate's encapsulation, one if the most important things in OOP. Encapsulation helps in rich domains, with lots and lots of models where DDD is a perfect fit.
Our current application uses a smart object style for working with the database. We are looking at the feasibility of moving to PetaPoco instead. Looking over the features I notice you can add attributes to make it easier to CRUD objects. Does adding these attributes have any negative side effects that I should be aware of?
Has anyone found a reason NOT to use these decorators?
Directly to the use of the POCO object instance itself? None.
At least not that I would be aware of. Jon Skeet should be able to provide more info because he knows compiler inner workings through and through, so he knows exactly what happens with this metadata after it's been compiled.
Other implications indirectly related to these
There are of course implications when accessing these declarative attributes, because they're read using reflection which is normally a slow process.
But there's nothing to worry here, because PetaPoco is a smart library and reads these only once then compiles & caches these things, so you only get penalized once then you get blazing performance afterwards. Because it uses compiled code.
Non-performance related implications
By putting attributes (any) on your classes/properties/methods you somehow bind your code to particular engine that will use this class, because they're directives for this particular engine to understand your code.
In case of PetaPoco attributes this means that your class can be used with PetaPoco but not with some other DAL (ie. EF) unless you add attributes of that one as well (EF Code First uses the very same approach with attributes).
The second implication is related to back-end database. In case you rename a table, column or any other part that is provided in your PetaPoco attribute as a constant magic string, you will subsequently have to change this string as well. This just means that you have to be thorough when doing database changes...
One downside is that it breaks the separation between the "domain" layer and the "data" layer, since it introduces the PetaPoco file (which contains data logic) to domain classes that should really not have any knowledge or dependency on the data layer.
If you're doing a single-project MVC app or something then it's okay to just use the Models directory for both, but for non-trivial and separated apps you'll have to have two PetaPoco files or play around with abstracting portions of the file in order to annotate your models without making them "know too much" about the underlying data, or else have you specify the table and/or primary key name all over the place.
Take this simple, contrived example:
UserRepository.GetAllUsers();
UserRepository.GetUserById();
Inevitably, I will have more complex "queries", such as:
//returns users where active=true, deleted=false, and confirmed = true
GetActiveUsers();
I'm having trouble determining where the responsibility of the repository ends. GetActiveUsers() represents a simple "query". Does it belong in the repository?
How about something that involves a bit of logic, such as:
//activate the user, set the activationCode to "used", etc.
ActivateUser(string activationCode);
Repositories are responsible for the application-specific handling of sets of objects. This naturally covers queries as well as set modifications (insert/delete).
ActivateUser operates on a single object. That object needs to be retrieved, then modified. The repository is responsible for retrieving the object from the set; another class would be responsible for invoking the query and using the object.
These are all excellent questions to be asking. Being able to determine which of these you should use comes down to your experience and the problem you are working on.
I would suggest reading a book such as Fowler's patterns of enterprise architecture. In this book he discusses the patterns you mention. Most importantly though he assigns each pattern a responsibility. For instance domain logic can be put in either the Service or Domain layers. There are pros and cons associated with each.
If I decide to use a Service layer I assign the layer the role of handling Transactions and Authorization. I like to keep it 'thin' and have no domain logic in there. It becomes an API for my application. I keep all business logic with the domain objects. This includes algorithms and validation for the object. The repository retrieves and persists the domain objects. This may be a one to one mapping between database columns and domain properties for simple systems.
I think GetAtcitveUsers is ok for the Repository. You wouldnt want to retrieve all users from the database and figure out which ones are active in the application as this would lead to poor performance. If ActivateUser has business logic as you suggest, then that logic belongs in the domain object. Persisting the change is the responsibility of the Repository layer.
Hope this helps.
When building DDD projects I like to differentiate two responsibilities: a Repository and a Finder.
A Repository is responsible for storing aggregate roots and for retrieving them, but only for usage in command processing. By command processing I meant executing any action a user invoked.
A Finder is responsible for querying domain objects for purposes of UI, like grid views and details views.
I don't consider finders to be a part of domain model. The particular IXxxFinder interfaces are placed in presentation layer, not in the domain layer. Implementation of both IXxxRepository and IXxxFinder are placed in data access layer, possibly even in the same class.