Relations between Repositories for Aggragates in DDD

Relations between Repositories for Aggragates in DDD - oop

I'm building a Repository for an Aggragate. We've got 3 different Entities that it's constructed out of, one of them is root.
Data for all 3 is persisted in a SQL database. Each has it's own table.
Let's consider simple case of getting full list of those Aggregates. I need to fetch data from all 3 tables. Should I build one optimised query to fetch this data set or rather encapsulate logic for each Entity in it's own Repository and assemble it the Aggragate's repo? (Aggregate repo would then call respective repos and assemble it)
I'm leaning twords the first solution, however it's stronger coupleing. The later seems nicer from OOP point of view, but seems to be overcomplicated and potentialy casue problems with cache invalidation for subsequent sets of data etc.

For each type of object requiring global access, create an object that provides the illusion of all objects of this type stored in memory. Configure access through the global interface. [..] Define methods for adding and removing objects. [..] Define repositories only for aggregates. ~ Evans, about repositories
You should create one repository for Aggregate only. There is no reason to create seperate repositories. What is more, creating seperate repository would cause some additional problems as you mentioned.
I'm leaning twords the first solution, however it's stronger
coupleing.
To answer that, please take a look at Aggregate definition from Martin Fowler:
Aggregate is a pattern in Domain-Driven Design. A DDD aggregate is a
cluster of domain objects that can be treated as a single unit. An
example may be an order and its line-items, these will be separate
objects, but it's useful to treat the order (together with its line
items) as a single aggregate.
Aggregate is coupling Entities that it is constructed out of by definition.

Related

What should repositories in DDD return

I researched about repositories in DDD and found too much different thing. Everyone says different things about repositories and that made me confused.
I want to know:
What methods should repositories contain?
What should repositories definitely (or closer that) return?
Thanks.

For each aggregate root (AR) you should have a repository. As a minimum the repository would probably have a void Save(Aggregate aggregate) and a Aggregate Get(Guid id) method. The returned aggregate would always be fully constituted.
I sometimes add methods for specific use cases in order to update only certain bits of data. For instance, something like void Activate(Guid id) or some such. This is simply to avoid manipulating more data than is necessary.
Querying on a repository is usually problematic since you should typically avoid querying your domain. For such scenarios my recommendation is to use a query mechanism that is closer to the data and in a more raw format than a domain object or object graph. The query mechanism would more-then-likely return primitives, such as int Count(Query.Specification specification) or perhaps return a list of read model instances.

You are right, a repository has different meanings in different contexts - and many authors have their own interpretation. The way I understand them is from multiple perspectives:
They abstract away underline storage type
They can introduce interface closer to the domain model
They represent a collection of objects and thus serve as aggregate
in-memory storage(collection of related objects)
They represent a transaction boundary for related objects.
They can't contain duplicates - like sets.
It is valid for the repository to contain only one object, without
complex relations internally
So to answer your questions, repositories should contain collection related methods like add, remove, addAll, findByCriteria - instead of save, update, delete. They can return whole aggregate or parts of aggregates or some internal aggregate relation - it is dependent on your domain model and the way you want to represent objects

Eric Evans coined "domain driven design" in 2003; so the right starting point for any definitions in that context is his book. He defines the repository pattern in chapter 6 ("Lifecycle of a Domain Object").
A REPOSITORY represents all objects of a type as a conceptual set (usually emulated). It acts like a collection, except with more elaborate querying ability. Objects of the appropriate type are added and removed, and the machinery behind the repository inserts them or deletes them from the database.
...
For each type of object that requires global access, create an object that can provide the illusion of an in-memory collection of all objects of that type.
The primary use case of a repository: given a key, return the correct root entity. The repository implementation acts as a module, which hides your choice of persistence strategy (see: Parnas 1971).

Why are repositories only used for aggregates in Domain-Driven Design?

In DDD, repositories are used to perform serialization and de-serialization of aggregates, e.g. by reading and writing to a database. That way, aggregates can contain purer business logic, and won't be coupled to non-domain-specific persistence strategies.
However, I wonder why repositories are always described as being used for aggregates specifically. Isn't it equally motivated to use it for all entities?
(If this is only a matter of the fact that all plain entities can be seen as aggregate roots with zero children, please notify me of this, and the question can be buried.)

I wonder why repositories are always described as being used for aggregates specifically. Isn't it equally motivated to use it for all entities?
Because aggregates are the consistency boundaries exposed to the application layer.
Which is to say that, yes, the repositories are responsible for taking the snapshot of state from the data store, and building from it the graph of entities and values that make up the aggregate.
The API of the repository only exposes an aggregate root, because that defines the consistency boundary. Instead of allowing the application to reach into an arbitrary location in the graph and make changes, we force the application to communicate with the root object exclusively. With this constraint in place, we only need to look in one place to ensure that all changes satisfy the business invariant.
So there's no need to develop a repository for each type of entity in your model, because the application isn't allowed to interact directly with the model on that fine a grain.
Put another way, the entities within the aggregate are private data structures. We don't allow the client code to manipulate the entities directly for the same reason that we don't implement lists that allow the clients to reach past the api and manipulate the pointers directly.
In cqrs, you do see "repositories" that are used for things other than aggregates -- repositories can also be used to look up cached views of the state of the model. The trick is that the views don't support modification. In the approach that Evans describes, each entity has one single representation that fulfills all of its roles. In CQRS, and entity may have different representations in each role, but typically only a single role that supports modifying the entity.

In DDD there are two kind of entities: Aggregate roots and nested entities. As #VoiceOfUnreason answered, you are not allowed to modify the nested entities from outside an Aggregate so there is no need to have a repository for them (by "repository" I'm refering to an interface for load and persist an entities state). If you would be allowed, it would break the Aggregate's encapsulation, one if the most important things in OOP. Encapsulation helps in rich domains, with lots and lots of models where DDD is a perfect fit.

Should I returning IQueryable<T> from a repository in DDD

Recently, I am refactoring my DDD project. When I look at my Repository Layer. I found it return IQueryable< T> in my repository. I am puzzled and should I return IQueryable< T> from a repository in my DDD project in Repository Layer? I usually return IQueryable type in my Repository design. But today I found a opposite idea from this article
I can't figure it out!

If you return IQueryable you permit domain knowledge leaking from Domain layer to the consumer layers. It increases the risk that your Domain objects will become anemic and all behavior will move to other layers.
Although it seams very handy to return a IQueryable and you think that your code becomes simpler, that is just an illusion; when the project will grow that IQueryable will transform your code into a big ball of mud, with domain code scattered every where. You won't be able to optimize your repository or to change one persistence with another (i.e. from sql to nosql).

You probably shouldn't do that.
The repository's job is not only to abstract away persistence details, but also to provide an explicit query contract defining those that are needed to process commands1 in your domain.
If you feel the need to further filter/transform what's returned from your repository then you most likely failed to capture an explicit query that should be part of the repository's contract.
Having such contract lets the querying clients express their intent and allows for easier optimizations.
1. Nowadays, it's quite common to apply some CQRS principles and by-bass the domain model entirely for queries. Given this scenario, the only queries that would go through the repository are the ones needed to process commands. You are however not forced in any way to use this approach so your repository could fulfill reporting queries as well if you wish.

Where does my DDD logic belong?

I’ve been persuaded by Eric Evans’ book and am integrating DDD into my framework. All basic elements (services, repositories, bounded contexts, etc) have been implemented and now I’m looking for feedback on how to correctly integrate this.
I have some business logic which has to be performed when an entity is created or modified. This example is a very simple one. Most business logic will become much more complex.
This business logic can be split up into the following actions:
Update calculated fields;
Update a child record inside the aggregate root. When creating the aggregate root this entails creating a default child record. When updating the aggregate root this entails removing the existing child record and creating a new one if a specific field on the aggregate root has changed;
Propagate start and end date of the aggregate root to the start and end date of the child records inside the aggregate root. These must be kept in sync under certain circumstances;
Propagate a field of the aggregate root to a different aggregate root.
My first attempt is to put all of this on the aggregate root, but I feel this is not going to work. I have the following problems integrating this logic:
All these actions have to be completed as a single whole and should not be made available as separate actions. This has the result that this is going to be very difficult to test (TDD);
I am not clear on whether any of these actions can be moved out to a service. The reason for this is that they make no sense outside of the aggregate root, but it would make TDD a lot easier;
Some logic changes depending on whether a new entity is created or an existing one is modified. Should I put these two branches inside the update logic or should I make two entirely different paths that share the business code that does not differentiate based create/modify.
Any help on the above issues would be greatly appreciated and other feedback in general.

The algorithm you've described should remain in the aggregate root, elsewise you end up with an anemic domain model, excepting propagating a field to another aggregate root where I will describe what I think you should do later.
As far as TDD is concerned, a method with "package" access on the aggregate root (e.g. "calculate()", should coordinate the entire action, which either the service or repository object would normally call. This is what tests should exercise in conjunction with setting different combinations of instance variables. The aggregate root should expose its instance variables, the children collection, and each child should expose its instance variables, through getters - this allows tests to validate their state. In all cases if you need to hide information make these getters package or private access and use your unit testing framework to make them public for the purpose of testing.
For your testing environment consider mocking the repository objects (you're using dependency injection right?) to return hard coded values. Short of this consider using something like dbunit to work with a database in a known state.
As far as logic changes are concerned create vs. modify, are you referring to how to persist or is there an actual algorithm to consider? If the former, I would make the repository responsible, if the latter I would make two separate methods (e.g. "calculateCreate()" & "calculateUpdate()") which calculate() would delegate as appropriate.
Also, there's a concurrency issue to think about as well because it sounds as if calculated values rely on mutable fields. So either need to have careful locking or aggregate roots that can only be used by a client once at a time. This also applies to propagating a field across aggregates - I would probably use the repository for this purpose - but you need to think carefully on how this should or should not impact other clients who are using the repository object.

How can I gradually transition to NHibernate persistence logic from existing ADO.NET persistence logic?

The application uses ADO.NET to invoke sprocs for nearly every database operation. Some of these sprocs also contain a fair amount of domain logic. The data access logic for each domain entity resides in the domain class itself. ie, there is no decoupling between domain logic and data access logic.
I'm looking to accomplish the following:
decouple the domain logic from the data access logic
make the domain model persistence ignorant
implement the transition to NHibernate gradually across releases, refactoring individual portions of the DAL (if you can call it that) at a time
Here's my approach for transitioning a single class to NHibernate persistence
create a mapping for the domain class
create a repository for the domain class (basic CRUD operations inherited from a generic base repository)
create a method in the repository for each sproc used by the old DAL (doing some refactoring along the way to pull out the domain logic)
modify consumers to use the repository rather than the data access logic in the class itself
remove the old data access logic and the sprocs
The issues I have are with #1 and #4.
(#1) How can I map properties of a type with no NHibernate mapping?
Consider a Person class with an Address property (Address being a domain object without an NH mapping and Person being the class I'm mapping). How can I include Address in the Person mapping without creating an entire mapping for Address?
(#4) How should I manage the dependencies on old data access logic during the transition?
Classes in the domain model utilize the old data access logic that I'm looking to remove. Consider an Order class with a CustomerId property. When the Order needs info on the Customer it invokes the ADO.NET data access logic that resides in the Customer class. What options are there other than maintaining the old data access logic until the dependent classes are mapped themselves?

I would approach it like this:
Refactor and move the data access logic out of the domain classes into a data layer.
Refactor and move the domain logic out of the sprocs into a data layer. (This step is optional, but doing it will definitely make the transition smoother and easier.)
You don't need a repository, but you can certainly create one if you want.
Create a NHibernate mapping for every domain class (there are tools that do this).
Create a NHibernate oriented data access API that slowly replaces the sproc data layer.
Steps 1 & 2 are the hardest part as it sounds like you have tight coupling that ideally never would have happened. Neither of these first two steps involve NHibernate at all. You are strictly moving to a more maintainable architecture before trying to swap out your data layer.
While it may be possible to create NHibernate mappings one by one and utilize them without the full object graph being available, that seems like asking for unnecessary pain. You need to proceed very cautiously if you choose that path and I just wouldn't recommend it. To do so, you may leave a foreign key mapped as a plain int/guid instead of as a relation to another domain class, but you have to be very careful you don't corrupt your data by half committing to NHibernate in that way. Automated unit/integration tests are your friend.
Swapping out a data layer is hard. It is easier if you have a solid lowest common denominator data layer architecture, but I wouldn't actually recommend creating an architecture using a lowest common denominator approach. Loose coupling is good, but you can go too far.

search more on the internet for nhibernate e-books
Refactor and move the data access logic out of the domain classes into a data layer.
Refactor and move the domain logic out of the sprocs into a data layer. (This step is optional, but doing it will definitely make the transition smoother and easier.)
You don't need a repository, but you can certainly create one if you want.
Create a NHibernate mapping for every domain class (there are tools that do this).
Create a NHibernate oriented data access API that slowly replaces the sproc data layer

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas