Domain driven design and aggregate references - repository

I am designing the domain model, but there is something that doesn't seem to be ok.
I start with a main aggregate. It has references to other aggregates and those other aggregates reference more aggregates too. I can travel the hole domain model starting from the main aggregate.
The problem I see is that I will be holding all instances of aggregates in memory.
Is that a good design? I can solve the memory problem with lazy loading but I think that I have a deeper problem.
I have another question regarding aggregate references. Should I load the references to other aggregates lazily? If that is the case I would almost never user their repositories. Is that ok?

Having direct referenced between aggregate roots (ARs) can lead to problems that cannot be solved by lazy loading. Moreover, it forces all connected ARs to be in the same database and makes it more difficult to reason about and enforce invariants which is the primary purpose of an AR in the first place. It is better to limit or eliminate direct references between ARs. A great resource to learn about aggregate design is a series of articles by Vaughn Vernon. The basic idea is to make your ARs lean and focused all while keeping in mind their function - enforcing business constraints and forging a boundary around the root entity. If an AR needs data from another AR to perform its work, this data can be provided to it by an application service via repository. Also, if references are only needed to fulfill UI requirements, then consider using the read-model pattern.

Related

DDD: Aggregates and Deletes

It's taken awhile but I feel I've started to build a good understanding of aggregates in DDD. Keep them small (An Entity with Value Objects whenever possible) and when containing multiple entities, ensure their reason to exist is to enforce some (transactional) invariant.
Where I come a bit undone is when it comes to the Remove or Delete side of things. Imagine a:
Thread, with Posts
For a long time I would mistake the 'has-a' relationship for an aggregate, but...
The requirement that a Post must have a Thread can be enforced via a factory method on the Thread to add a Post.
Then in lieu of any business rules that require it, they can be separate aggregates. For instance, if you were loading a list of threads, it doesn't make much sense to have to also load all the posts for each thread as well.
What about deleting a Thread though? It makes sense that removing a Thread means the Posts for that thread should go as well. But enforcing that a Post must be removed when its Thread is removed leads to them becoming a single Aggregate with Thread as the aggregate root.
This is just a representative example, but in many cases any 'has-a' relationship often implies something like the above. ie. the child should no longer exists if the parent is removed.
So, any advice on a situation when the only reason to seem to need an aggregate relationship between two entities is for delete/remove purposes?
My thinking at the moment?
You don't really delete a Thread. You make it inactive.
When a thread is made inactive, you obviously can't add any new posts (enforced through the factory method). Any posts that belong to the now inactive thread are also made inactive through eventual consistency?
Any other pearls of wisdom learned to ensure not mixing up a 'has-a' relationship with an aggregate root / child entity aggregate?
You don't really delete a Thread. You make it inactive.
See also Don't Delete, Just Don't.
Any other pearls of wisdom learned to ensure not mixing up a 'has-a' relationship with an aggregate root / child entity aggregate?
I'd say that the most important lesson is this: if two pieces of information have to be kept immediately consistent with each other, then they have to be stored together <-- same database. In other words, the need for immediate consistency puts constraints not only on your domain model, but also on your data model.
In business systems, "have to be consistent" is less frequent than you might expect, because the key motivation for "have to be" is "what is the cost to the business if they are not?"
The classic example used here is orders vs inventory; we don't need to have reservable stock on the floor in order to accept a new order -- "backorder" is a real thing in the domain, and is often a better way of doing business than keeping everything immediately consistent.

Domain Driven Design - Creating general purpose entities vs. Context specific Entities

Situation
Suppose you have Orders and Clients as entities in your application. In one aggregate, the Order entity is considered to be the root but you also want to make use of the Client entity for simple things. In another the Client is the root entity and the Order entity is touched ever so lightly.
An example:
Let's say that in the Order aggregate I use the Client only to read details like name, address, build order history and not to make the client do client specific business logic. (like persistence, passwords resets and back flips..).
On the other hand, in the Client aggregate I use the Order entity to report on the client's buying habbits, order totals, order counting, without requiring advanced order functionality like order processing, updating, status changes, etc.
Possible solution
I believe the better solution is to create the entities for each aggregate specific to the aggregate context, because making them full featured (general purpose) and ready for any situation and usage seems like overkill and could potentially become a maintenance nightmare. (and potentially memory intensive)
Question
What is the DDD recommended way of handling this situation?
What is your take on the matter?
The basic driver for these decisions should be the ubiquitous language, and consequently the real world domain you're modeling. If both works in a specific domain, I'd favor separation over god-classes for maintainability reasons.
Apart from separating behavior into different aggregates, you should also take care that you don't mix different bounded contexts. Depending on the requirements of your domain, it could make sense to separate the Purchase Context from the Reporting Context (to extend on your example).
To decide on a context design, context maps are a helpful tool.
You are one the right track. In DDD, entities are not merely containers encapsulating all attributes related to a "subject" (for example: a customer, or an order). This is a very important concept that eludes a lot of people. An entity in DDD represents an operation boundary, thus only the data necessary to perform the operation is considered to be a part of the entity. Exactly which data to include in an entity can be difficult to consider because some data is relevant in a different use-cases. Here are some tips when analyzing data:
Analyze invariants, things that must be considered when applying validation rules and that can not be out of sync should be in the same aggregate.
Drop the database-thinking, normalization is not a concern of DDD
Just because things look the same, it doesn't mean that they are. For example: the current shipping address registered on a customer is different from the shipping address which a specific order was shipped to.
Don't look at reads. Reading, like creating a report or populating av viewmodel/dto/whatever has nothing to do with operation boundaries and can typically be a 360 deg view of the data. In fact don't event use your domain model when returning reads, use a different architectural stack.

Should the rule "one transaction per aggregate" be taken into consideration when modeling the domain?

Taking into consideration the domain events pattern and this post , why do people recomend keeping one aggregate per transaction model ? There are good cases when one aggregate could change the state of another one . Even by removing an aggregate (or altering it's identity) will lead to altering the state of other aggregates that reference it. Some people say that keeping one transaction per aggregates help scalability (keeping one aggregate per server) . But doesn't this type of thinking break the fundamental characteristic about DDD : technology agnostic ?
So based on the statements above and on your experience, is it bad to design aggregates, domain events, that lead to changes in other aggregates and this will lead to having 2 or more aggregates per transaction (ex. : when a new order is placed with 100 items change the customer's state from normal to V.I.P. )?
There are several things at play here and even more trade-offs to be made.
First and foremost, you are right, you should think about the model first. Afterall, the interplay of language, model and domain is what we're doing this all for: coming up with carefully designed abstractions as a solution to a problem.
The tactical patterns - from the DDD book - are a means to an end. In that respect we shouldn't overemphasize them, eventhough they have served us well (and caused major headaches for others). They help us find "units of consistency" in the model, things that change together, a transactional boundary. And therein lies the problem, I'm afraid. When something happens and when the side effects of it happening should be visible are two different things. Yet all too often they are treated as one, and thus cause this uncomfortable feeling, to which we respond by trying to squeeze everything within the boundary, without questioning. Still, we're left with that uncomfortable feeling. There are a lot of things that logically can be treated as a "whole change", whereas physically there are multiple small changes. It takes skill and experience, or even blunt trying to know when that is the case. Not everything can be solved this way mind you.
To scale or not to scale, that is often the question. If you don't need to scale, keep things on one box, be content with a certain backup/restore strategy, you can bend the rules and affect multiple aggregates in one go. But you have to be aware you're doing just that and not take it as a given, because inevitably change is going to come and it might mess with this particular way of handling things. So, fair warning. More subtle is the question as to why you're changing multiple aggregates in one go. People often respond to that with the "your aggregate boundaries are wrong" answer. In reality it means you have more domain and model exploration to do, to uncover the true motivation for those synchronous, multi-aggregate changes. Often a UI or service is the one that has this "unreasonable" expectation. But there might be other reasons and all it might take is a different set of abstractions to solve the same problem. This is a pretty essential aspect of DDD.
The example you gave seems like something I could handle as two separate transactions: an order was placed, and as a reaction to that, because the order was placed with a 100 items, the customer was made a VIP. As MikeSW hinted at in his answer (I started writing mine after he posted his), the question is when, who, how, and why should this customer status change be observed. Basically it's the "next" behavior that dictates the consistency requirements of the previous behavior(s).
An aggregate groups related business objects while an aggregate root (AR) is the 'representative' of that aggregate. Th AR itself is an entity modeling a (bigger, more complex) domain concept. In DDD a model is always relative to a context (the bounded context - BC) i.e that model is valid only in that BC.
This allows you to define a model representative of the specific business context and you don't need to shove everything in one model only. An Order is an AR in one context, while in another is just an id.
Since an AR pretty much encapsulates all the lower concepts and business rules, it acts as a whole i.e as a transaction/unit of work. A repository always works with AR because 1) a repo always deals with business objects and 2) the AR represents the business object for a given context.
When you have a use case involving 2 or more AR the business workflow and the correct modelling of that use case is paramount. In a lot of cases those AR can be modified independently (one doesn't care about other) or an AR changes as a result of other AR behaviour.
In your example, it's pretty trivial: when the customer places an order for 100 items, a domain event is generated and published. Then you have a handler which will check if the order complies with the customer promotions rules and if it does, a command is issued which will have the result of changing the client state to VIP.
Domain events are very powerful and allows you to implement transactions but in an eventual consistent environment. The old db transaction is an implementation detail and it's usually used when persisting one AR (remember AR are treated as a logical unit but persisting one may involve multiple tables hence db transaction).
Eventual consistency is a 'feature' of domain events which fits naturally a rich domain (and the real world actually). For some cases you might need instant consistency however those are particular cases and they are related to UI rather than how Domain works. Of course, it really depends from one domain to another. In your example, the customer won't mind it became a VIP 2 seconds or 2 minutes after the order was placed instead of the same milisecond.

Is structure (graph) of objects an Aggregate Root worthy of a Repository?

Philosophical DDD question here...
I've seen a lot of Entity vs. Value Object discussions here, but mine is slightly different. Forgive me if this has been covered before.
I'm working in the financial domain at the moment. We have funds (hedge variety). Those funds often invest into other funds. This results in a tree structure of sorts with one fund at the top anchoring it all together.
Obviously, a fund is an Entity (Aggregate Root, even). Things like trades and positions are most likely Value Objects.
My question is: Should the tree structure itself be considered an Aggregate Root?
Some thoughts:
The tree structure is stored in the DB by storing the components and the posistions they have into each other. We currently have no coded concept of the tree. The domain is very weak.
The tree structure has no "uniqueness" or identifier.
There is logic needed in many places to "walk" the tree to find the relationships to each other, either top-down, or sometimes bottom-up. This logic needs to be encapsulated somewhere.
There is lots of logic to compute leverage, exposure, etc... and roll it up the tree.
Is it good enough to treat the Fund as a Composite Fund object and that is the Aggregate Root with in-built Invariants? Or is a more formal tree structure useful in this case?
I usually take a more functional/domain approach to designing my aggregates and aggregate roots.
This results in a tree structure of sorts
Maybe you can talk with your domain expert to see if that notion deserves to be a first-class citizen with a name of its own in the ubiquitous language (FundTree, FundComposition... ?)
Once that is done, making it an aggregate root will basically depend on whether you consider the entity to be one of the main entry points in the application, i.e. will you sometimes need a reference to a FundTree before even having any reference to a Fund, or if you can afford to obtain it only by traversal of a Fund.
This is more a decision of if you want to load full trees at all times really.
If you are anal about what you define as an aggregate root, then you will find a lot of bloat as you will be loading full object trees any time you load them.
There is no one size fits all approach to this, but in my opinion, you should have your relationships all mapped to your aggregate roots where possible, but in some cases a part of that tree can be treated as an aggregate root when needed.
If you're in a web environment, this is a different decision to a desktop application.
In the web, you are starting again every page load so I tend to have a good MODEL to map the relationships and a repository for pretty much every entity (as I always need to save just a small part of something from some popup somewhere) and pull it together with services that are done per aggregate root. It makes the code predictable and stops those... "umm.... is this a root" moments or repositories that become unmanagable.
Then I will have mappers that can give me summary and/or listitem views of large trees as needed and when needed.
On a desktop app, you keep things in memory a lot more, so you will write less code by just working out what your aggregate roots are and loading them when you need them.
There is no right or wrong to this. I doubt you could build a big app of any sort without making compromises on what is considered an aggregate root and you'll always end up in a sitation where 2 roots end up joining each other somewhere.

Practical usage of the Unit Of Work & Repository patterns

I'm building an ORM, and try to find out what are the exact responsibilities of each pattern. Let's say I want to transfer money between two accounts, using the Unit Of Work to manage the updates in a single database transaction.
Is the following approach correct?
Get them from the Repository
Attach them to my Unit Of Work
Do the business transaction & commit?
Example:
from = acccountRepository.find(fromAccountId);
to = accountRepository.find(toAccountId);
unitOfWork.attach(from);
unitOfWork.attach(to);
unitOfWork.begin();
from.withdraw(amount);
to.deposit(amount);
unitOfWork.commit();
Should, as in this example, the Unit Of Work and the Repository be used independently, or:
Should the Unit Of Work use internally a Repository and have the ability to load objects?
... or should the Repository use internally a Unit Of Work and automatically attach any loaded entity?
All comments are welcome!
The short answer would be that the Repository would be using the UoW in some way, but I think the relationship between these patterns is less concrete than it would initially seem. The goal of the Unit Of Work is to create a way to essentially lump a group of database related functions together so they can be executed as an atomic unit. There is often a relationship between the boundaries created when using UoW and the boundaries created by transactions, but this relationship is more coincidence.
The Repository pattern, on the other hand, is a way to create an abstraction resembling a collection over an Aggregate Root. More often than not the sorts of things you see in a repository are related to querying or finding instances of the Aggregate Root. A more interesting question (and one which doesn't have a single answer) is whether it makes sense to add methods that deal with something other than querying for Aggregates. On the one hand there could be some valid cases where you have operations that would apply to multiple Aggregates. On the other it could be argued that if you're performing operations on more than one Aggregate you are actually performing a single action on another Aggregate. If you are only querying data I don't know if you really need to create the boundaries implied by the UoW. It all comes down to the domain and how it is modeled.
The two patterns are dealing at very different levels of abstraction, and the involvement of the Unit Of Work is going to be dependent on how the Aggregates are modeled as well. The Aggregates may want to delegate work related to persistence to the Entities its managing, or there could be another layer of abstraction between the Aggregates and the actual ORM. If your Aggregates/Entities are dealing with persistence themselves, then it may be appropriate for the Repositories to also manage that persistence. If not, then it doesn't make sense to include UoW in your Repository.
If you're wanting to create something for general public consumption outside of your organization, then I would suggest creating your Repository interfaces/base implementations in a way that would allow them to interact directly with your ORM or not depending on the needs of the user of your ORM. If this is internal, and you are doing the persistence work in your Aggregates.Entities, then it makes sense for your Repository to make use of your UoW. For a generic Repository it would make sense to provide access to the UoW object from within Repository implementations that can make sure it is initialized and disposed of appropriately. On that note, there will also be times when you would likely want to utilize multiple Repositories within what would be a single UoW boundary, so you would want to be able to pass in an already primed UoW to the Repository in that case.
I recommend you to use approach when repository uses UoW internally. This approach has some advantages, especially for web application.
In web application recommended pattern of using UoW is Unit of Work (session) per HTTP request. So if your repositories will share UoW, you will be able to use 1st level cache (using identity map) for object that were requested by other repositories (like data dictionaries that are referenced by multiple aggregates). Also you will have to commit only one transaction instead of multiple, so it will work much better in terms of the performance.
You could take a look at Hibernate/NHibernate source codes that are mature ORMs in Java/.NET world.
Good Question!
Depends on what your work boundaries are going to be. If they are going to span multiple repositories then you might have to create another abstraction to ensure that multiple repositories are covered. It would be like a small "service" layer that is defined in Domain Driven Design.
If your unit of work is going to be pretty much per Repository then I would go with the second option.
My question, however, to you would be, how can you worry about repository when writing an ORM? They are going to be defined and used by the consumers of your Unit of Work right? If so, you have no option but to just provide a Unit of Work and your consumers will have to enlist the repositories with your unit of work and will also be responsible for controlling the boundaries of unit of work. Isn't it?