Domain Driven Design - Creating general purpose entities vs. Context specific Entities - oop

Situation
Suppose you have Orders and Clients as entities in your application. In one aggregate, the Order entity is considered to be the root but you also want to make use of the Client entity for simple things. In another the Client is the root entity and the Order entity is touched ever so lightly.
An example:
Let's say that in the Order aggregate I use the Client only to read details like name, address, build order history and not to make the client do client specific business logic. (like persistence, passwords resets and back flips..).
On the other hand, in the Client aggregate I use the Order entity to report on the client's buying habbits, order totals, order counting, without requiring advanced order functionality like order processing, updating, status changes, etc.
Possible solution
I believe the better solution is to create the entities for each aggregate specific to the aggregate context, because making them full featured (general purpose) and ready for any situation and usage seems like overkill and could potentially become a maintenance nightmare. (and potentially memory intensive)
Question
What is the DDD recommended way of handling this situation?
What is your take on the matter?

The basic driver for these decisions should be the ubiquitous language, and consequently the real world domain you're modeling. If both works in a specific domain, I'd favor separation over god-classes for maintainability reasons.
Apart from separating behavior into different aggregates, you should also take care that you don't mix different bounded contexts. Depending on the requirements of your domain, it could make sense to separate the Purchase Context from the Reporting Context (to extend on your example).
To decide on a context design, context maps are a helpful tool.

You are one the right track. In DDD, entities are not merely containers encapsulating all attributes related to a "subject" (for example: a customer, or an order). This is a very important concept that eludes a lot of people. An entity in DDD represents an operation boundary, thus only the data necessary to perform the operation is considered to be a part of the entity. Exactly which data to include in an entity can be difficult to consider because some data is relevant in a different use-cases. Here are some tips when analyzing data:
Analyze invariants, things that must be considered when applying validation rules and that can not be out of sync should be in the same aggregate.
Drop the database-thinking, normalization is not a concern of DDD
Just because things look the same, it doesn't mean that they are. For example: the current shipping address registered on a customer is different from the shipping address which a specific order was shipped to.
Don't look at reads. Reading, like creating a report or populating av viewmodel/dto/whatever has nothing to do with operation boundaries and can typically be a 360 deg view of the data. In fact don't event use your domain model when returning reads, use a different architectural stack.

Related

DDD: How many aggregates should have a single bounded context?

How many aggregates should have a single bounded context?
I'm asking this question due the reason, that the information from books and other resources are too broad/abstract.
I suppose, that it depends on certain domain model and its structure. How many bounded contexts do have a domain model? How many entities there are in each bounded context. I suppose, that all these questions make dependency on that fact, how many aggregates should be in a single bounded context.
Also, if to recall the SOLID principles and the common idea to have the small loosely coupled pieces of code. I suppose, that it's fine to have maximum 3-4 aggregates per single bounded context. If there are more aggregates in single bounded context, then there are probably some issues with the software design.
I'm reading the Vernon's book right now about DDD, but it's rather difficult to understand how to design certainly such things.
The trite answer is “just enough, but not too many”. There is no real guidance on how many aggregates to put in a bounded context.
The thing that drives aggregates and entities is the Ubiquitous Language that is used to describe the context. The Ubiquitous Language is different for each context, and the entities and aggregate roots needed in the context can be found in the nouns used in the language. Once you have the domain fully described by the language, count up the nouns that have a special meaning in that language and you have a count of the entities necessary.
Bear in mind, though, that I've rarely come across a bounded context that was "fully described". The goal is "described fully enough for this release". Therefore for any release the number of entities won't be "enough" and you'll probably have plans of adding more. Whether those plans ever rise to the top of the priority queue is another question.
How many aggregates should have a single bounded context?
All aggregates should have a single bounded context. You can almost work that out backwards - an aggregate is going to be stored in a single database, a database is going to belong to a single (micro) service, a service is going to serve a single bounded context; therefore it follows that an aggregate is going to belong to a single bounded context.
Where things can get messy: it's easy to take some broad business concept, like "order", and try to create a single representation for order that works for every bounded context. That's not the goal though -- the goal is for each context to have a representation of order that works in that context.
Common example: sales, billing, fulfillment may all care about "order", but the information that they need to share is largely just the order id, which acts as a correlation identifier so that they can coordinate conversations.
See Mauro Servienti: All of Our Aggregates Are Wrong

What is the use of single responsibility principle?

I am trying to understand the Single Responsibility principle but I have tough time in grasping the concept. I am reading the book "Design Patterns and Best Practices in Java by Lucian-Paul Torje; Adrian Ianculescu; Kamalmeet Singh ."
In this book I am reading Single responsibility principle chapter ,
where they have a car class as shown below:
They said Car has both Car logic and database operations. In future if we want to change database then we need to change database logic and might need to change also car logic. And vice versa...
The solution would be to create two classes as shown below:
My question is even if we create two classes , let’s consider we are adding a new property called ‘price’ to the class CAR [Or changing the property ‘model’ to ‘carModel’ ] then don’t you think we also need to update CarDAO class like changing the SQL or so on.
So What is the use of SRP here?
Great question.
First, keep in mind that this is a simplistic example in the book. It's up to the reader to expand on this a little and imagine more complex scenarios. In all of these scenarios, further imagine that you are not the only developer on the team; instead, you are working in a large team, and communication between developers often take the form of negotiating class interfaces i.e. APIs, public methods, public attributes, database schemas. In addition, you often will have to worry about rollbacks, backwards compatibility, and synchronizing releases and deploys.
Suppose, for example, that you want to swap out the database, say, from MySQL to PostgreSQL. With SRP, you will reimplement CarDAO, change whatever dialect-specific SQL was used, and leave the Car logic intact. However, you may have to make a small change, possibly in configuration, to tell Car to use the new PostgreSQL DAO. A reasonable DI framework would make this simple.
Suppose, in another example, that you want to delegate CarDAO to another developer to integrate with memcached, so that reads, while eventually consistent, are fast. Again, this developer would not need to know anything about the business logic in Car. Instead, they only need to operate behind the CRUD methods of CarDAO, and possibly declare a few more methods in the CarDAO API with different consistency guarantees.
Suppose, in yet another example, your team hires a database engineer specializing in compliance law. In preparation for the upcoming IPO, the database engineer is tasked with keeping an audit log of all changes across all tables in the company's 35 databases. With SRP, our intrepid DBA would not have to worry about any of the business logic using any of our tables; instead, their mutation tracking magic can be deftly injected into DAOs all over, using decorators or other aspect programming techniques. (This could also be done of the other side of the SQL interface, by the way.)
Alright one last one - suppose now that a systems engineer is brought onto the team, and is tasked with sharding this data across multiple regions (data centers) in AWS. This engineer could take SRP even further and add a component whose only role is to tell us, for each ID, the home region of each entity. Each time we do a cross-region read, the new component bumps a counter; each week, an automated tool migrates data frequently read across regions into a new home region to reduce latency.
Now, let's take our imagination even further, and assume that business is booming - suddenly, you are working for a Fortune 500 company with multiple departments spanning multiple countries. Business Analysts from the Finance Department want to use your table to plot quarterly growth in auto sales in their post-IPO investor reports. Instead of giving them access to Car (because the logic used for reporting might be different from the logic used to prepare data for rendering on a web UI), you could, potentially, create a read-only interface for CarDAO with a short list of carefully curated public attributes that you now have to maintain across department boundaries. God forbid you have to rename one of these attributes: be prepared for a 3-month sunset plan and many many sad dashboards and late-night escalations. (And please don't give them direct access to the actual SQL table, because the implicit assumption will be that the entire table is the public interface.) Oops, my scars may be showing.
A corollary is that, if you need to change the business logic in Car (say, add a method that computes the lower sale price of each Tesla after an embarrassing recall), you wouldn't touch the CarDAO, since if car.brand == 'Tesla; price = price * 0.6 has nothing to do with data access.
Additional Reading: CQRS
For adding new property you need to change both classes only if that property should be saved to database. If it is a property used in business logic then you do not need to change DAO. Also if you change your database from one vendor to another or from SQL to NoSQL you will have to make changes only in DAO class. And if you need to change some business logic then you need to change only Car class.
Single responsibility principle as stated by Robert C. Martin means that
A class should have only one reason to change.
Keeping this principle in mind will generally lead to smaller and highly cohesive classes, which in turn means that less people need to work on these classes simultaneously, and the code becomes more robust.
In your example, keeping data access and business logic (price calculation) logic separate means that you are less likely to break the other when making changes.

Aggregate - Correct Usage (DDD)

I have been trying to get started on Domain Driven Design (DDD) and therefore I've been studying it for a while now. I have a problem and I seek help around how I can solve it in a DDD fashion.
I have a Client class, which contains a hell lot of attributes - some of them are simple attributes, such as string contactName whereas others are complex ones, such as list addresses, list websites, etc.
DDD advocates that Client should be an Entity and it should also be an Aggregate root - ie, the client code should manipulate only the Client object itself and it's down to the Client object to perform operations on its inner objects (addresses, websites, names, etc.).
Here's the point where I get confused:
There are tons of business rules in the application that depend on the Client's inner objects - for instance:
Depending on the Client's country of birth or resident and her address, some FATCA (an US regulation) restrictions may be applicable.
I need to enrich some inner objects with data that comes from other systems, both internal to my organisation as well as external.
The application has to decide whether a Client is allowed to perform an operation and to that end, the app needs to scrutinize a lot of client details and make a decision - also, as the app scrutinizes the Client it needs to update many of its attributes to keep track of what led the application to that decision.
I could list hundreds of rules here - but you get the idea. My point is that I need to update many of the Client's inner attributes. From the domain perspective, the root is the Client - it's the Client that the user searches for in the GUI. The user cares only about the Client as a whole. Say, an isolated address is meaningless - it only exists if it's part of a Client.
Having said all that, my question is:
Eric Evans says it's OK for the root to return transient references to inner objects, preferably VOs (keyword here: VO) - but any manipulation on the inner objects should be performed by the root itself.
I have hundreds of manipulations that I need to perform on my clients - if I move all of them to the root, the root is going to become huge - it will have at least 10K lines of code!
According to Eric, a VO should be immutable - so if my root returns VOs, the client code won't be allowed to change them. So doing something like this would be unacceptable in a service: client.getExternalInfo().update(getDataFromExternalSystem())
So my question boils down to how on Earth I should update the inner objects without breaking the DDD rules?
I don't see any easy way out.
UPDATE I:
I've just come across Specifications, which seems to be the ideal DDD concept to my problem.
I'm still reading about it but I have decided to post this update anyway.
I have been studying DDD for awhile myself and am struggling to master it.
First, you're right: Specification is a fine pattern to use for validation or business rules in general, assuming the rules you are applying fit well with a predicate tree.
Of course, I don't know the specifics of your design, but I do wonder about the model itself. You mention that your Client class has "a hell lot of attributes". Are you sure your model is not somewhat anemic? Could your design benefit from some more analysis, perhaps breaking it out into other Aggregates? Is this a single Bounded Context? Should it be?
Specifications is definitely the way to go for complex business logic.
One question though - are you modeling the inner entities like addresses and names as ValueObjects? The best rule of thumb I can think of for those is if you can say they're equal, without an ID, they're likely value objects. Does your domain consider names to have a state?
If you're looking at a problem where few entities take in many types of change AND need an audit trail, you might want to also explore EventSourcing. Ideally the entity declares its reaction to an event, but you can also have the mutating code be held in the event for easy extensibility. There's pros and cons in that approach, of course.

Should the rule "one transaction per aggregate" be taken into consideration when modeling the domain?

Taking into consideration the domain events pattern and this post , why do people recomend keeping one aggregate per transaction model ? There are good cases when one aggregate could change the state of another one . Even by removing an aggregate (or altering it's identity) will lead to altering the state of other aggregates that reference it. Some people say that keeping one transaction per aggregates help scalability (keeping one aggregate per server) . But doesn't this type of thinking break the fundamental characteristic about DDD : technology agnostic ?
So based on the statements above and on your experience, is it bad to design aggregates, domain events, that lead to changes in other aggregates and this will lead to having 2 or more aggregates per transaction (ex. : when a new order is placed with 100 items change the customer's state from normal to V.I.P. )?
There are several things at play here and even more trade-offs to be made.
First and foremost, you are right, you should think about the model first. Afterall, the interplay of language, model and domain is what we're doing this all for: coming up with carefully designed abstractions as a solution to a problem.
The tactical patterns - from the DDD book - are a means to an end. In that respect we shouldn't overemphasize them, eventhough they have served us well (and caused major headaches for others). They help us find "units of consistency" in the model, things that change together, a transactional boundary. And therein lies the problem, I'm afraid. When something happens and when the side effects of it happening should be visible are two different things. Yet all too often they are treated as one, and thus cause this uncomfortable feeling, to which we respond by trying to squeeze everything within the boundary, without questioning. Still, we're left with that uncomfortable feeling. There are a lot of things that logically can be treated as a "whole change", whereas physically there are multiple small changes. It takes skill and experience, or even blunt trying to know when that is the case. Not everything can be solved this way mind you.
To scale or not to scale, that is often the question. If you don't need to scale, keep things on one box, be content with a certain backup/restore strategy, you can bend the rules and affect multiple aggregates in one go. But you have to be aware you're doing just that and not take it as a given, because inevitably change is going to come and it might mess with this particular way of handling things. So, fair warning. More subtle is the question as to why you're changing multiple aggregates in one go. People often respond to that with the "your aggregate boundaries are wrong" answer. In reality it means you have more domain and model exploration to do, to uncover the true motivation for those synchronous, multi-aggregate changes. Often a UI or service is the one that has this "unreasonable" expectation. But there might be other reasons and all it might take is a different set of abstractions to solve the same problem. This is a pretty essential aspect of DDD.
The example you gave seems like something I could handle as two separate transactions: an order was placed, and as a reaction to that, because the order was placed with a 100 items, the customer was made a VIP. As MikeSW hinted at in his answer (I started writing mine after he posted his), the question is when, who, how, and why should this customer status change be observed. Basically it's the "next" behavior that dictates the consistency requirements of the previous behavior(s).
An aggregate groups related business objects while an aggregate root (AR) is the 'representative' of that aggregate. Th AR itself is an entity modeling a (bigger, more complex) domain concept. In DDD a model is always relative to a context (the bounded context - BC) i.e that model is valid only in that BC.
This allows you to define a model representative of the specific business context and you don't need to shove everything in one model only. An Order is an AR in one context, while in another is just an id.
Since an AR pretty much encapsulates all the lower concepts and business rules, it acts as a whole i.e as a transaction/unit of work. A repository always works with AR because 1) a repo always deals with business objects and 2) the AR represents the business object for a given context.
When you have a use case involving 2 or more AR the business workflow and the correct modelling of that use case is paramount. In a lot of cases those AR can be modified independently (one doesn't care about other) or an AR changes as a result of other AR behaviour.
In your example, it's pretty trivial: when the customer places an order for 100 items, a domain event is generated and published. Then you have a handler which will check if the order complies with the customer promotions rules and if it does, a command is issued which will have the result of changing the client state to VIP.
Domain events are very powerful and allows you to implement transactions but in an eventual consistent environment. The old db transaction is an implementation detail and it's usually used when persisting one AR (remember AR are treated as a logical unit but persisting one may involve multiple tables hence db transaction).
Eventual consistency is a 'feature' of domain events which fits naturally a rich domain (and the real world actually). For some cases you might need instant consistency however those are particular cases and they are related to UI rather than how Domain works. Of course, it really depends from one domain to another. In your example, the customer won't mind it became a VIP 2 seconds or 2 minutes after the order was placed instead of the same milisecond.

What are the principles behind, and benefits of, the "party model"?

The "party model" is a "pattern" for relational database design. At least part of it involves finding commonality between many entities, such as Customer, Employee, Partner, etc., and factoring that into some more "abstract" database tables.
I'd like to find out your thoughts on the following:
What are the core principles and motivating forces behind the party model?
What does it prescribe you do to your data model? (My bit above is pretty high level and quite possibly incorrect in some ways. I've been on a project that used it, but I was working with a separate team focused on other issues).
What has your experience led you to feel about it? Did you use it, and if so, would you do so again? What were the pros and cons?
Did the party model limit your choice of ORMs? For example, did you have to eliminate certain ORMs because they didn't allow for enough of an "abstraction layer" between your domain objects and your physical data model?
I'm sure every response won't address every one of those questions ... but anything touching on one or more of them is going to help me make some decisions I'm facing.
Thanks.
What are the core principles and motivating forces behind the party
model?
To the extent that I've used it, it's mostly about code reuse and flexibility. We've used it before in the guest / user / admin model and it certainly proves its value when you need to move a user from one group to another. Extend this to having organizations and companies represented with users under them, and it's really providing a form of abstraction that isn't particularly inherent in SQL.
What does it prescribe you do to your data model? (My bit above is
pretty high level and quite possibly
incorrect in some ways. I've been on a
project that used it, but I was
working with a separate team focused
on other issues).
You're pretty correct in your bit above, though it needs some more detail. You can imagine a situation where an entity in the database (call it a Party) contracts out to another Party, which may in turn subcontract work out. A party might be an Employee, a Contractor, or a Company, all subclasses of Party. From my understanding, you would have a Party table and then more specific tables for each subclass, which could then be further subclassed (Party -> Person -> Contractor).
What has your experience led you to feel about it? Did you use it, and if
so, would you do so again? What were
the pros and cons?
It has its benefits if you need flexibly to add new types to your system and create relationships between types that you didn't expect at the beginning and architect in (users moving to a new level, companies hiring other companies, etc). It also gives you the benefit of running a single query and retrieving data for multiple types of parties (Companies,Employees,Contractors). On the flip side, you're adding additional layers of abstraction to get to the data you actually need and are increasing load (or at least the number of joins) on the database when you're querying for a specific type. If your abstraction goes too far, you'll likely need to run multiple queries to retrieve the data as the complexity would start to become detrimental to readability and database load.
Did the party model limit your choice of ORMs? For example, did you
have to eliminate certain ORMs because
they didn't allow for enough of an
"abstraction layer" between your
domain objects and your physical data
model?
This is an area that I'm admittedly a bit weak in, but I've found that using views and mirrored abstraction in the application layer haven't made this too much of a problem. The real problem for me has always been a "where is piece of data X living" when I want to read the data source directly (it's not always intuitive for new developers on the system either).
The idea behind the party models (aka entity schema) is to define a database that leverages some of the scalability benefits of schema-free databases. The party model does that by defining its entities as party type records, as opposed to one table per entity. The result is an extremely normalized database with very few tables and very little knowledge about the semantic meaning of the data it stores. All that knowledge is pushed to the data access in code. Database upgrades using the party model are minimal to none, since the schema never changes. It’s essentially a glorified key-value pair data model structure with some fancy names and a couple of extra attributes.
Pros:
Kick-ass horizontal scalability. Once your 5-6 tables are defined in your entity model, you can go to the beach and sip margaritas. You can virtually scale this database out as much as you want with minimum efforts.
The database supports any data structure you throw at it. You can also change data structures and party/entities definitions on the fly without affecting your application. This is very very powerful.
You can model any arbitrary data entity by adding records, not changing the schema. Meaning you can say goodbye to schema migration scripts.
This is programmers’ paradise, since the code they write will define the actual entities they use in code, and there are no mappings from Objects to Tables or anything like that. You can think of the Party table as the base object of your framework of choice (System.Object for .NET)
Cons:
Party/Entity models never play well with ORMs, so forget about using EF or NHibernate to get semantically meaningful entities out of your entity database.
Lots of joins. Performance tuning challenges. This ‘con’ is relative to the practices you use to define your entities, but is safe to say that you’ll be doing a lot more of those mind-bending queries that will bring you nightmares at night.
Harder to consume. Developers and DB pros unfamiliar with your business will have a harder time to get used to the entities exposed by these models. Since everything is abstract, there no diagram or visualization you can build on top of your database to explain what is stored to someone else.
Heavy data access models or business rules engines will be needed. Basically you have to do the work of understanding what the heck you want out of your database at some point, and your database model is not going to help you this time around.
If you are considering a party or entity schema in a relational database, you should probably take a look at other solutions like a NoSql data store, BigTable or KV Stores. There are some great products out there with massive deployments and traction such as MongoDB, DynamoDB, and Cassandra that pioneered this movement.
This is a vast topic, I would recommend reading The Data Model Resource Book Volume 3 - Universal Patterns for Data Modeling by Len Silverston and Paul Agnew.
I've just received my copy and it's pretty good - It provides you with an overlook for many approaches to data modeling, including hybrid contextual role patterns and so on. It has detailed PROs and CONs for every approach.
There is a pletheora of ways to model party relationships and roles all with their benefits and disadvantages. The question that was accepted as an answer covers just one instance of a 'party model'.
For instance, in many approaches, notions like "Employee", "Project Manager" etc. are roles that a party can play within a certain context. I will try to give you a better breakdown once I get home.
When I was part of a team implementing these ideas in the early 1980's, it did not limit our choice of ORM's because those hadn't been invented yet.
I'd fall back on those ideas any time, as that particular project was one of the most convincing proofs-of-concept I have ever seen of a "revolutionary" idea (which it certainly was at the time).
It forces you to nothing. And it doesn't stop you from anything (from any mistake, I mean). The one defining your own information model is you.
All parties have lots of properties in common. The fact that they have a name and such (we called those "signaletics"). The fact that they have principal/primary locations called "addresses". The fact that they all are involved, in some sense, in the business' contracts.
as a simple talk from my understanding: Party modeling gives the flexibility and needs more effort (like T-sql join and ...) to be implemented.
I also wanna point that, "using Party modeling (serialization/generalization) gives you the ability to have FK-Relation to other tables". for example: think of different types of users (admin, user, ...) which generalized into User table, and you can have UserID in your Authorization table.
I'm not sure, but the party model sounds like a particular case of the generalization-specialization pattern. A search on "generalization specialization relational modeling" finds some interesting articles.