Aggregate - Correct Usage (DDD) - oop

I have been trying to get started on Domain Driven Design (DDD) and therefore I've been studying it for a while now. I have a problem and I seek help around how I can solve it in a DDD fashion.
I have a Client class, which contains a hell lot of attributes - some of them are simple attributes, such as string contactName whereas others are complex ones, such as list addresses, list websites, etc.
DDD advocates that Client should be an Entity and it should also be an Aggregate root - ie, the client code should manipulate only the Client object itself and it's down to the Client object to perform operations on its inner objects (addresses, websites, names, etc.).
Here's the point where I get confused:
There are tons of business rules in the application that depend on the Client's inner objects - for instance:
Depending on the Client's country of birth or resident and her address, some FATCA (an US regulation) restrictions may be applicable.
I need to enrich some inner objects with data that comes from other systems, both internal to my organisation as well as external.
The application has to decide whether a Client is allowed to perform an operation and to that end, the app needs to scrutinize a lot of client details and make a decision - also, as the app scrutinizes the Client it needs to update many of its attributes to keep track of what led the application to that decision.
I could list hundreds of rules here - but you get the idea. My point is that I need to update many of the Client's inner attributes. From the domain perspective, the root is the Client - it's the Client that the user searches for in the GUI. The user cares only about the Client as a whole. Say, an isolated address is meaningless - it only exists if it's part of a Client.
Having said all that, my question is:
Eric Evans says it's OK for the root to return transient references to inner objects, preferably VOs (keyword here: VO) - but any manipulation on the inner objects should be performed by the root itself.
I have hundreds of manipulations that I need to perform on my clients - if I move all of them to the root, the root is going to become huge - it will have at least 10K lines of code!
According to Eric, a VO should be immutable - so if my root returns VOs, the client code won't be allowed to change them. So doing something like this would be unacceptable in a service: client.getExternalInfo().update(getDataFromExternalSystem())
So my question boils down to how on Earth I should update the inner objects without breaking the DDD rules?
I don't see any easy way out.
UPDATE I:
I've just come across Specifications, which seems to be the ideal DDD concept to my problem.
I'm still reading about it but I have decided to post this update anyway.

I have been studying DDD for awhile myself and am struggling to master it.
First, you're right: Specification is a fine pattern to use for validation or business rules in general, assuming the rules you are applying fit well with a predicate tree.
Of course, I don't know the specifics of your design, but I do wonder about the model itself. You mention that your Client class has "a hell lot of attributes". Are you sure your model is not somewhat anemic? Could your design benefit from some more analysis, perhaps breaking it out into other Aggregates? Is this a single Bounded Context? Should it be?

Specifications is definitely the way to go for complex business logic.
One question though - are you modeling the inner entities like addresses and names as ValueObjects? The best rule of thumb I can think of for those is if you can say they're equal, without an ID, they're likely value objects. Does your domain consider names to have a state?
If you're looking at a problem where few entities take in many types of change AND need an audit trail, you might want to also explore EventSourcing. Ideally the entity declares its reaction to an event, but you can also have the mutating code be held in the event for easy extensibility. There's pros and cons in that approach, of course.

Related

Domain Driven Design - Creating general purpose entities vs. Context specific Entities

Situation
Suppose you have Orders and Clients as entities in your application. In one aggregate, the Order entity is considered to be the root but you also want to make use of the Client entity for simple things. In another the Client is the root entity and the Order entity is touched ever so lightly.
An example:
Let's say that in the Order aggregate I use the Client only to read details like name, address, build order history and not to make the client do client specific business logic. (like persistence, passwords resets and back flips..).
On the other hand, in the Client aggregate I use the Order entity to report on the client's buying habbits, order totals, order counting, without requiring advanced order functionality like order processing, updating, status changes, etc.
Possible solution
I believe the better solution is to create the entities for each aggregate specific to the aggregate context, because making them full featured (general purpose) and ready for any situation and usage seems like overkill and could potentially become a maintenance nightmare. (and potentially memory intensive)
Question
What is the DDD recommended way of handling this situation?
What is your take on the matter?
The basic driver for these decisions should be the ubiquitous language, and consequently the real world domain you're modeling. If both works in a specific domain, I'd favor separation over god-classes for maintainability reasons.
Apart from separating behavior into different aggregates, you should also take care that you don't mix different bounded contexts. Depending on the requirements of your domain, it could make sense to separate the Purchase Context from the Reporting Context (to extend on your example).
To decide on a context design, context maps are a helpful tool.
You are one the right track. In DDD, entities are not merely containers encapsulating all attributes related to a "subject" (for example: a customer, or an order). This is a very important concept that eludes a lot of people. An entity in DDD represents an operation boundary, thus only the data necessary to perform the operation is considered to be a part of the entity. Exactly which data to include in an entity can be difficult to consider because some data is relevant in a different use-cases. Here are some tips when analyzing data:
Analyze invariants, things that must be considered when applying validation rules and that can not be out of sync should be in the same aggregate.
Drop the database-thinking, normalization is not a concern of DDD
Just because things look the same, it doesn't mean that they are. For example: the current shipping address registered on a customer is different from the shipping address which a specific order was shipped to.
Don't look at reads. Reading, like creating a report or populating av viewmodel/dto/whatever has nothing to do with operation boundaries and can typically be a 360 deg view of the data. In fact don't event use your domain model when returning reads, use a different architectural stack.

Should the rule "one transaction per aggregate" be taken into consideration when modeling the domain?

Taking into consideration the domain events pattern and this post , why do people recomend keeping one aggregate per transaction model ? There are good cases when one aggregate could change the state of another one . Even by removing an aggregate (or altering it's identity) will lead to altering the state of other aggregates that reference it. Some people say that keeping one transaction per aggregates help scalability (keeping one aggregate per server) . But doesn't this type of thinking break the fundamental characteristic about DDD : technology agnostic ?
So based on the statements above and on your experience, is it bad to design aggregates, domain events, that lead to changes in other aggregates and this will lead to having 2 or more aggregates per transaction (ex. : when a new order is placed with 100 items change the customer's state from normal to V.I.P. )?
There are several things at play here and even more trade-offs to be made.
First and foremost, you are right, you should think about the model first. Afterall, the interplay of language, model and domain is what we're doing this all for: coming up with carefully designed abstractions as a solution to a problem.
The tactical patterns - from the DDD book - are a means to an end. In that respect we shouldn't overemphasize them, eventhough they have served us well (and caused major headaches for others). They help us find "units of consistency" in the model, things that change together, a transactional boundary. And therein lies the problem, I'm afraid. When something happens and when the side effects of it happening should be visible are two different things. Yet all too often they are treated as one, and thus cause this uncomfortable feeling, to which we respond by trying to squeeze everything within the boundary, without questioning. Still, we're left with that uncomfortable feeling. There are a lot of things that logically can be treated as a "whole change", whereas physically there are multiple small changes. It takes skill and experience, or even blunt trying to know when that is the case. Not everything can be solved this way mind you.
To scale or not to scale, that is often the question. If you don't need to scale, keep things on one box, be content with a certain backup/restore strategy, you can bend the rules and affect multiple aggregates in one go. But you have to be aware you're doing just that and not take it as a given, because inevitably change is going to come and it might mess with this particular way of handling things. So, fair warning. More subtle is the question as to why you're changing multiple aggregates in one go. People often respond to that with the "your aggregate boundaries are wrong" answer. In reality it means you have more domain and model exploration to do, to uncover the true motivation for those synchronous, multi-aggregate changes. Often a UI or service is the one that has this "unreasonable" expectation. But there might be other reasons and all it might take is a different set of abstractions to solve the same problem. This is a pretty essential aspect of DDD.
The example you gave seems like something I could handle as two separate transactions: an order was placed, and as a reaction to that, because the order was placed with a 100 items, the customer was made a VIP. As MikeSW hinted at in his answer (I started writing mine after he posted his), the question is when, who, how, and why should this customer status change be observed. Basically it's the "next" behavior that dictates the consistency requirements of the previous behavior(s).
An aggregate groups related business objects while an aggregate root (AR) is the 'representative' of that aggregate. Th AR itself is an entity modeling a (bigger, more complex) domain concept. In DDD a model is always relative to a context (the bounded context - BC) i.e that model is valid only in that BC.
This allows you to define a model representative of the specific business context and you don't need to shove everything in one model only. An Order is an AR in one context, while in another is just an id.
Since an AR pretty much encapsulates all the lower concepts and business rules, it acts as a whole i.e as a transaction/unit of work. A repository always works with AR because 1) a repo always deals with business objects and 2) the AR represents the business object for a given context.
When you have a use case involving 2 or more AR the business workflow and the correct modelling of that use case is paramount. In a lot of cases those AR can be modified independently (one doesn't care about other) or an AR changes as a result of other AR behaviour.
In your example, it's pretty trivial: when the customer places an order for 100 items, a domain event is generated and published. Then you have a handler which will check if the order complies with the customer promotions rules and if it does, a command is issued which will have the result of changing the client state to VIP.
Domain events are very powerful and allows you to implement transactions but in an eventual consistent environment. The old db transaction is an implementation detail and it's usually used when persisting one AR (remember AR are treated as a logical unit but persisting one may involve multiple tables hence db transaction).
Eventual consistency is a 'feature' of domain events which fits naturally a rich domain (and the real world actually). For some cases you might need instant consistency however those are particular cases and they are related to UI rather than how Domain works. Of course, it really depends from one domain to another. In your example, the customer won't mind it became a VIP 2 seconds or 2 minutes after the order was placed instead of the same milisecond.

How to handle complex availability of information in OOP from a RESTful API

My issue is that I'm dealing with a RESTful API that returns information about objects, and when writing classes to represent them, I'm not sure how best to handle all the possibilities of the status of each variable's availability. From what I can tell, there are 5 possibilities: The information
is available
has not been requested
is currently being requested (asynchronously)
is unavailable
is not applicable
So with these, having an object represent its data with a value or null doesn't cut it. To give a more concrete example, I'm working with an API about the United States Congress, so the problem goes as thus:
I request information about a bill, and it contains a stub about the sponsoring legislator.
I eventually need to request all the information about that legislator. Not all the legislators will have all the information. Those in the House of Representatives won't have a senate class (Senators' six-year terms are staggered so a third expire every two years, the House is entirely re-elected every two years). Some won't have a twitter id, just because they don't have one. And, of course, if I have already requested information, I shouldn't try to request it again.
There's a couple options I see:
I can create a Legislator object and fill it with what information I have, but then I have to have some mechanism of tracking information availability with the getters and setters. This is kind of what I'm doing right now, but it requires a lot of repeated code.
I could create a separate class for abbreviated objects and replace them when I get more with immutable "complete" objects, but then I have to be really careful about replacing all references to them and also go through a bunch of hoops for unavailable, and especially, not applicable information.
So, I'm just wondering what other people's take on this issue is. Are there other (better?) ways of handling this complexity? What are the advantages and drawbacks of different approaches? What should I consider about what I'm trying to do in choosing an approach?
[Note: I'm working in Objective-C, but this isn't necessarily specific to that language.]
If you want to treat those remote resources as objects on the client side, the do yourself a huge favour and forget about the REST buzzword. You will drive yourself crazy. Just accept that you are doing HTTP RPC and move on as you would doing any other RPC project.
However, if you really want to do REST, you need to understand what is meant by the "State Transfer" part of the REST acronym and you need to read about HATEOAS. It is a huge mental shift for building clients, but it does have a bunch of benefits. But maybe you don't need those particular benefits.
What I do know, is if you are trying using a "REST API" to retrieve objects over the wire, you are going to come to the conclusion that REST is a load of crap.
It's an interesting question, but I think you're probably overthinking this a bit.
Firstly, I think you're considering the possible states of information a bit too much; consider the more basic consideration that you either have the information or you don't. WHY you have the information doesn't really matter, except in one case. Let me explain; if the information about a certain bill or legislator or anything is not applicable, you shouldn't be requesting it / needing it. That "state" is irrelevant. Similarly, if the information is in the process of being requested, then it is simply not yet available; the only state you really care about is whether you have the information or if you do not yet have the information.
If you start worrying about further depths of the request process, you risk getting into a deep, endless cycle of managing state; has the information changed between when I got it and now? All you can know about the information is if you've been told what it is. This is fundamental to the REST process; you're getting REPRESENTATION of the underlying data, but there's no mistake about it; the representation is NOT the underlying data, any more than a congressman's name is the congressman himself.
Second, don't worry about information availability. If an object has a subobject, when you query the object, query for the subobject. If you get back data, great. If you get back that the data isn't available, that too is a representation of the subobject's data; it's just a different representation than you were hoping for, but it's equally valid. I'd represent that as an object with a null value; the object exists (was instantiated because it belonged to the parent), but you have no valid data about it (the representation returned was empty due to some reason; lack of availability, server down, data changed; whatever).
Finally, the real key here is that you need to be remembering that a RESTful structure is driven by hypermedia; a request to an object that does not return the full object's data should return an URI for requesting the subobject's data; and so forth. The key here is that those structures aren't static, like your object structure seems to be hoping to treat them; they're dynamic, and it's up to the server to determine the representation (i.e., the interrelationship). Attempting to define that in stone with a concrete object representation ahead of time means that you're dealing with the system in a way that REST was never meant to be dealt with.

DDD/NHibernate Use of Aggregate root and impact on web design - ex. Editing children of aggregate root

Hopefully, this fictitious example will illustrate my problem:
Suppose you are writing a system which tracks complaints for a software product, as well as many other attributes about the product. In this case the SoftwareProduct is our aggregate root and Complaints are entities that only can exist as a child of the product. In other words, if the software product is removed from the system, so shall the complaints.
In the system, there is a dashboard like web page which displays many different aspects of a single SoftwareProduct. One section in the dashboard, displays a list of Complaints in a grid like fashion, showing only some very high level information for each complaint. When an admin type user chooses one of these complaints, they are directed to an edit screen which allows them to edit the detail of a single Complaint.
The question is: what is the best way for the edit screen to retrieve the single Complaint, so that it can be displayed for editing purposes? Keep in mind we have already established the SoftwareProduct as an aggregate root, therefore direct access to a Complaint should not be allowed. Also, the system is using NHibernate, so eager loading is an option, but my understanding is that even if a single Complaint is eager loaded via the SoftwareProduct, as soon as the Complaints collection is accessed the rest of the collection is loaded. So, how do you get the single Complaint through the SoftwareProduct without incurring the overhead of loading the entire Complaints collection?
This gets a bit into semantics and religiosity, but within the context of editing a complaint, the complaint is the root object. When you are editing a complaint, the parent object (software product) is unimportant. It is obviously an entity with a unique identity. Therefore you would would have a service/repository devoted to saving the updated complaint, etc.
Also, i think you're being a bit too negative. Complaints? How about "Comments"? Or "ConstructiveCriticisms"?
#Josh,
I don't agree with what you are saying even though I have noticed some people design their "Web" applications this way just for the sake of performance, and not based on the domain model itself.
I'm not a DDD expert either, but I'm sure you have read the traditional Order and OrderItem example. All DDD books say OrderItem belongs to the Order aggregate with Order being the aggregate root.
Based on what you are saying, OrderItem doesn't belong to Order aggregate anymore since the user may want to directly edit an OrderItem with Order being unimportant (just like editing a Complaing with its parents Software Product being unimportant). And you know if this approach is followed, none of the Order invariants could be enforced, which are extremely important when it comes to e-commerce systems.
Anyone has any better approaches?
Mosh
To answer your question:
Aggregates are used for the purpose of consistency. For example, if adding/modifying/deleting a child object from a parent (aggregate root) causes an invariant to break, then you need an aggregate there.
However, in your problem, I believe SoftwareProduct and Compliant belong to two separate aggregates, each being the root of their own aggregates. You don't need to load the SoftwareProject and all N Complaints assigned to it, just to add a new Complaint. To me, it doesn't seem that you have any business rules to be evaluated when adding a new Complaint.
So, in summary, create 2 different Repositories: SoftwareProductRepository and ComplaintRepository.
Also, when you delete a SoftwareProduct, you can use database relationships to cascade deletes and remove the associated Complaints. This should be done in your database for the purpose of data integrity. You don't need to control that in your Domain Model, unless you had other invariants apart from deleting linked objects.
Hope this helps,
Mosh
I am using NH for another business context but similar entity relationships like yours. I do not understand why do you say:
Keep in mind we have already
established the SoftwareProduct as an
aggregate root, therefore direct
access to a Complaint should not be
allowed
I have this in mine, Article and Publisher entities, if Publisher cease to exist, so do all the dependent Artcle entities. I allow myself to have direct access to the Article collections of each Publisher and individual entities. In the DB/Mapping of the Article class, Publisher is one of the members and cannot accept Null.
Care to elaborate the difference between yours and mine?
Sorry this is not a direct answer but too long to be added as a comment.
I agree with Mosh. Each ones of these two entities has its own aggregate root. Let me to explain it in the real life. suppose that a company has developed a software. There are some bug in this software, that made you annoy. you are going to go to the company and aware them from this problem. this company gives you a form to be filled by you.
This form has a field - section - indicates to the software name and description. additionally, it has some parts for your complaint. Is this form the same as the software manual? No. It is a form related to the software. It is not the software. Does this form has any ID? yes. It has. In other words, you can call the company in the next day and ask the operator about your letter of complaint. It is obvious that the operator will ask you about the Id.
This evidence shows that this form has its own entity and it could not be confused with the software itself. Any relation between two different entity does not mean one of them belongs to the other.

What are the principles behind, and benefits of, the "party model"?

The "party model" is a "pattern" for relational database design. At least part of it involves finding commonality between many entities, such as Customer, Employee, Partner, etc., and factoring that into some more "abstract" database tables.
I'd like to find out your thoughts on the following:
What are the core principles and motivating forces behind the party model?
What does it prescribe you do to your data model? (My bit above is pretty high level and quite possibly incorrect in some ways. I've been on a project that used it, but I was working with a separate team focused on other issues).
What has your experience led you to feel about it? Did you use it, and if so, would you do so again? What were the pros and cons?
Did the party model limit your choice of ORMs? For example, did you have to eliminate certain ORMs because they didn't allow for enough of an "abstraction layer" between your domain objects and your physical data model?
I'm sure every response won't address every one of those questions ... but anything touching on one or more of them is going to help me make some decisions I'm facing.
Thanks.
What are the core principles and motivating forces behind the party
model?
To the extent that I've used it, it's mostly about code reuse and flexibility. We've used it before in the guest / user / admin model and it certainly proves its value when you need to move a user from one group to another. Extend this to having organizations and companies represented with users under them, and it's really providing a form of abstraction that isn't particularly inherent in SQL.
What does it prescribe you do to your data model? (My bit above is
pretty high level and quite possibly
incorrect in some ways. I've been on a
project that used it, but I was
working with a separate team focused
on other issues).
You're pretty correct in your bit above, though it needs some more detail. You can imagine a situation where an entity in the database (call it a Party) contracts out to another Party, which may in turn subcontract work out. A party might be an Employee, a Contractor, or a Company, all subclasses of Party. From my understanding, you would have a Party table and then more specific tables for each subclass, which could then be further subclassed (Party -> Person -> Contractor).
What has your experience led you to feel about it? Did you use it, and if
so, would you do so again? What were
the pros and cons?
It has its benefits if you need flexibly to add new types to your system and create relationships between types that you didn't expect at the beginning and architect in (users moving to a new level, companies hiring other companies, etc). It also gives you the benefit of running a single query and retrieving data for multiple types of parties (Companies,Employees,Contractors). On the flip side, you're adding additional layers of abstraction to get to the data you actually need and are increasing load (or at least the number of joins) on the database when you're querying for a specific type. If your abstraction goes too far, you'll likely need to run multiple queries to retrieve the data as the complexity would start to become detrimental to readability and database load.
Did the party model limit your choice of ORMs? For example, did you
have to eliminate certain ORMs because
they didn't allow for enough of an
"abstraction layer" between your
domain objects and your physical data
model?
This is an area that I'm admittedly a bit weak in, but I've found that using views and mirrored abstraction in the application layer haven't made this too much of a problem. The real problem for me has always been a "where is piece of data X living" when I want to read the data source directly (it's not always intuitive for new developers on the system either).
The idea behind the party models (aka entity schema) is to define a database that leverages some of the scalability benefits of schema-free databases. The party model does that by defining its entities as party type records, as opposed to one table per entity. The result is an extremely normalized database with very few tables and very little knowledge about the semantic meaning of the data it stores. All that knowledge is pushed to the data access in code. Database upgrades using the party model are minimal to none, since the schema never changes. It’s essentially a glorified key-value pair data model structure with some fancy names and a couple of extra attributes.
Pros:
Kick-ass horizontal scalability. Once your 5-6 tables are defined in your entity model, you can go to the beach and sip margaritas. You can virtually scale this database out as much as you want with minimum efforts.
The database supports any data structure you throw at it. You can also change data structures and party/entities definitions on the fly without affecting your application. This is very very powerful.
You can model any arbitrary data entity by adding records, not changing the schema. Meaning you can say goodbye to schema migration scripts.
This is programmers’ paradise, since the code they write will define the actual entities they use in code, and there are no mappings from Objects to Tables or anything like that. You can think of the Party table as the base object of your framework of choice (System.Object for .NET)
Cons:
Party/Entity models never play well with ORMs, so forget about using EF or NHibernate to get semantically meaningful entities out of your entity database.
Lots of joins. Performance tuning challenges. This ‘con’ is relative to the practices you use to define your entities, but is safe to say that you’ll be doing a lot more of those mind-bending queries that will bring you nightmares at night.
Harder to consume. Developers and DB pros unfamiliar with your business will have a harder time to get used to the entities exposed by these models. Since everything is abstract, there no diagram or visualization you can build on top of your database to explain what is stored to someone else.
Heavy data access models or business rules engines will be needed. Basically you have to do the work of understanding what the heck you want out of your database at some point, and your database model is not going to help you this time around.
If you are considering a party or entity schema in a relational database, you should probably take a look at other solutions like a NoSql data store, BigTable or KV Stores. There are some great products out there with massive deployments and traction such as MongoDB, DynamoDB, and Cassandra that pioneered this movement.
This is a vast topic, I would recommend reading The Data Model Resource Book Volume 3 - Universal Patterns for Data Modeling by Len Silverston and Paul Agnew.
I've just received my copy and it's pretty good - It provides you with an overlook for many approaches to data modeling, including hybrid contextual role patterns and so on. It has detailed PROs and CONs for every approach.
There is a pletheora of ways to model party relationships and roles all with their benefits and disadvantages. The question that was accepted as an answer covers just one instance of a 'party model'.
For instance, in many approaches, notions like "Employee", "Project Manager" etc. are roles that a party can play within a certain context. I will try to give you a better breakdown once I get home.
When I was part of a team implementing these ideas in the early 1980's, it did not limit our choice of ORM's because those hadn't been invented yet.
I'd fall back on those ideas any time, as that particular project was one of the most convincing proofs-of-concept I have ever seen of a "revolutionary" idea (which it certainly was at the time).
It forces you to nothing. And it doesn't stop you from anything (from any mistake, I mean). The one defining your own information model is you.
All parties have lots of properties in common. The fact that they have a name and such (we called those "signaletics"). The fact that they have principal/primary locations called "addresses". The fact that they all are involved, in some sense, in the business' contracts.
as a simple talk from my understanding: Party modeling gives the flexibility and needs more effort (like T-sql join and ...) to be implemented.
I also wanna point that, "using Party modeling (serialization/generalization) gives you the ability to have FK-Relation to other tables". for example: think of different types of users (admin, user, ...) which generalized into User table, and you can have UserID in your Authorization table.
I'm not sure, but the party model sounds like a particular case of the generalization-specialization pattern. A search on "generalization specialization relational modeling" finds some interesting articles.