Is it OK to compose an aggregate with immutable data from another aggregate? - oop

If an aggregate needs some read-only data that doesn't belong to itself to performs an operation, is there any negative consequence to let the repository query some data from another aggregate to create the aggregate?
In detail:
I have a BC with two aggregates, say A and B. B needs a bit of data from A to perform some operation but won't modify it in any way. The data fits better on A since there are the rules to modify it.
Reading IDDD and PPP of DDD it seems that it is acceptable to pass a transient reference of an aggregate (or sub entity of it) to another one, or pass a read-only view as a value object to the other aggregate.
In my example, B doesn't need the whole A aggregate but only some specific data, so a value object seems like a good approach in this case. A could create the VO acting as a factory, the VO will conform to the UL and B doesn't need to be aware of A at all. A business use case in the application layer can reconstitute A and B from repositories, tell A to create the VO and performs operation on B passing VO.
Lets suppose now that reconstitution of A is expensive or there is another reason for what is not desirable to load the whole A to create the VO with just a bit of information (maybe the data is not from one instance of A but is aggregated from a list of them or whatever). Here a simple solution could be let the repository of A create the VO directly from the data store. I feel comfortable with this and seems it is a common pattern.
But now I'm thinking in a case when the operation on B is performed many times, or maybe is part of a bigger calculation on B that many other operations need. I could have a reference to the VO with the data needed (as a private, read-only property of B or somewhere in its graph) and let the repository of B take the data needed to create the VO and reconstitute B with it. Now B will always have the data locally to performs its operations. The data taken from A cannot be modified; saving B through its repository will just discard that data (maybe it could use it to detect a conflicting concurrent update), A and B will not be consistent at all times but that's OK, and reload B from repository will query the data again to update the view inside B in case of a conflict.
This approach seems OK to me since, as I understand, the domain model is unrelated to the data model, with the repository acting as a sort of ACL between the two. Also there is a single source of truth for the data inside A since the copy inside B is immutable and eventually consistent. The drawbacks I see are that repository will have more logic (but not business logic) and that it could be unclear where exactly the data is coming from since the dependency from B to A is now hidden inside infrastructural code.
So the questions are:
Is this a not-so-good approach after all?
Is there another drawback I am not seeing?
Did you or someone do something like this so I can learn from that experience?
I know the example is very poor since in DDD everything is about context. But this is a question I came up many times in different situations. I know as well that a valid concern is if aggregate boundaries are well defined, but let say they look good for the problem at hand.

is it acceptable to let the repository query some data from another aggregate to create the aggregate?
Acceptable is kind of weakly defined. A better question to ask might be "are there negative consequences?"
In this example, the usual consideration is whether or not the system becomes harder to change. Take a look at Adam Ralph's talk on service boundaries to get a sense for what happens when you don't control the coupling between components.
These days, if B needs a copy of A's data, then we usually introduce into our design of B a cache of A's data. Store the copy of the data with B, and work out explicitly how and when updates to A are communicated to B. The cache becomes part of B's data model.
See also Pat Helland's paper: Data on the Outside versus Data on the Inside..

Related

What is the best way to pass certain attribute values from one object to another in OOP?

Considering a class A that instantiates from a class B. Class B requires certain values from an instance of class A. What is best-practice, in terms of the SOLID or other design principles, for passing these values from A to B:
by passing the whole instance of A to the constructor of B
by just passing the necessary attribute values from an instance of A to the constructor of B
depends on the situation?
In case of (3), which criteria would favour one or the other solution?
(I do not know if this has an effect on the possible answers, but I am coding in Python)
This is kind of an open question, and I would say it depends on the situation. Since you specifically say that B will be instantiated from A, I assume Dependency Injection (the D from SOLID) is out of the question. I would then focus on the Single-responsibility principle, in addition to the general goal of keeping code as clean as possible.
A couple of things to consider:
How many "values" will B actually require? If B only requires one or two simple values, then I would pass them in as parameters to keep it clean and simple. If it needs more than that, then you could either pass the whole of A (or a reference to it) in as a parameter, and let B pick what it needs from there, or create another class (C?) that contains exactly those values that are needed. That will keep the constructor signature of B fairly clean, and make it less burdensome to add / change the data passed into it (it will probably be easier and less messy to modify C later than to modify the signature of B, and all calls to it).
Where will B be used? If it will only ever be used by A, then it might make sense to let B fetch data directly from A as needed. If it will be used other places and by other classes, then it might be better to minimize any dependency on A, and make sure it has everything it needs from the get-go. This also applies if there is a chance B will need to be serialized and stored or sent over a network, etc., as it might then not have access to A.
The optimal solution will vary depending on context. You might for instance think of the difference between two tightly coupled classes dealing with the GUI of a Windows Desktop application, and two classes in a service architecture, where one or both might contain data to be transferred or stored in a database.
I think you missed one -- and the one that is actually most common.
You pass a reference to the whole object to B, which either holds the reference, or copies the values it needs during construction.
Depending on the relationship between the objects, either is acceptable.
Your solution #2 is common for those cases in which instantiating A does not require B -- that is, it could be instantiated by C which has equivalent values, or even just programmatically from values either input or computed at run time.
For whatever it's worth, I usually use #2, but if I commonly construct based on B, I will create a special constructor fascade that accepts B and harvests the values needed into the primary constructor.

Method requires specific subtype but collection is of base abstract type. What is wrong?

Recently I have fallen in a situation like this. I'm generalizing the problem because I think it relates more to the structural design than the specific problem.
General problem
There is a hierarchy of classes: an abstract base class Base and some concretions D1, D2, D3 that inherit from it. The class A contains an object's collection of type Base. A requires a computation from some service-class B but B.process() method accepts only a collection of type D1. Let's say that is important because if the input collection contains any other type the value returned is just wrong.
A have an interface that allows clients to add elements to the internal collection, which is not exposed in any other way. The classes in the hierarchy can be constructed for the same clients and pass the new values to A; A have not enough context to construct them itself.
Attempts, questions and thoughts
The major concern for me was the need to determine at runtime the type of each element in the A collection, so can filter the right ones and pass to B.process(). Even if it is possible (it is in my particular problem, more later on) it just seems wrong! I think the object who contains references to the abstract base class shouldn't have to know the concrete instances it holds.
I try to:
Change the parameter type to B.process(c: Base[]) so A doesn't have to downcast the type, but it doesn't solve anything: A still needs to filter the elements or the computation will be wrong.
Pass the complete collection Base[] to B.process() but just defer the problem of selection/downcasting to B.
Put a process() method in Base so D1 can override the behavior (well known polymorphism). The problem here is that a process() returning a SomeValue type just have sense for D1.
Separate the interface that add elements so a more specific A.addD1Element(e: D1) method could allow put D1 objects in a different collection and pass that to B. It should work but also looks... don't know, weird. If method overload based on parameter type is possible at least the process won't be so cumbersome for clients of the class.
Just separate the D1 class of the hierarchy. This is a more aggressive variation of the previous one. The issue is that D1 seems related to the whole hierarchy except for the specific requirements of B.
Those were some of my thoughts on the problem.
For instance, the language used have support to check the type of an object at runtime (instanceof) and it is easy to filter the collection based on that check. But as I say my question is more related to the paradigm. What about a language, say for instance C++, where is less handy to make a check like that?
So what could be a solution to this kind of problem? What kind of refactoring or design pattern could be applied so the problem is easy to treat with or simply fades away?
This question looks related, but I believe this is more general (although I provide a more specific context). The most upvoted answer suggest to split in different collections. This is also a think i'm considering, but that forces to change A implementation every time a new type is added.
Context (problem in action)
I'm asking in a general way because it really intrigues me on that way, but I know most of the time a design can be analyzed only with the context of the particular problem it tries to solve.
The problem at hand is similar to this:
A is a class (some kind of entity, like a DDD entity) that models a sort of agreement or debt a customer incurs for a service. It has different costs including a monthly pay. Base and related classes are Payments of different types. They share a lot in common, although most of it is data (date, amount, interests, etc); but there is at least one type of payment that have different, additional information: the monthly payment (D1). Those payments need to be analyzed carefully so a different class (B) is responsible for that, using more contextual information and all the payments of that type at once. The service needs the additional data that is specific to those payments so cannot receive an abstract Payment type (at least not in that design). Other payments doesn't have the specific information MonthlyPayment does and so they cannot generates the values that business requires and B is generating (doesn't have sense in other payment types).
All payments are stored in the same collection so other methods of the class can process all payments in a generic way.
This is mostly the context. I think the design is not the best, but I fail to see a better one.
Maybe separating only MonthlyPayment (D1) in a different collection as described earlier? But it is not the only payment that requires additional processing (it is the most complex, though), so I could end with different collections for every payment type and no hierarchy at all. Right now there are four payments types and two of them requires additional, specific analysis, but more types can be added later and the issue of need to modify the implementation every time a new type is added persists.
Is this, more discrete approach of different collections by type, a better one here? The abstract base class Payment can still be used for payments that can be manipulated trough the common interface. Also I can use a layer super type or something like that to allow reutilization of common functionality (the language allows a kind of mixing as well) and stop using the base class as root from a hierarchy.
Uf. I am sorry for the length of the text. I hope it is at least readable and clear. Thank you very much in advance.

How to express requirements between components?

I have two components, A and B.
Component B requires that A has a certain state.
I can write this as part of B's code,
or I can write this as part of A's code (and maybe add assertions to B)
What should I take into consideration when making such a decision?
Edit
In this scenario there might be several B-type components.
It's also assumed that I can't avoid this situation
Edit 2
This often happens when working with frameworks. I usually have a some sort of "global settings", and components that require those settings to be something
Possibilities:
Have A implement an interface that will B will check
Make A create B whenever it has such a state.
Generally, the first solution is used, because ALL Bs refer to A, but A doesn't actually have to know about ALL Bs (you said there were many). In theory, every object should do what it is supposed to do, ignoring anything else exists, unless its a controller object.
With the first solution, B checks what A has.
With the second solution, A becomes the controller of all Bs.
I would say that it's better to have Bs check for A on creation, but in special cases, like when A is your main controller class, it MIGHT be preferable to have A create B.
Edit as response to Edit 2 by OP
Yes, in this case it's almost always better to have B check in global settings. Global settings are there so you can check them! The only exception is if A is also the owner of all other components (such as the Game class in XNA)... Even there it would be difficult to choose and just to keep the architecture intact I'd still make B check inside of A, it's just more clean and healthy.
I'm not sure that coupling is too high (at least not from your description of the problem alone).
I think the general answer to your general question comes from the concept of ownership/responsibility that so pervades OO in general. If B needs A to be in some state before doing something, then B must make sure A is in that state before doing it. Responsibility lies with B - put the code in B.
Presumably A has its own life independent of B. Let it be A, man.

DDD: Should everything fit into either Entity or Value Object?

I'm trying to follow DDD, or a least my limited understanding of it.
I'm having trouble fitting a few things into the DDD boxes though.
An example: I have a User Entity. This user Entity has a reference to a UserPreferencesInfo object - this is just a class which contains a bunch of properties regarding user preferences. These properties are fairly unrelated, other than the fact that they are all user preferences (unlike say an Address VO, where all the properties form a meaningful whole).
Question is - what is this UserPreferencesInfo object?
1) Obviously it's not an Entity (I'm just storing it as 'component' in fluent nhibernate speak (i.e. in the same DB table as the User entity).
2) VO? I understand that Value Object are supposed to be Immutable (so you cant cange them, just new them up). This makes complete sense when the object is an address for instance (the address properties form a meaningful 'whole'). But in the case of UserPreferencesInfo I don't think it makes sense. There could be 100 properties (Realistically) There could be maybe 20 properties on this object - why would I want to discard an recreate the object whenever I needed to change one property?
I feel like I need to break the rules here to get what I need, but I don't really like the idea of that (it's a slippery slope!). Am I missing something here?
Thanks
Answer 1 (the practical one)
I'm a huge proponent of DDD, but don't force it. You've already recognised that immutable VOs add more work than is required. DDD is designed to harness complexity, but in this case there is very little complexity to manage.
I would simply treat UserPreferencesInfo as an Entity, and reference it from the User aggregate. Whether you store it as a Component or in a separate table is your choice.
IMHO, the whole Entity vs VO debate can be rendered moot. It's highly unlikely that in 6 months time, another developer will look at your code and say "WTF! He's not using immutable VOs! What the heck was he thinking!!".
Answer 2 (the DDD purist)
Is UserPreferencesInfo actually part of the business domain? Others have mentioned disecting this object. But if you stick to pure DDD, you might need to determine which preferences belong to which Bounded Context.
This in turn could lead to adding Service Layers, and before you know it, you've over-engineered the solution for a very simple problem...
Here's my two cents. Short answer: UserPreferenceInfo is a value object because it describes the characteristics of an object. It's not an entity because there's no need to track an object instance over time.
Longer answer: an object with 100+ properties which are not related is not very DDD-ish. Try to group related properties together to form new VOs or you might discover new entities as well.
Another DDD smell is to have a lot of set properties in the first place. Try to find the essence of the action instead of only setting the value. Example:
// not ddd
employee.Salary = newSalary;
// more ddd
employee.GiveRaise(newSalary);
On the other hand you may very well have legitimate reasons to have a bunch of properties that are no more than getters and setters. But then there's probably simpler methods than DDD to solve the problem. There's nothing wrong with taking the best patterns and ideas from DDD but relax a little of all the "rules", especially for simpler domains.
I'd say a UserPreferenceInfo is actually a part of the User aggregate root. It should be the responsibility of the UserRepository to persist the User Aggregate Root.
Value objects only need to be newed up (in your object model) when their values are shared. A sample scenario for that would be if you check for a similar UserPreferenceInfo and associate the User with that instead of Inserting a new one everytime. Sharing Value Objects make sense if value object tables would get to large and raise speed/storage concerns. The price for sharing is paid on Insert.
It is reasonable to abstract this procedure in the DAL.
If you are not shraing value objects, there is nothing against updating.
As far as I understand, UserPreferenceInfo is a part of User entity. Ergo User entity is an Aggregate root which is retrieved or saved using UserRepository as a whole, along with UserPreferenceInfo and other objects.
Personally, I think that UserPreferenceInfo is entity type, since it has identity - it can be changed, saved and retrieved from repository and still be regarded as the same object (i.e. has identity). But it depends on your usage of it.
It doesn't matter IMHO how object is represented in the DAL - is it stored in a separate table or part of other table. One of the benefits of DDD is persistence ignorance and is ususally a good thing.
Of course, I may be wrong, I am new to DDD too.
Question is - what is this UserPreferencesInfo object?
I don't know how this case is supported by NHibernate, but some ORMs support special concepts for them. For example DataObjects.Net include Structures concept. It seems that you need something like this in NH.
First time ever posting on a blog. Hope I do it right.
Anyway, since you haven't showed us the UserPreferencesInfo object, I am not sure how it's constructed such that you can have a variable number of things in it.
If it were me, I'd make a single class called UserPreference, with id, userid, key, value, displaytype, and whatever other fields you may need in it. This is an entity. it has an id and is tied to a certain user.
Then in your user entity (the root I am assuming), have an ISet.
100 properties sounds like a lot.
Try breaking UserPreferenceInfo up into smaller (more cohesive) types, which likely/hopefully are manageable as VOs.

Aggregate Objects

If you have a class A that is an aggregate of class B and C, is it better for A
to store ID's for B and C
to load and store the entire object for B and C (edit, store by reference to object B/C, i.e. instantiate objects B and C as opposed to storing id's for B and C.
store the ID's and provide methods to pull methods B and C
I'm assuming this varies depending on performance requirements and other requierements, but I'm just looking for any general guidelines or thoughts.
In a typical program running in memory, objects will almost always be stored by reference as pointers, so you ARE storing IDs for B and C, it's just that you don't deal with the details yourself, the language hides them from you.
Loading and storing the "Entire Object" is a questionable concept. I know you are trying to be language independent, but one of the first things that really helped me "Get" OO is that nearly every object should have a lifecycle of its own.
If you have object A that "Contains" object B, and you pass a reference to object B to object C, then Object A has to know something about object C, this is completely NOT OK. Freeing up object B's lifecycle so that object A knows nothing of object C is one of the core concepts that makes OO work.
So if that's what you meant by storing the entire object, then no--never do that.
And that's true of Databases and other storage as well. Even if one object is responsible for destroying another, it should rarely contain the other objects' data.
And (although I think you meant to say "pull Objects B and C", not "methods"), the concept of being able to pass an object from another is also very useful and there is generally nothing wrong with it with one caveat:
Remember that an object has no control over what goes on outside itself. It could be passed around, methods called in a semi-random order, etc. Therefore it's helpful to keep your object as safe as possible. If something is called in the wrong order, or an invalid variable is passed in, or you find that somehow you've entered an invalid state, Fail early and Fail LOUD so that the programmer who made a mistake called it.
You also want to make it as difficult as possible to get into an illegal state--this means keep your object small and simple, make variables final whenever possible and try not to have too many places where parameter call order matters.
I tend to load and store the entire objects (and their subobjects) as my default approach.
Sometimes this will cause long load times, a large memory footprint, or both. Then you'll need to determine if all the loaded objects are in fact used or if many are created and never accessed.
If all the objects are used, a more creative approach will be needed to load a subset, process those, dispose them, and then load the next subset to fit everything into memory - or simply buy more memory and make it available to your app.
If many of the objects are not used, the best approach is to lazily load the sub-objects as they are needed.
This depends on the situation. If the objects stay in memory, it's more OO (and easier) to have A contain B and C. I've found that if the objects need to be persisted, though, it makes things easier and more efficient for A to store IDs for B and C. (That way if you need data directly in A but not B and C, you won't have to pull B and C out of the database, file, etc.)
It depends,
If B and C are heavy and expense to load and construct, it might be worthwhile to defer the loading of them until you are sure they are needed (Lazy Initialize).
If they are simple and lightweight maybe you just want to construct them whenever, you get the Ids.