I do not use lazy loading. My root aggregate have entities (collection navigation properties). I want my aggregate to be self-contained, responsible for itself, and follow the Single Responsibility Principle (SRP), and adhere to high cohesion and low coupling.
The problem is that the code that retrieves the root aggregate needs to include certain child entities depending on which way it wants to interact with the aggregate.
Example:
public class Blog // My root aggregate
{
public ICollection<Author> Authors { get; set; }
public ICollection<Post> Posts { get; set; }
public AddAuthor(Author author)
{
_authors.Add(author);
}
public AddPost(Post post)
{
_posts.Add(post);
}
}
If I want to add a author, I have to do:
var blog = _context.Blogs.Include(x => x.Authors).Single(x => x.BlogId == 1);
blog.AddAuthor(/* ... */);
And if I want to add a post, I would have to do:
var blog = _context.Blogs.Include(x => x.Posts).Single(x => x.BlogId == 1);
blog.AddPost(/* ... */);
But I feel this breaks encapsulation, because now my Blog aggregate is not self-contained, its functionality depends on how the caller has retrieved the aggregate from the DbContext (or the repository). If the caller did not include the necessary dependent entities then the operation on the aggregate would fail (since the property would be null).
I would like to avoid lazy loading because it is less suitable for web applications and performs worse due to executing multiple queries. I feel that having a repository with methods such as GetBlogWithAuthors and GetBlogWithPosts would be ugly. Do I have to create a repository method such as GetBlog which always include all child entities? (this would a big, slow query, that would timeout).
Are there any solutions to this problem?
I realize it is probably a practice domain but I think an important point that is not talked about enough is that strict DDD should not always be applied. DDD brings a certain amount of complexity to minimize the explosion of complexity. If there is little complexity to start with, it is not worth the added upfront complexity.
As was mentioned in the comments, and Aggregate is a consistency boundary. Since there does not seem to be any consistency being enforced, you can split it. Blog can have a collection of PostRef or something so it need not pull back ALL Post data where PostRef has maybe Id and Title?
Then Post is its own aggregate. I am guessing that Post has an Author. It is recommended not to reference entities in other aggregates that are not the aggregate root so now it seems like Authors should not be in Blog.
When your starting point is an ORM, my experience is that your model will fight the DDD recommendations. Create your model and then see how to persist your aggregates. My and many other's experiences at that point is that an ORM just isn't worth the yak shaving that it brings throughout the project. It is also far too easy for someone who does not understand the constraints to add a reference that should not be there.
To address performance concerns. Remember that your read and write models do not have to be the same. You optimize your write model for enforcing constraints. If separate you can then optimize your read model for query performance. If this sounds like CQRS to you, then you are correct. Again though, the number of moving parts increases and it should solve more problems than it introduces. Again, your ORM will fight you on this.
Lastly, if you do have consistency constraints that require really large amounts of data, you need to ask the question of whether they really need to be enforced in real-time? When you start modeling time, some new options emerge.
SubmittedPost -> RejectedPost OR AcceptedPost -> PublishedPost. If this happens as a background process, the amount of data that needs to be pulled will not affect UX. If this sounds interesting I suggest you take a look at the great book Domain Modeling made Functional.
Some other resources:
Shameless plug: Functional modeling
Nick has an example of
relaxing invariant business rules when accepting input
A
discussion on aggregates...it went deep fast. This question was asked
and I don't think we answered it well so hopefully, I did that better here.
Related
What are the technical reasons that bidirectional relations between entities are not recommended? Does it impact an ORM's performance? (If so, why?)
Source:
http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/reference/best-practices.html#constrain-relationships-as-much-as-possible
https://ocramius.github.io/doctrine-best-practices/#/86
In that first source you refer to are three reasons mentioned:
This has several benefits:
Reduced coupling in your domain model
Simpler code in your domain model (no need to maintain bidirectionality properly)
Less work for Doctrine
In the second:
BI-DIRECTIONAL ASSOCIATIONS ARE OVERHEAD
I assume those are the whys. "Less work doctrine" and "are overhead" most likely means that it impacts performance, I wouldn't know how else to interpret that...
Makes sense since the ORM needs to update both sides whenever you change something in a bi-directional relationship.
As well as the reasons mentioned in the source you provided (and Wilt's answer) having a lot of relationships between entities makes it easier to violate single responsibility and can make your code more complex.
Take this example, I want to update a user's phone number from a certain part of the code. I currently only have access to an organization the user belongs to. If I have a full path of connections between entities I can do this:
foreach ($organization->getDepartments() as $department) {
if ($department->getName() == 'sales') {
foreach ($department->getMembers() as $member) {
if ($member->getName == 'Kevin') {
$member->setPhoneNumber(012343929394);
}
}
}
}
It's a personal preference but I think that making this sort of thing hard to do is a good idea. Instead you would fetch the member based on name from the database in a dedicated service for editing user info. This means your logic is more encapsulated. A new developer working on the code will be more likely to look for the UserEditService if they don't have access to everything from everywhere.
I'm not sure how to name data store classes when designing a program's data access layer (DAL).
(By data store class, I mean a class that is responsible to read a persisted object into memory, or to persist an in-memory object.)
It seems reasonable to name a data store class according to two things:
what kinds of objects it handles;
whether it loads and/or persists such objects.
⇒ A class that loads Banana objects might be called e.g. BananaSource.
I don't know how to go about the second point (ie. the Source bit in the example). I've seen different nouns apparently used for just that purpose:
repository: this sounds very general. Does this denote something read-/write-accessible?
store: this sounds like something that potentially allows write access.
context: sounds very abstract. I've seen this with LINQ and object-relational mappers (ORMs).
P.S. (several months later): This is probably appropriate for containers that contain "active" or otherwise supervised objects (the Unit of Work pattern comes to mind).
retriever: sounds like something read-only.
source & sink: probably not appropriate for object persistence; a better fit with data streams?
reader / writer: quite clear in its intention, but sounds too technical to me.
Are these names arbitrary, or are there widely accepted meanings / semantic differences behind each? More specifically, I wonder:
What names would be appropriate for read-only data stores?
What names would be appropriate for write-only data stores?
What names would be appropriate for mostly read-only data stores that are occasionally updated?
What names would be appropriate for mostly write-only data stores that are occasionally read?
Does one name fit all scenarios equally well?
As noone has yet answered the question, I'll post on what I have decided in the meantime.
Just for the record, I have pretty much decided on calling most data store classes repositories. First, it appears to be the most neutral, non-technical term from the list I suggested, and it seems to be well in line with the Repository pattern.
Generally, "repository" seems to fit well where data retrieval/persistence interfaces are something similar to the following:
public interface IRepository<TResource, TId>
{
int Count { get; }
TResource GetById(TId id);
IEnumerable<TResource> GetManyBySomeCriteria(...);
TId Add(TResource resource);
void Remove(TId id);
void Remove(TResource resource);
...
}
Another term I have decided on using is provider, which I'll be preferring over "repository" whenever objects are generated on-the-fly instead of being retrieved from a persistence store, or when access to a persistence store happens in a purely read-only manner. (Factory would also be appropriate, but sounds more technical, and I have decided against technical terms for most uses.)
P.S.: Some time has gone by since writing this answer, and I've had several opportunities at work to review someone else's code. One term I've thus added to my vocabulary is Service, which I am reserving for SOA scenarios: I might publish a FooService that is backed by a private Foo repository or provider. The "service" is basically just a thin public-facing layer above these that takes care of things like authentication, authorization, or aggregating / batching DTOs for proper "chunkiness" of service responses.
Well so to add something to you conclusion:
A repository: is meant to only care about one entity and has certain patterns like you did.
A store: is allowed to do a bit more, also working with other entities.
A reader/writer: is separated to allow semantically show and inject only reading and wrting functionality into other classes. It's coming from the CQRS pattern.
A context: is more or less bound to a ORM mapper as you mentioned and is usually used under the hood of a repository or store, some use it directly instead of making a repository on top. But it's harder to abstract.
Our architect has spoken about using SOA techniques throughout our codebase, even on interfaces that are not actually hosted as a service. One of his requests is that we design our interface methods so that we make no assumptions about the actual implementation. So if we have a method that takes in an object and needs to update a property on that object, we explictly need to return the object from the method. Otherwise we would be relying on the fact that Something is a reference type and c# allows us to update properties on a reference type by default.
So:
public void SaveSomething(Something something)
{
//save to database
something.SomethingID = 42;
}
becomes:
public Something SaveSomething(Something something)
{
//save to database
return new Something
{
//all properties here including new primary key from db
};
}
I can't really get my head around the benefits of this approach and was wondering if anyone could help?
Is this a common approach?
I think your architect is trying to get your code to have fewer side effects. In your specific example, there isn't a benefit. In many, many cases, your architect would be right, and you can design large parts of your application without side effects, but one place this cannot happen is during operations against a database.
What you need to do is get familiar with functional programming, and prepare for your conversations about cases like these with your architect. Remember his/her intentions are most likely good, but specific cases are YOUR domain. In this case, the side effect is the point, and you would most likely want a return type of bool to indicate success, but returning a new type doesn't make sense.
Show your architect that you understand limiting side effects, but certain side effects must be allowed (database, UI, network access, et cetera), and you will likely find that he or she agrees with you. Find a way to isolate the desired side effects and make them clear to him or her, and it will help your case. Your architect will probably appreciate it if you do this in the spirit of collaboration (not trying to shoot holes in his or her plan).
A couple resources for FP:
A great tutorial on Functional
Programming
Wikipedia's entry on Functional programming
Good luck, I hope this helps.
I'm in a project that takes the Single Responsibility Principle pretty seriously. We have a lot of small classes and things are quite simple. However, we have an anemic domain model - there is no behaviour in any of our model classes, they are just property bags. This isn't a complaint about our design - it actually seems to work quite well
During design reviews, SRP is brought out whenever new behaviour is added to the system, and so new behaviour typically ends up in a new class. This keeps things very easily unit testable, but I am perplexed sometimes because it feels like pulling behaviour out of the place where it's relevant.
I'm trying to improve my understanding of how to apply SRP properly. It seems to me that SRP is in opposition to adding business modelling behaviour that shares the same context to one object, because the object inevitably ends up either doing more than one related thing, or doing one thing but knowing multiple business rules that change the shape of its outputs.
If that is so, then it feels like the end result is an Anemic Domain Model, which is certainly the case in our project. Yet the Anemic Domain Model is an anti-pattern.
Can these two ideas coexist?
EDIT: A couple of context related links:
SRP - http://www.objectmentor.com/resources/articles/srp.pdf
Anemic Domain Model - http://martinfowler.com/bliki/AnemicDomainModel.html
I'm not the kind of developer who just likes to find a prophet and follow what they say as gospel. So I don't provide links to these as a way of stating "these are the rules", just as a source of definition of the two concepts.
Rich Domain Model (RDM) and Single Responsibility Principle (SRP) are not necessarily at odds. RDM is more at odds with a very specialised subclassof SRP - the model advocating "data beans + all business logic in controller classes" (DBABLICC).
If you read Martin's SRP chapter, you'll see his modem example is entirely in the domain layer, but abstracting the DataChannel and Connection concepts as separate classes. He keeps the Modem itself as a wrapper, since that is useful abstraction for client code. It's much more about proper (re)factoring than mere layering. Cohesion and coupling are still the base principles of design.
Finally, three issues:
As Martin notes himself, it's not always easy to see the different 'reasons for change'. The very concepts of YAGNI, Agile, etc. discourage the anticipation of future reasons for change, so we shouldn't invent ones where they aren't immediately obvious. I see 'premature, anticipated reasons for change' as a real risk in applying SRP and should be managed by the developer.
Further to the previous, even correct (but unnecessary anal) application of SRP may result in unwanted complexity. Always think about the next poor sod who has to maintain your class: will the diligent abstraction of trivial behaviour into its own interfaces, base classes and one-line implementations really aid his understanding of what should simply have been a single class?
Software design is often about getting the best compromise between competing forces. For example, a layered architecture is mostly a good application of SRP, but what about the fact that, for example, the change of a property of a business class from, say, a boolean to an enum has a ripple effect across all the layers - from db through domain, facades, web service, to GUI? Does this point to bad design? Not necessarily: it points to the fact that your design favours one aspect of change to another.
I'd have to say "yes", but you have to do your SRP properly. If the same operation applies to only one class, it belongs in that class, wouldn't you say? How about if the same operation applies to multiple classes? In that case, if you want to follow the OO model of combining data and behavior, you'd put the operation into a base class, no?
I suspect that from your description, you're ending up with classes which are basically bags of operations, so you've essentially recreated the C-style of coding: structs and modules.
From the linked SRP paper:
"The SRP is one of the simplest of the principle, and one of the hardest to get right."
The quote from the SRP paper is very correct; SRP is hard to get right. This one and OCP are the two elements of SOLID that simply must be relaxed to at least some degree in order to actually get a project done. Overzealous application of either will very quickly produce ravioli code.
SRP can indeed be taken to ridiculous lengths, if the "reasons for change" are too specific. Even a POCO/POJO "data bag" can be thought of as violating SRP, if you consider the type of a field changing as a "change". You'd think common sense would tell you that a field's type changing is a necessary allowance for "change", but I've seen domain layers with wrappers for built-in value types; a hell that makes ADM look like Utopia.
It's often good to ground yourself with some realistic goal, based on readability or a desired cohesion level. When you say, "I want this class to do one thing", it should have no more or less than what is necessary to do it. You can maintain at least procedural cohesion with this basic philosophy. "I want this class to maintain all the data for an invoice" will generally allow SOME business logic, even summing subtotals or calculating sales tax, based on the object's responsibility to know how to give you an accurate, internally-consistent value for any field it contains.
I personally do not have a big problem with a "lightweight" domain. Just having the one role of being the "data expert" makes the domain object the keeper of every field/property pertinent to the class, as well as all calculated field logic, any explicit/implicit data type conversions, and possibly the simpler validation rules (i.e. required fields, value limits, things that would break the instance internally if allowed). If a calculation algorithm, perhaps for a weighted or rolling average, is likely to change, encapsulate the algorithm and refer to it in the calculated field (that's just good OCP/PV).
I don't consider such a domain object to be "anemic". My perception of that term is a "data bag", a collection of fields that has no concept whatsoever of the outside world or even the relation between its fields other than that it contains them. I've seen that too, and it's not fun tracking down inconsistencies in object state that the object never knew was a problem. Overzealous SRP will lead to this by stating that a data object is not responsible for any business logic, but common sense would generally intervene first and say that the object, as the data expert, must be responsible for maintaining a consistent internal state.
Again, personal opinion, I prefer the Repository pattern to Active Record. One object, with one responsibility, and very little if anything else in the system above that layer has to know anything about how it works. Active Record requires the domain layer to know at least some specific details about the persistence method or framework (whether that be the names of stored procedures used to read/write each class, framework-specific object references, or attributes decorating the fields with ORM information), and thus injects a second reason to change into every domain class by default.
My $0.02.
I've found following the solid principles did in fact lead me away from DDD's rich domain model, in the end, I found I didn't care. More to the point, I found that the logical concept of a domain model, and a class in whatever language weren't mapped 1:1, unless we were talking about a facade of some sort.
I wouldn't say this is exactly a c-style of programming where you have structs and modules, but rather you'll probably end up with something more functional, I realise the styles are similar, but the details make a big difference. I found my class instances end up behaving like higher order functions, partial functions application, lazily evaluated functions, or some combination of the above. It's somewhat ineffable for me, but that's the feeling I get from writing code following TDD + SOLID, it ended up behaving like a hybrid OO/Functional style.
As for inheritance being a bad word, i think that's more due to the fact that the inheritance isn't sufficiently fine grained enough in languages like Java/C#. In other languages, it's less of an issue, and more useful.
I like the definition of SRP as:
"A class has only one business reason to change"
So, as long as behaviours can be grouped into single "business reasons" then there is no reason for them not to co-exist in the same class. Of course, what defines a "business reason" is open to debate (and should be debated by all stakeholders).
Before I get into my rant, here's my opinion in a nutshell: somewhere everything has got to come together... and then a river runs through it.
I am haunted by coding.
=======
Anemic data model and me... well, we pal around a lot. Maybe it's just the nature of small to medium sized applications with very little business logic built into them. Maybe I am just a bit 'tarded.
However, here's my 2 cents:
Couldn't you just factor out the code in the entities and tie it up to an interface?
public class Object1
{
public string Property1 { get; set; }
public string Property2 { get; set; }
private IAction1 action1;
public Object1(IAction1 action1)
{
this.action1 = action1;
}
public void DoAction1()
{
action1.Do(Property1);
}
}
public interface IAction1
{
void Do(string input1);
}
Does this somehow violate the principles of SRP?
Furthermore, isn't having a bunch of classes sitting around not tied to each other by anything but the consuming code actually a larger violation of SRP, but pushed up a layer?
Imagine the guy writing the client code sitting there trying to figure out how to do something related to Object1. If he has to work with your model he will be working with Object1, the data bag, and a bunch of 'services' each with a single responsibility. It'll be his job to make sure all those things interact properly. So now his code becomes a transaction script, and that script will itself contain every responsibility necessary to properly complete that particular transaction (or unit of work).
Furthermore, you could say, "no brah, all he needs to do is access the service layer. It's like Object1Service.DoActionX(Object1). Piece of cake." Well then, where's the logic now? All in that one method? Your still just pushing code around, and no matter what, you'll end up with data and the logic being separated.
So in this scenario, why not expose to the client code that particular Object1Service and have it's DoActionX() basically just be another hook for your domain model? By this I mean:
public class Object1Service
{
private Object1Repository repository;
public Object1Service(Object1Repository repository)
{
this.repository = repository;
}
// Tie in your Unit of Work Aspect'ing stuff or whatever if need be
public void DoAction1(Object1DTO object1DTO)
{
Object1 object1 = repository.GetById(object1DTO.Id);
object1.DoAction1();
repository.Save(object1);
}
}
You still have factored out the actual code for Action1 from Object1 but for all intensive purposes, have a non-anemic Object1.
Say you need Action1 to represent 2 (or more) different operations that you would like to make atomic and separated into their own classes. Just create an interface for each atomic operation and hook it up inside of DoAction1.
That's how I might approach this situation. But then again, I don't really know what SRP is all about.
Convert your plain domain objects to ActiveRecord pattern with a common base class to all domain objects. Put common behaviour in the base class and override the behaviour in derived classes wherever necessary or define the new behaviour wherever required.
Information-Expert, Tell-Don't-Ask, and SRP are often mentioned together as best practices. But I think they are at odds. Here is what I'm talking about.
Code that favors SRP but violates Tell-Don't-Ask & Info-Expert:
Customer bob = ...;
// TransferObjectFactory has to use Customer's accessors to do its work,
// violates Tell Don't Ask
CustomerDTO dto = TransferObjectFactory.createFrom(bob);
Code that favors Tell-Don't-Ask & Info-Expert but violates SRP:
Customer bob = ...;
// Now Customer is doing more than just representing the domain concept of Customer,
// violates SRP
CustomerDTO dto = bob.toDTO();
Please fill me in on how these practices can co-exist peacefully.
Definitions of the terms,
Information Expert: objects that have the data needed for an operation should host the operation.
Tell Don't Ask: don't ask objects for data in order to do work; tell the objects to do the work.
Single Responsibility Principle: each object should have a narrowly defined responsibility.
I don't think that they are so much at odds as they are emphasizing different things that will cause you pain. One is about structuring code to make it clear where particular responsibilities are and reducing coupling, the other is about reducing the reasons to modify a class.
We all have to make decisions each and every day about how to structure code and what dependencies we are willing to introduce into designs.
We have built up a lot of useful guidelines, maxims and patterns that can help us to make the decisions.
Each of these is useful to detect different kinds of problems that could be present in our designs. For any specific problem that you may be looking at there will be a sweet spot somewhere.
The different guidelines do contradict each other. Just applying every piece of guidance you have heard or read will not make your design better.
For the specific problem you are looking at today you need to decide what the most important factors that are likely to cause you pain are.
You can talk about "Tell Don't Ask" when you ask for object's state in order to tell object to do something.
In your first example TransferObjectFactory.createFrom just a converter. It doesn't tell Customer object to do something after inspecting it's state.
I think first example is correct.
Those classes are not at odds. The DTO is simply serving as a conduit of data from storage that is intended to be used as a dumb container. It certainly doesn't violate the SRP.
On the other hand the .toDTO method is questionable -- why should Customer have this responsibility? For "purity's" sake I would have another class who's job it was to create DTOs from business objects like Customer.
Don't forget these principles are principles, and when you can et away with simpler solutions until changing requirements force the issue, then do so. Needless complexity is definitely something to avoid.
I highly recommend, BTW, Robert C. Martin's Agile Patterns, Practices and principles for much more in depth treatments of this subject.
DTOs with a sister class (like you have) violate all three principles you stated, and encapsulation, which is why you're having problems here.
What are you using this CustomerDTO for, and why can't you simply use Customer, and have the DTOs data inside the customer? If you're not careful, the CustomerDTO will need a Customer, and a Customer will need a CustomerDTO.
TellDontAsk says that if you are basing a decision on the state of one object (e.g. a customer), then that decision should be performed inside the customer class itself.
An example is if you want to remind the Customer to pay any outstanding bills, so you call
List<Bill> bills = Customer.GetOutstandingBills();
PaymentReminder.RemindCustomer(customer, bills);
this is a violation. Instead you want to do
Customer.RemindAboutOutstandingBills()
(and of course you will need to pass in the PaymentReminder as a dependency upon construction of the customer).
Information Expert says the same thing pretty much.
Single Responsibility Principle can be easily misunderstood - it says that the customer class should have one responsibility, but also that the responsibility of grouping data, methods, and other classes aligned with the 'Customer' concept should be encapsulated by only one class. What constitutes a single responsibility is extremely hard to define exactly and I would recommend more reading on the matter.
Craig Larman discussed this when he introduced GRASP in Applying UML and Patterns to Object-Oriented Analysis and Design and Iterative Development (2004):
In some situations, a solution suggested by Expert is undesirable, usually because of problems in coupling and cohesion (these principles are discussed later in this chapter).
For example, who should be responsible for saving a Sale in a database? Certainly, much of the information to be saved is in the Sale object, and thus Expert could argue that the responsibility lies in the Sale class. And, by logical extension of this decision, each class would have its own services to save itself in a database. But acting on that reasoning leads to problems in cohesion, coupling, and duplication. For example, the Sale class must now contain logic related to database handling, such as that related to SQL and JDBC (Java Database Connectivity). The class no longer focuses on just the pure application logic of “being a sale.” Now other kinds of responsibilities lower its cohesion. The class must be coupled to the technical database services of another subsystem, such as JDBC services, rather than just being coupled to other objects in the domain layer of software objects, so its coupling increases. And it is likely that similar database logic would be duplicated in many persistent classes.
All these problems indicate violation of a basic architectural principle: design for a separation of major system concerns. Keep application logic in one place (such as the domain software objects), keep database logic in another place (such as a separate persistence services subsystem), and so forth, rather than intermingling different system concerns in the same component.[11]
Supporting a separation of major concerns improves coupling and cohesion in a design. Thus, even though by Expert we could find some justification for putting the responsibility for database services in the Sale class, for other reasons (usually cohesion and coupling), we'd end up with a poor design.
Thus the SRP generally trumps Information Expert.
However, the Dependency Inversion Principle can combine well with Expert. The argument here would be that Customer should not have a dependency of CustomerDTO (general to detail), but the other way around. This would mean that CustomerDTO is the Expert and should know how to build itself given a Customer:
CustomerDTO dto = new CustomerDTO(bob);
If you're allergic to new, you could go static:
CustomerDTO dto = CustomerDTO.buildFor(bob);
Or, if you hate both, we come back around to an AbstractFactory:
public abstract class DTOFactory<D, E> {
public abstract D createDTO(E entity);
}
public class CustomerDTOFactory extends DTOFactory<CustomerDTO, Customer> {
#Override
public CustomerDTO createDTO(Customer entity) {
return new CustomerDTO(entity);
}
}
I don't 100% agree w/ your two examples as being representative, but from a general perspective you seem to be reasoning from the assumption of two objects and only two objects.
If you separate the problem out further and create one (or more) specialized objects to take on the individual responsibilities you have, and then have the controlling object pass instances of the other objects it is using to the specialized objects you have carved off, you should be able to observe a happy compromise between SRP (each responsibility has handled by a specialized object), and Tell Don't Ask (the controlling object is telling the specialized objects it is composing together to do whatever it is that they do, to each other).
It's a composition solution that relies on a controller of some sort to coordinate and delegate between other objects without getting mired in their internal details.