What should repositories in DDD return - repository

I researched about repositories in DDD and found too much different thing. Everyone says different things about repositories and that made me confused.
I want to know:
What methods should repositories contain?
What should repositories definitely (or closer that) return?
Thanks.

For each aggregate root (AR) you should have a repository. As a minimum the repository would probably have a void Save(Aggregate aggregate) and a Aggregate Get(Guid id) method. The returned aggregate would always be fully constituted.
I sometimes add methods for specific use cases in order to update only certain bits of data. For instance, something like void Activate(Guid id) or some such. This is simply to avoid manipulating more data than is necessary.
Querying on a repository is usually problematic since you should typically avoid querying your domain. For such scenarios my recommendation is to use a query mechanism that is closer to the data and in a more raw format than a domain object or object graph. The query mechanism would more-then-likely return primitives, such as int Count(Query.Specification specification) or perhaps return a list of read model instances.

You are right, a repository has different meanings in different contexts - and many authors have their own interpretation. The way I understand them is from multiple perspectives:
They abstract away underline storage type
They can introduce interface closer to the domain model
They represent a collection of objects and thus serve as aggregate
in-memory storage(collection of related objects)
They represent a transaction boundary for related objects.
They can't contain duplicates - like sets.
It is valid for the repository to contain only one object, without
complex relations internally
So to answer your questions, repositories should contain collection related methods like add, remove, addAll, findByCriteria - instead of save, update, delete. They can return whole aggregate or parts of aggregates or some internal aggregate relation - it is dependent on your domain model and the way you want to represent objects

Eric Evans coined "domain driven design" in 2003; so the right starting point for any definitions in that context is his book. He defines the repository pattern in chapter 6 ("Lifecycle of a Domain Object").
A REPOSITORY represents all objects of a type as a conceptual set (usually emulated). It acts like a collection, except with more elaborate querying ability. Objects of the appropriate type are added and removed, and the machinery behind the repository inserts them or deletes them from the database.
...
For each type of object that requires global access, create an object that can provide the illusion of an in-memory collection of all objects of that type.
The primary use case of a repository: given a key, return the correct root entity. The repository implementation acts as a module, which hides your choice of persistence strategy (see: Parnas 1971).

Related

Relations between Repositories for Aggragates in DDD

I'm building a Repository for an Aggragate. We've got 3 different Entities that it's constructed out of, one of them is root.
Data for all 3 is persisted in a SQL database. Each has it's own table.
Let's consider simple case of getting full list of those Aggregates. I need to fetch data from all 3 tables. Should I build one optimised query to fetch this data set or rather encapsulate logic for each Entity in it's own Repository and assemble it the Aggragate's repo? (Aggregate repo would then call respective repos and assemble it)
I'm leaning twords the first solution, however it's stronger coupleing. The later seems nicer from OOP point of view, but seems to be overcomplicated and potentialy casue problems with cache invalidation for subsequent sets of data etc.
For each type of object requiring global access, create an object that provides the illusion of all objects of this type stored in memory. Configure access through the global interface. [..] Define methods for adding and removing objects. [..] Define repositories only for aggregates. ~ Evans, about repositories
You should create one repository for Aggregate only. There is no reason to create seperate repositories. What is more, creating seperate repository would cause some additional problems as you mentioned.
I'm leaning twords the first solution, however it's stronger
coupleing.
To answer that, please take a look at Aggregate definition from Martin Fowler:
Aggregate is a pattern in Domain-Driven Design. A DDD aggregate is a
cluster of domain objects that can be treated as a single unit. An
example may be an order and its line-items, these will be separate
objects, but it's useful to treat the order (together with its line
items) as a single aggregate.
Aggregate is coupling Entities that it is constructed out of by definition.

Use structures to group individual attributes or not?

I'm in doubt of how to get the best of ABAP structures and class attributes.
Let's say that I have the object Operation with 4 fields: operation id, type, description and date.
Now I can create a class with this 4 attributes, but then if I want to have a constructor, I need either 4 individual parameters or a structure than needs to be mapped to each attribute. The same happens if I want to get all this object data in one structure, for instance to return via RFC. Then a method get_operation_details( ) will need to map all of them one by one.
If I use a structure type ty_operation_details as a single class attribute, then when I add a field to the structure would also keep the constructor valid and the get_operation_details( ) method would also be always OK. However it seems wrong to have something like Operation->get_details( )-operationID, instead of operation->operation_ID if I had the attribute directly in the public section with READ-ONLY. I guess the first approach is more correct in the OO world, but we lose some of the ABAP benefits.
What do you recommend to use? Maybe one thing it could allow the first option and use structures at the same time would be a CORRESPONDING statement able to map class attributes to a flat structure, but I don't think this is possible.
Like most things, your design should follow your usage. If you primarily use a set of attributes together, consider grouping them in a structure. If you primarily use them individually, or in varying recombinations, keep them separate.
Some considerations:
Grouping makes calls shorter if you always create/update/delete a set of attributes together. You already identified this advantage.
Grouping reveals logical relations between fields, that are not clear when keeping the fields separate. For example, this could reveal that one part of your parameters is mandatory, while the rest forms several optional sets.
Grouping simplifies features that operate on state, such as the Memento or the Flyweight pattern, in that it allows to extract, store, and restore the object's state as a single structure.
Also, like many other things, there may be benefit in turning this either-or question into a I'll simply use both. For example, if your class has four individual properties, why not still offer a method that sets or gets them as a structure; of course, this will add some mapping, but the mapping would remain encapsulated within your own class, while consumer get an easy-to-consume interface.

Why are repositories only used for aggregates in Domain-Driven Design?

In DDD, repositories are used to perform serialization and de-serialization of aggregates, e.g. by reading and writing to a database. That way, aggregates can contain purer business logic, and won't be coupled to non-domain-specific persistence strategies.
However, I wonder why repositories are always described as being used for aggregates specifically. Isn't it equally motivated to use it for all entities?
(If this is only a matter of the fact that all plain entities can be seen as aggregate roots with zero children, please notify me of this, and the question can be buried.)
I wonder why repositories are always described as being used for aggregates specifically. Isn't it equally motivated to use it for all entities?
Because aggregates are the consistency boundaries exposed to the application layer.
Which is to say that, yes, the repositories are responsible for taking the snapshot of state from the data store, and building from it the graph of entities and values that make up the aggregate.
The API of the repository only exposes an aggregate root, because that defines the consistency boundary. Instead of allowing the application to reach into an arbitrary location in the graph and make changes, we force the application to communicate with the root object exclusively. With this constraint in place, we only need to look in one place to ensure that all changes satisfy the business invariant.
So there's no need to develop a repository for each type of entity in your model, because the application isn't allowed to interact directly with the model on that fine a grain.
Put another way, the entities within the aggregate are private data structures. We don't allow the client code to manipulate the entities directly for the same reason that we don't implement lists that allow the clients to reach past the api and manipulate the pointers directly.
In cqrs, you do see "repositories" that are used for things other than aggregates -- repositories can also be used to look up cached views of the state of the model. The trick is that the views don't support modification. In the approach that Evans describes, each entity has one single representation that fulfills all of its roles. In CQRS, and entity may have different representations in each role, but typically only a single role that supports modifying the entity.
In DDD there are two kind of entities: Aggregate roots and nested entities. As #VoiceOfUnreason answered, you are not allowed to modify the nested entities from outside an Aggregate so there is no need to have a repository for them (by "repository" I'm refering to an interface for load and persist an entities state). If you would be allowed, it would break the Aggregate's encapsulation, one if the most important things in OOP. Encapsulation helps in rich domains, with lots and lots of models where DDD is a perfect fit.

Object persistence terminology: 'repository' vs. 'store' vs. 'context' vs. 'retriever' vs. (...)

I'm not sure how to name data store classes when designing a program's data access layer (DAL).
(By data store class, I mean a class that is responsible to read a persisted object into memory, or to persist an in-memory object.)
It seems reasonable to name a data store class according to two things:
what kinds of objects it handles;
whether it loads and/or persists such objects.
⇒ A class that loads Banana objects might be called e.g. BananaSource.
I don't know how to go about the second point (ie. the Source bit in the example). I've seen different nouns apparently used for just that purpose:
repository: this sounds very general. Does this denote something read-/write-accessible?
store: this sounds like something that potentially allows write access.
context: sounds very abstract. I've seen this with LINQ and object-relational mappers (ORMs).
P.S. (several months later): This is probably appropriate for containers that contain "active" or otherwise supervised objects (the Unit of Work pattern comes to mind).
retriever: sounds like something read-only.
source & sink: probably not appropriate for object persistence; a better fit with data streams?
reader / writer: quite clear in its intention, but sounds too technical to me.
Are these names arbitrary, or are there widely accepted meanings / semantic differences behind each? More specifically, I wonder:
What names would be appropriate for read-only data stores?
What names would be appropriate for write-only data stores?
What names would be appropriate for mostly read-only data stores that are occasionally updated?
What names would be appropriate for mostly write-only data stores that are occasionally read?
Does one name fit all scenarios equally well?
As noone has yet answered the question, I'll post on what I have decided in the meantime.
Just for the record, I have pretty much decided on calling most data store classes repositories. First, it appears to be the most neutral, non-technical term from the list I suggested, and it seems to be well in line with the Repository pattern.
Generally, "repository" seems to fit well where data retrieval/persistence interfaces are something similar to the following:
public interface IRepository<TResource, TId>
{
int Count { get; }
TResource GetById(TId id);
IEnumerable<TResource> GetManyBySomeCriteria(...);
TId Add(TResource resource);
void Remove(TId id);
void Remove(TResource resource);
...
}
Another term I have decided on using is provider, which I'll be preferring over "repository" whenever objects are generated on-the-fly instead of being retrieved from a persistence store, or when access to a persistence store happens in a purely read-only manner. (Factory would also be appropriate, but sounds more technical, and I have decided against technical terms for most uses.)
P.S.: Some time has gone by since writing this answer, and I've had several opportunities at work to review someone else's code. One term I've thus added to my vocabulary is Service, which I am reserving for SOA scenarios: I might publish a FooService that is backed by a private Foo repository or provider. The "service" is basically just a thin public-facing layer above these that takes care of things like authentication, authorization, or aggregating / batching DTOs for proper "chunkiness" of service responses.
Well so to add something to you conclusion:
A repository: is meant to only care about one entity and has certain patterns like you did.
A store: is allowed to do a bit more, also working with other entities.
A reader/writer: is separated to allow semantically show and inject only reading and wrting functionality into other classes. It's coming from the CQRS pattern.
A context: is more or less bound to a ORM mapper as you mentioned and is usually used under the hood of a repository or store, some use it directly instead of making a repository on top. But it's harder to abstract.

how to model value object relationships?

context:
I have an entity Book. A book can have one or more Descriptions. Descriptions are value objects.
problem:
A description can be more specific than another description. Eg if a description contains the content of the book and how the cover looks it is more specific than a description that only discusses how the cover looks. I don't know how to model this and how to have the repository save it. It is not the responsibility of the book nor of the book description to know these relationships. Some other object can handle this and then ask the repository to save the relationships. But BookRepository.addMoreSpecificDescription(Description, MoreSpecificDescription) seems difficult for the repository to save.
How is such a thing handled in DDD?
The other two answers are one direction (+1 btw). I am coming in after your edit to the original question, so here are my two cents...
I define a Value Object as an object with two or more properties that can (and is) shared amongst other entities. They can be shared only within a single Aggregate Root, that's fine too. Just the fact that they can (and are) shared.
To use your example, you define a "Description" as a Value Object. That tells me that "Description" with multiple properties can be shared amongst several Books. In the real-world, that does not make sense as we all know each book has unique descriptions written by the master of who authored or published the book. Hehe. So, I would argue that Descriptions aren't really Value Objects, but themselves are additional Entity objects within your Book Aggregate Root Entity boundery (you can have multiple entities within a single aggregate root's entity). Even books that are re-released, a newer revision, etc have slightly different descriptions describing that slight change.
I believe that answers your question - make the descriptions entity objects and protect them behind your main Book Entity Aggregate Root (e.g. Book.GetDescriptions()...). The rest of this answer addresses how I handle Value Objects in Repositories, for others reading this post...
For storing Value Objects in a repository, and retrieving them, we start to encroach onto the same territory I wrestled with myself when I went switched from a "Database-first" modeling approach to a DDD approach. I myself wreslted with this one, on how to store a Value Object in the DB, and retrieve it without an Identity. Until I stepped back and realized what i was doing...
In Domain Driven Design, you are modeling the Value Objects in your domain - not your data store. That is the key phrase. It means you are not designing the Value Objects to be stored as independant objects in the data store, you can store them however you like!
Let's take the common DDD example of Value Objects, that being an Address(). DDD presents that an Mailing Address is the perfect Value Object example, as the definition of a Value Object is an object of who's properties sum up to create the uniqueness of the object. If a property changes, it will be a different Value Object. And the same Value Object 9teh sum of its properties) can be shared amongst other Entities.
A Mailing Address is a location, a long/lat of a specific location on the planet. Multiple people can live at the address, and when someone moves, the new people to occupy the same Mailing Address now use the same Value Object.
So, I have a Person() object with a MailingAddress() object that has the address information in it. It is protected behind my Person() aggregate root with get/update/create methods/services.
Now, how do we store that and share it amongst the people in the same household? Ah, there lies DDD - you aren't modeling your data store straight from your DDD (even though, that would be nice). With that said, you simple create a single Table that presents your Person object, and it has the columns for your mailing address within it. It is the job of your Repository to re-hydrate that information back into your Person() and MailingAddress() object from the data store, and to split it up during the Create/Update operations.
Yep, you'd have duplicate data now in your data store. Three Person() entities with the same mailing address all now have three seperate copies of that Value Object data - and that is ok! Value Objects are meant to be copied and destoyed quite easily. "Copy" is the optimum word there in the DDD playbook.
So to sum up, Domain Drive Design is about modeling your Domain to represent your actual business use of the objects. You model a Person() entity and a MailingAddress Value Object seperately, as they are represented differently in your application. You persist them a copied-data, that being additional columns in the same table as your Person table.
All of the above is strict-DDD. But, DDD is meant to be just "suggestions", not rules to live by. That's why you are free to do what myself and many others have done, kind of a loose-DDD style. If you don't like the copied data, your only option is that being you can create a seperate table for MailingAddress() and stick an Identity column on it, and update your MailingAddress() object to have now have that identity on it - knowing you only use that identity to link it to other Person() objects that share it (I personally like a 3rd many-to-many relationship table, to keep the speed of the queries up). You would mask that Idenity (i.e. internal modifier) from being exposed outside of your Aggregate Root/Domain, so other layers (such as the Application or UI) do not know of the Identity column of the MailingAddress, if possible. Also, I would create a dedicated Repository just for MailingAddress, and use your PersonService layer to combine them into the correct object, Person.MailingAddress().
Sorry for the rant... :)
First, I think that reviews should be entities.
Second, why are you trying to model relationships between reviews? I don't see a natural relationship between them. "More specific than" is too vague to be useful as a relationship.
If you're having difficulty modeling the situation, that suggests that maybe there is no relationship.
I agree with Jason. I don't know what your rationale is for making reviews value objects.
I would expect a BookReview to have BookReviewContentItems so that you could have a method on the BookReview to call to decide if it is specific enough, where the method decides based on querying its collection of content items.