Should repositories have update/save/delete methods? - repository

I want to implement Repository design pattern for my project but it's not clear to use CRUD operations in repositories or not. Some resources say you shouldn't use update/save/delete methods because the repository is only for saving objects in memory and you should services for other actions.
Which one is the best way?
Thanks.

A summary of Martin Fowler’s definition of the Repository pattern:
Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.
So if we have both add and update methods, I could claim it’s not a collection-like interface, right? I shouldn’t need to bother checking if an object’s already there when adding to a set-like collection.
There are two common approaches about add/update:
Collection-oriented repositories try to mimic an in-memory collection, so you shouldn’t need to re-add an object if it was updated and already in the collection. The repository (or layers hidden below it, such as an ORM) should handle the changes to an entity and track them. You just add an object when you first create it and then no more methods are needed after the entity is changed.
Persistence-oriented repositories are aware that an object needs to be explicitly “saved” after any changes, so you can call the entity.save() method when an object is created or modified.
(Those are my interpretations of the definitions by Vaughn Vernon in Implementing Domain-Driven Design.)
delete is fine, but perhaps remove would be a better name.

Related

Repository design for complex objects?

What is the best way to design repositories for complex objects, assuming use of an ORM such as NHibernate or Entity Framework?
I am creating an app using Entity Framework 4. The app uses complex objects--a Foo object contains a collection of Bar objects in a Foo.Bars property, and so on. In the past, I would have created a FooRepository, and then a BarRepository, and I would inject a reference to the BarRepository into the FooRepository constructor.
When a query is passed to the FooRepository, it would call on the BarRepository as needed to construct the Foo.Bars property for each Foo object. And when a Foo object is passed to the FooRepository for persistence, the repository would call the BarRepository to persist the objects in the Foo.Bars property.
My question is pretty simple: Is that a generally accepted way to set up the repositories? Is there a better approach? Thanks for your help.
In domain-driven design, there is the concept of a "root aggregate" object. The accepted answer to a related question has good information on what it is and how you would use it in your design. I don't know about the Entity Framework but NHibernate does not require the usage pattern you are describing. As long as all the nested objects and their relationships are properly mapped to tables, saving the aggregate root will also save all its child object. The exception is when a nested object has specific business logic that needs to performed as part of its access or persistence. In that case, you would need to pass the "child" repositories so you are not duplicating that business logic.
Repository pattern helps grouping of business transactions among related entities. Meaning if you have two domain objects foo and bar and have a common transactions like GetList(),Update() then a common repository like FoobarReporsitory can be created. You can even abstract that to an interface called IFoobarReporsitory to make application loosely coupled.

DDD: Repositories are in-memory collections of objects?

I've noticed Repository is usually implemented in either of the following ways:
Method 1
void Add(object obj);
void Remove(object obj);
object GetBy(int id);
Method 2
void Save(object obj); // Used both for Insert and Update scenarios
void Remove(object obj);
object GetBy(int id);
Method 1 has collection semantics (which is how repositories are defined). We can get an object from a repository and modify it. But we don't tell the collection to update it. Implementing a repository this way requires another mechanism for persisting the changes made to an in-memory object. As far as I know, this is done using Unit of Work. However, some argue that UoW is only required when you need transaction control in your system.
Method 2 eliminates the need to have UoW. You can call the Save() method and it determines if the object is new and should be Inserted or is modified and should be Updated. It then uses the data mappers to persist the changes to the database. Whilst this makes life much easier, a repository modeled doesn't have collection semantics. This model has DAO semantics.
I'm really confused about this. If repositories mimic in-memory collection of objects, then we should model them according to Method 1.
What are your thoughts on this?
Mosh
I personally have no issue with the Unit of Work pattern being a part of the solution. Obviously, you only need it for the CUD in CRUD. The fact that you are implementing a UoW pattern, though, does nothing more than dictate that you have a set of operations that need to go as a batch. That is slightly different than saying it needs to be a part of a transaction. If you abstract your repositories well enough, your UoW implementation can be agnostic to the backing mechanism that you are using - whether it is database, XML, etc.
As to the specific question, I think the difference between method one and method two are trivial, if for no other reason than most instances of method two contain a check to see if the identifier is set. If set, treat as update, otherwise, treat as insert. This logic is often built into the repository and is more for simplification of the exposed interface, in my opinion. The repository's purpose is to broker objects between a consumer and a data source and to remove having to have knowledge of the data source directly. I go with method two, because I trust the simple logic of detecting an identifier than having to rely on tracking object states all over the application.
The fact that the terminology for repository usage is so similar to both data access and object collections lend to the confusion. I just treat them as their own first class citizen and do what is best for the domain. ;-)
Maybe you want to have:
T Persist(T entityToPersist);
void Remove(T entityToRemove);
"Persist" being the same as "Save Or Update" or "Add Or Update" - ie. the Repo encapsulates creating new identities (the db may do this) but always returns the new instance with the identity reference.

Repository, Service or Domain object - where does logic belong?

Take this simple, contrived example:
UserRepository.GetAllUsers();
UserRepository.GetUserById();
Inevitably, I will have more complex "queries", such as:
//returns users where active=true, deleted=false, and confirmed = true
GetActiveUsers();
I'm having trouble determining where the responsibility of the repository ends. GetActiveUsers() represents a simple "query". Does it belong in the repository?
How about something that involves a bit of logic, such as:
//activate the user, set the activationCode to "used", etc.
ActivateUser(string activationCode);
Repositories are responsible for the application-specific handling of sets of objects. This naturally covers queries as well as set modifications (insert/delete).
ActivateUser operates on a single object. That object needs to be retrieved, then modified. The repository is responsible for retrieving the object from the set; another class would be responsible for invoking the query and using the object.
These are all excellent questions to be asking. Being able to determine which of these you should use comes down to your experience and the problem you are working on.
I would suggest reading a book such as Fowler's patterns of enterprise architecture. In this book he discusses the patterns you mention. Most importantly though he assigns each pattern a responsibility. For instance domain logic can be put in either the Service or Domain layers. There are pros and cons associated with each.
If I decide to use a Service layer I assign the layer the role of handling Transactions and Authorization. I like to keep it 'thin' and have no domain logic in there. It becomes an API for my application. I keep all business logic with the domain objects. This includes algorithms and validation for the object. The repository retrieves and persists the domain objects. This may be a one to one mapping between database columns and domain properties for simple systems.
I think GetAtcitveUsers is ok for the Repository. You wouldnt want to retrieve all users from the database and figure out which ones are active in the application as this would lead to poor performance. If ActivateUser has business logic as you suggest, then that logic belongs in the domain object. Persisting the change is the responsibility of the Repository layer.
Hope this helps.
When building DDD projects I like to differentiate two responsibilities: a Repository and a Finder.
A Repository is responsible for storing aggregate roots and for retrieving them, but only for usage in command processing. By command processing I meant executing any action a user invoked.
A Finder is responsible for querying domain objects for purposes of UI, like grid views and details views.
I don't consider finders to be a part of domain model. The particular IXxxFinder interfaces are placed in presentation layer, not in the domain layer. Implementation of both IXxxRepository and IXxxFinder are placed in data access layer, possibly even in the same class.

What functionality to build into business objects?

What functionality do you think should be built into a persistable business object at bare minimum?
For example:
validation
a way to compare to another object of the same type
undo capability (the ability to roll-back changes)
The functionality dictated by the domain & business.
Read Domain Driven Design.
A persistable business object should consist of the following:
Data
New
Save
Delete
Serialization
Deserialization
Often, you'll abstract the functionality to retrieve them into a repository that supports:
GetByID
GetAll
GetByXYZCriteria
You could also wrap this type of functionality into collection classes (e.g. BusinessObjectTypeCollection), however there's a lot of movement towards using the Repository Pattern in Domain Driven Design to provide these type of accessors (e.g. InvoicingRepository.GetAllCustomers, InvoicingRepository.GetAllInvoices).
You could put the business rules in the New, Save, Update, Delete ... but sometimes you could have an external business rules engine that you pass off the objects to.
This is just one piece of an answer, but I would say that you need a way to get to all objects with which this object has a relationship. In the beginning you may try to be smart and only include one-way navigability for some relationships, but I have found that this is usually more trouble than it's worth.
All persistent frameworks also include finders, ways to do cascading deletes... sorts....
Once you start modeling, all business objects should know how to manage themselves. Whenever you find another class referring TO your business object too much, it's usually time to push that behavior into the business object itself.
Of the three things noted in the question, I would say that validation is the only one that is truly required. The others depend on the overall archetecture of the application.
Also, the business rules should be in the business objects.
Whether an object should do its own serialization is an interesting question. I have had great success in the past by having each object handle its own serialization, but I can also see merit in having a serialization module load and save the business objects just the same way as the GUI writes to and reads from the objects. Then your validation will protect against errors in the database or files too.
I can't think of anything else that is required in general.

Where IdentityMap belongs: UnitOfWork or Repository?

If I implement some simple OR/M tool, where do I put identity map? Obviously, each Repository should have access to its own identity map, so it can register loaded objects (or maybe DataMapper is the one who registers objects in IdentityMap?).
And when I commit unit of work, I also need to access the identity map to see which entity is dirty and which is clean (or I am wrong again and there is some outer object which calls RegisterClean/RegisterDirty methods of my UnitOfWork class? Then what object does this?).
Does this mean that I should implement IdentityMap as a completely independent object which contains inner IdentityMaps for each entity type?
Really confused about how IdentityMap, Repository and UnitOfWork work all together.
With our .NET O/R Mapper, LightSpeed we placed the identity map inside the unit of work class. This has worked very well for us and feels quite natural as it effectively acts as a level 1 cache for querying purposes during the unit of work's life.
Generally, inject or somehow provide a UoW for your Repository class so that you have an effective scope and gateway to querying.
I hope that helps.