What layer should contain Queries in DDD - repository

I have a simple DDD service, with Article Aggregate root. I use MediatR and CQRS for separation of commands and queries. In DDD domain should not have dependencies on application and infrastructure layers. I have a repository IArticleRepository for composing some data from articles database. I have a rest endpoint for getting articles by some kind of filters so that I create
ArticleQuery : IRequest<ArticleDto(or Article)>
And when this query object should be? I have a repository per aggregate, so in Domain layer I have IArticleRepository. And I need to specify the input parameter type. If I put query in Infrastructure or Application layer I get the dependency from domain pointing to infrastructure or application. If I put query in Domain it violates DDD, because the query has no relations to business. If I will not putting an object, and just fields as a parameter to the repository, there will be about 10-15 parameters - this is a code smell.
It needed because in Query handler also appear SearchEngine logic, so I decided to encapsulate SQL logic from search engine logic in infrastructure via the repository or something like that.

I usually go for a query layer of sorts. Just as I would have an ICustomerRepository that handles my aggregates I would have an ICustomerQuery that interacts directly with my data store.
A repository really should be only for aggregates and, therefore, only for data modification. The only retrieval should be retrieving an entire aggregate in order to effect some form of change on that aggregate.
The query layer (more concern than layer, really) is infrastructure. I usually also namespace any read model in a Query namespace in order to distinguish between my domain Customer, say, and my Query.Customer.

I don't understand your question entirely, but there seems to be some confusion on how to use repositories. Answering that may help you find the right way.
Let me answer your question in two parts: where do repositories fit in, and how to use queries represent domain concepts.
Repositories are not part of the Domain layer. They belong outside in the Application layer.
A typical transaction flow would be something like this:
UI sends a request to API
API Controller gathers request params and
invokes the Application Service
Application Service gathers Repositories (Applications typically inject repositories at runtime based on configuration)
Application Service loads Aggregates (domain objects) based on the request params with the help of Repositories
Application Service invokes the methods on Aggregates to perform changes, if necessary
Application Service persists the aggregates with the help of Repositories
Application Service formats response and returns data to API Controller
So, you see, Application Service deals with repositories and aggregates. Aggregates, being in the domain layer, do not ever have to deal with Repositories.
A Query is best placed within the Repository because it is the responsibility of the Repository to interact with underlying data stores.
However, you should ensure that each Query represents a concept in the domain. It is generally not recommended to use filter params directly, because you don't capture the importance of the Query from the domain's point of view.
For example, if you are querying for, say, people who are adults (age > 21), then you should have a Query object called Adults which holds this filter within it. If you are querying for, say, people are who are senior citizens (age > 60), you should have a different Query object called Senior Citizen and so on.
For this purpose, you could use the Specification pattern to expose one GET API, but translate it into a Domain Specification Object before passing it on to the Repository for querying. You typically do this transformation in your Controller, before invoking the Application Service.
Martin Fowler and Eric Evans have published an excellent paper on using Specifications: https://martinfowler.com/apsupp/spec.pdf
As the paper states, The central idea of Specification is to separate the statement of how to match a candidate, from the candidate object that it is matched against.
Note:
Use the specification pattern for the Query side, but avoid reusing it in different contexts. Unless the Query represents the same domain concept, you should be creating a different specification object for each need. Also, DO NOT use a specification object on both the query side and command side, if you are using CQRS. You will be creating a central dependency between two parts, that NEED to be kept separate.
One way to get the underlying domain concept is to evaluate your queries (getByAandB and getByAandC) and draw out the question you are asking to the domain (For ex., ask your domain expert to describe the data she is trying to fetch).
Repository Organization:
Apologies if this confuses you a bit, but the code is in Python. But it almost reads like pseudocode, so you should be able to understand easily.
Say, we have this code structure:
application
main.py
infrastructure
repositories
user
mongo_repository.py
postgres_repository.py
...
...
domain
model
article
aggregate.py
domain_service.py
repository.py
user
...
The repository.py file under article will be an abstract repository, with important but completely empty methods. The methods represent domain concepts, but they need to implemented concretely (I think this is what you are referring to in your comments).
class ArticleRepository:
def get_all_active_articles(...):
raise NotImplementedError
def get_articles_by_followers(...):
raise NotImplementedError
def get_article_by_slug(...):
raise NotImplementedError
And in postgres_repository.py:
# import SQLAlchemy classes
...
# This class is required by the ORM used for Postgres
class Article(Base):
__tablename__ = 'articles'
id = Column(Integer, primary_key=True)
title = Column(String)
And this is a possible concrete implementation of the Factory, in the same file:
# This is the concrete repository implementation for Postgres
class ArticlePostgresRepository(ArticleRepository):
def __init__(self):
# Initialize SQL Alchemy session
self.session = Session()
def get_all_active_articles(self, ...):
return self.session.query(Article).all()
def get_article_by_slug(self, slug, ...):
return self.session.query(Article).filter(Article.slug == slug).all()
def get_articles_by_followers(self, ...):
return self.session.query(Article).filter(followee_id__in=...).all()
So in effect, the aggregate still does not know anything about the repository itself. Application services or configuration choose what kind of repository is to be used for a given environment dynamically (Maybe Postgres in Test and Mongo in Production, for example).

Related

How to structure the query-side of a CQRS application?

I am working on two different services:
The first one handles all of the write operations through a REST API, it contains all of the required business logic to maintain data in a consistent state, and it persists entities on a database. It also publishes events to a message broker when an entity is changed (creation, update, deletion, etc). It's structured in a DDD fashion.
The second one only handles reads, also with a REST API. It subscribes to the same message broker in order to process the events published by the first service, then it saves the received data to an in memory database for fast reads.
Nothing fancy, just CQRS with eventual consistency.
For the first service, I had a clear mind on how to structure the application:
I have the domain package with subpackages for each different aggregate. Each aggregate has its own domain objects, and its own repository interface.
I have the application package with different application services, and they basically just orchestrate the domain objects and call repositories to persist/update data, and the event publisher to publish domain events. The event publisher interface is also in this package.
I have the infrastructure package, which includes a persistence package, where the repository implementations reside, and a messaging package, where the event publisher implementation resides.
Finally, the interfaces package is where I keep the controllers/handlers for the REST API.
For the second service, I'm very unsure on how to structure it. My doubts are the following:
Should I use the repository pattern? To be fair it seems redundant and not very useful in this scenario. There are no domain objects nor rules here, cause the data to be saved/updated is already validated by the first service.
If I avoid using the repository pattern, I suppose I'd have to inject the database client in my application service, and access the data directly. Is this a good practice? If yes, where would the returned objects fit? Would they also be part of the application layer?
Would it make sense to skip the application service entirely and inject the database client straight up in the controller/handler? What if the queries are a bit complicated? This would pollute the controllers with a lot of db logic, making it harder to switch implementations (there would be no interface in this case).
What do you think?
The Query side will only contain the methods for getting data, so it can/should be really simple.
You are right, an abstraction on top of your persistence like a repository pattern can feel redundant.
You can actually call the database in your controller. Even when it comes to testing, on the query side you only need basically integration tests that test the actual database. Having unit tests won't test much.
On the other hand, it can make sense to wrap the database calling logic in a query service similar to a repository. You would inject only that query service interface in your controller, which should use your ubiquitous language! You would have all the db logic in this query service and keep the db complexity there, while keeping the controller really simple.
You can avoid complex queries by having multiple read models based on your events depending on your needs.

What creates an aggregate that isn't stored in a database?

In my DDD design, a Command Handler is asked to create a conversation. This is done by calling a third party API.
I use conversation as an abstraction, as today it is a phone call, tomorrow could be something else. I will represent that conversation as a Conversation aggregate in my domain model.
Since it's not a CRUD thing from a database that I am retrieving/updating, do I continue to use a Repository or is there another pattern I should use in it's place? Or should I simply inject my adapter (IConversationAPIAdapter, not depicted in the diagram) into my Command Handler and let it create and return the aggregate back to the handler?
UML Design
If the 3rd party is creating the Conversation, then you don't really have an aggregate. You have at best a facade (the command handler) which just relays the command.
This is actually good, because it simplifies things. However, you should be aware that the returned object is not really an aggregate (as far as DDD is concerned), but a simple business (domain) object (in lack of another term).
You can still use the Repository as a pattern and send that object to be persisted if your app needs it.
Repository is a commonly pattern used in DDD.
The ConversationRepository (an interface) should be in your Domain Layer.
You can totally inject your implementation/adapter (from your Infrastructure Layer) into your ConversationRepository to retrieve your Conversation aggregates from the BDD and return it to your handler (your "Application Service").

What is the difference between a cornice.Service and cornice.resource in Cornice?

I have read through the documentation many times over and search all over for the answer to this question but have come up short. Specifically I've looked at Defining the service and the Cornice API for a service and at Defining resource for resource.
I'm currently building a REST API that will have a similar structure to this:
GET /clients # Gets a list of clients
GET /clients/{id} # Gets a specific client
GET /clients/{id}/users # Gets a specific clients users
What would be the best way to go about this? Should I use a service or a resource or both? And, if both, how?
Resources are high-level convenience, services offer lower level control.
I am just learning cornice myself. Looking at the source code, a Resource creates Services internally, one for the item and one for the collection (if a collection path is specified). The resource also adds views to the services for each method defined with an http verb as the name or in the form collection_[verb].
So there is little difference except that resources are a neat, structured way to define services.
The resource decorator uses a url for the collection, as well as a url pattern for an object.
collection_path=/rest/users
path=/rest/users/{id}
The resource decorator is best used in view classes where you can use get/put/post/delete methods on the objects, as well as collection_get, collection_put, etc. on the collection. I have some examples here:
https://github.com/umeboshi2/trumpet/blob/master/trumpet/views/rest/users.py
Since I make heavy use of the resource decorator and view classes, I haven't found a need for the service function, but it allows you to create get,put,post decorators that wrap view callable functions.
If you are using backbone.js on the client side, the resource decorator and url examples work well with the Backbone collections and models.

Entity Framework Code First DTO or Model to the UI?

I am creating a brand new application, including the database, and I'm going to use Entity Framework Code First. This will also use WCF for services which also opens it up for multiple UI's for different devices, as well as making the services API usable from other unknown apps.
I have seen this batted around in several posts here on SO but I don't see direct questions or answers pertaining to Code First, although there are a few mentioning POCOs. I am going to ask the question again so here it goes - do I really need DTOs with Entity Framework Code First or can I use the model as a set of common entities for all boundaries? I am really trying to follow the YAGNI train of thought so while I have a clean sheet of paper I figured that I would get this out of the way first.
Thanks,
Paul Speranza
There is no definite answer to this problem and it is also the reason why you didn't find any.
Are you going to build services providing CRUD operations? It generally means that your services will be able to return, insert, update and delete entities as they are = you will always expose whole entity or single exactly defined serializable part of the entity to all clients. But once you do this it probably worth to check WCF Data Services.
Are you going to expose business facade working with entities? The facade will provide real business methods instead of just CRUD operations. These buisness methods will get some data object and decompose it to multiple entities in wrapped business logic. Here it makes sense to use specific DTO for every operation. DTO will transfer only data needed for the operation and return only date allowed to the client.
Very simple example. Suppose that your entities keep information like LastModifiedBy. This is probably information you want to pass back to the client. In the first scenario you have single serializable set so you will pass it back to the client and client pass it modified back to the service. Now you must verify that client didn't change the field because he probably didn't have permissions to do that. You must do it with every single field which client didn't have permission to change. In the second scenario your DTO with updated data will simply not include this property (= specialized DTO for your operation) so client will not be able to send you a new value at all.
It can be somehow related to the way how you want to work with data and where your real logic will be applied. Will it be on the service or on the client? How will you ensure that client will not post invalid data? Do you want to restrict passing invalid data by logic or by specific transferred objects?
I strongly recommend a dedicated view model.
Doing this means:
You can design the UI (and iterate on it) without having to wait to design the data model first.
There is less friction when you want to change the UI.
You can avoid security problems with auto-mapping/model binding "accidentally" updating fields which shouldn't be editable by the user -- just don't put them in the view model.
However, with a WCF Data Service, it's hard to ignore the advantage of being able to write the service in essentially one line when you expose entities directly. So that might make the most sense for the WCF/server side.
But when it comes to UI, you're "gonna need it."
do I really need DTOs with Entity Framework Code First or can I use the model as a set of common entities for all boundaries?
Yes, the same set of POCOs / entities can be used for all boundaries.
But a set of mappers / converters / configurators will be needed to adapt entities to some generic structures of each layer.
For example, when entities are configured with DataContract and DataMember attributes, WCF is able to transfer domain objects' state without creating any special classes.
Similarly, when entities are mapped using Entity Framework fluent mapping api, EF is able to persist domain objects' state in database without creating any special classes.
The same way, entities can be configured to be used in any layer by means of the layer infrastructure without creating any special classes.

NHibernate. DTO -> Domain

I've got SOA which processing data for diff clients(asp,sl). The base of this design is domains of my business model. For transporting,showing it to clients I use DTO. For mapping domain to DTO I use AutoMapper. Now I should persist new entities from clients. I want use my DTO's at this scenario too. So i've got some questions as I'm not much familiar with this design
1) Is it a good practice build DTO on client and send it to web-service on the wire? MayBe i should pass my domains?
2) Is it possible have several DTO's to one domain (one show at grid, and one to save). For saving I need set all nonprimitive props at client.
3) DTO -> to Domain. If I've got int can I use AutoMapper to generate NHibernate Proxy for this ID, or I should do i manually.
Your expierence and practice are very interesting.
Thanks for answer!!!
A good practice to use is screen and command specific DTOs.
An example of this would be when the user is looking at a customer display screen there is a single DTO which contains all (or most if you need to lazy load some stuff) the information for that customer.
The value of this technique is that the data can come from multiple sources which allows you to model your domain as makes sense to you as opposed to how your screens are setup. It also allows you to change your domain without worrying about your screens as you just need to update the mappings.
Depending on your programming language there may be tools such a AutoMapper (for C#) which allow you to easily create the mappings between domain and DTOs.
Your architecture gets more flexible using DTOs over the wire, in stead of domain model entities. You can have several DTOs per domain.