How to structure the query-side of a CQRS application? - repository

I am working on two different services:
The first one handles all of the write operations through a REST API, it contains all of the required business logic to maintain data in a consistent state, and it persists entities on a database. It also publishes events to a message broker when an entity is changed (creation, update, deletion, etc). It's structured in a DDD fashion.
The second one only handles reads, also with a REST API. It subscribes to the same message broker in order to process the events published by the first service, then it saves the received data to an in memory database for fast reads.
Nothing fancy, just CQRS with eventual consistency.
For the first service, I had a clear mind on how to structure the application:
I have the domain package with subpackages for each different aggregate. Each aggregate has its own domain objects, and its own repository interface.
I have the application package with different application services, and they basically just orchestrate the domain objects and call repositories to persist/update data, and the event publisher to publish domain events. The event publisher interface is also in this package.
I have the infrastructure package, which includes a persistence package, where the repository implementations reside, and a messaging package, where the event publisher implementation resides.
Finally, the interfaces package is where I keep the controllers/handlers for the REST API.
For the second service, I'm very unsure on how to structure it. My doubts are the following:
Should I use the repository pattern? To be fair it seems redundant and not very useful in this scenario. There are no domain objects nor rules here, cause the data to be saved/updated is already validated by the first service.
If I avoid using the repository pattern, I suppose I'd have to inject the database client in my application service, and access the data directly. Is this a good practice? If yes, where would the returned objects fit? Would they also be part of the application layer?
Would it make sense to skip the application service entirely and inject the database client straight up in the controller/handler? What if the queries are a bit complicated? This would pollute the controllers with a lot of db logic, making it harder to switch implementations (there would be no interface in this case).
What do you think?

The Query side will only contain the methods for getting data, so it can/should be really simple.
You are right, an abstraction on top of your persistence like a repository pattern can feel redundant.
You can actually call the database in your controller. Even when it comes to testing, on the query side you only need basically integration tests that test the actual database. Having unit tests won't test much.
On the other hand, it can make sense to wrap the database calling logic in a query service similar to a repository. You would inject only that query service interface in your controller, which should use your ubiquitous language! You would have all the db logic in this query service and keep the db complexity there, while keeping the controller really simple.
You can avoid complex queries by having multiple read models based on your events depending on your needs.

Related

Grouping data together in Microservices

Lets suppose I have two services in a mircoservices architecture which are:-
OrderService
CustomerService
Each service has its own database, my frontend makes a request to /api/order/25 now along with the order I want to return the CustomerName as well but CustomerName is in a different database. So whats the best and most microservices based approach for fetching the CustomerName along with the Order data?
In this case, I suggest creating an aggregator service or backends-for-frontend.
This service will fetch data from both the services and aggregates it for your frontend(or for your client).
In this way, your order service doesn't need to depend on customer service for just getting a name or your client doesn't need to make another call for fetching data and do aggregation.
With BFF,
You can add an API tailored to the needs of each client, removing a lot of the bloat caused by keeping it all in one place.
Frontend requirements will be separated from the backend concerns. This is easier for maintenance.
The client application will know less about your APIs’ structure, which will make it more resilient to changes in those APIs.
However, all these microservices patterns came with some tradeoffs and no exceptions for BFFs.
so Always keep in mind that,
BFF is a translation/aggregation layer between the client and the services. When data is returned from a service API, the purpose of it is to aggregate and transform it into the data type specified by the client application.
Avoid over-dependence on BFF and don't add application logics in this layer. As I said in the previous step it's just a translator.
Implement a resilient design and timeout since this aggregator is calling other services and getting data. If one or more service calls take too long, it should timeout and return a partial set of data. Consider how your application will handle this scenario
Monitoring of your aggregator and it's child service calls. Implement distributed tracing using correlation IDs to track each call.
You have multiple options here. For example:
Duplicate some Customer data in the OrderService Database. You could for example save a small subset of the entity as a separate Table in the Order service db like: CustomerId and CustomerName. This works very well for use cases where you have a lot of traffic on the: /api/order/id endpoint but the number of Customers is very small. This means that duplicating some of Customers data in your OrderService db would not be very expensive. This option works quite well if you have a Messaging Queue for communication between micro-service which publishes event messages when the Customer data changes in CustomerService. This way you can update the Customer data in OrderService db and be up to date with the source of truth of the data. If you do not have a Messaging Queue for communication this option would not be so great as you would need to periodically check for changes in the CustomerService. This will create additional load to the CustomerService and the Customer data in OrderService db would inconsistent for some period. Good thing about this option is even if you have a lot of calls to your "/api/order/id" the CustomerService will not be affected at all by this load as the data is already in the OrderService db.
Call the CustomerService from OrderService when you need that data. In this option every time you need the CustomerName and other CustomerService data you would need to call the CustomerService over some api. For example that would be when someone calls your "/api/order/id" endpoint you would call the CustomerService api from your OrderService. This option is very good if you do not have a lot of calls to this api and generally the load on the CustomerService is not big. This way this additional calls would not be very problematic. It becomes a problem if you have a lot of calls to"/api/order/id" and the load is delegated to the CustomerService as well.
Create a 3'rd micro-service which will aggregate data from both the CustomerService and OrderService. In this option you would have another read-only micro-service with its own database which will have data from both micro-service in its own database. This option can be useful if you have a system which has to handle a lot of load in particular has a lot of potential for lot of load to this particular endpoint: "/api/order/id" or other endpoints which use an Aggregations of multiple entities from multiple micro-service. If you want to separate this and scale this independently you can create another micro-service like customer-order-micro-service and duplicate data from CustomerService and OrderService. Similar as in option 1. you would just have to duplicate particular fields/properties of the Entities which you need. Using this option you could even have some kind of De-Normalized data structure where you could store the entities in one table/collection(if you need it :)). You can even use pick a data storage technology which fits your queries better like Elastic Search if you have some full text search requirements and etc. This special service can be used to fit your needs as you need it. Some examples are also in the direction Query/Read part of CQRS and so on. There are a lot of options what you can do here. This is an option for a specific use case so you need to carefully review your business requirements and adjust it to your needs.
Conclusion
Usually in most of the cases you will be fine with option 1 or option 2. But it is good to know that you can also decide to go with the option 3 and have a special read-only/query only service which can server your special needs. Keep in mind that the way of communication between services has also an impact on your options. You can read more about micro-service to micro-service communication here.

Resolving design dependency between microservices

I am designing an e-commerce application with microservice approach, using ORM(JPA) for data persistence for one of the microservice named OrderService. OrderService owns functionality related to persisting and reporting orders, which essentially include customer and product information. Customer and product functionality is managed by different microservices.
My question is at ORM layer OrderService need POJO which belongs to ProductService and CustomerService. What is the best way to deal with this dependency between services? Should application needs to design in different way?
There are few things that one should take into consideration when try to find a solution
1. You cannot access the database of other services, you have to make a call.
2. You should try not to keep data from other services into yours. Data duplication lead to an inconsistent state and should be avoided if you can
3. You should have a means to query data from other services when asked for.
Now with those points, I will mostly restrict data from other services to some reference ids (which should be immutable). At ORM layer I will just fetch the reference IDs and bloat them up by making an API call to concerned services(business layer).
You may realize that you are making way too many calls for say getting customer name to customer service using customer id, if that is the case, you may consider saving some of these information in your system. But be cautioned. Data that you saved should not be volatile and make sure you have done due diligence in making that call.
Recently, I have gone through many design principles of microservices and realizes that CQRS-ES and data replication with eventual consistency is best solution of this issue. We try to make communication Asynchronous as much as possible, uses point to point synchronous communication between microservices only when necessary.
This is a fairly common situation when designing microservices. Most microservices will require access to data available through another microservices or an external provider.
The best way to deal with this is to design each microservice as a "separate" application and think of all other microservices as external to it.
So, the developer of Microservice#1 (M1) would have to check into the Microservice#2 (M2) spec and write simple POJO classes for the data he fetches from there. Just like he would do if he were using some external API like Facebook.
Do note that that M1 will always talk to M2 (via REST for example) and never to the DB directly for the data it needs.
Ideally, each microservice would have its own database (or part clone of a central database)

Entity Framework Code First DTO or Model to the UI?

I am creating a brand new application, including the database, and I'm going to use Entity Framework Code First. This will also use WCF for services which also opens it up for multiple UI's for different devices, as well as making the services API usable from other unknown apps.
I have seen this batted around in several posts here on SO but I don't see direct questions or answers pertaining to Code First, although there are a few mentioning POCOs. I am going to ask the question again so here it goes - do I really need DTOs with Entity Framework Code First or can I use the model as a set of common entities for all boundaries? I am really trying to follow the YAGNI train of thought so while I have a clean sheet of paper I figured that I would get this out of the way first.
Thanks,
Paul Speranza
There is no definite answer to this problem and it is also the reason why you didn't find any.
Are you going to build services providing CRUD operations? It generally means that your services will be able to return, insert, update and delete entities as they are = you will always expose whole entity or single exactly defined serializable part of the entity to all clients. But once you do this it probably worth to check WCF Data Services.
Are you going to expose business facade working with entities? The facade will provide real business methods instead of just CRUD operations. These buisness methods will get some data object and decompose it to multiple entities in wrapped business logic. Here it makes sense to use specific DTO for every operation. DTO will transfer only data needed for the operation and return only date allowed to the client.
Very simple example. Suppose that your entities keep information like LastModifiedBy. This is probably information you want to pass back to the client. In the first scenario you have single serializable set so you will pass it back to the client and client pass it modified back to the service. Now you must verify that client didn't change the field because he probably didn't have permissions to do that. You must do it with every single field which client didn't have permission to change. In the second scenario your DTO with updated data will simply not include this property (= specialized DTO for your operation) so client will not be able to send you a new value at all.
It can be somehow related to the way how you want to work with data and where your real logic will be applied. Will it be on the service or on the client? How will you ensure that client will not post invalid data? Do you want to restrict passing invalid data by logic or by specific transferred objects?
I strongly recommend a dedicated view model.
Doing this means:
You can design the UI (and iterate on it) without having to wait to design the data model first.
There is less friction when you want to change the UI.
You can avoid security problems with auto-mapping/model binding "accidentally" updating fields which shouldn't be editable by the user -- just don't put them in the view model.
However, with a WCF Data Service, it's hard to ignore the advantage of being able to write the service in essentially one line when you expose entities directly. So that might make the most sense for the WCF/server side.
But when it comes to UI, you're "gonna need it."
do I really need DTOs with Entity Framework Code First or can I use the model as a set of common entities for all boundaries?
Yes, the same set of POCOs / entities can be used for all boundaries.
But a set of mappers / converters / configurators will be needed to adapt entities to some generic structures of each layer.
For example, when entities are configured with DataContract and DataMember attributes, WCF is able to transfer domain objects' state without creating any special classes.
Similarly, when entities are mapped using Entity Framework fluent mapping api, EF is able to persist domain objects' state in database without creating any special classes.
The same way, entities can be configured to be used in any layer by means of the layer infrastructure without creating any special classes.

How would I know if I should use Self-Tracking Entities or DTOs/POCOs?

What are some questions I can ask myself about our design to identify if we should use DTOs or Self-Tracking Entities in our application?
Here's some things I know of to take into consideration:
We have a standard n-tier application with a WPF/MVVM client, WCF server, and MS SQL Database.
Users can define their own interface, so the data needed from the WCF service changes based on what interface the user has defined for themselves
Models are used on both the client-side and server-side for validation. We would not be binding directly to the DTO or STE
Some Models contain properties that get lazy-loaded from the WCF service if needed
The Database layer spams multiple servers/databases
There are permission checks on the server-side which affect how the data is returned. For example, some data is either partially or fully masked based on the user's role
Our resources are limited (time, manpower, etc)
So, how can I determine what is right for us? I have never used EF before so I really don't know if STEs are right for us or not.
I've seen people suggest starting with STEs and only implement DTOs if they it becomes a problem, however we currently have DTOs in place and are trying to decide if using STEs would make life easier. We're early enough in the process that switching would not take too long, but I don't want to switch to STEs only to find out it doesn't work for us and have to switch everything back.
If I understand your architecture, I think it is not good for STEs because:
Models are used on both the client-side and server-side for validation. We would not be binding directly to the DTO or STE
The main advantage (and the only advantage) or STEs is their tracking ability but the tracking ability works only if STE is used on both sides:
The client query server for data
The server query EF and receive set of STEs and returns them to the client
The client works with STEs, modifies them and sends them back to the server
The server receives STEs and applies transferred changes to EF => database
In short: There are no additional models on client or server side. To fully use STEs they must be:
Server side model (= no separate model)
Transferred data in WCF (= no DTOs)
Client side model (= no separate model, binding directly to STEs). Otherwise you will be duplicating tracking logic when handling change events on bounded objects and modifying STEs. (The client and the server share the assembly with STEs).
Any other scenario simply means that you don't take advantage of self tracking ability and you don't need them.
What about your other requirements?
Users can define their own interface, so the data needed from the WCF service changes based on what interface the user has defined for them.
This should be probably possible but make sure that each "lazy loaded" part is separate structure - do not build complex model on the client side. I've already seen questions where people had to send whole entity graph back for updates which is not what you always want. Because of that I think you should not connect loaded parts into single entity graph.
There are permission checks on the server-side which affect how the data is returned. For example, some data is either partially or fully masked based on the user's role
I'm not sure how do you want actually achieve this. STEs don't use projections so you must null fields directly in entities. Be aware that you must do this when entity is not in tracking state or your masking will be saved to the database.
The Database layer spams multiple servers/databases
It is something that is not problem of STEs. The server must use a correct EF context to load and save data.
STEs are implementation of change set pattern. If you want to use them you should follow their rules to take full advantage of the pattern. They can save some time if used correctly but this speed up comes with sacrifice of some architectural decisions. As any other technology they are not perfect and sometimes you can find them hard to use (just follow self-tracking-entities tag to see questions). They also have some serious disadvantages but in .NET WPF client you will not meet them.
You can opt STE for given scenario,
All STEs are POCOs, .Net dynamically add one layer to it for change tracking.
Use T4 templates to generate the STEs, it will save your time.
Uses of tools like Automapper will save your time for manually converting WCF returned data contract to Entity or DTO
Pros for STE -
You don't have to manually track the changes.
In case of WCF you just have to say applydbchanges and it will automatically refresh the entity
Cons for STE -
STEs are heavier than POCO, because of dynamic tracking
Pros for POCO -
Light weight
Can be easily bridged with EF or nH
Cons for POCO -
Need to manually track the changes with EF.(painful)
POCO are dynamic proxied and don't play nice on the wire see this MSDN article for the workaround though. So they can be made to but IMO you're better off going STE as I believe they align nicely with WPF/MVVM development.

OData WCF Data Service with NHibernate and corporate business logic

Let me first apologise for the length of the entire topic. It will be fairly long, but I wish to be sure that the message comes over clearly without errors.
Here at the company, we have an existing ASP.NET WebApplication. Written in C# ASP.NET on the .NET Framework 3.5 SP1. Some time ago an initial API was developed for this web application using WCF and SOAP to allow external parties to communicate with the application without relying on the browsers.
This API survived for some time, but eventually the request came to create a new API that was RESTfull and relying on new technologies. I was given this assignment, and I created the initial API using the Microsoft MVC 2 Framework, running inside our ASP.NET WebApplication. This took initially quiet some time to get it properly running, but at the moment we're able to make REST calls on the application to receive XML detailing our resources.
I've attended a Microsoft WebCamp, and I was immediatly sold by the OData concept. It was very similar then what we are doing, but this was a protocol supported by more players instead of our own implementation. Currently I'm working on a PoC (Proof of Concept) to recreate the API that I developed using the OData protocol and the WCF DataService technology.
After searching the Internet for getting NHibernate 2 to work with the Data Services, I succeeded in creating a ReadOnly version of the API that allows us to read out the entities from the internal business layer by mapping the incoming query requests to our Business layer.
However, we wish to have a functional API that also allows the creation of entities using the OData protocol. So now i'm a bit stuck on how to proceed. I've been reading the following article : http://weblogs.asp.net/cibrax/default.aspx?PageIndex=3
The above articly nicely explains on how to map a custom DataService to the NHibernate layer. I've used this as a base to continue on, but I have the "problem" that I don't want to map my requests directly to the database using NHibernate, but I wish to map them to our Business layer (a seperate DLL) that performs a large batch of checks, constraints and updates based upon accessrights, privledges and triggers.
So what I want to ask, I for example create my own NhibernateContext class as in the above articly, but instead rely on our Business Layer instead of NHibernate sessions, could it work? I'd probably have to rely on reflection alot to figure out the type of object I'm working with at runtime and call the correct business classes to perform the updates and deletes.
To demonstrate with a smal ascii picture:
*-----------------*
* Database *
*-----------------*
*------------------------*
* DAL(Data Access Layer) *
*------------------------*
*------------------------*
* BUL (Bussiness Layer) *
*------------------------*
*---------------* *-------------------*
* My OData stuff* * Internal API *
*---------------* *-------------------*
*------------------*
* Web Application *
*------------------*
So, would this work, or would the performance make it useless?
Or am I just missing the ball here?
The idea is that I wish to reuse whatever logic is stored in the BUL & DAL layer from the OData WCF DataService.
I was thinking about creating new classes that inherit from the EntityModel classes in the Data.Services namespace and create a new DataService object that marks all calls to the BUL & DAL & API layers. I'm however not sure where/who to intercept the requests for creating and deleting resources.
I hope it's a bit clear what I'm trying to explain, and I hope someone can help me on this.
The devil is in the details, but it sounds like the design you're proposing should work.
The DataService class is where you get to define the access rights applicable to everyone, configuration settings, and custom operations. In this scenario, I think you will be focusing more on the data context instead (the 'T' in DataService).
For the context, there are really two interesing paths: reads and writes. Reads happen through the IQueryable entry points. Writing a LINQ provider is a good chunk of work, but NHibernate already supports this, although it would return what I imagine we're calling DAL entities. You can use query interceptors to do access checks here if you can express those in terms that the database would understand.
The update path is from what I understand where you are trying to run more business logic (you mentioned validation, extra updates, etc). To do this, you'll want to focus on the IUpdatable implementation (IDataServiceUpdateProvider if you're using the latest version). Here you can use whichever objects you want - they could be DAL objects or business objects. You can do everything in the DAL and then run validation on SaveChanges(), or do everything on business objects if they validate as they go.
There are two places where you might 'jump' from one kind of objects to another. One is in the GetResource() API, where you get an IQueryable, presumably in term of DAL entities. The other is in ResolveResource(), where the runtime is asking for an object to serialize, just like it would get from an IQueryable, so it's presumably also a DAL entity.
Hope this helps - doing uniform access over non-uniform APIs can be hard, but often well worth it!