I read this excellent tutorial (http://blogs.planbsoftware.co.nz/?p=247) about NserviceBus Sagas, but still I don't understand what is the advantage of this model (sagas), over using database or business layer transactions?
The main benefit of the saga model is that it allows you to take logic and data that would otherwise be spread out across a system (and various batch jobs), and pull that all into a single class, better following the single responsibility principle. Once you have that, you get all the other benefits that come from good software practices - better testability, maintainability, etc.
To show you real benefit of Saga model I'l show you two examples.
Imagine you have Services Oriented Architecture with hundreds of distributed hosts. Customer makes an Order that starts one or more sagas. Each saga have some related business logic. Handler for each given saga can be shared between different hosts and you don't need to check order state handling each message, NServiceBus implicitly checks saga state matching it by order id or other attributes and if it is still opened you'll get it in your data context.
You can also use this model as pattern without NServiceBus usage. Imagine you develop a video game and want to track some user combos. Each time player hits jump you open saga and add bonus points handling other rapid input. Once player delays for some time between inputs and saga closes itself saving total score for combo.
What are the benefits of Saga?
1) Your business logic is encapsulated in one place - saga.
2) You can extend it easily adding additional saga or removing them. You can also move them to other handlers or hosts.
3) You don't need to know what data in database are required in case of migration, you just need to migrate sagas which contain all necessary info
Related
After recently reading about event-based architecture, I wanted to change my architecture into one making use of such strengths.
I have two services that expose an API (crud, graphql), each based around a different entity and using a different database.
However, now whenever someone deletes a certain type of row in service A, i need to delete a coupled row in Service B.
So I added Kafka to my design, and whenever I delete the entity in service A, it publishes a notification message into Kafka.
In service B I am currently consuming the same topic so whenever a new message is received the Service will also handle the deletion of the matching entity, because it already has access that table because the same service already exposes the CRUD API to users.
What i'm not sure about is whether putting the Kafka Consumer and the API together in the same service is a good design. It contradicts the point of single responsibility in micro services, and whether there is an issue in one part of the service, it will likely affect the second.
However, creating a new service will also cause me issues - i will have 2 different services accessing the same table, and i will have to make sure i always maintain them together, whenever making changes to the table or database.
What is the best practice in a incident such as this? Is it inevitable to have different services have data coupling or is it not so bad to use the same service for two, similiar usages.
There is nothing wrong with using Kafka... You could do the same with point-to-point service communication, however (JSON-RPC / gRPC), however.
The real problem you seem to be asking about is dual-writes or race-conditions leading to data inconsistency.
While you could use a single consumer group and one topic-partition to preserve order and locking across consumers interested in those events, that does not lock out other consumer-groups from interacting with the database to perform the same action. Therefore, Kafka itself won't help with this problem.
You'll need external, distributed locks (e.g. Zookeeper can be used here) that fence off your database clients while you are performing actions against it.
To the original question, Kafka Connect offers an API and is also a Producer and Consumer client (and would be recommended for database interactions). So is Confluent Schema Registry, KSQLdb, etc.
I believe that the consumer of your service B would not be considered "a service" or part of the "service", as in that it is not called as part the code which services requests. Yet it does provide functionality that is required for the domain function of your microservice. So yes I would consider the consumer part of the Microservice in terms of team/domain responsibility.
There may be different opinions on if the consumer code should share the same code base/repo as the "service" code. Some people believe that it is better to limit the repo scope to a single "executable", others believe it is beneficial to keep the domain scope and have everything in a single repo. I probably belong to the latter group but do not have a very strong opinion on it. I would argue it is more important to have a central documentation / wiki for the domain that will point to the repos involved etc.
We have a situation where several of our services are shared across our system. For example one that tracks stock movements. Whenever the stock level of an article changes an event is raised.
The problem we run in to is that while sometimes another service may be interested in ALL stock change events (for example to do some aggregation), in most cases only stock changes that are the result of a specific action are interesting.
The problem we now face is this. Say have an IArticleStockChangedEvent event that contains the article number, the stock change and a ProcessId that requested the change. This event is raised for every change in the article stock.
Now some external service has a saga to change 10 articles and commands the stock service to make it so. It also implements IHandleMessages to keep track of the progress. This works well in theory, but in practise this means that the service containing this saga will be flooded with unrelated IArticleStockChangedEvent message for which it will be unable to find a corresponding saga instance. While not technically breaking anything it causes unnecessary delays in the system.
I'm not really looking forward to creating a new kind of IArticleStockChangedEvent for every saga that can possibly cause a stock change. What is the recommended approach to handle this issue?
Thanks
The knowledge about which IArticleStockChangedEvent events you need to be delivered to your service lives inside your "external" service and changes dynamically, so it's not possible (or is complex and non-scalable) to make a filter in either Stock service or at a transport level (Ex. Service Bus subscription filter).
To make an optimization, namely avoid deserialization of the IArticleStockChangedEvent, you might consider custom Behavior<IIncomingPhysicalMessageContext> where you read the Stock item's Id from message header and lookup db to see if there is any saga for that stock item and if not, short circuit the message processing.
Better solution might be to use Reply and reply with a message from Stock service.
We are into the lead business. We capture leads and pass it on to the clients based on some rules. integration to each client very in nature like nature of the API and in some cases, data mapping is also required. We perform the following steps in order to route leads to the client.
Select the client
Check if any client-specific mapping(master data) is required.
Send Lead to nearest available dealer(optional step)
Call client api to send lead
Update push status of the lead to database
Note that some of the steps can be optional.
Which design pattern would be suitable to solve this problem. The motive is to simplify integration to each client.
You'll want to isolate (and preferably externalize) the aspects that differ between clients, like the data mapping and API, and generalize as much as possible. One possible force to consider is how easily new clients and their APIs can be accommodated in the future.
I assume you have a lot of clients, and a database or other persistent mechanism that holds this client list, so data-driven routing logic that maps leads to clients shouldn't be a problem. The application itself should be as "dumb" as possible.
Data mapping is often easily described with meta-data, and also easily data-driven. Mapping meta-data is client specific, so it could easily be kept in your database associated with each client in XML or some other format. If the transformations to leads necessary to conform to specific APIs are very complex, the logic could be isolated through the use of a strategy pattern, with the specific strategy selected according to the target client. If an extremely large number of clients and APIs need to be accommodated, I'd bend over backwards to make the API data-driven as well. If you have just a few client types (say less than 20), I'd employ some distributed asynchronicity, and just have my application publish the lead and client info to a topic corresponding to client-type, and have subscribed external processors specific for each client-type do their thing and publish the results on another single queue. A consumer listing to the results queue would update the database.
I will divide your problem statement into three parts mentioned below:
1) Integration of API with different clients.
2) Perfom some steps in order to route leads to the client.
3) Update push status of the lead to database.
Design patterns involved in above three parts:
1) Integration of API with different clients - Integration to each client vary in nature like the nature of the API. It seems you have incompitable type of interface so, you should design this section by using "Adapter Design Pattern".
2) Perform some steps in order to route leads to the client- You have different steps of execution. Next step is based on the previous steps. So, you should design this section by using "State Design Pattern".
3) Update push status of the lead to database: This statement shows that you want to notify your database whenever push status of the lead happens so that information will be updated into database. So, you should design this section by using "Observer Design Pattern".
Sounds like this falls in the workflow realm.
If you're on Amazon Web Services, there's SWF, otherwise, there's a lot of workflow solutions out there for your favorite programming language.
I'm struggling to understand how to implement Eventual Consistency with the exposed example of BacklogItems and Tasks from Vaughn Vernon. The statement I've understood so far is (considering the case where he splits BacklogItem and Task into separate aggregate roots):
A BacklogItem can contain one or more tasks. When all remaining hours from a the tasks of a BacklogItem are 0, the status of the BacklogItem should change to "DONE"
I'm aware about the rule that says that you should not update two aggregate roots in the same transaction, and that you should accomplish that with eventual consistency.
Once a Domain Service updates the amount of hours of a Task, a TaskRemainingHoursUpdated event should be published to a DomainEventPublisher which lives in the same thread as the executing code. And here it is where I'm at a loss with the following questions:
I suppose that there should be a subscriber (also living in the same thread I guess) that should react to TaskRemainingHoursUpdated events. At which point in your Desktop/Web application you perform this subscription to the Bus? At the very initialization of your app? In the application code? Is there any reasoning to place domain subscriptors in a specific place?
Should that subscriptor (in the same thread) call a BacklogItem repository and perform the update? (But that would be a violation of the rule of not updating two aggregates in the same transaction since this would happen synchronously, right?).
If you want to achieve eventual consistency to fulfil the previously mentioned rule, do I really need a Message Broker like RabbitMQ even though both BacklogItem and Task live inside the same Bounded Context?
If I use this message broker, should I have a background thread or something that just consumes events from a RabbitMQ queue and then dispatches the event to update the product?
I'd appreciate if someone can shed some clear light over this since it is quite complex to picture in its completeness.
So to start with, you need to recognize that, if the BacklogItem is the authority for whether or not it is "Done", then it needs to have all of the information to compute that for itself.
So somewhere within the BacklogItem is data that is tracking which Tasks it knows about, and the known state of those tasks. In other words, the BacklogItem has a stale copy of information about the task.
That's the "eventually consistent" bit; we're trying to arrange the system so that the cached copy of the data in the BacklogItem boundary includes the new changes to the task state.
That in turn means we need to send a command to the BacklogItem advising it of the changes to the task.
From the point of view of the backlog item, we don't really care where the command comes from. We could, for example, make it a manual process "After you complete the task, click this button here to inform the backlog item".
But for the sanity of our users, we're more likely to arrange an event handler to be running: when you see the output from the task, forward it to the corresponding backlog item.
At which point in your Desktop/Web application you perform this subscription to the Bus? At the very initialization of your app?
That seems pretty reasonable.
Should that subscriptor (in the same thread) call a BacklogItem repository and perform the update? (But that would be a violation of the rule of not updating two aggregates in the same transaction since this would happen synchronously, right?).
Same thread and same transaction are not necessarily coincident. It can all be coordinated in the same thread; but it probably makes more sense to let the consequences happen in the background. At their core, events and commands are just messages - write the message, put it into an inbox, and let the next thread worry about processing.
If you want to achieve eventual consistency to fulfil the previously mentioned rule, do I really need a Message Broker like RabbitMQ even though both BacklogItem and Task live inside the same Bounded Context?
No; the mechanics of the plumbing matter not at all.
What is the best way to achieve DB consistency in microservice-based systems?
At the GOTO in Berlin, Martin Fowler was talking about microservices and one "rule" he mentioned was to keep "per-service" databases, which means that services cannot directly connect to a DB "owned" by another service.
This is super-nice and elegant but in practice it becomes a bit tricky. Suppose that you have a few services:
a frontend
an order-management service
a loyalty-program service
Now, a customer make a purchase on your frontend, which will call the order management service, which will save everything in the DB -- no problem. At this point, there will also be a call to the loyalty-program service so that it credits / debits points from your account.
Now, when everything is on the same DB / DB server it all becomes easy since you can run everything in one transaction: if the loyalty program service fails to write to the DB we can roll the whole thing back.
When we do DB operations throughout multiple services this isn't possible, as we don't rely on one connection / take advantage of running a single transaction.
What are the best patterns to keep things consistent and live a happy life?
I'm quite eager to hear your suggestions!..and thanks in advance!
This is super-nice and elegant but in practice it becomes a bit tricky
What it means "in practice" is that you need to design your microservices in such a way that the necessary business consistency is fulfilled when following the rule:
that services cannot directly connect to a DB "owned" by another service.
In other words - don't make any assumptions about their responsibilities and change the boundaries as needed until you can find a way to make that work.
Now, to your question:
What are the best patterns to keep things consistent and live a happy life?
For things that don't require immediate consistency, and updating loyalty points seems to fall in that category, you could use a reliable pub/sub pattern to dispatch events from one microservice to be processed by others. The reliable bit is that you'd want good retries, rollback, and idempotence (or transactionality) for the event processing stuff.
If you're running on .NET some examples of infrastructure that support this kind of reliability include NServiceBus and MassTransit. Full disclosure - I'm the founder of NServiceBus.
Update: Following comments regarding concerns about the loyalty points: "if balance updates are processed with delay, a customer may actually be able to order more items than they have points for".
Many people struggle with these kinds of requirements for strong consistency. The thing is that these kinds of scenarios can usually be dealt with by introducing additional rules, like if a user ends up with negative loyalty points notify them. If T goes by without the loyalty points being sorted out, notify the user that they will be charged M based on some conversion rate. This policy should be visible to customers when they use points to purchase stuff.
I don’t usually deal with microservices, and this might not be a good way of doing things, but here’s an idea:
To restate the problem, the system consists of three independent-but-communicating parts: the frontend, the order-management backend, and the loyalty-program backend. The frontend wants to make sure some state is saved in both the order-management backend and the loyalty-program backend.
One possible solution would be to implement some type of two-phase commit:
First, the frontend places a record in its own database with all the data. Call this the frontend record.
The frontend asks the order-management backend for a transaction ID, and passes it whatever data it would need to complete the action. The order-management backend stores this data in a staging area, associating with it a fresh transaction ID and returning that to the frontend.
The order-management transaction ID is stored as part of the frontend record.
The frontend asks the loyalty-program backend for a transaction ID, and passes it whatever data it would need to complete the action. The loyalty-program backend stores this data in a staging area, associating with it a fresh transaction ID and returning that to the frontend.
The loyalty-program transaction ID is stored as part of the frontend record.
The frontend tells the order-management backend to finalize the transaction associated with the transaction ID the frontend stored.
The frontend tells the loyalty-program backend to finalize the transaction associated with the transaction ID the frontend stored.
The frontend deletes its frontend record.
If this is implemented, the changes will not necessarily be atomic, but it will be eventually consistent. Let’s think of the places it could fail:
If it fails in the first step, no data will change.
If it fails in the second, third, fourth, or fifth, when the system comes back online it can scan through all frontend records, looking for records without an associated transaction ID (of either type). If it comes across any such record, it can replay beginning at step 2. (If there is a failure in step 3 or 5, there will be some abandoned records left in the backends, but it is never moved out of the staging area so it is OK.)
If it fails in the sixth, seventh, or eighth step, when the system comes back online it can look for all frontend records with both transaction IDs filled in. It can then query the backends to see the state of these transactions—committed or uncommitted. Depending on which have been committed, it can resume from the appropriate step.
I agree with what #Udi Dahan said. Just want to add to his answer.
I think you need to persist the request to the loyalty program so that if it fails it can be done at some other point. There are various ways to word/do this.
1) Make the loyalty program API failure recoverable. That is to say it can persist requests so that they do not get lost and can be recovered (re-executed) at some later point.
2) Execute the loyalty program requests asynchronously. That is to say, persist the request somewhere first then allow the service to read it from this persisted store. Only remove from the persisted store when successfully executed.
3) Do what Udi said, and place it on a good queue (pub/sub pattern to be exact). This usually requires that the subscriber do one of two things... either persist the request before removing from the queue (goto 1) --OR-- first borrow the request from the queue, then after successfully processing the request, have the request removed from the queue (this is my preference).
All three accomplish the same thing. They move the request to a persisted place where it can be worked on till successful completion. The request is never lost, and retried if necessary till a satisfactory state is reached.
I like to use the example of a relay race. Each service or piece of code must take hold and ownership of the request before allowing the previous piece of code to let go of it. Once it's handed off, the current owner must not lose the request till it gets processed or handed off to some other piece of code.
Even for distributed transactions you can get into "transaction in doubt status" if one of the participants crashes in the midst of the transaction. If you design the services as idempotent operation then life becomes a bit easier. One can write programs to fulfill business conditions without XA. Pat Helland has written excellent paper on this called "Life Beyond XA". Basically the approach is to make as minimum assumptions about remote entities as possible. He also illustrated an approach called Open Nested Transactions (http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper142.pdf) to model business processes. In this specific case, Purchase transaction would be top level flow and loyalty and order management will be next level flows. The trick is to crate granular services as idempotent services with compensation logic. So if any thing fails anywhere in the flow, individual services can compensate for it. So e.g. if order fails for some reason, loyalty can deduct the accrued point for that purchase.
Other approach is to model using eventual consistency using CALM or CRDTs. I've written a blog to highlight using CALM in real life - http://shripad-agashe.github.io/2015/08/Art-Of-Disorderly-Programming May be it will help you.