I am studying Lagom and try to understand how persistent entities work.
I've read the following description:
Every PersistentEntity has a fixed identifier (primary key) that can
be used to fetch the current state and at any time only one instance
(as a “singleton”) is kept in memory.
Makes sense.
Then there is the following example to create a customer:
#Override
public ServiceCall<CreateCustomerMessage, Done> createCustomer() {
return request -> {
log.info("===> Create or update customer {}", request.toString());
PersistentEntityRef<CustomerCommand> ref = persistentEntityRegistry.refFor(CustomerEntity.class, request.userEmail);
return ref.ask(new CustomerCommand.AddCustomer(request.firstName, request.lastName, request.birthDate, request.comment));
};
}
This confuses me:
Does that mean that the persistentEntityRegistry contain multiple singleton persistentEntities? How exactly does the persistentEntityRegistry get filled and what is in it? Say we have 10k users that are created, does the registry contain 10k persistentEntities, or just 1?
In this case we want to create a new customer. So when we request a persistentEntity using persistentEntityRegistry.refFor(CustomerEntity.class, request.userEmail);, this shouldn't return anything from the registry since the customer doesn't exist yet (?).
Can you shine a light on how this works?
Documentation is good but there a few holes in my understanding that I haven't been able to fill.
Great questions. I'm not sure how far you are along with concepts relating to persistent entities that aren't mentioned here, so I'll start from the beginning.
When doing event sourcing, generally, for a given entity (eg, a single customer), you need a single writer. This is because generally reading and then writing to the event log is not done in a single transaction, so you read some events to load your state, validate an incoming command, and then emit one or more new events to be persisted. If two operations came in for the same entity at the same time, then they would both be validated with the same state - not taking into account the state change that the other might get in before they are executed. Hence, event sourcing requires a single writer principle, only one operation can be handled at a time so there's only one writer.
In Lagom, this is implemented using actors. Each entity (ie each instance of a customer) is loaded and managed by an actor. An actor has a mailbox (ie, a queue), where commands are placed, and it processes them one at a time, in order. For each entity, there is a singleton actor managing it (so, one actor per customer, many actors for many customers). Due to the single writer principle, it's very important that this is true.
But, how does a system like this scale? What happens if you have multiple nodes, do you then have multiple instances of each entity? No. Lagom uses Akka clustering with Akka cluster sharding to shard your entities across many nodes, ensuring that across all of your deployed nodes, you only have one of each entity. So when a command comes in to a node, the entity may live on the same node, in which case it just gets sent straight to the local actor for it to be processed, or it may live on a different node, in which case it gets serialised, sent to the node it lives on, and processed there, with the response being serialised and sent back.
This is one of the reasons why it's a PersistentEntityRef, due to the location transparency (you don't know where the entity lives), you can't hold onto the entity directly, you can only have a reference to it. The same terminology is used for actors, you have the actual Actor that does the behaviour, and an ActorRef is used to communicate with it.
Now, logically, when you get a reference for a customer that according to the domain model of your system doesn't exist yet (eg, they haven't registered), they don't exist. But, the persistent entity for them can, and must exist. There is actually no concept in Lagom of a persistent entity not existing, you can always instantiate a persistent entity of any id, it will load. It's just that there might be no events yet for that entity, in which case, when it loads, it will just have its initialState, with no events applied. For a customer, the state of the customer might be Optional<Customer>. So, when the entity is first created before any events are emitted for a customer, the state will be Optional.empty(). The first event emitted for the customer will be a CustomerRegistered event, and this will cause the state to change to an Optional.of(someCustomer).
To understand why logically this must be so, you don't want the same customer to be able to register themselves twice, right? You want to ensure that there is only one CustomerRegistered event for each customer. To do that, you need to have a state for the customer in their unregistered state, to validate that they are not already registered before they do register.
So, to make clear the answer to your first question, if you have 10K users, then there will be 10K persistent entity instances, one for each user. Though, that is only logically (there will be events for 10K different users in the database physically). In memory, the actual loaded entities will depend on which entities are active, when an entity goes for, by default, 2 minutes without receiving a command, Lagom will passivate that entity, that means, it drops it from memory, so the next time a command comes in for it will have to be loaded by loading the events for it from the database. This helps to ensure that you don't run out of memory by holding all your data in memory if you don't want.
Related
I have one question about akka-persistence and event migration. I do have read the "Schema Evolution for Event Sourced Actors" chapter. However, this does not give an answer to my question.
Given I have one persistent actor ChildActor that produce Created event. But, later we discover that ChildActor should be a child of ParentActor. And ParentActor has to update his state based on the creation of ChildActor (to maintains a collection of childs).
We can add a new command CreateChild for ParentActor that will create the ChildActor. However, the parent will never receive the Created event emitted by his child. Thus it will not be able to update his state. Of course, ParentActor can create a ChildCreated event for himself.
But, what about the Created events already persisted by FirstActor?
How can we "send" (and, ideally adapt) them to the ParentActor?
So, my question is:
Can we "route" persisted events from one actor to another?
Thanks
It is possible to watch the events persisted by a given persistence ID with the events by persistence ID query. Since this query is very much like what Akka Persistence must do in replaying events to rebuild a persistent actor's state, it's available in all the commonly used plugins: you'll need to check the documentation for your plugin for how to summon a ReadJournal. Once summoned, assuming that the ReadJournal is further an instance of EventsByPersistenceIdQuery, you would use (Scala):
readJournal.eventsByPersistenceId(childActorPersistenceId, fromOffset, Long.MaxValue)
which would give you an Akka Streams Source of events in order starting at fromOffset. Your subscribing actor may (probably will) want to save in its state the last-seen sequence number as part of its state so if it resumes it doesn't see the event it processed (ideally the event updating the sequence number would be in the same batch or otherwise atomically part of the state update).
Note that there will be an observable delay from persisting the event to when ParentActor sees the event, though many of the recent iterations of plugins (e.g. Cassandra or R2DBC) can directly propagate the event or at least the notification that there's an event for the persistence ID to the query.
We have a situation where several of our services are shared across our system. For example one that tracks stock movements. Whenever the stock level of an article changes an event is raised.
The problem we run in to is that while sometimes another service may be interested in ALL stock change events (for example to do some aggregation), in most cases only stock changes that are the result of a specific action are interesting.
The problem we now face is this. Say have an IArticleStockChangedEvent event that contains the article number, the stock change and a ProcessId that requested the change. This event is raised for every change in the article stock.
Now some external service has a saga to change 10 articles and commands the stock service to make it so. It also implements IHandleMessages to keep track of the progress. This works well in theory, but in practise this means that the service containing this saga will be flooded with unrelated IArticleStockChangedEvent message for which it will be unable to find a corresponding saga instance. While not technically breaking anything it causes unnecessary delays in the system.
I'm not really looking forward to creating a new kind of IArticleStockChangedEvent for every saga that can possibly cause a stock change. What is the recommended approach to handle this issue?
Thanks
The knowledge about which IArticleStockChangedEvent events you need to be delivered to your service lives inside your "external" service and changes dynamically, so it's not possible (or is complex and non-scalable) to make a filter in either Stock service or at a transport level (Ex. Service Bus subscription filter).
To make an optimization, namely avoid deserialization of the IArticleStockChangedEvent, you might consider custom Behavior<IIncomingPhysicalMessageContext> where you read the Stock item's Id from message header and lookup db to see if there is any saga for that stock item and if not, short circuit the message processing.
Better solution might be to use Reply and reply with a message from Stock service.
I have two queues that both have distinct data types that affect one another as they're being processed by my application, therefore processing messages from the two queues asynchronously would cause a data integrity issue.
I'm curious as to the best practice for making sure only one consumer is consuming at any given time. Here is a summary of what I have so far:
EventMessages receive information about external events that may or may not have an impact on the enqueued/existing PurchaseOrderMessages.
Since we anticipate we'll be consuming more PurchaseOrderMessage than EventMessage, maybe we should just ensure the EventMessage Queue is empty (via the API) before we process anything in PurchaseOrderMessage Queue - but that gets into the question of wait times, etc. and this all needs to happen as close to real time as possible.
If there's a way to simply pause a Consumer A until Consumer B is at rest that might be the simplest solution, I'm just not quite sure which direction I need to go in.
UPDATE
To provide some additional context, a PurchaseOrderMessage will contain a Origin and Destination.
A EventMessage also contains location data.
Each time a PurchaseOrderMessage is processed, it will query the current EventMessage records for any Event locations that match the Origin and Destination of that PurchaseOrder and create an association.
Each time an EventMessage is processed, it will query the current PurchaseOrderMessage records for any Origins of Destinations that match that Event and create an association.
If synchronous queues aren't a good solution, what's an alternative that would insure none of the associations are missed when EventMessages and PurchaseOrderMessages are getting published to the app at the same time?
UPDATE 2
Ultimately this data will serve a UI which will have a list of PurchaseOrders and the events that might be affecting their delivery dates. It would be too slow to do the "Event Check" as the PurchaseOrder data was being rendered/retrieved by the end user which is why we're wanting to do it as they're processed/consumed.
Let me begin with the bottom line up front - on the face of it, what you are asking doesn't make sense.
Queues should never require synchronization. The very thought of doing so entirely defeats the purpose of having a queue. For some background, visit this answer.
Let's consider some common places from real life where we encounter multiple queues:
Movie theaters (box office, concession counter, usher)
Theme parks (snack bars, major attractions)
Manufacturing floors (each station may have a queue waiting to process)
In each of these examples, from the point of view of the object in the queue, it can only wait in one at a time. It cannot wait in one line while it is waiting in another- such a thing is physically impossible.
Your example seems to take two completely unrelated things and merge them together. You have a queue for PurchaseOrder objects - but what is the queue for? That's the equivalent of going to Disney World and waiting in the Customer queue - what is the purpose of such a queue? If the purpose is not clear, it's not a real queue.
Addressing your issue
This particular issue needs to be addressed first by clearly defining the various operations that are being done to a PurchaseOrder, then creating queues for each of those operations. If these operations are truly synchronous, then your business logic should be coded to wait for one operation to complete before starting another. In this circumstance, it would be considered an exception if a PurchaseOrder got to the head of one queue without fulfilling a pre-requisite.
Please remember that a message queue typically serves a stateless operation. Good design dictates that messages in the queue contain all the information needed for the processor to process the message. If you don't adhere to this, then your database becomes a single point of contention for your system - and while this is not an insurmountable problem, it does make the design more complex.
Waiting in Multiple Queues
Now, if you've ever been to Disney World, you'll also know that they have something called a FastPass+ (FP+), which allows the holder to skip the line at the designated attraction. Disney allocates a certain number of slots per hour for each major attraction at the park, and guests are able to request up to three FP+s during each day. FP+ times are allocated for one hour blocks, and guests cannot have two overlapping FP+ time blocks. Once all FP+ slots have been issued for the ride, no more are made available. The FP+ system ensures these rules are enforced, independently of the standby queues for each ride. Essentially, by using FastPass+, guests can wait in multiple lines virtually and experience more attractions during their visit.
If you are unable to analyze your design and come up with an alternative, perhaps the FastPass+ approach could help alleviate some of the bottlenecks.
Disclaimer: I don't work for Disney, but I do go multiple times per month, always getting my FastPass first
What is the best way to achieve DB consistency in microservice-based systems?
At the GOTO in Berlin, Martin Fowler was talking about microservices and one "rule" he mentioned was to keep "per-service" databases, which means that services cannot directly connect to a DB "owned" by another service.
This is super-nice and elegant but in practice it becomes a bit tricky. Suppose that you have a few services:
a frontend
an order-management service
a loyalty-program service
Now, a customer make a purchase on your frontend, which will call the order management service, which will save everything in the DB -- no problem. At this point, there will also be a call to the loyalty-program service so that it credits / debits points from your account.
Now, when everything is on the same DB / DB server it all becomes easy since you can run everything in one transaction: if the loyalty program service fails to write to the DB we can roll the whole thing back.
When we do DB operations throughout multiple services this isn't possible, as we don't rely on one connection / take advantage of running a single transaction.
What are the best patterns to keep things consistent and live a happy life?
I'm quite eager to hear your suggestions!..and thanks in advance!
This is super-nice and elegant but in practice it becomes a bit tricky
What it means "in practice" is that you need to design your microservices in such a way that the necessary business consistency is fulfilled when following the rule:
that services cannot directly connect to a DB "owned" by another service.
In other words - don't make any assumptions about their responsibilities and change the boundaries as needed until you can find a way to make that work.
Now, to your question:
What are the best patterns to keep things consistent and live a happy life?
For things that don't require immediate consistency, and updating loyalty points seems to fall in that category, you could use a reliable pub/sub pattern to dispatch events from one microservice to be processed by others. The reliable bit is that you'd want good retries, rollback, and idempotence (or transactionality) for the event processing stuff.
If you're running on .NET some examples of infrastructure that support this kind of reliability include NServiceBus and MassTransit. Full disclosure - I'm the founder of NServiceBus.
Update: Following comments regarding concerns about the loyalty points: "if balance updates are processed with delay, a customer may actually be able to order more items than they have points for".
Many people struggle with these kinds of requirements for strong consistency. The thing is that these kinds of scenarios can usually be dealt with by introducing additional rules, like if a user ends up with negative loyalty points notify them. If T goes by without the loyalty points being sorted out, notify the user that they will be charged M based on some conversion rate. This policy should be visible to customers when they use points to purchase stuff.
I don’t usually deal with microservices, and this might not be a good way of doing things, but here’s an idea:
To restate the problem, the system consists of three independent-but-communicating parts: the frontend, the order-management backend, and the loyalty-program backend. The frontend wants to make sure some state is saved in both the order-management backend and the loyalty-program backend.
One possible solution would be to implement some type of two-phase commit:
First, the frontend places a record in its own database with all the data. Call this the frontend record.
The frontend asks the order-management backend for a transaction ID, and passes it whatever data it would need to complete the action. The order-management backend stores this data in a staging area, associating with it a fresh transaction ID and returning that to the frontend.
The order-management transaction ID is stored as part of the frontend record.
The frontend asks the loyalty-program backend for a transaction ID, and passes it whatever data it would need to complete the action. The loyalty-program backend stores this data in a staging area, associating with it a fresh transaction ID and returning that to the frontend.
The loyalty-program transaction ID is stored as part of the frontend record.
The frontend tells the order-management backend to finalize the transaction associated with the transaction ID the frontend stored.
The frontend tells the loyalty-program backend to finalize the transaction associated with the transaction ID the frontend stored.
The frontend deletes its frontend record.
If this is implemented, the changes will not necessarily be atomic, but it will be eventually consistent. Let’s think of the places it could fail:
If it fails in the first step, no data will change.
If it fails in the second, third, fourth, or fifth, when the system comes back online it can scan through all frontend records, looking for records without an associated transaction ID (of either type). If it comes across any such record, it can replay beginning at step 2. (If there is a failure in step 3 or 5, there will be some abandoned records left in the backends, but it is never moved out of the staging area so it is OK.)
If it fails in the sixth, seventh, or eighth step, when the system comes back online it can look for all frontend records with both transaction IDs filled in. It can then query the backends to see the state of these transactions—committed or uncommitted. Depending on which have been committed, it can resume from the appropriate step.
I agree with what #Udi Dahan said. Just want to add to his answer.
I think you need to persist the request to the loyalty program so that if it fails it can be done at some other point. There are various ways to word/do this.
1) Make the loyalty program API failure recoverable. That is to say it can persist requests so that they do not get lost and can be recovered (re-executed) at some later point.
2) Execute the loyalty program requests asynchronously. That is to say, persist the request somewhere first then allow the service to read it from this persisted store. Only remove from the persisted store when successfully executed.
3) Do what Udi said, and place it on a good queue (pub/sub pattern to be exact). This usually requires that the subscriber do one of two things... either persist the request before removing from the queue (goto 1) --OR-- first borrow the request from the queue, then after successfully processing the request, have the request removed from the queue (this is my preference).
All three accomplish the same thing. They move the request to a persisted place where it can be worked on till successful completion. The request is never lost, and retried if necessary till a satisfactory state is reached.
I like to use the example of a relay race. Each service or piece of code must take hold and ownership of the request before allowing the previous piece of code to let go of it. Once it's handed off, the current owner must not lose the request till it gets processed or handed off to some other piece of code.
Even for distributed transactions you can get into "transaction in doubt status" if one of the participants crashes in the midst of the transaction. If you design the services as idempotent operation then life becomes a bit easier. One can write programs to fulfill business conditions without XA. Pat Helland has written excellent paper on this called "Life Beyond XA". Basically the approach is to make as minimum assumptions about remote entities as possible. He also illustrated an approach called Open Nested Transactions (http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper142.pdf) to model business processes. In this specific case, Purchase transaction would be top level flow and loyalty and order management will be next level flows. The trick is to crate granular services as idempotent services with compensation logic. So if any thing fails anywhere in the flow, individual services can compensate for it. So e.g. if order fails for some reason, loyalty can deduct the accrued point for that purchase.
Other approach is to model using eventual consistency using CALM or CRDTs. I've written a blog to highlight using CALM in real life - http://shripad-agashe.github.io/2015/08/Art-Of-Disorderly-Programming May be it will help you.
We are currently starting to broadcast events from one central applications to other possibly interested consumer applications, and we have different options among members of our team about how much we should put in our published messages.
The general idea/architecture is the following :
In the producer application :
the user interacts with some entities (Aggregate Roots in the DDD sense) that can be created/modified/deleted
Based on what is happening, Domain Events are raised (ex : EntityXCreated, EntityYDeleted, EntityZTransferred etc ... i.e. not only CRUD, but mostly )
Raised events are translated/converted into messages that we send to a RabbitMQ Exchange
in RabbitMQ (we are using RabbitMQ but I believe the question is actually technology-independent):
we define a queue for each consuming application
bindings connect the exchange to the consumer queues (possibly with message filtering)
In the consuming application(s)
application consumes and process messages from its queue
Based on Enterprise Integration Patterns we are trying to define the Canonical format for our published messages, and are hesitating between 2 approaches :
Minimalist messages / event-store-ish : for each event published by the Domain Model, generate a message that contains only the parts of the Aggregate Root that are relevant (for instance, when an update is done, only publish information about the updated section of the aggregate root, more or less matching the process the end-user goes through when using our application)
Pros
small message size
very specialized message types
close to the "Domain Events"
Cons
problematic if delivery order is not guaranteed (i.e. what if Update message is received before Create message ? )
consumers need to know which message types to subscribe to (possibly a big list / domain knowledge is needed)
what if consumer state and producer state get out of sync ?
how to handle new consumer that registers in the future, but does not have knowledge of all the past events
Fully-contained idempotent-ish messages : for each event published by the Domain Model, generate a message that contains a full snapshot of the Aggregate Root at that point in time, hence handling in reality only 2 kind of messages "Create or Update" and "Delete" (+metadata with more specific info if necessary)
Pros
idempotent (declarative messages stating "this is what the truth is like, synchronize yourself however you can")
lower number of message formats to maintain/handle
allow to progressively correct synchronization errors of consumers
consumer automagically handle new Domain Events as long as the resulting message follows canonical data model
Cons
bigger message payload
less pure
Would you recommend an approach over the other ?
Is there another approach we should consider ?
Is there another approach we should consider ?
You might also consider not leaking information out of the service acting as the technical authority for that part of the business
Which roughly means that your events carry identifiers, so that interested parties can know that an entity of interest has changed, and can query the authority for updates to the state.
for each event published by the Domain Model, generate a message that contains a full snapshot of the Aggregate Root at that point in time
This also has the additional Con that any change to the representation of the aggregate also implies a change to the message schema, which is part of the API. So internal changes to aggregates start rippling out across your service boundaries. If the aggregates you are implementing represent a competitive advantage to your business, you are likely to want to be able to adapt quickly; the ripples add friction that will slow your ability to change.
what if consumer state and producer state get out of sync ?
As best I can tell, this problem indicates a design error. If a consumer needs state, which is to say a view built from the history of an aggregate, then it should be fetching that view from the producer, rather than trying to assemble it from a collection of observed messages.
That is to say, if you need state, you need history (complete, ordered). All a single event really tells you is that the history has changed, and you can evict your previously cached history.
Again, responsiveness to change: if you change the implementation of the producer, and consumers are also trying to cobble together their own copy of the history, then your changes are rippling across the service boundaries.