CQRS Command how to store and query entities that are not persisted to data store immediately

CQRS Command how to store and query entities that are not persisted to data store immediately - entity

In CQRS, we separate Commands and Queries. As I understand it, Commands raise Domain Events that may modify Entity states while Queries return View specific DTO's directly from a data store. According to this article, the UI makes commands through a Command Bus which creates Commands that are handled by their respective CommandHandlers who then orchestrate the Domain Logic to determine the occurrence of Domain Events and persist/publish any state changes to a Repository (optionally using Event Sourcing). After being persisted, state changes are available through Queries.
Now, what if a Command creates an Entity that is not persisted/published immediately? Firstly, where is that not-yet-persisted Entity held? Is it in the Command Bus, the Command Handler, the Repository, or should a new thin application layer hold it? How should a Query gain access to it?
The problem here is that it seems like any Queries for unpersisted Entities differ significantly from those of persisted Entities, unless CQRS demands that ALL Entities be persisted upon creation, which IMO is not necessarily compatible with all Domains.
Specifically, I'm trying to build software to record training information for various Training Sessions. However, I would like it if Training Sessions were persisted manually by a Save Session button as opposed to always upon creation. I don't know where a StartNewTrainingSessionCommand would store the new Training Session so that it can be Queried, if not in the data store.

I think you understood things a bit wrong: A command is sent via a service bus to a command handler which uses the business objects to do the work. Domain events should be generated by the business (domain) objects, but sometimes the command handler does that too.
I don't see a reason for a created entity not to be saved. In your particular case, if the domain allows it, you can have a default, empty TrainingSession saved automatically then updated when the user press the Save button.
If this approach is not feasible, then simply store the input data, pretty much the view models in a temporary place (session, db) and issue the command only when the user clicks the button.

Related

CQRS: Read model projection update in an API

I would like to have a simple CQRS implementation on an API.
In short:
Separate routes for Command and Query.
Separate DB tables (on the same DB at the moment). Normalized one for Command and a de-normalized one for Query.
Asynchronous event-driven update of Query Read Model, using existing external Event Bus.
After the Command is executed, naturally I need to raise an event and pass it to the Event Bus.
Event Bus would process the event and pass it to it's subscriber(s).
In this case the subscriber is Read Model which needs to be updated.
So I need a callback route on the API which gets the event from Command Bus and updated the Read Model projection (i.e.: updating the de-normalized DB table which is used for Queries).
The problem is that the update of the Read Model projection is neither a Command (we do not execute any Domain Logic) nor a Query.
The questions is:
How should this async Read Model update work in order to be compliant both with CQRS and DDD?

How should this async Read Model update work in order to be compliant both with CQRS and DDD?
I normally think of the flow of information as a triangle.
We copy information from the outside world into our "write model", via commands
We copy information from the write model into our "read model"
We copy information from the read model to the outside world, via queries.
Common language for that middle step is "projection".
So the projection (typically) runs asynchronously, querying the "write model" and updating the "read model".
In the architecture you outlined, it would be the projection that is subscribed to the bus. When the bus signals that the write model has changed, we wake up the projection, and let it run so that it can update the read model.
(Note the flow of information - the signal we get from the bus triggers the projection to run, but the projection copies data from the write model, not from the event bus message. This isn't the only way to arrange things, but it is simple, and therefore easy to reason about when things start going pear shaped.)
It is often the case that the projection will store some of its own metadata when it updates the read model, so as to not repeat work.

Avoid two-phase commits in a event sourced application saving BLOB data

Let's assume we have an Aggregate User which has a UserPortraitImage and a Contract as a PDF file. I want to store files in a dedicated document-based store and just hold process-relevant data in the event (with a link to the BLOB data).
But how do I avoid a two-phase commit when I have to store the files and store the new event?
At first I'd store the documents and then the event; if the first transaction fails it doesn't matter, the command failed. If the second transaction fails it also doesn't matter even if we generated some dead files in the store, the command fails; we could even apply a rollback.
But could there be an additional problem?
The next question is how to design the aggregate and the event. If the aggregate only holds a reference to the BLOB storage, what is the process after a SignUp command got called?
SignUpCommand ==> Store documents (UserPortraitImage and Contract) ==> Create new User aggregate with the given BLOB storage references and store it?
Is there a better design which unburdens the aggregate of knowing that BLOB data is saved in another store? And who is responsible for storing BLOB data and forwarding the reference to the aggregate?

Sounds like you are working with something analogous to an AtomPub media-entry/media-link-entry pair. The blob is going into your data store, the meta data gets copied into the aggregate history
But how do I avoid a two-phase commit when I have to store the files and store the new event?
In practice, you probably don't.
That is to say, if the blob store and the aggregate store happen to be the same database, then you can update both in the same transaction. That couples the two stores, and adds some pretty strong constraints to your choice of storage, but it is doable.
Another possibility is that you accept that the two changes that you are making are isolated from one another, and therefore that for some period of time the two stores are not consistent with each other.
In this second case, the saga pattern is what you are looking for, and it is exactly what you describe; you pair the first action with a compensating action to take if the second action fails. So "manual" rollback.
Or not - in a sense, the git object database uses a two phase commit; an object gets copied into the object store, and then the trees get updated, and then the commit... garbage collection comes along later to discard the objects that you don't need.
who is responsible for storing BLOB data and forwarding the reference to the aggregate?
Well, ultimately it is an infrastructure concern; does your model actually need to interact with the document, or is it just carrying a claim check that can be redeemed later?

At first I'd store the documents and then the event; if the first
transaction fails it doesn't matter, the command failed. If the second
transaction fails it also doesn't matter even if we generated some
dead files in the store, the command fails; we could even apply a
rollback. But could there be an additional problem?
Not that I can think of, aside from wasted disk space. That's what I typically do when I want to avoid distributed transactions or when they're not available across the two types of data stores. Oftentimes, one of the two operations is less important and you can afford to let it complete even if the master operation fails later.
Cleaning up botched attempts can be done during exception handling, as an out-of-band process or as part of a Saga as #VoiceOfUnreason explained.
SignUpCommand ==> Store documents (UserPortraitImage and Contract) ==>
Create new User aggregate with the given BLOB storage references and
store it?
Yes. Usually the Application layer component (Command handler in your case) acts as a coordinator betweeen the different data stores and gets back all it needs to know from one store before talking to the other or to the Domain.

Asynchronous Object Construction in Core Data

I'm currently working on an app with a reasonably complex Core Data model. The data model currently has 10 tables in it, with a bunch of relationships set between them. The data for the model is obtained piecemeal from a remote server. In order to minimize the amount of traffic to/from the server, the server API passes object ID's first, giving me a chance to discover if I already have stored the objects. If not, then I can ask the server for the full objects and store them. However, those objects can have references to other objects, for which I will need to check follow the same process: check if I have the object(s) and, if not, grab the objects from the server. The Core Data model includes fields for the server IDs which I use to validate and construct Core Data's object graph.
This creates a situation where objects will have been instantiated in Core Data, but won't have been completely constructed as they may be waiting for referenced objects to be returned by the server (which may, in turn, need to wait for their own reference objects).
So my first attempt to deal with this was to create a semaphore that would not allow the object context to be saved (I only save the context in one place) until all objects are downloaded and the object graph is constructed. The problem I ran into was that the context was being saved anyway, without me asking. This results in a ton of changes propagating through NSFetchedResultsController as objects are downloaded from the server and the object graph is being constructed. Moreover, the propagated objects may not be complete.
Has any dealt with anything like this? I think this could all work if I could explicitly control when Core Data saves, but that does not appear to be possible. Or am I missing something?
UPDATE
I was missing something. I was under the impression that NSFetchedResultsController received updates when the Context is saved. This is not true. It receives updates whenever processPendingChanges is called in the context, which occurs at the end of an event cycle. In the past, I've always used two contexts to keep updates separate from the UI, but this project had a deadline and existing code that kept me from refactoring. Given this new information, I think the separate context will fix my problem.

That is an extremely expensive way to sync with a server. Is there a reason your server can't respond to "changed since X" calls and give you everything? In your current design you are spending more time opening and closing sockets than you are receiving data.
Be that as it may, you want to do all of this processing in a secondary context that is connected directly to the NSPersistentStoreCoordinator. When it saves you want to capture the NSManagedObjectContextDidSaveNotification and then have your UI context consume that notification. That will update your UI when your server sync is complete.
This will keep your syncing 100% isolated from the UI and allow the UI to save or do whatever else it needs to do while you are working with the server. I would not use a parent/child design here. There is no reason to.

You access a core data database via the NSManagedObjectContext class.
Each context object must belong to a single thread, and any NSManagedObjects that context creates belong to the same thread.
Do not read or write any managed object from a thread other than the one that created it. If you do, you'll end up with unpredictable and impossible to debug data corruption problems.
However, you can have multiple NSManagedObjectContext instances for a single core data database, each one on a different thread, and you can merge any changes made to the context in one thread over to a context on another thread.
So, basically, you have a "main" NSManagedObjectContext which is used on the main thread, and used for almost all your operations. And then when you need to do something on another thread you create a "child" context for that thread, make all your changes, then merge those changes back to the main context on the main thread.
You can find specific details how to implement this from Apple's official documentation. Start reading here:
https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/CoreData/Articles/cdConcurrency.html#//apple_ref/doc/uid/TP40003385-SW1

Core data : how to undo operations once managed objects are saved with context

I am trying to implement downloading of bulk data from several tables on the server.
In my case there are 16 tables. For all these tables I will be firing 10 requests to the server. This means I have done a bit of logical groupings for related tables, but it is like all tables are inter-related with each other through one or the other relationship.
I need to consider three cases while doing downloading:
Saving data to each table at local.
Managing relationships between inserted objects.
Handling situation when one of the requests fails during download, say 8th request failed.
I will be following this approach for each response:
Inserting data in managed object context.
Managing relationships by firing NSPredicate and associating the related objects.
Saving the context.
In case of a response failure, I have two options:
Next time continue from the failed response.
Revert all saved data to its previous state.
1st approach may lead to some data inconsistency, so I am going with 2nd approach.
I know that if a managed object context is not saved, we can revert the changes, but
is it possible to revert the changes, if the managed object context is
saved?
I require some useful answers from the community.
Please suggest.

Is it possible to revert the changes, if the managed object context is saved?
After saving? Maybe, but it could be tricky. If you set up a separate managed object context for your network operations, and give it an NSUndoManager, you could later on tell the undo manager to roll everything back to the previous state.
It would be simpler to just not save changes until you're finished, though. Using an undo manager doesn't really help much-- the memory needed to store up all the undo actions will at least match the memory use from keeping all of the unsaved changes around until you're finished. If you're working on a separate managed object context (whether a child context or a completely separate context), handling the error case is as simple as letting the MOC get deallocated without saving changes first.

NSManagedObjectContext confusion

I am learning about CoreData. Obviously, one of the main classes you entouer is NSManagedObjectContext. I am unclear about the exact role of this. From the articles i've read, it seems that you can have multiple NSManagedObjectContexts. Does this mean that NSManagedObjectContext is basically a copy of the backend?
How would this resolve into a consistent backend when there is multiple different copies lying around?
So, 2 questions basically:
Is NSManagedContext a copy of the backend database?
and...
For example, say I make a change in context A and make some other change in context B. Then I call save on A first, then B? will B prevail?
Thanks

The NSManagedObjectContext is not a copy of the backend database. The documentation describes it as a scratch pad
An instance of NSManagedObjectContext represents a single “object
space” or scratch pad in an application. Its primary responsibility is
to manage a collection of managed objects. These objects form a group
of related model objects that represent an internally consistent view
of one or more persistent stores. A single managed object instance
exists in one and only one context, but multiple copies of an object
can exist in different contexts. Thus object uniquing is scoped to a
particular context.
The NSManagedObjectContext is just a temporary place to make changes to your managed objects in a transactional way. When you make changes to objects in a context it does not effect the backend database until and if you save the context, and as you know you can have multiple context that you can make changes to which is really important for concurrency.
For question number 2, the answer for who prevails will depend on the merge policy you set for your context and which one is called last which would be B. Here are the merge policies that can be set that will effect the second context to be saved.
NSErrorMergePolicyType
Specifies a policy that causes a save to fail
if there are any merge conflicts.
NSMergeByPropertyStoreTrumpMergePolicyType
Specifies a policy that
merges conflicts between the persistent store’s version of the object
and the current in-memory version, giving priority to external
changes.
NSMergeByPropertyObjectTrumpMergePolicyType
Specifies a policy that merges conflicts between the persistent store’s version
of the object and the current in-memory version, giving priority to
in-memory changes.
NSOverwriteMergePolicyType
Specifies a policy that
overwrites state in the persistent store for the changed objects in
conflict.
NSRollbackMergePolicyType
Specifies a policy that
discards in-memory state changes for objects in conflict.

An NSManagedObjectContext is specific representation of your data model. Each context maintains its own state (e.g. context) so changes in one context will not directly affect other contexts. When you work with multiple contexts it is your responsibility to keep them consistent by merging changes when a context saves its changes to the store.
Your question is regarding this process and may also involve merge conflicts. Whenever you save a context its changes are committed to the store and a merge policy is used to resolve conflicts.
When you save a context, it will post various notifications regarding progress. In your case, if [contextA save:&error] succeeds, the context will post the notification NSManagedObjectContextDidSaveNotification. When you have multiple contexts, you typically observe this notification and call:
[contextB mergeChangesFromContextDidSaveNotification:notification];
This will merge the changes saved on contextA into contextB.
EDIT: removed the 'thread-safe' comment. NSManagedObjectContext is not thread safe.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas