NHibernate Cascade Behavior for Same Session Updates - nhibernate

Documentation on NHibernate cascade settings, discusses the settings in the context of calling the Save() Update() and Delete() methods. But I can find no discussion of cascade behavior in the context of the implicit update that occurs when one loads, modifies and saves entities on the same session. In this case an explicit call to update is not needed, so what happens regarding cascade settings?
This may seem a dumb question, but the reason I bring this up is that I am trying to figure out how NHibernate supports the concept of Aggregate Boundaries in the context of Domain Driven Design. Let me give an example that will illustrate what I am trying to get at.
Suppose I have the canonical invoice application with the entities Invoice, Buyer and LineItem. Invoice is an aggregate root and LineItem is in the same aggregate but Buyer is its own aggregate root.
I want to model this in NHibernate by configuring my mapping such that the cascade from Invoice to LineItem is All-DeleteOrphans and the one from Invoice to Buyer is None.
Based on the documentation I have read, using my desired cascade settings, if I am working with disconnected entities and I do the following, only the Invoice and LineItems will save:
disconnectedInvoice.ShippedDate = DateTime.Today();
disconnectedInvoice.LineItems[2].Backordered = true;
disconnectedInvoice.Buyer.Address = buyersNewAddress;
session.Update(disconnectedInvoice);
session.Flush();
What I don't see discussed anywhere is what happens when one retrieves the invoice, makes the same updates and flushes the session in a connected manner like so.
var invoice = session.Get<Invoice>(invoiceNumber);
invoice.ShippedDate = DateTime.Today();
invoice.LineItems[2].Backordered = true;
invoice.Buyer.Address = buyersNewAddress;
session.Flush();
The NHibernate documentation says that flush persists the changes for dirty entities associated with the session. Based on this one would presume that the updates to the Invoice, the Buyer and the LineItems will all be persisted.
However, this would seem to violate the concept behind the cascade rule. It would seem to me that for the purposes of deciding what entities to update upon flush, the session should look at those entities that were directly loaded (only the invoice in this case) and include indirectly loaded entities (the LineItems and the Buyer in this case) only if the cascade setting indicates they should be persisted.
I admit that this example represents bad DDD. If the Buyer isn't part of the aggregate then it should not be being updated at this time. Or at least it should not be being updated through the Invoice aggregate. However, DDD aside, the point I am actually more interested in is determining if cascade rules are honored for updates in the same-session scenario the same as they are in the disconnected scenario.

The NHibernate documentation says that flush persists the changes for dirty entities associated with the
session.
The main issue is the difference between disconnected and connected entities. Cascading behaves the same for both, but it's the implicit updates that are different. For session loaded entities, there is no need to cascade the save to the Buyer since it'd be redundant. For a disconnected entity, you need the cascade since there will be no implicit updates to that Buyer because it was never explicitly merged into the session.

Related

How to use database triggers in a real world project?

I've learned a lot about triggers and active databases in the last weaks, but I've some questions about real world examples for these.
At work we use the Entity Framework with ASP.Net and an MSSQL Server. We just use the auto generated constrains and no triggers.
When I heared about triggers I asked myself the following questions:
Which tasks can be performed by triggers?
e.g.: Generation of reporting data: currently the data for the reports is created in vb, but I think a trigger could handle this as well. The creation in vb takes a lot of time and the user should not need to wait for it, because it's not necessary for his work.
Is this an example for a perfect task for a trigger?
How does OR-Mapper handle trigger manipulated data?
e.g.: Do OR-Mapper recognize if a trigger manipulated data? The entity framework seems to cache a lot of data, so I'm not sure if it reads the updated data if a trigger manipulates the data, after the insert/update/delete from the framework is processed.
How much constraint handling should be within the database?
e.g.: Sometimes constrains in the database seem much easier and faster than in the layer above (vb.net,...), but how to throw exceptions to the upper layer that could be handled by the OR-Mapper?
Is there a good solution for handeling SQL exceptions (from triggers) in any OR-Mapper?
Thanks in advance
When you hear about a new tool or feture it doesn't mean you have to use it everywhere. You should think about design of your application.
Triggers are used a lot when the logic is in the database but if you build ORM layer on top of your database you want logic in the business layer using your ORM. It doesn't mean you should not use triggers. It means you should use them with ORM in the same way as stored procedures or database functions - only when it makes sense or when it improves performance. If you pass a lot of logic to database you can throw away ORM and perhaps whole your business layer and use two layered architecture where UI will talk directly to database which will do everything you need - such architecture is considered "old".
When using ORM trigger can be helpful for some DB generated data like audit columns or custom sequences of primary key values.
Current ORM mostly don't like triggers - they can only react to changes to currently processed record so for example if you save Order record and your update trigger will modify all ordered items there is no automatic way to let ORM know about that - you must reload data manually. In EF all data modified or generated in the database must be set with StoreGeneratedPattern.Identity or StoreGeneratedPattern.Computed - EF fully follows pattern where logic is either in the database or in the application. Once you define that value is assigned in the database you cannot change it in the application (it will not persist).
Your application logic should be responsible for data validation and call persistence only if validation passes. You should avoid unnecessary transactions and roundtrips to database when you can know upfront that transaction will fail.
I use triggers for two main purposes: auditing and updating modification/insertion times. When auditing, the triggers push data to related audit tables. This doesn't affect the ORM in any way as those tables are not typically mapped in the main data context (there's a separate auditing data context used when needed to look at audit data).
When recording/modifying insert/modification times, I typically mark those properties in the model as [DatabaseGenerated( DatabaseGenerationOptions.Computed )] This prevents any values set on in the datalayer from being persisted back to the DB and allows the trigger to enforce setting the DateTime fields properly.
It's not a hard and fast rule that I manage auditing and these dates in this way. Sometimes I need more auditing information than is available in the database itself and handle auditing in the data layer instead. Sometimes I want to force the application to update dates/times (since they may need to be the same over several rows/tables updated at the same time). In those cases I might make the field nullable, but [Required] in the model to force a date/time to be set before the model can be persisted.
The old Infomodeler/Visiomodeler ORM (not what you think - it was Object Role Modeling) provided an alternative when generating the physical model. It would provide all the referential integrity with triggers. For two reasons:
Some dbmses (notably Sybase/SQL Server) didn't have declarative RI yet, and
It could provide much more finely grained integrity - e.g. "no more than two children" or "sons or daughters but not both" or "mandatory son or daughter but not both".
So trigger logic related to the model in the same way that any RI constraint does. In SQL Server it handled violations with RAISERROR.
An conceptual issue with triggers is that they are essentially context-free - they always fire regardless of context (at least without great pain, and you might better include their logic with the rest of the context-specific logic.) So global domain constraints are the only place I find them useful - which I guess is another general way to identify "referential integrity".
Triggers are used to maintain integrity and consistency of data (by using constraints), help the database designer ensure certain actions are completed and create database change logs.
For example, given numeric input, if you want the value to be constrained to say, less then 100, you could write a trigger that fires for every row on update or insert, and raise an application error if the value of that column does not meet that contraint.
Suppose you want to log historical changes to a table. You could create a Trigger that fires AFTER each INSERT, UPDATE, and DELETE, which also inserts the data into a logging table. If you need to execute custom custom logic, then Triggers may appeal to you.

NHibernate: What are child sessions and why and when should I use them?

In the comments for the ayende's blog about the auditing in NHibernate there is a mention about the need to use a child session:session.GetSession(EntityMode.Poco).
As far as I understand it, it has something to do with the order of the SQL operation which session.Flush will emit. (For example: If I wanted to perform some delete operation in the pre-insert event but the session was already done with deleting operations, I would need some way to inject them in.)
However I did not find documentation about this feature and behavior.
Questions:
Is my understanding of child sessions correct?
How and in which scenarios should I use them?
Are they documented somewhere?
Could they be used for session "scoping"?
(For example: I open the master session which will hold some data and then I create 2 child-sessions from the master one. I'd expect that the two child-scopes will be separated but the will share objects from the master session cache. Is this the case?)
Are they first class citizens in NHibernate or are they just hack to support some edge-case scenarios?
Thanks in advance for any info.
Stefando,
NHibernate has not knowledge of child sessions, you can reuse an existing one or open a new one.
For instance, you will get an exception if you try to load the same entity into two different sessions.
The reason why it is mentioned in the blog, is because in preupdate and preinsert, you cannot load more objects in the session, you can change an allready loaded instance, but you may not navigate to a relationship property for instance.
So in the blog it is needed to open a new session just because we want to add a new auditlog entity. So in the end it's the transaction (unit of work) that manages the data.

How to manage multiple versions of the same record

I am doing short-term contract work for a company that is trying to implement a check-in/check-out type of workflow for their database records.
Here's how it should work...
A user creates a new entity within the application. There are about 20 related tables that will be populated in addition to the main entity table.
Once the entity is created the user will mark it as the master.
Another user can make changes to the master only by "checking out" the entity. Multiple users can checkout the entity at the same time.
Once the user has made all the necessary changes to the entity, they put it in a "needs approval" status.
After an authorized user reviews the entity, they can promote it to master which will put the original record in a tombstoned status.
The way they are currently accomplishing the "check out" is by duplicating the entity records in all the tables. The primary keys include EntityID + EntityDate, so they duplicate the entity records in all related tables with the same EntityID and an updated EntityDate and give it a status of "checked out". When the record is put into the next state (needs approval), the duplication occurs again. Eventually it will be promoted to master at which time the final record is marked as master and the original master is marked as dead.
This design seems hideous to me, but I understand why they've done it. When someone looks up an entity from within the application, they need to see all current versions of that entity. This was a very straightforward way for making that happen. But the fact that they are representing the same entity multiple times within the same table(s) doesn't sit well with me, nor does the fact that they are duplicating EVERY piece of data rather than only storing deltas.
I would be interested in hearing your reaction to the design, whether positive or negative.
I would also be grateful for any resoures you can point me to that might be useful for seeing how someone else has implemented such a mechanism.
Thanks!
Darvis
I've worked on a system like this which supported the static data for trading at a very large bank. The static data in this case is things like the details of counterparties, standard settlement instructions, currencies (not FX rates) etc. Every entity in the database was versioned, and changing an entity involved creating a new version, changing that version and getting the version approved. They did not however let multiple people create versions at the same time.
This lead to a horribly complex database, with every join having to take version and approval state into account. In fact the software I wrote for them was middleware that abstracted this complex, versioned data into something that end-user applications could actually use.
The only thing that could have made it any worse was to store deltas instead of complete versioned objects. So the point of this answer is - don't try to implement deltas!
This looks like an example of a temporal database schema -- Often, in cases like that, there is a distinction made between an entity's key (EntityID, in your case) and the row primary key in the database (in your case, {EntityID, date}, but often a simple integer). You have to accept that the same entity is represented multiple times in the database, at different points in its history. Every database row still has a unique ID; it's just that your database is tracking versions, rather than entities.
You can manage data like that, and it can be very good at tracking changes to data, and providing accountability, if that is required, but it makes all of your queries quite a bit more complex.
You can read about the rationale behind, and design of temporal databases on Wikipedia
You are describing a homebrew Content Management System which was probably hacked together over time, is - for the reasons you state - redundant and inefficient, and given the nature of such systems in firms is unlikely to be displaced without massive organizational effort.

NHibernate, ActiveRecord, Transaction database locks and when Commits are flushed

This is a common question, but the explanations found so far and observed behaviour are some way apart.
We have want the following nHibernate strategy in our MVC website:
A SessionScope for the request (to track changes)
An ActiveRecord.TransactonScope to wrap our inserts only (to enable rollback/commit of batch)
Selects to be outside a Transaction (to reduce extent of locks)
Delayed Flush of inserts (so that our insert/updates occur as a UoW at end of session)
Now currently we:
Don't get the implied transaction from the SessionScope (with FlushAction Auto or Never)
If we use ActiveRecord.TransactionScope there is no delayed flush and any contained selects are also caught up in a long-running transaction.
I'm wondering if it's because we have an old version of nHibernate (it was from trunk very near 2.0).
We just can't get the expected nHibernate behaviour, and performance sucks (using NHProf and SqlProfiler to monitor db locks).
Here's what we have tried since:
Written our own TransactionScope (inherits from ITransactionScope) that:
Opens a ActiveRecord.TransactionScope on the Commit, not in the ctor (delays transaction until needed)
Opens a 'SessionScope' in the ctor if none are available (as a guard)
Converted our ids to Guid from identity
This stopped the auto flush of insert/update outside of the Transaction (!)
Now we have the following application behaviour:
Request from MVC
SELECTs needed by services are fired, all outside a transaction
Repository.Add calls do not hit the db until scope.Commit is called in our Controllers
All INSERTs / UPDATEs occur wrapped inside a transaction as an atomic unit, with no SELECTs contained.
... But for some reason nHProf now != sqlProfiler (selects seems to happen in the db before nHProf reports it).
NOTE
Before I get flamed I realise the issues here, and know that the SELECTs aren't in the Transaction. That's the design. Some of our operations will contain the SELECTs (we now have a couple of our own TransactionScope implementations) in serialised transactions. The vast majority of our code does not need up-to-the-minute live data, and we have serialised workloads with individual operators.
ALSO
If anyone knows how to get an identity column (non-PK) refreshed post-insert without a need to manually refresh the entity, and in particular by using ActiveRecord markup (I think it's possible in nHibernate mapping files using a 'generated' attribute) please let me know!!

Undo in a transactional database

I do not know how to implement undo property of user-friendly interfaces using a transactional database.
On one hand it is advisable that the user has multilevel (infinite) undoing possibility as it is stated here in the answer. Patterns that may help in this problem are Memento or Command.
However, using a complex database including triggers, ever-growing sequence numbers, and uninvertable procedures it is hard to imagine how an undo action may work at different points than transaction boundaries.
In other words, undo to a point when the transaction committed for the last time is simply a rollback, but how is it possible to go back to different moments?
UPDATE (based on the answers so far): I do not necessarily want that the undo works when the modification is already committed, I would focus on a running application with an open transaction. Whenever the user clicks on save it means a commit, but before save - during the same transaction - the undo should work. I know that using database as a persistent layer is just an implementation detail and the user should not bother with it. But if we think that "The idea of undo in a database and in a GUI are fundamentally different things" and we do not use undo with a database then the infinite undoing is just a buzzword.
I know that "rollback is ... not a user-undo".
So how to implement a client-level undo given the "cascading effects as the result of any change" inside the same transaction?
The idea of undo in a database and in a GUI are fundamentally different things; the GUI is going to be a single user application, with low levels of interaction with other components; a database is a multi-user application where changes can have cascading effects as the result of any change.
The thing to do is allow the user to try and apply the previous state as a new transaction, which may or may not work; or alternatively just don't have undos after a commit (similar to no undos after a save, which is an option in many applications).
Some (all?) DBMSs support savepoints, which allow partial rollbacks:
savepoint s1;
insert into mytable (id) values (1);
savepoint s2;
insert into mytable (id) values (2);
savepoint s3;
insert into mytable (id) values (3);
rollback to s2;
commit;
In the above example, only the first insert would remain, the other two would have been undone.
I don't think it is practical in general to attempt undo after commit, for the reasons you gave* and probably others. If it is essential in some scenario then you will have to build a lot of code to do it, and take into account the effects of triggers etc.
I don't see any problem with ever-increasing sequences though?
We developped such a possibility in our database by keeping track of all transactions applied to the data (not really all, just the ones that are less than 3 months old). The basic idea was to be able to see who did what and when. Each database record, uniquely identified by its GUID, can then be considered as the result of one INSERT, multiple UPDATEs statements, and finally one DELETE statement. As we keep tracks of all these SQL statements, and as INSERTs are global INSERTS (a track of all fields value is kept in the INSERT statement), it is then possible to:
Know who modified which field and when: Paul inserted a new line in the proforma invoice, Bill renegociated the unit price of the item, Pat modified the final ordered quantity, etc)
'undo' all previous transactions with the following rules:
an 'INSERT' undo is a 'DELETE' based on the unique identifier
an 'UPDATE' undo is equivalent to the previous 'UPDATE'
a 'DELETE' undo is equilavent to the first INSERT followed by all updates
As we do not keep tracks of
transactions older than 3 months,
UNDO's are not allways available.
Access to these functionalities are strictly limited to database managers only, as other users are not allowed to do any data update outside of the business rules (example: what would be the meaning of an 'undo' on a Purchase Order line once the Purchase Order has been agreed by the supplier?). To tell you the truth, we use this option very rarely (a few times a year?)
It's nearly the same like William's post (which I actually voted up), but I try to point out a little more detailed, why it is neccessary to implement a user undo (vs. using a database rollback).
It would be helpful to know more about your application, but I think for a user(-friendly) Undo/Redo the database is not the adequate layer to implement the feature.
The user wants to undo the actions he did, independent of if these lead to no/one/more database transactions
The user wants to undo the actions HE did (not anyone else's)
The database from my point of view is implementation detail, a tool, you use as a programmer for storing data. The rollback is a kind of undo that helps YOU in doing so, it's not a user-undo. Using the rollback would mean to involve the user in things he doesn't want to know about and does not understand (and doesn't have to), which is never a good idea.
As William posted, you need an implementation inside the client or on server side as part of a session, which tracks the steps of what you define as user-transaction and is able to undo these. If database transactions were made during those user transactions you need other db transactions to revoke these (if possible). Make sure to give valuable feedback if the undo is not possible, which again means, explain it business-wise not database-wise.
To maintain arbitrary rollback-to-previous semantics you will need to implement logical deletion in your database. This works as follows:
Each record has a 'Deleted' flag, version number and/or 'Current Indicator' flag, depending on how clever you need your reconstruction to be. In addition, you need a per entity key across all versions of that entity so you know which records actually refer to which specific entity. If you need to know the times a version was applicable to, you can also have 'From' and 'To' columns.
When you delete a record, you mark it as 'deleted'. When you change it, you create a new row and update the old row to reflect its obsolescence. With the version number, you can just find the previous number to roll back to.
If you need referential integrity (which you probably do, even if you think you don't) and can cope with the extra I/O you should also have a parent table with the key as a placeholder for the record across all versions.
On Oracle, clustered tables are useful for this; both the parent and version tables can be co-located, minimising the overhead for the I/O. On SQL Server, a covering index (possibly clustered on the entity key) including the key will reduce the extra I/O.