Is there a difference between Session.Save and Transaction.Commit ?
When I should use which?
It seems that sometimes Session.Save must use in conjunction with Transaction.Commit, sometimes no. Can anyone tell why this is so?
They're differenrt-- Session.Save saves an object and Transaction.Commit commits a bunch of work (multiple Gets, Loads, Saves, Updates, etc).
You'll want to use both. Here's a quick explanation with a link for more reading. The NHibernate documentation says the following:
In an ISession, every database operation occurs inside a transaction
that isolates the database operations (even read-only operations).
If you don't explicitly define your transaction, one will be created implicitly every time you read from or write to the database. Not very efficient. So even if you're just reading, you'll want to put everything inside a transaction and commit the transaction when you're done. Ayende Rahien explains further in this blog post.
When you look at some code samples out there, it may seem like people aren't using transactions but they may just be beginning/committing the transaction outside of the code you're looking at. In my ASP.Net MVC app, for example, I use an action filter (TransactionAttribute) to handle the transaction outside of my Controller Actions.
Related
Our users go through several steps of workflow - the further they go the more objects we create. We also allow users to go back to Step#1 and change one of the existing objects. Which may cause inconsistencies so we must update/delete some of the objects at Step#2. I see 2 options:
Update/delete objects from Step#2 right away. This leads to:
Operation that's supposed to be a simple PATCH of an entity field becomes complicated. And it's a shared object between multiple workflows - so we'll have to add if-statements and do different things depending on the workflow.
Circular dependencies. Operations on Step#1 have to know about objects/operations on Step#2.
On each request in Step#1 we'd have to load data for Step#2 in order to determine whether Step#2 really needs to be updated. Which slows down operations on Step#1. So to change 1 record in DB we'll have to load hundreds (or even thousands) records for Step#2.
Many actions on Step#1 may need fixing state at Step#2. So we have to ensure we don't forget anything today and in the future.
Fix Step#2 lazily - when user goes there (our current approach). Step#2 will recognize that objects are inconsistent and fix them. Which leads to just 1 place where we need to care, but:
Until user opens Step#2 - DB will contain inconsistent objects. This hasn't resulted in any problems so far. But I can imagine it may complicate future SQL migrations.
We update DB state on GET request. This one doesn't seem like that big of a deal since GET stays idempotent anyway. But still it feels awkward.
Anyone knows better approaches? Or maybe improvements to these two?
Update
I haven't found perfect solution, but eventually we implemented an improved version of #1. When updating state on Step#1 we also set a flag "need to rebuild Step#2", when UI opens Step#2 it first checks this flag and issues a PUT to rebuild the state, and only then it GETs Step#2.
This still means that DB state is inconsistent for some period of time. But at least we'll know this for sure from the flag in DB. And if needed - we could write migrations taking this flag into account. This also allows (if needed in the future) to create an async job to fix the state.
I think it is more flexible to separate the state and the context where the objects are stored. Any creation of a new object at any step is accompanied by the preservation of the invariant and consistency of context.
There are separate rules of states - these are rules for transition from one to another and available objects for creation and separate rules for the context, rules for its consistency, which is ensured every time it changes.
What about dirty data asynchronous cleanup?
Whenever user goes back to Step #1 and changes something, mark all related data as "dirty" (e.g. add links to it in "DirtyData" table) and be done for now.
Have a DataCleanup worker (e.g. separate thread or smth) that constantly looks for data to be cleaned up.
Before editing data for Step #2, check if the data is not dirty.
Depending on your logic, 3) might result in user error (e.g. user would need to repeat Step #2). If DataCleanup worker has enough resources (i.e. it processes DirtyData table almost instantaneously), that should happen only on very rare occasions. If that is not OK, you could opt for checking for dirty data on each fetch, but that could be expensive.
It sounds like you're familiar with the HTTP spec regarding GET requests, but for future readers:
Why shouldn't a GET request change data on the server?
Why is using a HTTP GET to update state on the server in a RESTful call incorrect?
For the other bullet under 2, we probably don't need a specification to agree that persisting valid data is preferable to persisting invalid data.
So what can we do for the bullets under 1 to avoid complex branching logic in a particular step and also circular dependencies? My suggestion is an event-driven design. When step #2 changes it should fire a change event. In this scenario, step #2 has no knowledge of the concrete listener(s) who may receive its events, so it remains decoupled from any complex handling logic.
There's probably no way to guarantee you don't forget anything in the future; but if every step in the workflow is defined as a listener, it forces you to consider change events to some extent every time you implement a new step.
One side note on granularity: if a step has many changes, it can batch up its events rather than fire each one individually. You can adjust the size for efficiency.
In summary, I would strongly consider the Observer design pattern.
I was wondering if it's a good practice to nest two transactions? For example wrapping my NHibernate transaction with TransactionScope for the benefit of the Tests (making sure that the db rollbacks all the changes that were made in the test).
The other option is to keep the entities that I insert into the Db in memory and delete them at the end of the test.
Which one is better?
First of all, nhibernate doesn't support nested transactions!
TransactionScope on the other side will not create a new transaction if there is already one opened. If you only use transaction scope, it will create a new transaction for the connection.
If you then open a transaction within the scope, this will still work with nhibernate.
Back to your question, it pretty much depends on the amount of objects you create within the TransactionScope. If it becomes too many, you will simply spam the transaction log of your database. Apart from that, the concept is perfectly fine I would say.
And one important thing to mention, if you use TransactionScope, and you create multiple sessions/transaction with nhibernate, the scope might switch to distributed transactions which requires MSDTC to run on the target server, otherwise it will simply fail.
We're building a large application with NHibernate as ORM layer. We've tried to apply as many best practices as possible, among which setting FlushMode to Never. However, this is giving us pain, for example the following scenario:
There is a table with an end date column. From this table, we delete the last (by end date) record:
The record is deleted;
After the delete, we do a (repository) query for the last record (by end date);
This last record is updated because it's the new active record.
This is a very simple scenario, of which many exist. The problem here is that when we do the query, we get the deleted record back, which of course is not correct. This roughly means that we cannot do queries in business logic that may touch the entity being inserted or deleted, because its resp. not there yet or still there.
How can I work with this scenario? Are there ways to work around this without reverting the FlushMode setting or should I just give up on the FlushMode setting all together?
How can I work with this scenario? Are there ways to work around this
without reverting the FlushMode setting
FlushMode.Never does not prevent you from manually calling Flush() when you want to deal with up-to-date data. I guess it is the way to work this scenario without changing the FlushMode
or should I just give up on the FlushMode setting all together?
Could you provide some reference on FlushMode.Never being a good practice in the general case ? Seems like FlushMode.Never is fit when dealing with large, mostly readonly, sets of objects.
http://jroller.com/tfenne/entry/hibernate_understand_flushmode_never
FlushMode.Never is a best practice only when you absolutely require fine-grained control. FlushMode.Auto will cover 99.99% of the cases without a problem. That said, decorating you CUD operations with a ISession.FLush() will not hurt as it only involves a database roundtrip if there are any CUD actions in the internal action queue
Flush mode Never means NHibernate will never flush the session, it's up to you to do that. So, session.Delete() will not actually delete the record from database, just mark the object for delete in session's cache. You can force a flush by calling session.Flush() after calling session.Delete().
I think Auto is a better option, with Auto, NHibernate will flush the session automatically before querying for data.
In the comments for the ayende's blog about the auditing in NHibernate there is a mention about the need to use a child session:session.GetSession(EntityMode.Poco).
As far as I understand it, it has something to do with the order of the SQL operation which session.Flush will emit. (For example: If I wanted to perform some delete operation in the pre-insert event but the session was already done with deleting operations, I would need some way to inject them in.)
However I did not find documentation about this feature and behavior.
Questions:
Is my understanding of child sessions correct?
How and in which scenarios should I use them?
Are they documented somewhere?
Could they be used for session "scoping"?
(For example: I open the master session which will hold some data and then I create 2 child-sessions from the master one. I'd expect that the two child-scopes will be separated but the will share objects from the master session cache. Is this the case?)
Are they first class citizens in NHibernate or are they just hack to support some edge-case scenarios?
Thanks in advance for any info.
Stefando,
NHibernate has not knowledge of child sessions, you can reuse an existing one or open a new one.
For instance, you will get an exception if you try to load the same entity into two different sessions.
The reason why it is mentioned in the blog, is because in preupdate and preinsert, you cannot load more objects in the session, you can change an allready loaded instance, but you may not navigate to a relationship property for instance.
So in the blog it is needed to open a new session just because we want to add a new auditlog entity. So in the end it's the transaction (unit of work) that manages the data.
This is a common question, but the explanations found so far and observed behaviour are some way apart.
We have want the following nHibernate strategy in our MVC website:
A SessionScope for the request (to track changes)
An ActiveRecord.TransactonScope to wrap our inserts only (to enable rollback/commit of batch)
Selects to be outside a Transaction (to reduce extent of locks)
Delayed Flush of inserts (so that our insert/updates occur as a UoW at end of session)
Now currently we:
Don't get the implied transaction from the SessionScope (with FlushAction Auto or Never)
If we use ActiveRecord.TransactionScope there is no delayed flush and any contained selects are also caught up in a long-running transaction.
I'm wondering if it's because we have an old version of nHibernate (it was from trunk very near 2.0).
We just can't get the expected nHibernate behaviour, and performance sucks (using NHProf and SqlProfiler to monitor db locks).
Here's what we have tried since:
Written our own TransactionScope (inherits from ITransactionScope) that:
Opens a ActiveRecord.TransactionScope on the Commit, not in the ctor (delays transaction until needed)
Opens a 'SessionScope' in the ctor if none are available (as a guard)
Converted our ids to Guid from identity
This stopped the auto flush of insert/update outside of the Transaction (!)
Now we have the following application behaviour:
Request from MVC
SELECTs needed by services are fired, all outside a transaction
Repository.Add calls do not hit the db until scope.Commit is called in our Controllers
All INSERTs / UPDATEs occur wrapped inside a transaction as an atomic unit, with no SELECTs contained.
... But for some reason nHProf now != sqlProfiler (selects seems to happen in the db before nHProf reports it).
NOTE
Before I get flamed I realise the issues here, and know that the SELECTs aren't in the Transaction. That's the design. Some of our operations will contain the SELECTs (we now have a couple of our own TransactionScope implementations) in serialised transactions. The vast majority of our code does not need up-to-the-minute live data, and we have serialised workloads with individual operators.
ALSO
If anyone knows how to get an identity column (non-PK) refreshed post-insert without a need to manually refresh the entity, and in particular by using ActiveRecord markup (I think it's possible in nHibernate mapping files using a 'generated' attribute) please let me know!!