We are trying to implement retry logic to recover from transient errors in Azure environment.
We are using long-running sessions to keep track and commit the whole bunch of changes at the end of application transaction (which may spread over several web-requests). Along the way we need to get additional data from database. Our main problem is that we can't easily recover from db error because we can't "replay" all user actions.
So far we used straightforward recovery algorithm:
Try to perform operation in long-running session
In case of error, close the session, open a new one and merge entities into it
Retry the operation
It's very expensive approach in terms of time (merge is really long for big entity hierarchies). So we'd like to optimize things a little.
We'd like to perform query operations in separate session (to keep long running one untouched and safe) and on success, merge results back to the long-running session. Retry is relatively simple here - we just need to open new session and run query once more. However, with this approach we have an issue with initializing lazy properties/collections:
If we do this in separate session, we need to merge results back (a lot of entities) but merge could fail and break the long-running session
We tried different ways of "moving" original entity to different session, loading details and returning it back, but without success (evict, replicate, etc.)
There is known statement that session should be discarded in case of exception. However, the example shows write operation. Is it still true for read ones? I mean if I guarantee that no data is written back to the database, can I reuse the same session to run query again?
Do you have any other suggestions about retry logic with long-running sessions?
IMO there's no way to solve your issue. It's gonna take a lot of time to commit everything or you will have to do a lot of work to break it up into smaller sessions and handle every error that can occur while merging.
To answer your question about using the session after an exception: you cannot trust ANYTHING anymore inside this session, not even loaded entities.
Read this paragraph from Ayende's article about building a simple todo app with a recoveryplan in case of an exception in the session:
Then there is the problem of error handling. If you get an exception
(such as StaleObjectStateException, because of concurrency conflict),
your session and its loaded entities are toast, because with
NHibernate, an exception thrown from a session moves that session into
an undefined state. You can no longer use that session or any loaded
entities. If you have only a single global session, it means that you
probably need to restart the application, which is probably not a good
idea.
Related
I am trying to implement downloading of bulk data from several tables on the server.
In my case there are 16 tables. For all these tables I will be firing 10 requests to the server. This means I have done a bit of logical groupings for related tables, but it is like all tables are inter-related with each other through one or the other relationship.
I need to consider three cases while doing downloading:
Saving data to each table at local.
Managing relationships between inserted objects.
Handling situation when one of the requests fails during download, say 8th request failed.
I will be following this approach for each response:
Inserting data in managed object context.
Managing relationships by firing NSPredicate and associating the related objects.
Saving the context.
In case of a response failure, I have two options:
Next time continue from the failed response.
Revert all saved data to its previous state.
1st approach may lead to some data inconsistency, so I am going with 2nd approach.
I know that if a managed object context is not saved, we can revert the changes, but
is it possible to revert the changes, if the managed object context is
saved?
I require some useful answers from the community.
Please suggest.
Is it possible to revert the changes, if the managed object context is saved?
After saving? Maybe, but it could be tricky. If you set up a separate managed object context for your network operations, and give it an NSUndoManager, you could later on tell the undo manager to roll everything back to the previous state.
It would be simpler to just not save changes until you're finished, though. Using an undo manager doesn't really help much-- the memory needed to store up all the undo actions will at least match the memory use from keeping all of the unsaved changes around until you're finished. If you're working on a separate managed object context (whether a child context or a completely separate context), handling the error case is as simple as letting the MOC get deallocated without saving changes first.
I have been having deadlock issues. I've been on working some retry approaches. My retry code is currently just a 'for' statement that tries 5 times. I understand i need to use the 'Evit' nhibernate method to clear the session. I am using a session factory and use a transaction for each request.
In the below example if i experience a deadlock on the first retry will the orderNote property remain the same on the second retry?
private ActionResult OrderDetails(int id)
{
var order = _orderRepository.Get(id);
order.OrderNote = "will this text remain";
Retry.Times(5).Do(() => _orderRepository.Update(order));
return View();
}
Edit
1) Finding it hard to trace the cause. I'm getting about 10 locks a day all over my application. Just set up a profiler. Are there any other useful methods for tracing
http://msdn.microsoft.com/en-us/library/ms190465.aspx
I think the main issue is that i'm using auto increament. I'm in the process of moving to hilo.
2) Using a different transation mode. I'm not defining any at the moment. What is recommended.
5) Long running operations. Yes i do. And i think because i'm using auto increament lazy loading is ignored. Does that sound correct?
In my opinion your code is trying to fix the symptoms instead of the cause.
You will be better off doing some of the following things:
Find out why you are getting deadlocks and fix the core issue
Use a different transaction mode to read past locks
Look at delegating the update into a queue structure to be background processed
Understand the update execution plan and perhaps add indexing to speed up queries
Do you have any "long" running operations in your Controller action which is keeping the transaction open for longer than it should be?
Even if the operation did deadlock, why don't you return an friendly error back to the calling page and let them manually retry.
Update:
1.) Useful methods for tracing
I have used this method for tracing deadlocks which should give you an idea of the resources which are in contention: Tracing Deadlocks
You can also look at the concurreny models available to you: NHibernate Concurrency
2.) Transaction Isolation Levels
Depending on your DB this Question has some useful information: Transaction Isolation Mode
3.) Long Running Operations
I have to use Identity Columns as my primary keys in NHibernate and I don't think these are going to be source of your problem in an update scenario as the Id/PK is already set by this point. Try to minimise the long running operations which will shorten the amount of time your transaction is held open.
We use one (read-only) session which we disconnect as soon as we retrieve the data from the database. The data retrieved, often has lazy-loaded properties which are not initialized yet.
When we try to access the properties, the following exception gets thrown:
NHibernate.LazyInitializationException
Initializing[NHibernateTest.AppUser#16]-failed to lazily initialize a collection of role: NHibernateTest.AppUser.Permissions, session is disconnected
Is there a way (interceptor) to automatically detect that the application is trying to access an uninitialized property, so that the interceptor can quickly open the connection and close it after the unit of work?
Fetching everything at once would nullify the usage of laziness.
There is no efficient way to do that. The idea is that you keep the session open until your done with the session. There should be one session per unit of work. (a session is kind of unit of work actually).
Fetching everything your need in one query is more efficient than fetching everything you need in multiple queries, so I don't agree with your last statement. Lazy loading is useful for lazy programmers (like me) but is never more efficient than eager loading. Lazy loading can save you some programming time, but you still have to watch out for to many queries being executed (select N+1)
This is a common question, but the explanations found so far and observed behaviour are some way apart.
We have want the following nHibernate strategy in our MVC website:
A SessionScope for the request (to track changes)
An ActiveRecord.TransactonScope to wrap our inserts only (to enable rollback/commit of batch)
Selects to be outside a Transaction (to reduce extent of locks)
Delayed Flush of inserts (so that our insert/updates occur as a UoW at end of session)
Now currently we:
Don't get the implied transaction from the SessionScope (with FlushAction Auto or Never)
If we use ActiveRecord.TransactionScope there is no delayed flush and any contained selects are also caught up in a long-running transaction.
I'm wondering if it's because we have an old version of nHibernate (it was from trunk very near 2.0).
We just can't get the expected nHibernate behaviour, and performance sucks (using NHProf and SqlProfiler to monitor db locks).
Here's what we have tried since:
Written our own TransactionScope (inherits from ITransactionScope) that:
Opens a ActiveRecord.TransactionScope on the Commit, not in the ctor (delays transaction until needed)
Opens a 'SessionScope' in the ctor if none are available (as a guard)
Converted our ids to Guid from identity
This stopped the auto flush of insert/update outside of the Transaction (!)
Now we have the following application behaviour:
Request from MVC
SELECTs needed by services are fired, all outside a transaction
Repository.Add calls do not hit the db until scope.Commit is called in our Controllers
All INSERTs / UPDATEs occur wrapped inside a transaction as an atomic unit, with no SELECTs contained.
... But for some reason nHProf now != sqlProfiler (selects seems to happen in the db before nHProf reports it).
NOTE
Before I get flamed I realise the issues here, and know that the SELECTs aren't in the Transaction. That's the design. Some of our operations will contain the SELECTs (we now have a couple of our own TransactionScope implementations) in serialised transactions. The vast majority of our code does not need up-to-the-minute live data, and we have serialised workloads with individual operators.
ALSO
If anyone knows how to get an identity column (non-PK) refreshed post-insert without a need to manually refresh the entity, and in particular by using ActiveRecord markup (I think it's possible in nHibernate mapping files using a 'generated' attribute) please let me know!!
Do you know of any ORM tool that offers deadlock recovery? I know deadlocks are a bad thing but sometimes any system will suffer from it given the right amount of load. In Sql Server, the deadlock message says "Rerun the transaction" so I would suspect that rerunning a deadlock statement is a desirable feature on ORM's.
I don't know of any special ORM tool support for automatically rerunning transactions that failed because of deadlocks. However I don't think that a ORM makes dealing with locking/deadlocking issues very different. Firstly, you should analyze the root cause for your deadlocks, then redesign your transactions and queries in a way that deadlocks are avoided or at least reduced. There are lots of options for improvement, like choosing the right isolation level for (parts) of your transactions, using lock hints etc. This depends much more on your database system then on your ORM. Of course it helps if your ORM allows you to use stored procedures for some fine-tuned command etc.
If this doesn't help to avoid deadlocks completely, or you don't have the time to implement and test the real fix now, of course you could simply place a try/catch around your save/commit/persist or whatever call, check catched exceptions if they indicate that the failed transaction is a "deadlock victim", and then simply recall save/commit/persist after a few seconds sleeping. Waiting a few seconds is a good idea since deadlocks are often an indication that there is a temporary peak of transactions competing for the same resources, and rerunning the same transaction quickly again and again would probably make things even worse.
For the same reason you probably would wont to make sure that you only try once to rerun the same transaction.
In a real world scenario we once implemented this kind of workaround, and about 80% of the "deadlock victims" succeeded on the second go. But I strongly recommend to digg deeper to fix the actual reason for the deadlocking, because these problems usually increase exponentially with the number of users. Hope that helps.
Deadlocks are to be expected, and SQL Server seems to be worse off in this front than other database servers. First, you should try to minimize your deadlocks. Try using the SQL Server Profiler to figure out why its happening and what you can do about it. Next, configure your ORM to not read after making an update in the same transaction, if possible. Finally, after you've done that, if you happen to use Spring and Hibernate together, you can put in an interceptor to watch for this situation. Extend MethodInterceptor and place it in your Spring bean under interceptorNames. When the interceptor is run, use invocation.proceed() to execute the transaction. Catch any exceptions, and define a number of times you want to retry.
An o/r mapper can't detect this, as the deadlock is always occuring inside the DBMS, which could be caused by locks set by other threads or other apps even.
To be sure a piece of code doesn't create a deadlock, always use these rules:
- do fetching outside the transaction. So first fetch, then perform processing then perform DML statements like insert, delete and update
- every action inside a method or series of methods which contain / work with a transaction have to use the same connection to the database. This is required because for example write locks are ignored by statements executed over the same connection (as that same connection set the locks ;)).
Often, deadlocks occur because either code fetches data inside a transaction which causes a NEW connection to be opened (which has to wait for locks) or uses different connections for the statements in a transaction.
I had a quick look (no doubt you have too) and couldn't find anything suggesting that hibernate at least offers this. This is probably because ORMs consider this outside of the scope of the problem they are trying to solve.
If you are having issues with deadlocks certainly follow some of the suggestions posted here to try and resolve them. After that you just need to make sure all your database access code gets wrapped with something which can detect a deadlock and retry the transaction.
One system I worked on was based on “commands” that were then committed to the database when the user pressed save, it worked like this:
While(true)
start a database transaction
Foreach command to process
read data the command need into objects
update the object by calling the command.run method
EndForeach
Save the objects to the database
If not deadlock
commit the database transaction
we are done
Else
abort the database transaction
log deadlock and try again
EndIf
EndWhile
You may be able to do something like with any ORM; we used an in house data access system, as ORM were too new at the time.
We run the commands outside of a transaction while the user was interacting with the system. Then rerun them as above (when you use did a "save") to cope with changes other people have made. As we already had a good ideal of the rows the command would change, we could even use locking hints or “select for update” to take out all the write locks we needed at the start of the transaction. (We shorted the set of rows to be updated to reduce the number of deadlocks even more)