I am currently facing transaction deadlock issues in Nhibernate data layer. The scenario is :
I have a large transaction table T1. This table undergoes frequent write/update operations. Also, a service frequently reads this table(based on filter) and merges the data with client cache(on client machine). The frequency of read is very high.
At times (does not follow pattern) , the deadlock issue surfaces.
How can I trace this issue?(I have dbo role). Is there any Nhibernate setting that can help in this context?
As your service only reads the table, you could change your application to use an optimistic lock approach.
More about optimistic locking with NH: http://www.google.com.br/url?sa=t&source=web&cd=5&ved=0CDMQFjAE&url=http%3A%2F%2Fknol.google.com%2Fk%2Ffabio-maulo%2Fnhibernate-chapter-5-basic-o-r-mapping%2F1nr4enxv3dpeq%2F8&ei=biB9TLjMNIL88AbQ4smNBw&usg=AFQjCNHpZ80cpxa6IqzxILfQU9XACQjbYA&sig2=md9f4mYYnvFPowB-RZbuag.
Filipe
Related
Program to insert into my 2 tables is written through Entity Framework and to SELECT the data is through a STORED PROC at SQL SERVER level. There is a point when SELECT and INSERT is getting done at the same time simultaneously. And when hitting that point, I got the below error:
Transaction (Process ID) was deadlocked on resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
How can I get rid of this DEADLOCK problem here? Need the best way to solve it.
Option 1: Implementing NOLOCK? What would be the PROS and CONS here for it?
Option 2: IF there is any way to exceed the DEADLOCK wait time so that it can wait for the resource for a longer time than usually it does? If yes, then HOW?
Option 3: Suggest Me?
Thanks,
Rahuul Dutta
A deadlock cannot be cured by increasing lock timeout. The resources are locked in such a way that it cannot be resolved by itself, regardless of how much time you can give it. A special background process in SQL Server, a deadlock monitor, periodically (rather often, actually) runs and if it identifies a deadlock it kills the 'lighter' transaction immediatelly.
The deadlocks are usually dealt with in one of several ways: by providing an alternative data access path for the SELECT query (ie adding a mnnclustered index), minimizing the transaction duration (by better indexing, again), or using one of snapshot isolation levels.
The least effort solution here will be setting the read committed snapshot isolation level. This way the SELECT query will not issue any shared locks on data, but still read only the committed data, which is a huge plus over using the NOLOCK hint (or read uncommitted isolation level).
You can change your transaction isolation level. Best option for deadlocks would be snapshot isolation i think. If you cannot turn this option on in your server or if you run into I/O issues, read committed should still prevent deadlocks from read/write dependencies. Make sure that you don't run into anomalies, read committed will allow non-repeatable reads and phantom reads.
First of all, thanks a lot for your precious answers!
With the help of your answers, some research and a call with Microsoft DBA team, I have got the following solution.
Solution: To implement this solution we have to change the database property to Read Committed Snapshot. This will help the Select statements in avoiding the blocks in case of locks by other sessions on the same table.
- To cater this solution the database will create a snapshot of the data in tempdb. Therefore we must have sufficient space in tempdb. Also if possible we must shift the tempdb to a new disk to split the I/O. This will improve the performance.
The following kb article helps in enabling Read Committed Snapshot property of the database:
http://technet.microsoft.com/en-us/library/ms175095(v=SQL.105).aspx
Alternately we can change this property through SSMS by right clicking the database---options---Miscellaneous----Is Read Committed Snapshot On. We have to change the value of this property to TRUE.
We do not have to restart the server to enable this property however we must note that 'When setting the READ_COMMITTED_SNAPSHOT option, only the connection executing the ALTER DATABASE command is allowed in the database. There must be no other open connection in the database until ALTER DATABASE is complete. The database does not have to be in single-user mode.'
This means we need a small amount of downtime from the application side.
Hope the MOM above would help you all too. :)
Thanks,
Rahuul Dutta
My app needs to batch process 10M rows, the result of a complex SQL query that join tables.
I'll plan to be iterating a resultset, reading a hundred per iteration.
To run this on a busy OLTP production DB and avoid locks, I figured I'll query with a READ UNCOMMITTED isolation level.
Would that get the query out of the way of any DB writes? avoiding any rows/table locks?
My main concern is my query blocking any other DB activity, I'm far less concerned with the other way around.
Side Notes:
1. I'll be reading historical data, so I'm unlikely to meet uncommitted data. It's OK if I do.
2. The iteration process could take hours. The DB connection would remain open through this process.
3. I'll have two such concurrent batch instances at most.
4. I can tolerate dup rows. (by product of read uncommitted).
5. DB2 is the target DB, but I want a solution that fits other DBs vendors as well.
6. Will snapshot isolation level help me clear out server memory?
Have you actually encountered any real locks on read?
As far as I'm concerned, the only reason that READ UNCOMMITED existed in SQL standard was to allow non-locking reads. So I don't know DB2, but I blindly bet that it does not lock data during read in READ UNCOMMITED mode. Most modern RDBMS systems however don't lock data at all during read (*). So READ UNCOMMITED is either not available (in Oracle, for example) or is silently promoted to READ COMMITED (PostgreSQL).
If you can freely choose the engine, either check DB2 transaction isolation level handling or go for Oracle/PostgreSQL/other.
(*) More precisely, they don't exclusively lock the data. Some shared locks can be placed on queried tables so no DDL alters them during read.
My answer applies to SQL Server.
Read committed releases lock after every row read (approximately). Locking is probably not your problem.
I recommend you use the safer READ COMMITTED. Better yet, use snapshot isolation. That removes many locking problems. There are disadvantages as well, sou you better read a little about it.
My main concern is my query blocking any other DB activity
Snapshot isolation makes all locking concerns go away for read-only transactions. No blocking either way, full data consistency. Be aware that long-running transactions can cause TempDB to fill with snapshot versions.
The DB connection would remain open through this process.
That's a problem because a network hiccup, app deployment or mirroring failover would kill your batch process.
Be aware, that read uncommitted can cause queries to sporadically fail outright. You need retry logic or tolerate failed jobs.
In sql server Transaction isolation level Read uncommitted cause no lock on table.
I am dealing with a strange issue related to NHibernate and distributed transactions in a WCF service. See Deadlocks causing 'Server failed to resume the transaction' with NHibernate and distributed transactions for more details.
One thing that seems to solve my problem is using NHibernate's AdoNetTransactionFactory, instead of AdoNetWithDistributedTransactionsFactory.
I believe that the AdoNetWithDistributedTransactionsFactory is involved with making NHibernate's second-level caching mechanism work right, but we're not using that. What (if any) other problems exist with using AdoNetTransactionFactory with distributed transactions?
Thanks for your time!
I notice that you mentioned from your other question/answer:
SqlConnection class is not thread-safe, and that includes closing the connection
on a separate thread. Based on this response we have filed a
bug report for NHibernate.
However, from NHibernate's documentation:
11.2. Threads and connections
You should observe the following practices when creating NHibernate Sessions:
Never create more than one concurrent ISession or ITransaction instance per database connection.
Be extremely careful when creating more than one ISession per database per transaction. The ISession itself keeps track of updates made to loaded objects, so a different ISession might see stale data.
The ISession is not threadsafe! Never access the same ISession in two concurrent threads. An ISession is usually only a single unit-of-work!
If you are trying to multi-thread the connection with NHibernate perhaps it is just not going to work. Have you considered a different ORM such as Entity Framework?
No matter what ORM you choose though, the database connection will not be thread safe. This is universal.
"many DB drivers are not thread safe. Using a singleton means that if you have many threads, they will all share the same connection. The singleton pattern does not give you thread saftey. It merely allows many threads to easily share a "global" instance." - https://stackoverflow.com/a/6507820/1026459
Using AdoNetTransactionFactory with distributed system transactions will cause those transaction to be ignored by NHibernate, which has the following consequences:
ConnectionReleaseMode.AfterTransaction will not be honored. Instead, NHibernate will release the connection after each statement, and so will re-acquire a connection from the pool for the next one. Depending on your data provider, this may trigger escalation of the transaction to distributed.
FlushMode.Commit will not be honored. Explicit flushes will be required instead. (Auto flushes before queries may still occur.)
Works needing to be isolated from current system transaction will still be included inside it. (Unless the connection string Enlist property is false.) Such works may include id generators queries such as retrieving the next high value for a table hilo generator. If the transaction gets roll-backed, NHibernate may then use conflicting ids.
The NHibernate session will not be able to correctly track locks it holds on entities. Considering itself outside of a transaction, it will consider it has no lock on them. So it may try (on user code request by example) to re-lock them with lower lock level than the one the transaction already holds on them in database. Not sure what outcome could result of that. (At best, ignored, at worst...)
Second level cache will be disabled as soon as you start modifying data. NHibernate sort of "invalidate" cache entries in such situation, and re-enable them only on transaction completion, updated. But since it will not be aware of transactions...
Some extensions (maybe Envers) may rely on NHibernate transaction events, and will no more work as expected.
I strongly recommend upgrading to nhibernate 3.2(or a version close to it). Why? Since 2.1, there has been significant improvements (read rewrite) to the AdoNetWithDistributedTransactionFactory. Matter of fact, it now handles TransactionScopes/ambient-transactions and the like correctly. When we ran 2.1 in production we encounter many issues related to distributed transactions. We pretty much had to fix a ton of stuff ourselves and recompile NHibernate. 3.2 seems to have fixed many issues around the subject.
I don't have the source near me but, if memory doesn't fail me, the AdoNetTransactionFactory doesn't check/handle ambient transactions. So, you are down to NHibernate booting transactions when one is not present in the session(by means of ISession.BeginTransaction()).
I have several threads executing some SQL select queries with serializable isolation level. I am not sure which implementation to choose. This:
_repository.Select(...)
or this
lock (_lockObject)
{
_repository.Select(...);
}
In other words, is it possible several transactions will start executing at the same time and partially block records inside Select operation range.
P. S. I am using MySQL but I guess it is a more general question.
Transactions performing SELECT queries place a shared lock on the rows, permitting other transactions to read those rows, but preventing them from making changes to the rows (including inserting new records into the gaps)
Locking in the application is doing something else, it will not allow other threads to enter the code block which fetches the data from the repository, This approach can lead to very bad performance for a few reasons:
If any of the rows are locked by another transaction (outside the application) via a exclusive lock, the lock in the application will not help.
Multiple transactions will not be able to perform reads even on rows that are not locked in exclusive mode (not being updated).
The lock will not be released until all the data is fetched and returned to the client. This includes the network latency and any other overhead that it takes converting the MySql result set to a code object.
Most importantly, Enforcing data integrity & atomicity is the databases job, it knows how to handle it very well, how to detect potential deadlocks. When to perform record locks, and when to add Index gap locks. It is what databases are for, and MySql is ACID complaint and is proven to handle these situations
I suggest you read through Section 13.2.8. The InnoDB Transaction Model and Locking of the MySql docs, it will give you a great insight how locking in InnoDB is performed.
From this post. One obvious problem is scalability/performance. What are the other problems that transactions use will provoke?
Could you say there are two sets of problems, one for long running transactions and one for short running ones? If yes, how would you define them?
EDIT: Deadlock is another problem, but data inconsistency might be worse, depending on the application domain. Assuming a transaction-worthy domain (banking, to use the canonical example), deadlock possibility is more like a cost to pay for ensuring data consistency, rather than a problem with transactions use, or you would disagree? If so, what other solutions would you use to ensure data consistency which are deadlock free?
It depends a lot on the transactional implementation inside your database and may also depend on the transaction isolation level you use. I'm assuming "repeatable read" or higher here. Holding transactions open for a long time (even ones which haven't modified anything) forces the database to hold on to deleted or updated rows of frequently-changing tables (just in case you decide to read them) which could otherwise be thrown away.
Also, rolling back transactions can be really expensive. I know that in MySQL's InnoDB engine, rolling back a big transaction can take FAR longer than committing it (we've seen a rollback take 30 minutes).
Another problem is to do with database connection state. In a distributed, fault-tolerant application, you can't ever really know what state a database connection is in. Stateful database connections can't be maintained easily as they could fail at any moment (the application needs to remember what it was in the middle of doing it and redo it). Stateless ones can just be reconnected and have the (atomic) command re-issued without (in most cases) breaking state.
You can get deadlocks even without using explicit transactions. For one thing, most relational databases will apply an implicit transaction to each statement you execute.
Deadlocks are fundamentally caused by acquiring multiple locks, and any activity that involves acquiring more than one lock can deadlock with any other activity that involves acquiring at least two of the same locks as the first activity. In a database transaction, some of the acquired locks may be held longer than they would otherwise be held -- to the end of the transaction, in fact. The longer locks are held, the greater the chance for a deadlock. This is why a longer-running transaction has a greater chance of deadlock than a shorter one.
One issue with transactions is that it's possible (unlikely, but possible) to get deadlocks in the DB. You do have to understand how your database works, locks, transacts, etc in order to debug these interesting/frustrating problems.
-Adam
I think the major issue is at the design level. At what level or levels within my application do I utilise transactions.
For example I could:
Create transactions within stored procedures,
Use the data access API (ADO.NET) to control transactions
Use some form of implicit rollback higher in the application
A distributed transaction in (via DTC / COM+).
Using more then one of these levels in the same application often seems to create performance and/or data integrity issues.