Read Committed Snapshot Isolation and Transactions - sql

I am considering enabling Read Committed Snapshot Isolation on our SQL 2005 database in an attempt to gain some performance. Does setting this isolation level effect all queries regardless of whether they are using BEGIN TRAN and COMMIT TRAN? According to MSDN:
"Once snapshot isolation is enabled, updated row versions for each transaction are maintained in tempdb."
I am unclear if "transaction" means all sql queries or only queries explicitly using transactions.

Every (useful) statement runs within a transaction. If there isn't an open one when you run a particular query, then by default SQL Server opens one, runs the query, and then commits it. This is called Autocommit mode.
This behaviour can be changed so that it doesn't do that third step automatically (the commit) and leaves the transaction open. That's called Implicit Transaction Mode.

Related

Groovy set isolation level - locking UPDATE sql transcation

Is it possible to lock UPDATE transaction in pure Groovy for writing (leave it free for reading)?
DB behind is MSSQL.
I see there are ways how to do it in Java, or in the level of procedure but I am interested in the groovy way.
Possible using optimistic Transaction Isolation Levels: READ COMMITTED SNAPSHOT or SNAPSHOT. They use row versioning for reads instead of shared locks, so when a row is updated (and locked with exclusive lock), its content copied to the Version Store in tempdb, so other process doesn't wait for update finishing, but just read previous version or row from the Version Store.
Here is more reading about snaphot isolation levels: https://msdn.microsoft.com/en-us/library/tcbchxcb(v=vs.110).aspx
As both of them need Version Store in tempdb, they can't be just specified in a connection, instead of that ALTER DATABASE is needed: https://technet.microsoft.com/en-us/library/ms175095(v=sql.105).aspx

Locking the database

Hi I'm trying to see what's locking the database and found 2 types of locking. Optimistic and Pessimistic Locking. I found some articles on Wiki but I would like to know more ! Can someone explain me about those locking ? We should only use locking when we need exclusive access to something? Locking only happens when we use transaction?
Thanks in advance.
Kevin
Optimistic locking is no locking at all.
It works by noting the state the system was in before you started making your changes, and then going ahead and just making those changes, assuming (optimistically) that no one else will want to make conflicting updates. Just as you are about to atomically commit those changes, you would check if in the mean-time someone else has also updated the same data. In which case, your commit fails.
Subversion for example using optimistic locking. When you try to commit, you have to handle any conflicts, but before that, you can do on your working copy whatever you want.
Pessimistic locks work with real locks. Assuming that there will be contention, you lock everything you want to update before touching it. Everyone else will have to wait for you to commit or rollback.
When using a relational database with transaction support, the database usually takes care of locking internally (such as when you issue an UPDATE statement), so for normal online processing you do not need to handle this yourself. Only if you want to do maintenance work or large batches do you sometimes want to lock down tables.
We should only use locking when we need exclusive access to something?
You need it to prevent conflicting operations from other sessions. In general, this means updates. Reading data can normally go on concurrently.
Locking only happens when we use transaction?
Yes. You will accumulate locks while proceeding with your transaction, releasing all of them at the end of it. Note that a single SQL command in auto-commit mode is still a transaction by itself.
Transactions isolation levels also specify the locking behaviour. BOL refers:Transaction isolation levels control:
Whether locks are taken when data is read, and what type of locks are requested.
How long the read locks are held.
Whether a read operation referencing rows modified by another transaction:
Blocks until the exclusive lock on the row is freed.
Retrieves the committed version of the row that existed at the time the statement or transaction started.
Reads the uncommitted data modification.
The default levels are:
Read uncommitted (the lowest level where transactions are isolated only enough to ensure that physically corrupt data is not read)
Read committed (Database Engine default level)
Repeatable read
Serializable (the highest level, where transactions are completely isolated from one another)

Default SQL Server IsolationLevel Changes

we have a customer that's been experiencing some blocking issues with our database application. We asked them to run a Blocked Process Report trace and the trace they gave us shows blocking occurring between a SELECT and UPDATE operation. The trace files show the following:
The same SELECT query is being executed at different isolation levels. One trace shows a Serializable IsolationLevel while a later trace shows a RepeatableRead IsolationLevel. We do not use an explicit transaction while executing the query.
The UPDATE query is being executed with a RepeatableRead isolation level but is being blocked by the SELECT query. This is expected as our updates are wrapped in an explicit transaction with IsolationLevel of RepeatableRead.
So basically we're at a loss as to why the Isolation Level of the SELECT query would not be the default ReadCommitted IsolationLevel but, even more confusingly, why the IsolationLevel of the query would change over time? It is only one customer that is seeing this behaviour so we suspect it may be a database configuration issue.
Any ideas?
Thanks in advance,
Graham
In your scenario, I would recommend explicitly setting isolation level to snapshot - that will prevent read from getting in the way of writes (inserts and updates) by preventing locks, yet those read would still be "good" reads (i.e. not dirty data - it is not the same as a NOLOCK)
Generally i find that where i have locking issues with my queries, i manually control the lock applied. e.g. i would do updates with row-level locks to avoid page/table level locking, and set my reads to readpast (accepting that i may miss some data, in some scenarios that might be ok)
link|edit|delete|flag
EDIT-- Combining all the comments into the answer
As part of the optimisation process, sql server avoids getting commited reads on a page that it know hasn't changed, and automatically falls back to a lesser locking strategy. In your case, sql server drops from a serializable read to a repeatable read.
Q: Thanks for that useful info regarding dropping Isolation Levels. Can you think of any reason that it would use Serializable IsolationLevel in the first place, given that we don't use an explicit transaction for the SELECT - it was our understanding that the implicit transaction would use ReadCommitted?
A: By default, SQL Server will use Read Commmited if that is your default isolation level BUT if you do not additionally specify a locking strategy in your query, you are basically saying to sql server "do what you think is best, but my preference is Read Commited". Since SQL Server is free to choose, so it does in order to optimise the query. (The optimisation algorithm in sql server is very complex and i do not fully understand it myself). Not explicitly executing within a transaction does not, afaik, affect the isolation level that sql server uses.
Q: One last thing, does it seem reasonable that SQL Server would increase the Isolation Level (and presumably the number of locks required) to optimise the query? I'm also wondering whether the reuse of a pooled connection would affect this if it inherited the last used Isolation Level?
A: Sql server will do that as part of a process called "Lock Escalation". From http://support.microsoft.com/kb/323630, i quote: "Microsoft SQL Server dynamically determines when to perform lock escalation. When making this decision, SQL Server takes into account the number of locks that are held on a particular scan, the number of locks that are held by the whole transaction, and the memory that is being used for locks in the system as a whole. Typically, SQL Server's default behavior results in lock escalation occurring only at those points where it would improve performance or when you must reduce excessive system lock memory to a more reasonable level. However, some application or query designs may trigger lock escalation at a time when it is not desirable, and the escalated table lock may block other users".
Although lock escalation is not exactly the same thing as changing the isolation level a query runs under, this surprises me because i would not have expected sql server to take more locks than what the default isolation level permits.
More info regarding why SQL would take more locks by escalating: this is incorrect, escalating reduces (not increases) the number of locks required. A table lock is a single lock vs. all the page or row locks required to do the same from a lower level. Lock escalation is always done for one reason: it's more efficient to take a higher level lock than to lock all the lower-level objects
For example, perhaps there is no index available to lock efficiently against. I.e. if you take a count with UPDLOCK on all records with a year of 2010 in a field, and there is no index on that date field, this will require a row lock on each record in 2010, which is not efficient if many records are hit, and a page lock will not help either since they are presumably distributed randomly across pages, therefore SQL takes a table lock. Moreover, SQL MUST also lock other records from changing to being in the year 2010 while the UPDLOCK is held, and with no index on this field to do a range lock, SQL has NO CHOICE but to take a table lock to prevent this from happening. This latter point is one often missed by those new to optimization: the realization that SQL must also "protect" the integrity of the queries already executed in the transaction.

Using IsolationLevel.Snapshot but DB is still locking

I'm part of a team building an ADO.NET based web-site. We sometimes have several developers and an automated testing tool working simultaneously a development copy of the database.
We use snapshot isolation level, which, to the best of my knowledge, uses optimistic concurrency: rather than locking, it hopes for the best and throws an exception if you try to commit a transaction if the affected rows have been altered by another party during the transaction.
To use snapshot isolation level we use:
ALTER DATABASE <database name>
SET ALLOW_SNAPSHOT_ISOLATION ON;
and in C#:
Transaction = SqlConnection.BeginTransaction(IsolationLevel.Snapshot);
Note that IsolationLevel Snapshot isn't the same as ReadCommitted Snapshot, which we've also tried, but are not currently using.
When one of the developers enters debug mode and pauses the .NET app, they will hold a connection with an active transaction while debugging. Now, I'd expect this not to be a problem - after all, all transactions are using snapshot isolation level, so while one transaction is paused, other transactions should be able to proceed normally since the paused transaction isn't holding any locks. Of course, when the paused transaction completes, it is likely to detect a conflict; but that's acceptable so long as other developers and the automated tests can proceed unhindered.
However, in practice, when one person halts a transaction while debugging, all other DB users attempting to access the same rows are blocked despite using snapshot isolation level.
Does anybody know why this occurs, and/or how I can achieve true optimistic (non-blocking) concurrency?
The resolution (an unfortunate one for me): Remus Rusanu noted that writers always block other writers; this is backed up by MSDN - it doesn't quite come out and say so, but only ever mentions avoiding reader-writer locks. In short, the behavior I want isn't implemented in SQL Server.
SNAPSHOT isolation level affects, like all isolation levels, only reads. Writes are still blocking each other. If you believe that what you see are read blocks, then you should investigate further and check out the resource types and resource names on which blocking occurs (wait_type and wait_resource in sys.dm_exec_requests).
I wouldn't advise in making code changes in order to support a scenario that involves developers staring at debugger for minutes on end. If you believe that this scenario can repeat in production (ie. client hangs) then is a different story. To achieve what you want you must minimize writes and perform all writes at the end of transaction, in one single call that commits before return. This way no client can hold X locks for a long time (cannot hang while holding X locks). In practice this is pretty hard to pull off and requires a lot of discipline on the part of developers in how they write the data access code.
Have you looked at the locks when one developer pauses the transaction? Also, just turning on snapshot isolation level does not have much effect. Have you set ALLOW_SNAPSHOT_ISOLATION ON?
Here are the steps:
ALTER DATABASE <databasename>
SET READ_COMMITTED_SNAPSHOT ON;
GO
ALTER DATABASE <database name>
SET ALLOW_SNAPSHOT_ISOLATION ON;
GO
After the database has been enabled for snapshot isolation, developers and users must then request that their transactions be run in this snapshot mode. This must be done before starting a transaction, either by a client-side directive on the ADO.NET transaction object or within their Transact-SQL query by using the following statement:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
Raj

Is there a difference between commit and rollback in a transaction only having selects?

The in-house application framework we use at my company makes it necessary to put every SQL query into transactions, even though if I know that none of the commands will make changes in the database. At the end of the session, before closing the connection, I commit the transaction to close it properly. I wonder if there were any particular difference if I rolled it back, especially in terms of speed.
Please note that I am using Oracle, but I guess other databases have similar behaviour. Also, I can't do anything about the requirement to begin the transaction, that part of the codebase is out of my hands.
Databases often preserve either a before-image journal (what it was before the transaction) or an after-image journal (what it will be when the transaction completes.) If it keeps a before-image, that has to be restored on a rollback. If it keeps an after-image, that has to replace data in the event of a commit.
Oracle has both a journal and rollback space. The transaction journal accumulates blocks which are later written by DB writers. Since these are asychronous, almost nothing DB writer related has any impact on your transaction (if the queue fills up, then you might have to wait.)
Even for a query-only transaction, I'd be willing to bet that there's some little bit of transactional record-keeping in Oracle's rollback areas. I suspect that a rollback requires some work on Oracle's part before it determines there's nothing to actually roll back. And I think this is synchronous with your transaction. You can't really release any locks until the rollback is completed. [Yes, I know you aren't using any in your transaction, but the locking issue is why I think a rollback has to be fully released then all the locks can be released, then your rollback is finished.]
On the other hand, the commit is more-or-less the expected outcome, and I suspect that discarding the rollback area might be slightly faster. You created no transaction entries, so the db writer will never even wake up to check and discover that there was nothing to do.
I also expect that while commit may be faster, the differences will be minor. So minor, that you might not be able to even measure them in a side-by-side comparison.
I agree with the previous answers that there's no difference between COMMIT and ROLLBACK in this case. There might be a negligible difference in the CPU time needed to determine that there's nothing to COMMIT versus the CPU time needed to determine that there's nothing to ROLLBACK. But, if it's a negligible difference, we can safely forget about about it.
However, it's worth pointing out that there's a difference between a session that does a bunch of queries in the context of a single transaction and a session that does the same queries in the context of a series of transactions.
If a client starts a transaction, performs a query, performs a COMMITor ROLLBACK, then starts a second transaction and performs a second query, there's no guarantee that the second query will observe the same database state as the first query. Sometimes, maintaining a single consistent view of the data is of the essence. Sometimes, getting a more current view of the data is of the essence. It depends on what you are doing.
I know, I know, the OP didn't ask this question. But some readers may be asking it in the back of their minds.
In general a COMMIT is much faster than a ROLLBACK, but in the case where you have done nothing they are effectively the same.
The documentation states that:
Oracle recommends that you explicitly end every transaction in your application programs with a COMMIT or ROLLBACK statement, including the last transaction, before disconnecting from Oracle Database. If you do not explicitly commit the transaction and the program terminates abnormally, then the last uncommitted transaction is automatically rolled back. A normal exit from most Oracle utilities and tools causes the current transaction to be committed. A normal exit from an Oracle precompiler program does not commit the transaction and relies on Oracle Database to roll back the current transaction.
http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/statements_4010.htm#SQLRF01110
If you want o choose to do one or the other then you might as well do the one that is the same as doing nothing, and just commit it.
Well, we must take into account what an SELECT returns in Oracle. There are two modes. By default an SELECT returns data as that data looked in the very moment the SELECT statement started executing (this is default behavior in READ COMMITTED isolation mode, the default transactional mode). So if an UPDATE/INSERT was executed after SELECT was issued that won't be visible in result set.
This can be a problem if you need to compare two result sets (for example debta and credit sides of an general ledger app). For that we have a second mode. In that mode SELECT returns data as it looked at the moment the current transaction began (default behavior in READ ONLY and SERIALIZABLE isolation levels).
So, at least sometimes it is necessary to execute SELECTs in transaction.
Since you've not done any DML, I suspect there'd be no difference between a COMMIT and ROLLBACK in Oracle. Either way there's nothing to do.
I'd think a Commit would be more efficient; since generally you'd expect most DB transactions to be committed; so you would think the DB optimizes for this case (as opposed to trying to be more efficient for a rollback).