Isolation levels and select for update - sql

What is the relation between isolation levels of relational databases and select for update?
If I use the plain vanilla JDBC connectivity with SQL Server and set isolation level to READ_REPEATABLE and use a simple select, would I see inconsistency in repeatable reads or not? Or should I use select for update always to avoid inconsistent repeatable reads in transactions? If so, what is the deal with isolation levels, how they would come into play?

SQL Server doesn't have the select ... for update syntax. The equivalent in SQL Server is to use the UPDLOCK table hint.
This hint is used when reading a row and immediately updating the same row in an atomic transaction. EG
declare #balance = (select balance from account where accountId = #id)
update account set balance = #balance + #amount where accountId = #id
At READ COMMITTED, or in any isolation level without a mult-statement transaction, multiple sessions can run the first query, and have lost updates when updating the balance.
Using the REPEATABLE READ or SERIALIZABLE isolation level will prevent this update anomaly, but they do it by blocking the first writer if there are any concurrent transactions that have read the row, and if one of the other transactions attempts to update the row, causing a deadlock.
Mostly this behavior is not worth the performance cost and annoyance of handling deadlocks. So you use 'select for update' aka UPDLOCK to place a U lock on the row while reading and block subsequent readers from acquiring a conflicting lock.
eg
declare #balance = (select balance from account with (updlock) where accountId = #id)
update account set balance = #balance + #amount where accountId = #id

Not exactly sure what you mean by "inconsistent repeatable reads" but with the REPEATABLE READ isolation level in SQL Server, shared locks will be held until the end of the transaction. Only committed data will be returned and other sessions cannot modify the rows read until the transaction is committed (or auto-committed in the case of no explict transaction).
REPEATABLE READ does not prevent other sessions from inserting new rows so rerunning the same query in the same transaction may return new rows (phantom reads) that were not returned originally.
The documentation describes this in more detail along with the other transaction isolation levels.

There is no need to intervene with the default behavior using hints or otherwise.
SQL Server guarantees transaction isolation levels per your declared required level.
REPEATABLE READ guarantees that within a transaction, your statements will see a consistent view of the data that was read, but doesn't guarantee there will be no phantom rows. To avoid the latter you will need to use SERIALIZABLE.
The higher the isolation, the higher the concurrency penalty of course.
See the JDBC Page for isolation levels and the general page for SET TRANSACTION ISOLATION LEVEL for a somewhat more in-depth discussion.
HTH

Related

How to implement Serializable Isolation Level in SQL Server

I need to implement a serializable isolation level in SQL Server but I've tried many ways and I don't get it.
I need to lock 1 row in one transaction (It doesn´t matter if lock the complete table). So, another transaction can´t even select the row (don´t read).
The last thing I tried:
For transaction 1:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
-- Here I select in another instance the same row
COMMIT TRAN
For transaction 2:
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
COMMIT TRAN
I would expect that transaction 2 wait until transaction 1 commit the operation, but the transaction 2 gives me the row.
Anyone can explain me if I miss something?
SQL Server conforms to the strict definition of a Serializable query. That is, there must be a result that can logically be generated IF both queries ran in serial order - Transaction 1 finishing before Transaction 2 can start, or vice versa.
This results in some effects that can be different than you would expect. There is a great explanation of the Serializable isolation level over at SQLPerformance.com that makes clear some of what this logical serializability ends up meaning. (Very helpful site, that one.)
For your above queries, there is no logical requirement to prevent the second query from reading the same row as the first query. No matter in what order the queries are run, they will both return the same data without modifying it. Since the Query Analyzer can identify this, there is no reason to place a read lock on the data. However, if one of the queries performed an update on the data, then (warning - logic assumption here, since I don't actually know the internals of how SQL Server handles this) the QA would set a stronger lock on the selected rows.
TL;DR - SQL Server wants to minimize blocking, so it uses logical analysis to see what types of locks are needed for a serializable isolation level, and it (tries to) use the minimum number and strength of locks needed to achieve its goal.
Now that we've dealt with that - there are only two ways that I can think of to lock a row so that no one else can read it: using XLOCK + TABLOCK (locking the whole table - not a recommended practice) or having some form of a field on each row that is updated when you start your process - something like an SPID field, or a bit flag for Locked. When you update it within your transaction, only SELECTs with NOLOCK hints will be able to read it.
Clearly, neither of these are optimal. I recommend the "This row is busy - go away" flag, as that's probably the approach I would take for an (almost) absolute lock on a row.
According to the documentation:
SERIALIZABLE Specifies the following:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the
current transaction until the current transaction completes.
If you're not making any changes to data with an INSERT, UPDATE, or DELETE inside transaction 1, SQL will release the Shared Lock after the read completes.
What you might want to try is adding a table hit to prevent the row lock from being released until the end of transaction 1.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code
FROM table1 WITH(ROWLOCK, HOLDLOCK)
WHERE code = 1
COMMIT TRAN
Maybe you can solve this with some hack like this?
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
UPDATE someTableForThisHack set val = CASE WHEN val = 1 THEN 0 else 1 End
SELECT code from table1.....
COMMIT TRANSACTION
So you create a table someTableForThisHack and insert one row to it.

Row locks on select

I have a stored procedure which does the following:
selects top N from table
sets these rows as processed
returns these rows to the client
Here is roughly how I am doing it in Sybase ASE:
set rowcount #count
begin tran get_items
insert into #temp_table
select item
from available_item
where is_processed = 0
update available_item
set is_processed = 1
from available_item, #temp_table
where available_item.item = #temp_table.item
# select the processed items...
commit trans
I am wondering whether there is a race condition here. If two separate processes execute this stored procedure at the same time, could they select and mark processed the same data? Or does having it in a transaction stop this?
If not, is there a way to hold locks on selected rows?
Some of the details will depend on your tables locking scheme. Allpages, pages and row level locking will have different impacts on your ability to run concurrent updates on a single table. I am assuming a page/row level scheme to allow for concurrency.
Your query will grab an initial shared page/row lock, which will be upgraded to an update lock, which will then be followed by an exclusive row lock on the updated pages/rows. No other processes will be able to make changes to the selected pages/rows for the duration of the transaction, but another process could read the selected rows prior to your update, which could lead to some inconsistency.
To get around this possibility, you can specify the isolation level in the transaction to either isolation level 2 (repeatable reads), or isolation level 3 (serialization). You may want to read up on the specifics of each level to decide which you want to enforce, and the trade-offs associate with it.
In your transaction, you would use it like this:
set rowcount #count
set transaction isolation level 2
...
Something to note, is that depending on the number of records you grab in your query, you could trigger a lock upgrade which could prevent your concurrent transactions from executing, even if they are not looking at the same rows as your first transaction. By default, the server will attempt to escalate to a table lock if it acquires locks on more than 200 pages/rows. This can be changed either to an explicit value or a range of values and percentage, and is configurable at the server, database or table level.
Relevant Documentation:
Transaction: Maintaining Data Consistency and Recovery
Performance and Tuning Series: Locking and Concurrency Control
Transact-SQL Users Guide 15.7 > Transactions: Maintaining Data Consistency and Recovery

SQL Server: serializable level not working

I have the following SP:
CREATE PROCEDURE [dbo].[sp_LockReader]
AS
BEGIN
SET NOCOUNT ON;
begin try
set transaction isolation level serializable
begin tran
select * from teste
commit tran
end try
begin catch
rollback tran
set transaction isolation level READ COMMITTED
end catch
set transaction isolation level READ COMMITTED
END
The table "test" has many values, so "select * from teste" takes several seconds. I run the sp_LockReader at same time in two diferent query windows and the second one starts showing test table contents without the first one terminates.
Shouldn't serializeble level forces the second query to wait?
How do i get the described behaviour?
Thanks
SERIALIZABLE at the most basic means "hold locks for longer". When you SELECT, the held lock is a shared lock which allows other readers.
If you want to block readers, use WITH (TABLOCKX) hint to take an exclusive lock where you don't need SERIALIZABLE. Or XLOCK with SERIALIZABLE
In other words:
SERIALIZABLE = Isolation Level = lock duration, concurrency
XLOCK = mode= sharing/exclusivity
TABLOCK = Granularity = what is locked
TABLOCKX = combined
See this question/answer for more info
A serializable transaction whose output is not affected by other concurrent transactions. In your case, you are SELECTing twice from the table; neither of those transactions changes the result set of the other, so they may both run simultaneously.
Even if one transaction did update the table, this would not necessarily prevent the other from executing, as the database may work from snapshots.
Have a look here for a better explanation than I can provide... http://en.wikipedia.org/wiki/Isolation_%28database_systems%29
Another note here. If you're using XLOCK under a SERIALIZABLE isolation, other transactions with READ COMMITTED isolation will still be able to read XLOCK'ed rows.
To prevent that, use PAGLOCK along with XLOCK.
See here for details
http://support.microsoft.com/kb/324417

Deadlocks - Will this really help?

So I've got a query that keeps deadlocking on me. People who know the system well can't figure out why the sproc is deadlocking, but they tell me that I should just add this to it:
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
Is this really a valid solution? What does that do?
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
This will cause the system to return inconsitent data, including duplicate records and missing records. Read more at Previously committed rows might be missed if NOLOCK hint is used, or here at Timebomb - The Consistency problem with NOLOCK / READ UNCOMMITTED.
Deadlocks can be investigated and fixed, is not a big deal if you follow the proper procedure. Of course, throwing a dirty read may seem easier, but down the road you'll be sitting long hours staring at your general ledger and wondering why the heck it does not balance debits and credits. So read again until you really grok this: DIRTY READs ARE INCONSISTENT READS.
If you want a get-out-of-jail card, turn on snapshot isolation:
ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON
But keep in mind that snapshot isolation does not fix the deadlocks, it only hides them. Proper investigation of the deadlock cause and fix is always the appropriate action.
NOCOUNT will keep your query from returning rowcounts back to the calling application (i.e. 1000000 rows affected).
TRANSACTION ISOLATION LEVEL READ UNCOMMITTED will allow for dirty reads as indicated here.
The isolation level may help, but do you want to allow dirty reads?
Randomly adding SET options to the query is unlikely to help I'm afraid
SET NOCOUNT ON
Will have no effect on the issue.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
will prevent your query taking out shared locks. As well as reading "dirty" data it also can lead to your query reading the same rows twice, or not at all, dependant upon what other concurrent activity is happening.
Whether this will resolve your deadlock issue depends upon the type of deadlock. It will have no effect at all if the issue is 2 writers deadlocking due to non linear ordering of lock requests. (transaction 1 updating row a, transaction 2 updating row b then tran 1 requesting a lock on b and tran 2 requesting a lock on a)
Can you post the offending query and deadlock graph? (if you are on SQL 2005 or later)
The best guide is:
http://technet.microsoft.com/es-es/library/ms173763.aspx
Snippet:
Specifies that statements can read rows that have been modified by other
transactions but not yet committed.
Transactions running at the READ
UNCOMMITTED level do not issue shared
locks to prevent other transactions
from modifying data read by the
current transaction. READ UNCOMMITTED
transactions are also not blocked by
exclusive locks that would prevent the
current transaction from reading rows
that have been modified but not
committed by other transactions. When
this option is set, it is possible to
read uncommitted modifications, which
are called dirty reads. Values in the
data can be changed and rows can
appear or disappear in the data set
before the end of the transaction.
This option has the same effect as
setting NOLOCK on all tables in all
SELECT statements in a transaction.
This is the least restrictive of the
isolation levels.
In SQL Server, you can also minimize
locking contention while protecting
transactions from dirty reads of
uncommitted data modifications using
either:
The READ COMMITTED isolation level
with the READ_COMMITTED_SNAPSHOT
database option set to ON. The
SNAPSHOT isolation level
.
On a different tack, there are two other aspects to consider, that may help.
1) Indexes and the indexes used by the SQL. The indexing strategy used on the tables will affect how many rows are affected. If you make the data modifications using a unique index, you may reduce the chance of deadlocks.
One algorithm - of course it will not work it all cases. The use of NOLOCK is targeted rather than being global.
The "old" way:
UPDATE dbo.change_table
SET somecol = newval
WHERE non_unique_value = 'something'
The "new" way:
INSERT INTO #temp_table
SELECT uid FROM dbo.change_table WITH (NOLOCK)
WHERE non_unique_value = 'something'
UPDATE dbo.change_table
SET somecol = newval
FROM dbo.change_table c
INNER JOIN
#temp_table t
ON (c.uid = t.uid)
2) Transaction duration
The longer a transaction is open the more likely there may be contention. If there is a way to reduce the amount of time that records remain locked, you can reduce the chances of a deadlock occurring.
For example, perform as many SELECT statements (e.g. lookups) at the start of the code instead of performing an INSERT or UPDATE, then a lookup, then an INSERT, and then another lookup.
This is where one can use the NOLOCK hint for SELECTs on "static" tables that are not changing reducing the lock "footprint" of the code.

Would this prevent the row from being read during the transaction?

I remember an example where reads in a transaction then writing back the data is not safe because another transaction may read/write to it in the time between. So i would like to check the date and prevent the row from being modified or read until my transaction is finish. Would this do the trick? and are there any sql variants that this will not work on?
update tbl set id=id where date>expire_date and id=#id
Note: date>expire_date happens to be my condition. It could be anything. Would this prevent other transaction from reading the row until i commit or rollback?
In a lot of cases, your UPDATE statement will not prevent other transactions from reading the row.
ziang mentioned transaction isolation levels.
Depending on the isolation level, databases use different types of locking. At the highest level, locking can be divided into two categories:
- pessimistic,
- optimistic
MS SQL 2008, for example, has 6 isolation levels, 4 of them are pessimistic, 2 are optimistic. By default , it uses READ COMMITTED isolation level, which falls into the pessimistic category.
Oracle, on another note, uses optimistic locking by default.
The statement that will lock your record for writing is
SELECT * FROM TBL WITH UPDLOCK WHERE id=#id
From that point on, no other transaction will be able to update your record with id=#id
And only transactions running in isolation level READ UNCOMMITTED will be able to read it.
With the default transaction level, READ COMMITTED, no other thansaction will be able to read or write into this record until you either commit or roll back your entire transaction.
It depends on the transaction isolation level you set on your transaction control. There are 4 types of read
READ UNCOMMITTED: this allows the dirty read
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
for more info, you can check msdn.
You should be able to do this in a normal select using a combination of
HOLDLOCK/ROWLOCK
It very well may work. Different platforms offer different services. For instance, in T-SQL, you can simply set the isolation level of the transaction and, as a result, force a lock to be obtained. I don't know what platform you are using so I cannot answer your question definitively.