SQL database locking difference between SELECT and UPDATE - sql

I am re-writing an old stored procedure which is called by BizTalk. Now this has the potential to have 50-60 messages pushed through at once.
I occasionally have an issue with database locking when they are all trying to update at once.
I can only make changes in SQL (not BizTalk) and I am trying to find the best way to run my SP.
With this in mind what i have done is to make the majority of the statement to determine if an UPDATE is needed by using a SELECT statement.
What my question is - What is the difference regarding locking between an UPDATE statement and a SELECT with a NOLOCK against it?
I hope this makes sense - Thank you.

You use nolock when you want to read uncommitted data and want to avoid taking any shared lock on the data so that other transactions can take exclusive lock for updating/deleting.
You should not use nolock with update statement, it is really a bad idea, MS says that nolock are ignored for the target of update/insert statement.
Support for use of the READUNCOMMITTED and NOLOCK hints in the FROM
clause that apply to the target table of an UPDATE or DELETE statement
will be removed in a future version of SQL Server. Avoid using these
hints in this context in new development work, and plan to modify
applications that currently use them.
Source
Regarding your locking problem during multiple updates happening at the same time. This normally happens when you read data with the intention to update it later by just putting a shared lock, the following UPDATE statement can’t acquire the necessary Update Locks, because they are already blocked by the Shared Locks acquired in the different session causing the deadlock.
To resolve this you can select the records using UPDLOCK like following
DECLARE #IdToUpdate INT
SELECT #IdToUpdate =ID FROM [Your_Table] WITH (UPDLOCK) WHERE A=B
UPDATE [Your_Table]
SET X=Y
WHERE ID=#IdToUpdate
This will take the necessary Update lock on the record in advance and will stop other sessions to acquire any lock (shared/exclusive) on the record and will prevent from any deadlocks.

NOLOCK: Specifies that dirty reads are allowed. No shared locks are issued to prevent other transactions from modifying data read by the current transaction, and exclusive locks set by other transactions do not block the current transaction from reading the locked data. NOLOCK is equivalent to READUNCOMMITTED.
Thus, while using NOLOCK you get all rows back but there are chances to read Uncommitted (Dirty) data. And while using READPAST you get only Committed Data so there are chances you won’t get those records that are currently being processed and not committed.
For your better understanding please go through below link.
https://www.mssqltips.com/sqlservertip/2470/understanding-the-sql-server-nolock-hint/
https://www.mssqltips.com/sqlservertip/4468/compare-sql-server-nolock-and-readpast-table-hints/
https://social.technet.microsoft.com/wiki/contents/articles/19112.understanding-nolock-query-hint.aspx

Related

SQL Server Update Locks

If you have the following sql, is it possible that if it is run multiple times by many different processes at exactly the same time, that two or more processes may update the table?
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
UPDATE table
SET Column1 = 1
WHERE Column1 = 0
No other locks etc are specified in the sql, other that Read Uncommitted.
I'm trying to track down an issue, and I'm now clutching at straws...
Got this from MSDN.
Transactions running at the READ UNCOMMITTED level do not issue shared locks to prevent other transactions from modifying data read by the current transaction. READ UNCOMMITTED transactions are also not blocked by exclusive locks that would prevent the current transaction from reading rows that have been modified but not committed by other transactions. When this option is set, it is possible to read uncommitted modifications, which are called dirty reads. Values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the isolation levels.
So basically, this is equivalent to SQL Server , NOLOCK hint. This might result in dirty reads, i.e. if some process in updated 1000 records and updated 500 till now, and other process read that data, then data might be in inconsistent form. This also helps in executing update without getting blocked (shared lock) by multiple select queries.
Hope this make some sense to your question. For reference -- MSDN

In SQL Server 2005, when does a Select query block Inserts or Updates to the same or other table(s)?

In the past I always thought that select query would not blocks other insert sql. However, recently I wrote a query that takes a long time (more than 2 min) to select data from a table. During the select, a number of insert sql statements were timing out.
If select blocks insert, what would be the solution way to prevent the timeout without causing dirty read?
I have investigate option of using isolation snapshot, but currently I have no access to change the client's database to enable the “ALLOW_SNAPSHOT_ISOLATION”.
Thanks
When does a Select query block Inserts or Updates to the same or
other table(s)?
When it holds a lock on a resource that is mutually exclusive with one that the insert or update statement needs.
Under readcommitted isolation level with no additional locking hints then the S locks taken out are typically released as soon as the data is read. For repeatable read or serializable however they will be held until the end of the transaction (statement for a single select not running in an explicit transaction).
serializable will often take out range locks which will cause additional blocking over and above that caused by the holding of locks on the rows and pages actually read.
READPAST might be what you're looking for - check out this article.

Why deadlock occurs?

I use a small transaction which consists of two simple queries: select and update:
SELECT * FROM XYZ WHERE ABC = DEF
and
UPDATE XYZ SET ABC = 123
WHERE ABC = DEF
It is quite often situation when the transaction is started by two threads, and depending on Isolation Level deadlock occurs (RepeatableRead, Serialization). Both transactions try to read and update exactly the same row.
I'm wondering why it is happening. What is the order of queries which leads to deadlock? I've read a bit about lock (shared, exclusive) and how long locks last for each isolation level, but I still don't fully understand...
I've even prepared a simple test which always result in deadlock. I've looked at results of the test in SSMS and SQL Server Profiler. I started first query and then immediately the second.
First query:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
SELECT ...
WAITFOR DELAY '00:00:04'
UPDATE ...
COMMIT
Second query:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
SELECT ...
UPDATE ...
COMMIT
Now I'm not able to show you detailed logs, but it looks less or more like this (I've very likely missed Lock:deadlock etc. somewhere):
(1) SQL:BatchStarting: First query
(2) SQL:BatchStarting: Second query
(3) Lock:timeout for second query
(4) Lock:timeout for first query
(5) Deadlock graph
If I understand locks well, in (1) first query takes a shared lock (to execute SELECT), then goes to sleep and keeps the shared lock until the end of transaction. In (2) second query also takes shared lock (SELECT) but cannot take exclusive lock (UPDATE) while there are shared locks on the same row, which results in Lock:timeout. But I can't explain why timeout for second query occurs. Probably I don't understand the whole process well. Can anybody give a good explanation?
I haven't noticed deadlocks using ReadCommitted but I'm afraid they may occur.
What solution do you recommend?
A deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock
http://msdn.microsoft.com/en-us/library/ms177433.aspx
"But I can't explain why timeout for second query occurs."
Because the first query holds shared lock. Then the update in the first query also tries to get the exclusive lock, which makes him sleep. So the first and second query are both sleeping waiting for the other to wake up - and this is a deadlock which results in timeout :-)
In mysql it works better - the deadlock is detected immediatelly and one of the transactions is rolled back (you need not to wait for timeout :-)).
Also, in mysql, you can do the following to prevent deadlock:
select ... for update
which will put a write-lock (i.e. exclusive lock) just from the beginning of the transaction, and this way you avoid the deadlock situation! Perhaps you can do something similar in your database engine.
For MSSQL there is a mechanism to prevent deadlocks. What you need here is called the WITH NOLOCK hint.
In 99.99% of the cases of SELECT statements it's usable and there is no need to bundle the SELECT with the UPDATE. There is also no need to put a SELECT into a transaction. The only exception is when dirty reads are not allowed.
Changing your queries to this form would solve all your issues:
SELECT ...
FROM yourtable WITH (NOLOCK)
WHERE ...
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
UPDATE ...
COMMIT
It has been a long time since I last dealt with this, but I believe that the select statement creates a read-lock, which only prevents the data to be changed -- hence multiple queries can hold and share a read-lock on the same data. The shared-read-lock is for read consistency, that is if you multiple times in your transaction reads the same row, then read-consistency should mean that you should always get the same result.
The update statement requires an exclusive lock, and hence the update statement have to wait for the read-lock to be released.
None of the two transactions will release the locks, so the transactions fails.
Different databases implementations have different strategies for how to deal with this, with Sybase and MS-SQL-servers using lock escalation with timeout (escalate from read-to-write-lock) -- Oracle I believe (at some point) implemented read consistency though use of the roll-back-log, where MySQL have yet a different strategy.

Deadlocks - Will this really help?

So I've got a query that keeps deadlocking on me. People who know the system well can't figure out why the sproc is deadlocking, but they tell me that I should just add this to it:
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
Is this really a valid solution? What does that do?
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
This will cause the system to return inconsitent data, including duplicate records and missing records. Read more at Previously committed rows might be missed if NOLOCK hint is used, or here at Timebomb - The Consistency problem with NOLOCK / READ UNCOMMITTED.
Deadlocks can be investigated and fixed, is not a big deal if you follow the proper procedure. Of course, throwing a dirty read may seem easier, but down the road you'll be sitting long hours staring at your general ledger and wondering why the heck it does not balance debits and credits. So read again until you really grok this: DIRTY READs ARE INCONSISTENT READS.
If you want a get-out-of-jail card, turn on snapshot isolation:
ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON
But keep in mind that snapshot isolation does not fix the deadlocks, it only hides them. Proper investigation of the deadlock cause and fix is always the appropriate action.
NOCOUNT will keep your query from returning rowcounts back to the calling application (i.e. 1000000 rows affected).
TRANSACTION ISOLATION LEVEL READ UNCOMMITTED will allow for dirty reads as indicated here.
The isolation level may help, but do you want to allow dirty reads?
Randomly adding SET options to the query is unlikely to help I'm afraid
SET NOCOUNT ON
Will have no effect on the issue.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
will prevent your query taking out shared locks. As well as reading "dirty" data it also can lead to your query reading the same rows twice, or not at all, dependant upon what other concurrent activity is happening.
Whether this will resolve your deadlock issue depends upon the type of deadlock. It will have no effect at all if the issue is 2 writers deadlocking due to non linear ordering of lock requests. (transaction 1 updating row a, transaction 2 updating row b then tran 1 requesting a lock on b and tran 2 requesting a lock on a)
Can you post the offending query and deadlock graph? (if you are on SQL 2005 or later)
The best guide is:
http://technet.microsoft.com/es-es/library/ms173763.aspx
Snippet:
Specifies that statements can read rows that have been modified by other
transactions but not yet committed.
Transactions running at the READ
UNCOMMITTED level do not issue shared
locks to prevent other transactions
from modifying data read by the
current transaction. READ UNCOMMITTED
transactions are also not blocked by
exclusive locks that would prevent the
current transaction from reading rows
that have been modified but not
committed by other transactions. When
this option is set, it is possible to
read uncommitted modifications, which
are called dirty reads. Values in the
data can be changed and rows can
appear or disappear in the data set
before the end of the transaction.
This option has the same effect as
setting NOLOCK on all tables in all
SELECT statements in a transaction.
This is the least restrictive of the
isolation levels.
In SQL Server, you can also minimize
locking contention while protecting
transactions from dirty reads of
uncommitted data modifications using
either:
The READ COMMITTED isolation level
with the READ_COMMITTED_SNAPSHOT
database option set to ON. The
SNAPSHOT isolation level
.
On a different tack, there are two other aspects to consider, that may help.
1) Indexes and the indexes used by the SQL. The indexing strategy used on the tables will affect how many rows are affected. If you make the data modifications using a unique index, you may reduce the chance of deadlocks.
One algorithm - of course it will not work it all cases. The use of NOLOCK is targeted rather than being global.
The "old" way:
UPDATE dbo.change_table
SET somecol = newval
WHERE non_unique_value = 'something'
The "new" way:
INSERT INTO #temp_table
SELECT uid FROM dbo.change_table WITH (NOLOCK)
WHERE non_unique_value = 'something'
UPDATE dbo.change_table
SET somecol = newval
FROM dbo.change_table c
INNER JOIN
#temp_table t
ON (c.uid = t.uid)
2) Transaction duration
The longer a transaction is open the more likely there may be contention. If there is a way to reduce the amount of time that records remain locked, you can reduce the chances of a deadlock occurring.
For example, perform as many SELECT statements (e.g. lookups) at the start of the code instead of performing an INSERT or UPDATE, then a lookup, then an INSERT, and then another lookup.
This is where one can use the NOLOCK hint for SELECTs on "static" tables that are not changing reducing the lock "footprint" of the code.

SQL Transactions (b)locking Selects - is my understanding correct

We are using ADO.NET to connect to a SQL 2005 server, and doing a number of inserts/updates and selects in it. We changed one of the updates to be inside a transaction however it appears to (b)lock the entire table when we do it, regardless of the IsolationLevel we set on the transaction.
The behavior that I seem to see is that:
If you have no transactions then it's an all out fight (losers getting dead locked)
If you have a few transactions then they win all the time and block all others out unless
If you have a few transactions and you set something like nolock on the rest then you get transactions and nothing blocked. This is because every statement (select/insert/delete/update) has an isolationlevel regardless of transactions.
Is this correct?
The answer to your question is: It depends.
If you are updating a table, SQL Server uses several strategies to decide how many rows to lock, row level locks, page locks or full table locks.
If you are updating more than a certain percentage of the table (configurable as I remember), then SQL Server gives you a table level lock, which may block selects.
The best reference is:
Understanding Locking in SQL Server:
http://msdn.microsoft.com/en-us/library/aa213039(SQL.80).aspx
(for SQL Server 2000)
Introduction to Locking in SQL Server: http://www.sqlteam.com/article/introduction-to-locking-in-sql-server
Isolation Levels in the Database Engine: http://msdn.microsoft.com/en-us/library/ms189122.aspx (for SQL server 2008, but 2005 version is available).
Good luck.
Your update statement (i.e one that changes data) will hold locks regardless of the isolation level and whether you have explicitly defined a transaction of not.
What you can control is the granularity of the locks by using query hints. So if the update is locking the entire table, then you can specify a query hint to only lock the affected rows (ROWLOCK hint). That is unless your query is updating the whole table of course.
So to answer your question, the first connection to request locks on a resource will hold those locks for the duration of the transaction. You can specify that a select does not hold locks by using the read uncommitted isolation level, statements that change data insert/update/delete always hold locks regardless. The next connection to request locks on the same resource will wait until the first has finished and will then hold its locks. Dead locking is a specific scenario where two connections are holding locks and each is waiting for the other connection's resource, to avoid the engine waiting forever, one connection is chosen as the deadlock victim.