SQL - Update causing deadlock error

SQL - Update causing deadlock error - sql

I'm trying to update a row in a table upon someone viewing the page (it increments the viewed count), however now and then I get a deadlock error, I'm guessing this is due to two or more people trying to update the same row?
The error is:
Transaction (Process ID 60) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
And my SQL is:
UPDATE [ProductDescription]
SET [ViewCount] = ([ViewCount] + 1)
WHERE ProductCode = #prodCode
AND ApplicationID = #AppID
I believe I may need a WITH(NOLOCK)?

You DO NOT need NOLOCK. This will only remove read locks and will cause unpredictable results. The better thing to do would be to use TABLOCK on the update statement, meaning that other processes cannot access the table until you have finished.

SET transaction isolation level to SERIALIZABLE or SNAPSHOT to update data properly.For more details check HERE

The problem is more likely to be caused by users running selects at the same time. The default isolation level is "read committed" which causes locks.
Unless it is critical that the data you're reading is up to date, consider using:
with(nolock)
in the selects or an alternative isolation level.

Related

Postgres could not serialize access due to concurrent update

I have an issue with "could not serialize access due to concurrent update". I checked logs and I can clearly see that two transactions were trying to update a row at the same time.
my sql query
UPDATE sessionstore SET valid_until = %s WHERE sid = %s;
How can I tell postgres to "try" update row without throwing any exception?

There is a caveat here which has been mentioned in comments. You must be using REPEATABLE READ transaction isolation or higher. Why? That is not typically required unless you really have a specific reason.
Your problem will go away if you use standard READ COMMITTED. But still it’s better to use SKIP LOCKED to both avoid lock waits and redundant updates and wasted WAL traffic.
As of Postgres 9.5+, there is a much better way to handle this, which would be like this:
UPDATE sessionstore
SET valid_until = %s
WHERE sid = (
SELECT sid FROM sessionstore
WHERE sid = %s
FOR UPDATE SKIP LOCKED
);
The first transaction to acquire the lock in SELECT FOR UPDATE SKIP LOCKED will cause any conflicting transaction to select nothing, leading to a no-op. As requested, it will not throw an exception.
See SKIP LOCKED notes here:
https://www.postgresql.org/docs/current/static/sql-select.html
Also the advice about a savepoint is not specific enough. What if the update fails for a reason besides a serialization error? Like an actual deadlock? You don’t want to just silently ignore all errors. But these are also in general a bad idea - an exception handler or a savepoint for every row update is a lot of extra overhead especially if you have high traffic. That is why you should use READ COMMITTED and SKIP LOCKED both to simplify the matter, and any actual error then would be truly unexpected.

The canonical way to do that would be to set a checkpoint before the UPDATE:
SAVEPOINT x;
If the update fails,
ROLLBACK TO SAVEPOINT x;
and the transaction can continue.

The default isolation level is "read committed" unless you need to change it for any specific use case.
You must be using the "repeatable read" or "serializable" isolation level. Here, the current transaction will roll-back if the already running transaction updates the value which was also supposed to be updated by current transaction.
Though this scenario can be easily handled by the "read_commit" isolation level where the current transaction accepts an updated value from other transaction and perform its instructions after the previous transaction is committed
ALTER DATABASE SET DEFAULT_TRANSACTION_ISOLATION TO 'read committed';
Ref: https://www.postgresql.org/docs/9.5/transaction-iso.html

SQL Server Update Locks

If you have the following sql, is it possible that if it is run multiple times by many different processes at exactly the same time, that two or more processes may update the table?
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
UPDATE table
SET Column1 = 1
WHERE Column1 = 0
No other locks etc are specified in the sql, other that Read Uncommitted.
I'm trying to track down an issue, and I'm now clutching at straws...

Got this from MSDN.
Transactions running at the READ UNCOMMITTED level do not issue shared locks to prevent other transactions from modifying data read by the current transaction. READ UNCOMMITTED transactions are also not blocked by exclusive locks that would prevent the current transaction from reading rows that have been modified but not committed by other transactions. When this option is set, it is possible to read uncommitted modifications, which are called dirty reads. Values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the isolation levels.
So basically, this is equivalent to SQL Server , NOLOCK hint. This might result in dirty reads, i.e. if some process in updated 1000 records and updated 500 till now, and other process read that data, then data might be in inconsistent form. This also helps in executing update without getting blocked (shared lock) by multiple select queries.
Hope this make some sense to your question. For reference -- MSDN

Check if table data has changed?

I am pulling the data from several tables and then passing the data to a long running process. I would like to be able to record what data was used for the process and then query the database to check if any of the tables have changed since the process was last run.
Is there a method of solving this problem that should work across all sql databases?
One possible solution that I've thought of is having a separate table that is only used for keeping track of whether the data has changed since the process was run. The table contains a "stale" flag. When I start running the process, stale is set to false. If any creation, update, or deletion occurs in any of the tables on which the operation depends, I set stale to true. Is this a valid solution? Are there better solutions?
One concern with my solution is situations like this:
One user starts inserting a new row into one of the tables. Stale gets set to true, but the new row has not actually been added yet. Another user has simultaneously started the long running process, pulling the data from the tables and setting the flag to false. The row is finally added. Now the data used for the process is out of date but the flag indicates it is not stale. Would transactions be able to solve this problem?
EDIT:
This is some SQL for my idea. Not sure if it works, but just to give you a better idea of what I was thinking:
# First transaction reads the data and sets the flag to false
BEGIN TRANSACTION
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
UPDATE flag SET stale = false
SELECT * FROM DATATABLE
COMMIT TRANSACTION
# Second transaction updates the data and sets the flag to true
BEGIN TRANSACTION
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
UPDATE data SET val = 15 WHERE ID = 10
UPDATE flag SET stale = true
COMMIT TRANSACTION
I do not have much experience with transactions or handwriting xml, so there are probably issues with this. From what I understand two serializable transactions can not be interleaved. Please correct me if I'm wrong.
Is there a way to accomplish this with only the first transaction? The process will be run rarely, but the updates to the data table will occur more frequently, so it would be nice to not lock up the data table when performing updates.
Also, is the SET TRANSACTION ISOLATION syntax specific to MS?

The stale flag will probably work, but a timestamp would be better since it provides more metadata about the age of the records which could be used to tune your queries, e.g., only pull data that is over 5 minutes old.
To address your concern about inserting a row at the same time a query is run, transactions with an appropriate isolation level will help. For row inserts, updates, and selects, at least use a transaction with an isolation level that prevents dirty reads so that no other connections can see the updated data until the transaction is committed.
If you are strongly concerned about the case where an update happens at the same time as a record pull, you could use the REPEATABLE READ or even SERIALIZABLE isolation levels, but this will slow DB access down.
Your SQLServer sampled should work. For alternate databases, Here's an example that works in PostGres:
Transaction 1
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- run queries that update the tables, then set last_updated column
UPDATE sometable SET last_updatee = now() WHERE id = 1;;
COMMIT;
Transaction 2
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- select data from tables, then set last_queried column
UPDATE sometable SET last_queried = now() WHERE id = 1;
COMMIT;
If transaction 1 starts, and then transaction 2 starts before transaction 1 has completed, transaction 2 will block during on the update, and then will throw an error when transaction 1 is committed. If transaction 2 starts first, and transaction 1 starts before that has finished, then transaction 1 will error. Your application code or process should be able to handle those errors.
Other databases use similar syntax - MySQL (with InnoDB plugin) requires you to set the isolation level before you start the transaction.

Why deadlock occurs?

I use a small transaction which consists of two simple queries: select and update:
SELECT * FROM XYZ WHERE ABC = DEF
and
UPDATE XYZ SET ABC = 123
WHERE ABC = DEF
It is quite often situation when the transaction is started by two threads, and depending on Isolation Level deadlock occurs (RepeatableRead, Serialization). Both transactions try to read and update exactly the same row.
I'm wondering why it is happening. What is the order of queries which leads to deadlock? I've read a bit about lock (shared, exclusive) and how long locks last for each isolation level, but I still don't fully understand...
I've even prepared a simple test which always result in deadlock. I've looked at results of the test in SSMS and SQL Server Profiler. I started first query and then immediately the second.
First query:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
SELECT ...
WAITFOR DELAY '00:00:04'
UPDATE ...
COMMIT
Second query:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
SELECT ...
UPDATE ...
COMMIT
Now I'm not able to show you detailed logs, but it looks less or more like this (I've very likely missed Lock:deadlock etc. somewhere):
(1) SQL:BatchStarting: First query
(2) SQL:BatchStarting: Second query
(3) Lock:timeout for second query
(4) Lock:timeout for first query
(5) Deadlock graph
If I understand locks well, in (1) first query takes a shared lock (to execute SELECT), then goes to sleep and keeps the shared lock until the end of transaction. In (2) second query also takes shared lock (SELECT) but cannot take exclusive lock (UPDATE) while there are shared locks on the same row, which results in Lock:timeout. But I can't explain why timeout for second query occurs. Probably I don't understand the whole process well. Can anybody give a good explanation?
I haven't noticed deadlocks using ReadCommitted but I'm afraid they may occur.
What solution do you recommend?

A deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock
http://msdn.microsoft.com/en-us/library/ms177433.aspx

"But I can't explain why timeout for second query occurs."
Because the first query holds shared lock. Then the update in the first query also tries to get the exclusive lock, which makes him sleep. So the first and second query are both sleeping waiting for the other to wake up - and this is a deadlock which results in timeout :-)
In mysql it works better - the deadlock is detected immediatelly and one of the transactions is rolled back (you need not to wait for timeout :-)).
Also, in mysql, you can do the following to prevent deadlock:
select ... for update
which will put a write-lock (i.e. exclusive lock) just from the beginning of the transaction, and this way you avoid the deadlock situation! Perhaps you can do something similar in your database engine.

For MSSQL there is a mechanism to prevent deadlocks. What you need here is called the WITH NOLOCK hint.
In 99.99% of the cases of SELECT statements it's usable and there is no need to bundle the SELECT with the UPDATE. There is also no need to put a SELECT into a transaction. The only exception is when dirty reads are not allowed.
Changing your queries to this form would solve all your issues:
SELECT ...
FROM yourtable WITH (NOLOCK)
WHERE ...
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
UPDATE ...
COMMIT

It has been a long time since I last dealt with this, but I believe that the select statement creates a read-lock, which only prevents the data to be changed -- hence multiple queries can hold and share a read-lock on the same data. The shared-read-lock is for read consistency, that is if you multiple times in your transaction reads the same row, then read-consistency should mean that you should always get the same result.
The update statement requires an exclusive lock, and hence the update statement have to wait for the read-lock to be released.
None of the two transactions will release the locks, so the transactions fails.
Different databases implementations have different strategies for how to deal with this, with Sybase and MS-SQL-servers using lock escalation with timeout (escalate from read-to-write-lock) -- Oracle I believe (at some point) implemented read consistency though use of the roll-back-log, where MySQL have yet a different strategy.

Deadlocks - Will this really help?

So I've got a query that keeps deadlocking on me. People who know the system well can't figure out why the sproc is deadlocking, but they tell me that I should just add this to it:
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
Is this really a valid solution? What does that do?

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
This will cause the system to return inconsitent data, including duplicate records and missing records. Read more at Previously committed rows might be missed if NOLOCK hint is used, or here at Timebomb - The Consistency problem with NOLOCK / READ UNCOMMITTED.
Deadlocks can be investigated and fixed, is not a big deal if you follow the proper procedure. Of course, throwing a dirty read may seem easier, but down the road you'll be sitting long hours staring at your general ledger and wondering why the heck it does not balance debits and credits. So read again until you really grok this: DIRTY READs ARE INCONSISTENT READS.
If you want a get-out-of-jail card, turn on snapshot isolation:
ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON
But keep in mind that snapshot isolation does not fix the deadlocks, it only hides them. Proper investigation of the deadlock cause and fix is always the appropriate action.

NOCOUNT will keep your query from returning rowcounts back to the calling application (i.e. 1000000 rows affected).
TRANSACTION ISOLATION LEVEL READ UNCOMMITTED will allow for dirty reads as indicated here.
The isolation level may help, but do you want to allow dirty reads?

Randomly adding SET options to the query is unlikely to help I'm afraid
SET NOCOUNT ON
Will have no effect on the issue.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
will prevent your query taking out shared locks. As well as reading "dirty" data it also can lead to your query reading the same rows twice, or not at all, dependant upon what other concurrent activity is happening.
Whether this will resolve your deadlock issue depends upon the type of deadlock. It will have no effect at all if the issue is 2 writers deadlocking due to non linear ordering of lock requests. (transaction 1 updating row a, transaction 2 updating row b then tran 1 requesting a lock on b and tran 2 requesting a lock on a)
Can you post the offending query and deadlock graph? (if you are on SQL 2005 or later)

The best guide is:
http://technet.microsoft.com/es-es/library/ms173763.aspx
Snippet:
Specifies that statements can read rows that have been modified by other
transactions but not yet committed.
Transactions running at the READ
UNCOMMITTED level do not issue shared
locks to prevent other transactions
from modifying data read by the
current transaction. READ UNCOMMITTED
transactions are also not blocked by
exclusive locks that would prevent the
current transaction from reading rows
that have been modified but not
committed by other transactions. When
this option is set, it is possible to
read uncommitted modifications, which
are called dirty reads. Values in the
data can be changed and rows can
appear or disappear in the data set
before the end of the transaction.
This option has the same effect as
setting NOLOCK on all tables in all
SELECT statements in a transaction.
This is the least restrictive of the
isolation levels.
In SQL Server, you can also minimize
locking contention while protecting
transactions from dirty reads of
uncommitted data modifications using
either:
The READ COMMITTED isolation level
with the READ_COMMITTED_SNAPSHOT
database option set to ON. The
SNAPSHOT isolation level
.

On a different tack, there are two other aspects to consider, that may help.
1) Indexes and the indexes used by the SQL. The indexing strategy used on the tables will affect how many rows are affected. If you make the data modifications using a unique index, you may reduce the chance of deadlocks.
One algorithm - of course it will not work it all cases. The use of NOLOCK is targeted rather than being global.
The "old" way:
UPDATE dbo.change_table
SET somecol = newval
WHERE non_unique_value = 'something'
The "new" way:
INSERT INTO #temp_table
SELECT uid FROM dbo.change_table WITH (NOLOCK)
WHERE non_unique_value = 'something'
UPDATE dbo.change_table
SET somecol = newval
FROM dbo.change_table c
INNER JOIN
#temp_table t
ON (c.uid = t.uid)
2) Transaction duration
The longer a transaction is open the more likely there may be contention. If there is a way to reduce the amount of time that records remain locked, you can reduce the chances of a deadlock occurring.
For example, perform as many SELECT statements (e.g. lookups) at the start of the code instead of performing an INSERT or UPDATE, then a lookup, then an INSERT, and then another lookup.
This is where one can use the NOLOCK hint for SELECTs on "static" tables that are not changing reducing the lock "footprint" of the code.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas