I am using SSIS 2008 to execute multiple stored procedures in parallel in control flow.
Each SP is supposed to ultimately update 1 row in a table.
The point to note is that each SP has a defined responsibility to update specific columns only.
It is guaranteed that the different SPs will not update each other's columns. So the columns to be updated are divided between the various SPs, but as per design, each SP is supposed to work on the same row ultimately.
At the moment some of my SPs error out due to deadlock. I am guessing this may be because of the lock on that row by other SPs?
How can i work this out?
You have to admit, this seems like a highly unusual thing to do. I wonder if it wouldn't be better to update separate tables, and then have a single update statement at the end that would join the individual tables to the final one? (i.e. update a set a.[1] = ... from a inner join b inner join c etc.).
But, if you want to continue down this path, then just set READ UNCOMMITTED from within each of your stored procedures. That is the best bet.
The deadlocks are probably caused by more than just having another SP locking the row. Under that situation the first procedure would just wait until the locking SP releases the lock. That's not to say that your multiple procedures are not causing the problem. There's more to it though.
You may have to do some rework to avoid the problem, but first you should find out more about the deadlock situation. I suspect that you have locks on objects other than the row that's being updated.
There are ways to gather more information on the deadlock. Here's a link where you can learn about the deadlock details.
Simple: make suere you do not leave locks anywhere. This works with the transaction isolation determined by the connection. With theproper transaction isaolation, there will be no locks, so no deadlocks.
The update is not the problem. It starts with READS. Go to READ UNCOMMITED to make sure you do not leave locks upon read, and / or use the NOLOCK option in the SELECT statements to individually force them to leave NO LOCKS in place wile reading data (more advisable).
If you then make sure that the SP will commit pretty much immediately after the insert (internally or externally) there should be no deadlock possible - a write lock will result in the next SP waiting for the first Transaction to commit.
Especially when going down to one table / one row only, a deadlock is not possible with update statements only, IIRC. It is the read locks which turn a delaying lock (albeit small delay) into a deadlock.
http://www.eggheadcafe.com/software/aspnet/30976898/how-does-update-lock-prevent-deadlocks.aspx has some nice example of a deadlock. So, if everything BUT the update has no locks, then at the end a deadlock is not possible.
Row level locking is as low as you can safely go, so I think you will have to break up the updated table into several tables that you can update independent of each other.
No, two sessions cannot update the same row simultaneously. Row level locking is as low as it gets so when one session is updating a row, other sessions wanting to update that record will wait.
As for the deadlocks, the trouble with SQL Server is that by default SELECT statements will block if it's asking for a record currently being updated. You can use WITH (NOLOCK) if you don't mind reading uncommitted data.
So, if you've got this order of events:
SessionA
begin transaction
update t1
set c1 = 'x'
where c2 = 5
SessionB
begin transaction
update t2
set c1 = 'y'
where c2 = 7
SessionA
select * from t1 where c2 = 5
<waits on SessionB>
SessionB
select * from t2 where c2 = 7
<waits on SessionA>...oops. Deadlock.
The trick is to only lock something for as little time as is necessary (don't break up a series of statements just to release a lock - make sure that steps that make up a logical transaction stay as a transaction):
begin transaction
update t1
set c1 = 'x'
where c2 = 5
commit
Or (caveat emptor) use the nolock directive:
SessionA
select c1 from t1 where c2 = 5 with (nolock)
<gets the new value of 'x'>
SessionB
select * from t2 where c2 = 7 with (nolock)
<gets the new value of 'y'>
Related
We are trying to retrieve and update the TOP X events from a table but without locking anything else than the "processed" rows. We looked into different SQL hints like ROWLOCK and READPAST, but haven't figured out what combination of those should be used in this scenario. Also, we need to make sure that the returned rows are unique across different concurrent executions of that query and that the same row will never be selected twice.
Note: This table has got many INSERTs happening concurrently.
UPDATE TOP(:batchSize) wsns WITH (READPAST)
SET consumer_ip = :consumerIP
OUTPUT inserted.id, inserted.another_id, inserted.created_time, inserted.scheduled_time
FROM table_A a
WHERE a.scheduled_time < GETUTCDATE() AND a.consumer_ip IS NULL
Any help is highly appreciated. Many thanks!
I don't quite follow how/why are you trying to use the READPAST hint here?
But anyway - to achieve what you want I would suggest:
WITH xxx AS
(
SELECT TOP(:batchSize) *
FROM table_A
)
UPDATE xxx
SET consumer_ip = :consumerIP
OUTPUT inserted.id, inserted.another_id, inserted.created_time, inserted.scheduled_time
FROM table_A a
WHERE a.scheduled_time < GETUTCDATE() AND a.consumer_ip IS NULL;
If all that could happen in the background are new inserts then, I can't see why this would be a problem. SQL Server optimiser most likely would decide for PAGE/ROW lock (but this is depending on your DB settings as well as indexes affected and their options). If by any reason you want to stop other transaction until this update is finished - hold an exclusive lock on the entire table, till the end of your transaction, you can just add WITH(TABLOCKX). Therefore, I would strongly recommend to have a good read on the SQL Server concurrency and isolation before you start messing with it in a production environment.
I need to implement a serializable isolation level in SQL Server but I've tried many ways and I don't get it.
I need to lock 1 row in one transaction (It doesn´t matter if lock the complete table). So, another transaction can´t even select the row (don´t read).
The last thing I tried:
For transaction 1:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
-- Here I select in another instance the same row
COMMIT TRAN
For transaction 2:
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
COMMIT TRAN
I would expect that transaction 2 wait until transaction 1 commit the operation, but the transaction 2 gives me the row.
Anyone can explain me if I miss something?
SQL Server conforms to the strict definition of a Serializable query. That is, there must be a result that can logically be generated IF both queries ran in serial order - Transaction 1 finishing before Transaction 2 can start, or vice versa.
This results in some effects that can be different than you would expect. There is a great explanation of the Serializable isolation level over at SQLPerformance.com that makes clear some of what this logical serializability ends up meaning. (Very helpful site, that one.)
For your above queries, there is no logical requirement to prevent the second query from reading the same row as the first query. No matter in what order the queries are run, they will both return the same data without modifying it. Since the Query Analyzer can identify this, there is no reason to place a read lock on the data. However, if one of the queries performed an update on the data, then (warning - logic assumption here, since I don't actually know the internals of how SQL Server handles this) the QA would set a stronger lock on the selected rows.
TL;DR - SQL Server wants to minimize blocking, so it uses logical analysis to see what types of locks are needed for a serializable isolation level, and it (tries to) use the minimum number and strength of locks needed to achieve its goal.
Now that we've dealt with that - there are only two ways that I can think of to lock a row so that no one else can read it: using XLOCK + TABLOCK (locking the whole table - not a recommended practice) or having some form of a field on each row that is updated when you start your process - something like an SPID field, or a bit flag for Locked. When you update it within your transaction, only SELECTs with NOLOCK hints will be able to read it.
Clearly, neither of these are optimal. I recommend the "This row is busy - go away" flag, as that's probably the approach I would take for an (almost) absolute lock on a row.
According to the documentation:
SERIALIZABLE Specifies the following:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the
current transaction until the current transaction completes.
If you're not making any changes to data with an INSERT, UPDATE, or DELETE inside transaction 1, SQL will release the Shared Lock after the read completes.
What you might want to try is adding a table hit to prevent the row lock from being released until the end of transaction 1.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code
FROM table1 WITH(ROWLOCK, HOLDLOCK)
WHERE code = 1
COMMIT TRAN
Maybe you can solve this with some hack like this?
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
UPDATE someTableForThisHack set val = CASE WHEN val = 1 THEN 0 else 1 End
SELECT code from table1.....
COMMIT TRANSACTION
So you create a table someTableForThisHack and insert one row to it.
I have an application connected to a SQL Server 2014 database that combines several rows into one. There are no other connections to this database while the application is running.
First, select a chunk of rows within a specific time span. This query uses a non-clustered seek (TIME column) merged with a clustered lookup.
select ...
from FOO
where TIME >= #from and TIME < #to and ...
Then, we process these rows in c# and write changes as a single update and multiple deletes, this happens many times per chunk. These also use non-clustered index seeks.
begin tran
update FOO set ...
where NON_CLUSTERED_ID = #id
delete FOO where NON_CLUSTERED_ID in (#id1, #id2, #id3, ...)
commit
I am getting deadlocks when running this with multiple parallel chunks. I tried using ROWLOCK for the update and delete but that caused even more deadlocks than before for some reason, even though there are no overlaps between chunks.
Then I tried TABLOCKX, HOLDLOCK on the update, but that means I can't perform my select in parallel so I'm losing the advantages of parallelism.
Any idea how I can avoid deadlocks but still process multiple parallel chunks?
Would it be safe to use NOLOCK on my select in this case, given there is no row overlap between chunks? Then TABLOCKX, HOLDLOCK would only block the update and delete, correct?
Or should I just accept that deadlocks will happen and retry the query in my application?
UPDATE (additional information): All deadlocks so far have happened in the update and delete phase, none in the select. I'll try to get some deadlock logs up if I can't get this solved today (the correct trace flags weren't enabled before).
UPDATE: These are the two arrangements of deadlocks that occur with ROWLOCK, they both refer only to the delete statement and the non-clustered index it uses. I'm not sure if these are the same as the deadlocks that occur without any table hints as I wasn't able to reproduce any of those.
Ask if there's anything else needed from the .xdl, I'm a bit weary of attaching the whole thing.
The general advice regarding deadlocks: make sure you do everything in the same order, i.e. acquire locks in the same order, for different processes.
You can find the same advice in this technical article on microsoft.com regarding Minimizing Deadlocks. There's a good reason it is listed first.
Access objects in the same order.
Avoid user interaction in transactions.
Keep transactions short and in one batch.
Use a lower isolation level.
Use a row versioning-based isolation level.
Set READ_COMMITTED_SNAPSHOT database option ON to enable read-committed transactions to use row versioning.
Use snapshot isolation.
Use bound connections.
Update after question from Cato:
How would acquiring locks in the same order apply here? Have you got any advice on how he would change his SQL to do that?
Deadlocks are always the same, no matter what environment: two processes (say A & B) acquire multiple locks (say X & Y) in a different order so that A is waiting for Y and B is waiting for X while A is holding X and B is holding Y.
It applies here because DELETE and UPDATE statements implicitely acquire locks on the rows or index range or table (depending on what the engine deems appropriate).
You should analyze your process and see if there are scenarios where locks could be acquired in a different order. If that doesn't reveal anything, you can analyze deadlocks using the SQL Server Profiler:
To trace deadlock events, add the Deadlock graph event class to a trace. This event class populates the TextData data column in the trace with XML data about the process and objects that are involved in the deadlock. SQL Server Profiler can extract the XML document to a deadlock XML (.xdl) file which you can view later in SQL Server Management Studio. You can configure SQL Server Profiler to extract Deadlock graph events to a single file that contains all Deadlock graph events, or to separate files.
I'd use sp_getapplock in the updating transaction to prevent multiple instances of this code running in parallel. This will not block the selecting statement as table locking hints do.
You still should program the retrying logic, because it may take a while to acquire the lock, longer than the timeout parameter.
This is how the updating transaction can be wrapped into sp_getapplock.
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'some_unique_name_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
update FOO set ...
where NON_CLUSTERED_ID = #id
delete FOO where NON_CLUSTERED_ID in (#id1, #id2, #id3, ...)
END ELSE BEGIN
-- return some error code, so that the caller could retry
END;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
-- handle the error
END CATCH;
The selecting statement doesn't need any changes.
I would recommend against NOLOCK, even though you say that IDs in chunks do not overlap. With this hint the SELECT query can skip some pages that are being changed, it can read some pages twice. It is unlikely that such behavior can be tolerated.
Kindly use get applock in such format in code. The stored procedure sp_getapplock puts the lock on the application resource .
EXEC Sp_getapplock
#Resource = 'storeprocedurename',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 25000
It is very helpful. Kindly increase LockTimeout to reduce deadlock
I have two transaction: T1 with SERIALIZABLE isolation level and T2 (I think - with default READ COMMITTED isolation level, but it doesn't matter).
Transaction T1 performs SELECT then WAITFOR 2 seconds then SELECT.
Transaction T2 performs UPDATE on data which T1 read.
It causes deadlock, why transaction T2 don't wait for end of T1?
When T1 has REPEATABLE READ isolation level everything is OK i.e. phantom rows occur.
I thought when I raise isolation level up to SERIALIZABLE, T2 will wait for end of T1.
This is a part of my college exercise. I have to show negative effects in two parallel transactions which have incorrect isolation level and absence of these effects with correct isolation level.
Here's the code, unfortunately names of fields are in Polish.
T1:
USE MR;
SET IMPLICIT_TRANSACTIONS OFF;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
-- 1. zapytanie
SELECT
www.IdSamochodu, s.Model, s.Marka, s.NrRejestracyjny, o.PESEL, o.Nazwisko, o.Imie, o.NrTelefonu
FROM
WizytyWWarsztacie www
JOIN
Samochody s
ON s.IdSamochodu = www.IdSamochodu
JOIN
Osoby o
ON o.PESEL = s.PESEL
WHERE
www.[Status] = 'gotowy_do_odbioru'
ORDER BY www.IdSamochodu ASC
;
WAITFOR DELAY '00:00:02';
-- 2. zapytanie
SELECT
u.IdSamochodu, tu.Nazwa, tu.Opis, u.Oplata
FROM
Uslugi u
JOIN
TypyUslug tu
ON tu.IdTypuUslugi = u.IdTypuUslugi
JOIN
WizytyWWarsztacie www
ON www.IdSamochodu = u.IdSamochodu AND
www.DataOd = u.DataOd
WHERE
www.[Status] = 'gotowy_do_odbioru'
ORDER BY u.IdSamochodu ASC, u.Oplata DESC
;
COMMIT;
T2:
USE MR;
SET IMPLICIT_TRANSACTIONS OFF;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN TRANSACTION;
UPDATE
Uslugi
SET
[Status] = 'wykonano'
WHERE
IdUslugi = 2
;
UPDATE
www
SET
www.[Status] = 'gotowy_do_odbioru'
FROM
WizytyWWarsztacie www
WHERE
www.[Status] = 'wykonywanie_usług' AND
EXISTS (
SELECT 1
FROM
Uslugi u
WHERE
u.IdSamochodu = www.IdSamochodu AND
u.DataOd = www.DataOd AND
u.[Status] = 'wykonano'
GROUP BY u.IdSamochodu, u.DataOd
HAVING COUNT(u.IdUslugi) = (
SELECT
COUNT(u2.IdUslugi)
FROM
Uslugi u2
WHERE
u2.IdSamochodu = www.IdSamochodu AND
u2.DataOd = www.DataOd
GROUP BY u2.IdSamochodu, u2.DataOd
)
)
;
COMMIT;
I use SQL Management Studio and I have each transaction in different file. I run this by clicking F5 in T1 then quickly switch to file which contains T2 and again - F5.
I read about deadlocks and locking mechanism in mssql but apparently, I haven't understand this topic yet.
Deadlock issue in SQL Server 2008 R2 (.Net 2.0 Application)
SQL Server deadlocks between select/update or multiple selects
Deadlock on SELECT/UPDATE
http://msdn.microsoft.com/en-us/library/ms173763(v=sql.105).aspx
http://www.sql-server-performance.com/2004/advanced-sql-locking/
edit
I figure out the first UPDATE statement in T2 causes the problem, why?
Troubleshooting deadlocks starts with obtaining the deadlock graph. This is an xml document that tells you the relevant bits about the transactions and resources involved. You can get it through Profiler, extended events, or event notifications (I'm sure that there are other methods, but this will do for now). Once you have the graph, examine it to see what each transaction had what type of locks on what resources. Where you go from there really depends on what's going on in the graph so I'll stop there. Bottom line: obtain the deadlock graph and mine it for details.
As an aside, to say that one or the other transaction is "causing" the deadlock is somewhat misleading. All transactions involved in the deadlock were necessary to cause the deadlock situation so neither is more at fault.
I had some problems with my SQL Managmenet Studio (Profiler didn't work) but finally I've obtained deadlock graph. This article was helpful for me.
To understand this graph I had to learn about locking mechanism and symbols.
I think here it is explained quite clearly.
Now, when I know about all these stuff, the cause of deadlock is quite obvious.
I've made sequence diagram for the described situation:
As I wrote earlier, when we get rid of the first UPDATE statement from transaction T2, deadlock does not occur.
In this situation T2 does not acquire a lock on the pk_uslugi index, thus second SELECT statement from transaction T1 will execute successfully and index pk_wizytywwarsztacie will be unlocked. After that, also T2 will be finished.
The problem could be this:
The T1 Select S-locks the row
The T2 Update U-locks the row (succeeds)
The T2 Update X-locks the row (waits, lock is queued)
T2 tries to S-lock again, but the S-lock is incompatible with the queued X-lock.
Locks in SQL Server are queued. If the head of the queue waits, everything else behind it also waits.
Actually, I'm not entirely sure that this is the cause because the same problem should occur with REPEATABLE READ. I'm still posting this idea hoping that it helps.
I ran into a similar issue where I was selecting from a list of available items and then inserting those items into a holding queue table. When I had too many concurrent requests, the select statement would return items that were also concurrently selected during another parallel request. When attempting to insert them into the holding queue table, I would receive a Unique Constraint error (because the same item couldn't go into the holding table twice).
I then tried wrapping a SERIALIZABLE transaction around the whole thing but then I ran into DEADLOCK errors because both transactions were holding onto a lock on the UC index (determined by my Deadlock graph).
I was finally able to resolve the issue by using an exclusive Row lock within the select statement.
You could try using an exclusive row lock on the table/rows in question. This would ensure that the lock on the rows within T1 will complete before T2 attempts to update the same rows.
EXAMPLE:
SELECT *
FROM Uslugi u WITH (XLOCK, ROWLOCK)
I'm not sure yet of the performance impact of this but while running load testing using multiple threads, it doesn't appear to have a negative impact.
I'm writing a high volume trading system. We receive messages at around 300-500 per second and these messages then need to be saved to the database as quickly as possible. These messages get deposited on a Message Queue and are then read from there.
I've implemented a Competing Consumer pattern, which reads from the queue and allows for multithreaded processing of the messages. However I'm getting a frequent primary key violation while the app is running.
We're running SQL 2008. The sample table structure would be:
TableA
{
MessageSequence INT PRIMARY KEY,
Data VARCHAR(50)
}
A stored procedure gets invoked to persist this message and looks something like this:
BEGIN TRANSACTION
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS
(
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
)
IF (##ROWCOUNT = 0)
BEGIN
UPDATE TableA
SET Data = #Data
WHERE MessageSequence = #MessageSequence
END
COMMIT TRANSACTION
All of this is in a TRY...CATCH block so if there's an error, it rolls back the transaction.
I've tried using table hints, like ROWLOCK, but it hasn't made a difference. Since the Insert is evaluated as a single statement, it seems ludicrous that I'm still getting a 'Primary Key on insert' issue.
Does anyone have an idea why this is happening? And have you got ANY ideas which may point me in the direction of a solution?
Why is this happening?
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
This SELECT will try to locate the row, if not found the EXISTS operator will return FALSE and the INSERT will proceed. Hoewever, the decision to INSERT is based on a state that was true at the time of the SELECT, but that is no longer guaranteed to be true at the time of the INSERT. In other words, you have race conditions where two threads can both look up the same #MessageSequence, both return NOT EXISTS and both try to INSERT, when only the first one will succeed, second one will cause a PK violation.
How do I solve it?
The quickest fix is to add a WITH (UPDLOCK) hint to the SELECT, this will enforce the lock placed on the #MessageSequence key to be retained and thus the INSERT/SELECT to behave atomically:
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS (
SELECT TOP 1 MessageSequence FROM TableA WITH(UPDLOCK) WHERE MessageSequence = #MessageSequence)
To prevent SQL from doing fancy stuff like page lock, you can also add the ROWLOCK hint.
However, that is not my recommendation. My recommendation may surpise you, but is this: do the operation that is most likely to succeed and handle the error if it failed. Ie. if your business case makes it more likely for the #MessageSequnce to be new, try an INSERT and handle the PK if it failed. This way you avoid the spurious look-ups, and hte cost of the catch/retry is amortized over the many cases when it succeeds from the first try.
Also, it is perhaps worth investigating using the built-in queues that come with SQL Server.
Common problem. Explained here:
Defensive database programming: eliminating IF statements
It might be related to the transaction isolation level. You might need
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
before you start the transaction.
Also, if you have more updates than inserts, you should try the update first and check rowcount and do the insert second.
This is very similar to post 939831. Ultimately you want to use the hints (ROWLOCK, READPAST, UPDLOCK). READPAST tells sql server to skip to the next record if the current one is locked. UPDLOCK tells sql server that the read lock is going to escalate to an update lock.
When I implemented something similar I locked the next record by the threadID
UPDATE TOP (1)
foo
SET
ProcessorID = #PROCID
FROM
OrderTable foo WITH (ROWLOCK, READPAST, UPDLOCK)
WHERE
ProcessorID = 0
Then selected the record
SELECT *
FROM foo WITH (NOLOCK)
WHERE ProcessorID = #PROCID
Then marked it as processed
UPDATE foo
SET ProcessorID = -1
WHERE ProcessorID = #PROCID
Later in off hours I perform the relatively expensive operation of performing the delete operation to clear the queue of processed records.
The atomicity of the following statement is what you are after:
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS
(
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
)
According to this person, it depends on the current isolation level.
On a tangent, if you're thinking of a high volume trading system you might want to consider a tick database designed for such data [I'm not exactly sure what "message" you are storing here], such as discussed in this thread for example: http://www.elitetrader.com/vb/showthread.php?threadid=81345.
These are typically in-memory solutions with proprietary query languages. We use kdb+ at our shop.
Not sure what Messaging product you use - but it may be worth looking at the transactions not at the DB level, but at the MQ Level.
Of course, if you are using a TM (Transaction manager), the two operations : 1)Get from MQ and 2)Write to DB are both 'bracketed' under the same parent commit.
So I am not sure if you are using an implicit or explicit or any TM here (for example, Microsoft's DTC).
MessageSequence is the PK, so could the same Message from the MQ be getting processed twice.
When you perform a 'GET" from MQ, make sure the GET is committed (i.e. not a db-commit, but a MQ-commit) - that will ensure the same MessageID cannot be 'popped' by the next thread that writes messages to the DB.