How to fix "UPDATE <table> OUTPUT WITH (READPAST)" - sql

We are trying to retrieve and update the TOP X events from a table but without locking anything else than the "processed" rows. We looked into different SQL hints like ROWLOCK and READPAST, but haven't figured out what combination of those should be used in this scenario. Also, we need to make sure that the returned rows are unique across different concurrent executions of that query and that the same row will never be selected twice.
Note: This table has got many INSERTs happening concurrently.
UPDATE TOP(:batchSize) wsns WITH (READPAST)
SET consumer_ip = :consumerIP
OUTPUT inserted.id, inserted.another_id, inserted.created_time, inserted.scheduled_time
FROM table_A a
WHERE a.scheduled_time < GETUTCDATE() AND a.consumer_ip IS NULL
Any help is highly appreciated. Many thanks!

I don't quite follow how/why are you trying to use the READPAST hint here?
But anyway - to achieve what you want I would suggest:
WITH xxx AS
(
SELECT TOP(:batchSize) *
FROM table_A
)
UPDATE xxx
SET consumer_ip = :consumerIP
OUTPUT inserted.id, inserted.another_id, inserted.created_time, inserted.scheduled_time
FROM table_A a
WHERE a.scheduled_time < GETUTCDATE() AND a.consumer_ip IS NULL;
If all that could happen in the background are new inserts then, I can't see why this would be a problem. SQL Server optimiser most likely would decide for PAGE/ROW lock (but this is depending on your DB settings as well as indexes affected and their options). If by any reason you want to stop other transaction until this update is finished - hold an exclusive lock on the entire table, till the end of your transaction, you can just add WITH(TABLOCKX). Therefore, I would strongly recommend to have a good read on the SQL Server concurrency and isolation before you start messing with it in a production environment.

Related

Sql Server 2008 Update query is taking so much time

I have a table name Companies with 372370 records.
And there is only one row which has CustomerNo = 'YP20324'.
I an running following query and its taking so much time I waited 5 minutes and it was still running. I couldn't figure out where is the problem.
UPDATE Companies SET UserDefined3 = 'Unzustellbar 13.08.2012' WHERE CustomerNo = 'YP20324'
You don't have triggers on update on that table?
Do you have a cascade foreign key based on that column?
Are you sure of the performance of your server? try to take a look of the memory, cpu first when you execute the query (for example on a 386 with 640mb i could understand it's slow :p)
And for the locks, you can right click the database and on the report you can see the blocking transactions. Sometimes it helps for concurrent access.
Try adding an index on the field you are using in your WHERE clause:
CREATE INDEX ix_CompaniesCustomerNo ON Companies(CustomerNo);
Also check if there are other active queries which might block the update.
Try this SQL and see what is running:
SELECT TOP 20
R.session_id, R.status, R.start_time, R.command, Q.text
FROM
sys.dm_exec_requests R
CROSS APPLY sys.dm_exec_sql_text(R.sql_handle) Q
WHERE R.status in ('runnable')
ORDER BY R.start_time
More details:
List the queries running on SQL Server
or
http://sqlhint.com/sqlserver/scripts/tsql/list-long-running-queries
Once I found someone shrinking database and blocking all other people.
More likely than not your UPDATE is not doing anything, is just waiting, blocked by some other statement. Use Activity Monitor to investigate what is causing the blocking. Most likely you have another statement that started a transaction and you forgot to close it.
There could be other causes too, eg. database/log growth. Only you can do the investigation. An index on CustomerNo is required, true, but lack of an index is unlikely to explain 5 minutes on 370k records. Blocking is more likely.
There are more advanced tools out there like sp_whoisactive.
5mn is way too long for 370k rows, even without any indexes, someone else is locking your update. use sp_who2 (or activity monitor) and check for BlockedBy Column to find who is locking your update
I would suggest to rebuild your indexes. This should surely help you.
If you do not have index on CustomerNo field you must add one.
In my case, there was a process that was blocking the update;
Run: 'EXEC sp_who;'
Find the process that is blocked by inspecting the 'blk' column; Let's say that we find a process that is blocked by '73';
Inspect the record with column 'spid' = '73' and if it's not important, run 'kill 73';
370k records is nothing for sql erver. You should check indexes on this table. Each index makes update operation longer.

How to solve lck_m_x Lock in sql

I have a complex Query (Without any locking hints) that takes data from many tables say Table1,Table2,Table3
Below is the code the code to retrieve data (there are no transactions)
IDbCommand sqlCmd = dbHelper.CreateCommand(Helper.MyConnString, sbSQL.ToString(), CommandType.Text, arParms);
sqlCmd.CommandTimeout = 300;
ds = dbHelper.ExecuteDataset(sqlCmd);
In an application this query runs every 2 minutes
When i fire a simple update query say
Update Table1 set Col1='abc' where ID=100
(where ID is int and primary key + clustered index)
the update query gets delayed and many times it is timeout
Below is the log
How can i fix this.
You could execute your query inside a transaction that has an isolation level of SNAPSHOT. That way, your query won't acquire any (shared) locks and your UPDATE doesn't have to wait for the exclusive lock (given that the source of the locks on the table that block your UPDATE is really your query that the application runs every two minutes...)
For reference, have a look at Working with Snapshot Isolation and SET TRANSACTION ISOLATION LEVEL on MSDN.
EDIT due to comment:
First, turn on ALLOW_SNAPSHOT_ISOLATION for your database:
ALTER DATABASE YourDB SET ALLOW_SNAPSHOT_ISOLATION ON
Then, write your query as follows:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
SELECT Column1, Columns2, ...
FROM Table1
LEFT JOIN Table2
ON Table1.Columns45 = Table2.Column3
[... your complex query ...]
Is that enough for an example?
If you don't want wait reading queries while modifing data, it would be best to use READ_COMMITTED_SNAPSHOT.
It can be transparently on, and don't affect you app code and has no side effect.
SNAPSHOT has many side effects, for example, while you don't have locks on data modifications, you can have conflicting data problems on commiting, this problems very dificult to deal with.

How can multiple Stored Procedures update same row at same time?

I am using SSIS 2008 to execute multiple stored procedures in parallel in control flow.
Each SP is supposed to ultimately update 1 row in a table.
The point to note is that each SP has a defined responsibility to update specific columns only.
It is guaranteed that the different SPs will not update each other's columns. So the columns to be updated are divided between the various SPs, but as per design, each SP is supposed to work on the same row ultimately.
At the moment some of my SPs error out due to deadlock. I am guessing this may be because of the lock on that row by other SPs?
How can i work this out?
You have to admit, this seems like a highly unusual thing to do. I wonder if it wouldn't be better to update separate tables, and then have a single update statement at the end that would join the individual tables to the final one? (i.e. update a set a.[1] = ... from a inner join b inner join c etc.).
But, if you want to continue down this path, then just set READ UNCOMMITTED from within each of your stored procedures. That is the best bet.
The deadlocks are probably caused by more than just having another SP locking the row. Under that situation the first procedure would just wait until the locking SP releases the lock. That's not to say that your multiple procedures are not causing the problem. There's more to it though.
You may have to do some rework to avoid the problem, but first you should find out more about the deadlock situation. I suspect that you have locks on objects other than the row that's being updated.
There are ways to gather more information on the deadlock. Here's a link where you can learn about the deadlock details.
Simple: make suere you do not leave locks anywhere. This works with the transaction isolation determined by the connection. With theproper transaction isaolation, there will be no locks, so no deadlocks.
The update is not the problem. It starts with READS. Go to READ UNCOMMITED to make sure you do not leave locks upon read, and / or use the NOLOCK option in the SELECT statements to individually force them to leave NO LOCKS in place wile reading data (more advisable).
If you then make sure that the SP will commit pretty much immediately after the insert (internally or externally) there should be no deadlock possible - a write lock will result in the next SP waiting for the first Transaction to commit.
Especially when going down to one table / one row only, a deadlock is not possible with update statements only, IIRC. It is the read locks which turn a delaying lock (albeit small delay) into a deadlock.
http://www.eggheadcafe.com/software/aspnet/30976898/how-does-update-lock-prevent-deadlocks.aspx has some nice example of a deadlock. So, if everything BUT the update has no locks, then at the end a deadlock is not possible.
Row level locking is as low as you can safely go, so I think you will have to break up the updated table into several tables that you can update independent of each other.
No, two sessions cannot update the same row simultaneously. Row level locking is as low as it gets so when one session is updating a row, other sessions wanting to update that record will wait.
As for the deadlocks, the trouble with SQL Server is that by default SELECT statements will block if it's asking for a record currently being updated. You can use WITH (NOLOCK) if you don't mind reading uncommitted data.
So, if you've got this order of events:
SessionA
begin transaction
update t1
set c1 = 'x'
where c2 = 5
SessionB
begin transaction
update t2
set c1 = 'y'
where c2 = 7
SessionA
select * from t1 where c2 = 5
<waits on SessionB>
SessionB
select * from t2 where c2 = 7
<waits on SessionA>...oops. Deadlock.
The trick is to only lock something for as little time as is necessary (don't break up a series of statements just to release a lock - make sure that steps that make up a logical transaction stay as a transaction):
begin transaction
update t1
set c1 = 'x'
where c2 = 5
commit
Or (caveat emptor) use the nolock directive:
SessionA
select c1 from t1 where c2 = 5 with (nolock)
<gets the new value of 'x'>
SessionB
select * from t2 where c2 = 7 with (nolock)
<gets the new value of 'y'>

SELECT FOR UPDATE with SQL Server

I'm using a Microsoft SQL Server 2005 database with isolation level READ_COMMITTED and READ_COMMITTED_SNAPSHOT=ON.
Now I want to use:
SELECT * FROM <tablename> FOR UPDATE
...so that other database connections block when trying to access the same row "FOR UPDATE".
I tried:
SELECT * FROM <tablename> WITH (updlock) WHERE id=1
...but this blocks all other connections even for selecting an id other than "1".
Which is the correct hint to do a SELECT FOR UPDATE as known for Oracle, DB2, MySql?
EDIT 2009-10-03:
These are the statements to create the table and the index:
CREATE TABLE example ( Id BIGINT NOT NULL, TransactionId BIGINT,
Terminal BIGINT, Status SMALLINT );
ALTER TABLE example ADD CONSTRAINT index108 PRIMARY KEY ( Id )
CREATE INDEX I108_FkTerminal ON example ( Terminal )
CREATE INDEX I108_Key ON example ( TransactionId )
A lot of parallel processes do this SELECT:
SELECT * FROM example o WITH (updlock) WHERE o.TransactionId = ?
EDIT 2009-10-05:
For a better overview I've written down all tried solutions in the following table:
mechanism | SELECT on different row blocks | SELECT on same row blocks
-----------------------+--------------------------------+--------------------------
ROWLOCK | no | no
updlock, rowlock | yes | yes
xlock,rowlock | yes | yes
repeatableread | no | no
DBCC TRACEON (1211,-1) | yes | yes
rowlock,xlock,holdlock | yes | yes
updlock,holdlock | yes | yes
UPDLOCK,READPAST | no | no
I'm looking for | no | yes
Recently I had a deadlock problem because Sql Server locks more then necessary (page). You can't really do anything against it. Now we are catching deadlock exceptions... and I wish I had Oracle instead.
Edit:
We are using snapshot isolation meanwhile, which solves many, but not all of the problems. Unfortunately, to be able to use snapshot isolation it must be allowed by the database server, which may cause unnecessary problems at customers site. Now we are not only catching deadlock exceptions (which still can occur, of course) but also snapshot concurrency problems to repeat transactions from background processes (which cannot be repeated by the user). But this still performs much better than before.
I have a similar problem, I want to lock only 1 row.
As far as I know, with UPDLOCK option, SQLSERVER locks all the rows that it needs to read in order to get the row. So, if you don't define a index to direct access to the row, all the preceded rows will be locked.
In your example:
Asume that you have a table named TBL with an id field.
You want to lock the row with id=10.
You need to define a index for the field id (or any other fields that are involved in you select):
CREATE INDEX TBLINDEX ON TBL ( id )
And then, your query to lock ONLY the rows that you read is:
SELECT * FROM TBL WITH (UPDLOCK, INDEX(TBLINDEX)) WHERE id=10.
If you don't use the INDEX(TBLINDEX) option, SQLSERVER needs to read all rows from the beginning of the table to find your row with id=10, so those rows will be locked.
You cannot have snapshot isolation and blocking reads at the same time. The purpose of snapshot isolation is to prevent blocking reads.
perhaps making mvcc permanent could solve it (as opposed to specific batch only: SET TRANSACTION ISOLATION LEVEL SNAPSHOT):
ALTER DATABASE yourDbNameHere SET READ_COMMITTED_SNAPSHOT ON;
[EDIT: October 14]
After reading this: Better concurrency in Oracle than SQL Server? and this: http://msdn.microsoft.com/en-us/library/ms175095.aspx
When the READ_COMMITTED_SNAPSHOT
database option is set ON, the
mechanisms used to support the option
are activated immediately. When
setting the READ_COMMITTED_SNAPSHOT
option, only the connection executing
the ALTER DATABASE command is allowed
in the database. There must be no
other open connection in the database
until ALTER DATABASE is complete. The
database does not have to be in
single-user mode.
i've come to conclusion that you need to set two flags in order to activate mssql's MVCC permanently on a given database:
ALTER DATABASE yourDbNameHere SET ALLOW_SNAPSHOT_ISOLATION ON;
ALTER DATABASE yourDbNameHere SET READ_COMMITTED_SNAPSHOT ON;
Try (updlock, rowlock)
The full answer could delve into the internals of the DBMS. It depends on how the query engine (which executes the query plan generated by the SQL optimizer) operates.
However, one possible explanation (applicable to at least some versions of some DBMS - not necessarily to MS SQL Server) is that there is no index on the ID column, so any process trying to work a query with 'WHERE id = ?' in it ends up doing a sequential scan of the table, and that sequential scan hits the lock which your process applied. You can also run into problems if the DBMS applies page-level locking by default; locking one row locks the entire page and all the rows on that page.
There are some ways you could debunk this as the source of trouble. Look at the query plan; study the indexes; try your SELECT with ID of 1000000 instead of 1 and see whether other processes are still blocked.
OK, a single select wil by default use "Read Committed" transaction isolation which locks and therefore stops writes to that set. You can change the transaction isolation level with
Set Transaction Isolation Level { Read Uncommitted | Read Committed | Repeatable Read | Serializable }
Begin Tran
Select ...
Commit Tran
These are explained in detail in SQL Server BOL
Your next problem is that by default SQL Server 2K5 will escalate the locks if you have more than ~2500 locks or use more than 40% of 'normal' memory in the lock transaction. The escalation goes to page, then table lock
You can switch this escalation off by setting "trace flag" 1211t, see BOL for more information
Create a fake update to enforce the rowlock.
UPDATE <tablename> (ROWLOCK) SET <somecolumn> = <somecolumn> WHERE id=1
If that's not locking your row, god knows what will.
After this "UPDATE" you can do your SELECT (ROWLOCK) and subsequent updates.
I'm assuming you don't want any other session to be able to read the row while this specific query is running...
Wrapping your SELECT in a transaction while using WITH (XLOCK,READPAST) locking hint will get the results you want. Just make sure those other concurrent reads are NOT using WITH (NOLOCK). READPAST allows other sessions to perform the same SELECT but on other rows.
BEGIN TRAN
SELECT *
FROM <tablename> WITH (XLOCK,READPAST)
WHERE RowId = #SomeId
-- Do SOMETHING
UPDATE <tablename>
SET <column>=#somevalue
WHERE RowId=#SomeId
COMMIT
Question - is this case proven to be the result of lock escalation (i.e. if you trace with profiler for lock escalation events, is that definitely what is happening to cause the blocking)? If so, there is a full explanation and a (rather extreme) workaround by enabling a trace flag at the instance level to prevent lock escalation. See http://support.microsoft.com/kb/323630 trace flag 1211
But, that will likely have unintended side effects.
If you are deliberately locking a row and keeping it locked for an extended period, then using the internal locking mechanism for transactions isn't the best method (in SQL Server at least). All the optimization in SQL Server is geared toward short transactions - get in, make an update, get out. That's the reason for lock escalation in the first place.
So if the intent is to "check out" a row for a prolonged period, instead of transactional locking it's best to use a column with values and a plain ol' update statement to flag the rows as locked or not.
Application locks are one way to roll your own locking with custom granularity while avoiding "helpful" lock escalation. See sp_getapplock.
Try using:
SELECT * FROM <tablename> WITH ROWLOCK XLOCK HOLDLOCK
This should make the lock exclusive and hold it for the duration of the transaction.
According to this article, the solution is to use the WITH(REPEATABLEREAD) hint.
Revisit all your queries, maybe you have some query that select without ROWLOCK/FOR UPDATE hint from the same table you have SELECT FOR UPDATE.
MSSQL often escalates those row locks to page-level locks (even table-level locks, if you don't have index on field you are querying), see this explanation. Since you ask for FOR UPDATE, i could assume that you need transacion-level(e.g. financial, inventory, etc) robustness. So the advice on that site is not applicable to your problem. It's just an insight why MSSQL escalates locks.
If you are already using MSSQL 2005(and up), they are MVCC-based, i think you should have no problem with row-level lock using ROWLOCK/UPDLOCK hint. But if you are already using MSSQL 2005 and up, try to check some of your queries which query the same table you want to FOR UPDATE if they escalate locks by checking the fields on their WHERE clause if they have index.
P.S.
I'm using PostgreSQL, it also uses MVCC have FOR UPDATE, i don't encounter same problem. Lock escalations is what MVCC solves, so i would be surprised if MSSQL 2005 still escalate locks on table with WHERE clauses that doesn't have index on its fields. If that(lock escalation) is still the case for MSSQL 2005, try to check the fields on WHERE clauses if they have index.
Disclaimer: my last use of MSSQL is version 2000 only.
You have to deal with the exception at commit time and repeat the transaction.
I solved the rowlock problem in a completely different way. I realized that sql server was not able to manage such a lock in a satisfying way. I choosed to solve this from a programatically point of view by the use of a mutex... waitForLock... releaseLock...
Have you tried READPAST?
I've used UPDLOCK and READPAST together when treating a table like a queue.
How about trying to do a simple update on this row first (without really changing any data)? After that you can proceed with the row like in was selected for update.
UPDATE dbo.Customer SET FieldForLock = FieldForLock WHERE CustomerID = #CustomerID
/* do whatever you want */
Edit: you should wrap it in a transaction of course
Edit 2: another solution is to use SERIALIZABLE isolation level

SQL problem - high volume trans and PK violation

I'm writing a high volume trading system. We receive messages at around 300-500 per second and these messages then need to be saved to the database as quickly as possible. These messages get deposited on a Message Queue and are then read from there.
I've implemented a Competing Consumer pattern, which reads from the queue and allows for multithreaded processing of the messages. However I'm getting a frequent primary key violation while the app is running.
We're running SQL 2008. The sample table structure would be:
TableA
{
MessageSequence INT PRIMARY KEY,
Data VARCHAR(50)
}
A stored procedure gets invoked to persist this message and looks something like this:
BEGIN TRANSACTION
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS
(
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
)
IF (##ROWCOUNT = 0)
BEGIN
UPDATE TableA
SET Data = #Data
WHERE MessageSequence = #MessageSequence
END
COMMIT TRANSACTION
All of this is in a TRY...CATCH block so if there's an error, it rolls back the transaction.
I've tried using table hints, like ROWLOCK, but it hasn't made a difference. Since the Insert is evaluated as a single statement, it seems ludicrous that I'm still getting a 'Primary Key on insert' issue.
Does anyone have an idea why this is happening? And have you got ANY ideas which may point me in the direction of a solution?
Why is this happening?
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
This SELECT will try to locate the row, if not found the EXISTS operator will return FALSE and the INSERT will proceed. Hoewever, the decision to INSERT is based on a state that was true at the time of the SELECT, but that is no longer guaranteed to be true at the time of the INSERT. In other words, you have race conditions where two threads can both look up the same #MessageSequence, both return NOT EXISTS and both try to INSERT, when only the first one will succeed, second one will cause a PK violation.
How do I solve it?
The quickest fix is to add a WITH (UPDLOCK) hint to the SELECT, this will enforce the lock placed on the #MessageSequence key to be retained and thus the INSERT/SELECT to behave atomically:
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS (
SELECT TOP 1 MessageSequence FROM TableA WITH(UPDLOCK) WHERE MessageSequence = #MessageSequence)
To prevent SQL from doing fancy stuff like page lock, you can also add the ROWLOCK hint.
However, that is not my recommendation. My recommendation may surpise you, but is this: do the operation that is most likely to succeed and handle the error if it failed. Ie. if your business case makes it more likely for the #MessageSequnce to be new, try an INSERT and handle the PK if it failed. This way you avoid the spurious look-ups, and hte cost of the catch/retry is amortized over the many cases when it succeeds from the first try.
Also, it is perhaps worth investigating using the built-in queues that come with SQL Server.
Common problem. Explained here:
Defensive database programming: eliminating IF statements
It might be related to the transaction isolation level. You might need
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
before you start the transaction.
Also, if you have more updates than inserts, you should try the update first and check rowcount and do the insert second.
This is very similar to post 939831. Ultimately you want to use the hints (ROWLOCK, READPAST, UPDLOCK). READPAST tells sql server to skip to the next record if the current one is locked. UPDLOCK tells sql server that the read lock is going to escalate to an update lock.
When I implemented something similar I locked the next record by the threadID
UPDATE TOP (1)
foo
SET
ProcessorID = #PROCID
FROM
OrderTable foo WITH (ROWLOCK, READPAST, UPDLOCK)
WHERE
ProcessorID = 0
Then selected the record
SELECT *
FROM foo WITH (NOLOCK)
WHERE ProcessorID = #PROCID
Then marked it as processed
UPDATE foo
SET ProcessorID = -1
WHERE ProcessorID = #PROCID
Later in off hours I perform the relatively expensive operation of performing the delete operation to clear the queue of processed records.
The atomicity of the following statement is what you are after:
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS
(
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
)
According to this person, it depends on the current isolation level.
On a tangent, if you're thinking of a high volume trading system you might want to consider a tick database designed for such data [I'm not exactly sure what "message" you are storing here], such as discussed in this thread for example: http://www.elitetrader.com/vb/showthread.php?threadid=81345.
These are typically in-memory solutions with proprietary query languages. We use kdb+ at our shop.
Not sure what Messaging product you use - but it may be worth looking at the transactions not at the DB level, but at the MQ Level.
Of course, if you are using a TM (Transaction manager), the two operations : 1)Get from MQ and 2)Write to DB are both 'bracketed' under the same parent commit.
So I am not sure if you are using an implicit or explicit or any TM here (for example, Microsoft's DTC).
MessageSequence is the PK, so could the same Message from the MQ be getting processed twice.
When you perform a 'GET" from MQ, make sure the GET is committed (i.e. not a db-commit, but a MQ-commit) - that will ensure the same MessageID cannot be 'popped' by the next thread that writes messages to the DB.