Using HoldLock Incorrectly in SQL Server stored procedure - sql

I believe I am using HOLDLOCK incorrectly.
The reason I think this is because I have a table that acts like a queue. The queue receives its items from the SQL below and gets processed, one by one in a console application. I haven't tested yet but I believe when this console application starts processing this table, during some of the long selects the code below fails. Why do I think that...because I am logging the GameID when grabbing everything from the table queue and processing them one by one in that console application. The funny thing is the games that I believe didn't make it through didn't make it in the log, therefore I dont believe they are being inserted in my queue table and I believe it's because of the HOLDLOCK below.
Thoughts?
MERGE Test WITH (HOLDLOCK) AS GL
USING (SELECT #GameId AS ID) AS NewTest ON GL.ID = NewTest.ID
WHEN NOT MATCHED THEN
INSERT
(
Id,
FailedAttempts,
DateCreated
)
VALUES
(
NewTest.ID,
0,
SYSDATETIME()
);

I suspect your issue is unrelated to your use of MERGE or HOLDLOCK. I see no reason to introduce cumbersome MERGE syntax here, since it provides no benefit, and especially given the potential issues it can cause in other areas. I suggest a very simple INSERT ... WHERE NOT EXISTS:
INSERT dbo.Test(Id, FailedAttempts, DateCreated)
SELECT #GameId, 0, SYSDATETIME()
WHERE NOT EXISTS
(
SELECT 1 FROM dbo.Test WITH (HOLDLOCK)
WHERE Id = #GameId
);
I'd prefer this over just blindly trying to insert and getting a PK violation for the reasons outlined here and here - in almost all cases, forcing SQL Server to try and get an exception instead of checking yourself first will yield worse performance.

Related

How to fix "UPDATE <table> OUTPUT WITH (READPAST)"

We are trying to retrieve and update the TOP X events from a table but without locking anything else than the "processed" rows. We looked into different SQL hints like ROWLOCK and READPAST, but haven't figured out what combination of those should be used in this scenario. Also, we need to make sure that the returned rows are unique across different concurrent executions of that query and that the same row will never be selected twice.
Note: This table has got many INSERTs happening concurrently.
UPDATE TOP(:batchSize) wsns WITH (READPAST)
SET consumer_ip = :consumerIP
OUTPUT inserted.id, inserted.another_id, inserted.created_time, inserted.scheduled_time
FROM table_A a
WHERE a.scheduled_time < GETUTCDATE() AND a.consumer_ip IS NULL
Any help is highly appreciated. Many thanks!
I don't quite follow how/why are you trying to use the READPAST hint here?
But anyway - to achieve what you want I would suggest:
WITH xxx AS
(
SELECT TOP(:batchSize) *
FROM table_A
)
UPDATE xxx
SET consumer_ip = :consumerIP
OUTPUT inserted.id, inserted.another_id, inserted.created_time, inserted.scheduled_time
FROM table_A a
WHERE a.scheduled_time < GETUTCDATE() AND a.consumer_ip IS NULL;
If all that could happen in the background are new inserts then, I can't see why this would be a problem. SQL Server optimiser most likely would decide for PAGE/ROW lock (but this is depending on your DB settings as well as indexes affected and their options). If by any reason you want to stop other transaction until this update is finished - hold an exclusive lock on the entire table, till the end of your transaction, you can just add WITH(TABLOCKX). Therefore, I would strongly recommend to have a good read on the SQL Server concurrency and isolation before you start messing with it in a production environment.

How do I prevent SQL from running transactions simultaneously

I've noticed that MS SQL may begin another transaction just before a previous transaction is complete (or committed). Is there a way how we can ensure a transaction must complete first before the next transaction begins?
My problem is that I want to perform an SQL SELECT almost immediately after an SQL INSERT. What I'm seeing right now is; when the SELECT statement is run; it does not return the (very) recently inserted data.
As I traced this scenario using SQL profiler, I've noticed that the SQL INSERT and SELECT performs simultaneously, as in the SELECT occurs before the INSERT is completed.
Is there a way to fix this problem of mine? thanks!
From the sounds of it, you're looking for the OUTPUT clause
From the examples in the documentation http://msdn.microsoft.com/en-us/library/ms177564.aspx
DECLARE #MyTableVar table( NewScrapReasonID smallint,
Name varchar(50),
ModifiedDate datetime);
INSERT Production.ScrapReason
OUTPUT INSERTED.ScrapReasonID, INSERTED.Name, INSERTED.ModifiedDate
INTO #MyTableVar
VALUES (N'Operator error', GETDATE());
You can run your transactions in SERIALIZABLE isolation level. In this way you will ensure that the select will be performed after the insert. In lower isolation levels, the select is performed in paralell and returns the snapshot of the data - the way it is seen with all transactions completed before the one that issues select has been started.
I'm guessing you want to get an auto-generated identifier back after the insert? I'm not sure the MSSQL way to do this, but in PostgreSQL, there is INSERT ... RETURNING extension to solve exactly this problem.
http://www.postgresql.org/docs/9.1/static/sql-insert.html
Are you locked into MSSQL?

What's blocking "Select top 1 * from TableName with (nolock)" from returning a result?

I'm currently running the following statement
select * into adhoc..san_savedi from dps_san..savedi_record
It's taking a painfully long time and I'd like to see how far along it is so I ran this:
select count(*) from adhoc..san_savedi with (nolock)
That didn't return anything in a timely manner so for the heck of it I did this:
select top 1 * from adhoc..san_savedi with (nolock)
Even that seems to run indefinitely. I could understand if there are millions of records that the count(*) could take a long time, but I don't understand why selecting the top 1 record wouldn't come back pretty much immediately considering I specified nolock.
In the name of full disclosure, dps_san is a view that pulls from an odbc connection via linked server. I don't think that'd be affecting why I can't return the top row but just throwing it out there in case I'm wrong.
So I want to know what is keeping that statement from running?
EDIT:
As I mentioned above, yes dps_san..savedi_record is a view. Here's what it does:
select * from DPS_SAN..root.SAVEDI_RECORD
It's nothing more than an alias and does no grouping/sorting/etc so I don't think the problem lies here, but please enlighten me if I'm wrong about that.
SELECT queries with NOLOCK don't actually take no locks, they still need a SCH-S (schema stability) lock on the table (and as it is a heap it will also take a hobt lock).
Additionally before the SELECT can even begin SQL Server must compile a plan for the statement, which also requires it to take a SCH-S lock out on the table.
As your long running transaction creates the table via SELECT ... INTO it holds an incompatible SCH-M lock on it until the statement completes.
You can verify this by looking in sys.dm_os_waiting_tasks whilst while during the period of blocking.
When I tried the following in one connection
BEGIN TRAN
SELECT *
INTO NewT
FROM master..spt_values
/*Remember to rollback/commit this later*/
And then executing (or just simply trying to view the estimated execution plan)
SELECT *
FROM NewT
WITH (NOLOCK)
in a second the reading query was blocked.
SELECT wait_type,
resource_description
FROM sys.dm_os_waiting_tasks
WHERE session_id = <spid_of_waiting_task>
Shows the wait type is indeed SCH_S and the blocking resource SCH-M
wait_type resource_description
---------------- -------------------------------------------------------------------------------------------------------------------------------
LCK_M_SCH_S objectlock lockPartition=0 objid=461960722 subresource=FULL dbid=1 id=lock4a8a540 mode=Sch-M associatedObjectId=461960722
It very well may be that there are no locks... If dps_san..savedi_record is a view, then it may be taking a long time to execute, because it may be accessing tables without using an index, or it may be sorting millions of records, or whatever reason. Then your query, even a simple top or count, will be only as fast as that view can be executed.
A few issues to consider here. Is dps_san..savedi_record a view? If so, it could just be taking a really long time to get your data. The other thing I can think of is that you're trying to create a temp table by using the select into syntax, which is a bad idea. select * into ... syntax will lock the tempdb for duration of the select.
If you are creating the table using that syntax, then there is a workaround. First, create the table by throwing where 1=0 at the end of your initial statement:
select * into ... from ... where 1=0
This will create the table first (which is quick) which allows you to insert into because the table exists now (without penalty of locking tempdb for duration of query).
Find the session_id that is performing the select into:
SELECT r.session_id, r.blocking_session_id, r.wait_type, r.wait_time
FROM sys.dm_exec_requests AS r
CROSS APPLY sys.dm_exec_sql_text(r.plan_handle) AS t
WHERE t.[text] LIKE '%select%into%adhoc..san_savedi%';
This should let you know if another session is blocking the select into or if it has a wait type that is causing a problem.
You can repeat the process in another window for the session that is trying to do the select. I suspect Martin is right and that my earlier comment about schema lock is relevant.

Select Fails With Nonexisitent Columns

Executing the following statement with SQL Server 2005 (My tests are through SSMS) results in success upon first execution and failure upon subsequent executions.
IF OBJECT_ID('tempdb..#test') IS NULL
CREATE TABLE #test ( GoodColumn INT )
IF 1 = 0
SELECT BadColumn
FROM #test
What this means is that something is comparing the columns I am accessing in my select statement against the columns that exist on a table when the script is "compiled". For my purposes this is undesirable functionality. My question is if there is anything that can be done so that this code would execute successfully on every run, or if that is not possible perhaps someone could explain why the demonstrated functionality is desirable. The only solutions I have currently is to wrap the select with EXEC or select *, but I don't like either of those solution.
Thanks
If you put:
IF OBJECT_ID('tempdb..#test') IS NOT NULL
DROP TABLE #test
GO
At the start, then the problem will go away, as the batch will get parsed before the #test table exists.
What you're asking is for the system to recognise that "1=0" will always evaluate to false. If it were ever true (which could potentially be the case for most real-life conditions), then you'd probably want to know that you were about to run something that would cause failure.
If you drop the temporary table and then create a stored procedure that does the same:
CREATE PROC dbo.test
AS
BEGIN
IF OBJECT_ID('tempdb..#test') IS NULL
CREATE TABLE #test ( GoodColumn INT )
IF 1 = 0
SELECT BadColumn
FROM #test
END
Then this will happily be created, and you can run it as many times as you like.
Rob
Whether or not this behaviour is "desirable" from a programmer's point of view is debatable of course -- it basically comes down to the difference between statically typed and dynamically typed languages. From a performance point of view, it's desirable because SQL Server needs complete information in order to compile and optimize the execution plan (and also cache execution plans).
In a word, T-SQL is not an interpretted or dynamically typed language, and so you cannot write code like this. Your options are either to use EXEC, or to use another language and embed the SQL queries within it.
This problem is also visible in these situations:
IF 1 = 1
select dummy = GETDATE() into #tmp
ELSE
select dummy = GETDATE() into #tmp
Although the second statement is never executed the same error occurs.
It seems the query engine first level validation ignores all conditional statements.
You say you have problems with subsequent request and that is because the object already exits. It it recommended that you drop your temporary tables as soon as possible when you are done with it.
Read more about temporary table performance at:
SQL Server performance.com

SQL problem - high volume trans and PK violation

I'm writing a high volume trading system. We receive messages at around 300-500 per second and these messages then need to be saved to the database as quickly as possible. These messages get deposited on a Message Queue and are then read from there.
I've implemented a Competing Consumer pattern, which reads from the queue and allows for multithreaded processing of the messages. However I'm getting a frequent primary key violation while the app is running.
We're running SQL 2008. The sample table structure would be:
TableA
{
MessageSequence INT PRIMARY KEY,
Data VARCHAR(50)
}
A stored procedure gets invoked to persist this message and looks something like this:
BEGIN TRANSACTION
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS
(
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
)
IF (##ROWCOUNT = 0)
BEGIN
UPDATE TableA
SET Data = #Data
WHERE MessageSequence = #MessageSequence
END
COMMIT TRANSACTION
All of this is in a TRY...CATCH block so if there's an error, it rolls back the transaction.
I've tried using table hints, like ROWLOCK, but it hasn't made a difference. Since the Insert is evaluated as a single statement, it seems ludicrous that I'm still getting a 'Primary Key on insert' issue.
Does anyone have an idea why this is happening? And have you got ANY ideas which may point me in the direction of a solution?
Why is this happening?
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
This SELECT will try to locate the row, if not found the EXISTS operator will return FALSE and the INSERT will proceed. Hoewever, the decision to INSERT is based on a state that was true at the time of the SELECT, but that is no longer guaranteed to be true at the time of the INSERT. In other words, you have race conditions where two threads can both look up the same #MessageSequence, both return NOT EXISTS and both try to INSERT, when only the first one will succeed, second one will cause a PK violation.
How do I solve it?
The quickest fix is to add a WITH (UPDLOCK) hint to the SELECT, this will enforce the lock placed on the #MessageSequence key to be retained and thus the INSERT/SELECT to behave atomically:
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS (
SELECT TOP 1 MessageSequence FROM TableA WITH(UPDLOCK) WHERE MessageSequence = #MessageSequence)
To prevent SQL from doing fancy stuff like page lock, you can also add the ROWLOCK hint.
However, that is not my recommendation. My recommendation may surpise you, but is this: do the operation that is most likely to succeed and handle the error if it failed. Ie. if your business case makes it more likely for the #MessageSequnce to be new, try an INSERT and handle the PK if it failed. This way you avoid the spurious look-ups, and hte cost of the catch/retry is amortized over the many cases when it succeeds from the first try.
Also, it is perhaps worth investigating using the built-in queues that come with SQL Server.
Common problem. Explained here:
Defensive database programming: eliminating IF statements
It might be related to the transaction isolation level. You might need
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
before you start the transaction.
Also, if you have more updates than inserts, you should try the update first and check rowcount and do the insert second.
This is very similar to post 939831. Ultimately you want to use the hints (ROWLOCK, READPAST, UPDLOCK). READPAST tells sql server to skip to the next record if the current one is locked. UPDLOCK tells sql server that the read lock is going to escalate to an update lock.
When I implemented something similar I locked the next record by the threadID
UPDATE TOP (1)
foo
SET
ProcessorID = #PROCID
FROM
OrderTable foo WITH (ROWLOCK, READPAST, UPDLOCK)
WHERE
ProcessorID = 0
Then selected the record
SELECT *
FROM foo WITH (NOLOCK)
WHERE ProcessorID = #PROCID
Then marked it as processed
UPDATE foo
SET ProcessorID = -1
WHERE ProcessorID = #PROCID
Later in off hours I perform the relatively expensive operation of performing the delete operation to clear the queue of processed records.
The atomicity of the following statement is what you are after:
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS
(
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
)
According to this person, it depends on the current isolation level.
On a tangent, if you're thinking of a high volume trading system you might want to consider a tick database designed for such data [I'm not exactly sure what "message" you are storing here], such as discussed in this thread for example: http://www.elitetrader.com/vb/showthread.php?threadid=81345.
These are typically in-memory solutions with proprietary query languages. We use kdb+ at our shop.
Not sure what Messaging product you use - but it may be worth looking at the transactions not at the DB level, but at the MQ Level.
Of course, if you are using a TM (Transaction manager), the two operations : 1)Get from MQ and 2)Write to DB are both 'bracketed' under the same parent commit.
So I am not sure if you are using an implicit or explicit or any TM here (for example, Microsoft's DTC).
MessageSequence is the PK, so could the same Message from the MQ be getting processed twice.
When you perform a 'GET" from MQ, make sure the GET is committed (i.e. not a db-commit, but a MQ-commit) - that will ensure the same MessageID cannot be 'popped' by the next thread that writes messages to the DB.