Looking for the appropriate combination of table hints - sql

I need to do some SQL Server transactions (select + insert) using locks to prevent conflicts. I have the following scenario: I have a table whose primary key is an integer but not auto-incremented (legacy, don't ask), therefore its value is determined as follows:
a select retrieves the maximum ID value from the table
the ID is incremented by one
a new record is inserted in the table using the new ID
All this is done in a transaction, the SQL command being as follows:
SELECT #maxvalue = max(MyId) FROM MyTable
IF #maxvalue > 0
SET #maxvalue = #maxvalue + 1
ELSE
SET #maxvalue = 1
INSERT INTO MyTable(MyValue, ...) VALUES(#maxvalue, ...)
This is prone to duplicate IDs is some scenario and people that wrote it put it in on a loop and retried the operation when a duplicate key error occurred. So I change that, removing the loop and setting locks at the transaction level as follows:
SELECT #maxvalue = max(MyId) FROM MyTable WITH (HOLDLOCK, TABLOCKX)
IF #maxvalue > 0
SET #maxvalue = #maxvalue + 1
ELSE
SET #maxvalue = 1
INSERT INTO MyTable(MyValue, ...) VALUES(#maxvalue, ...)
So I specified two table hints, HOLDLOCK and TABLOCKX. That looked fine for some databases but for some that had tens of thousands of records in this table this transaction took a lot of time, around 10 minutes. Looking the in SQL Server Activity Monitor I could see the transaction suspended, although after a very long while it was executed successfully.
I then changed the hints to (HOLDLOCK, TABLOCK) and works just as fast as it was before hints were used.
The problem is I am not sure whether this is the best combination for what I am looking for or something else is more appropriate. I have seen Confused about UPDLOCK, HOLDLOCK and https://www.sqlteam.com/articles/introduction-to-locking-in-sql-server but would appreciate expert opinions.

This should prevent duplicates. Added an extra column to illustrate how to use more columns:
DECLARE #val INT
INSERT MyTable(MyValue, val2)
SELECT coalesce(max(MyValue),0) + 1, #val
FROM mytable
In order to make sure, create a unique constrains as well:
ALTER TABLE MyTable
ADD CONSTRAINT UC_MyValue UNIQUE (MyValue);

Related

Are there a best practices of generate a increment sequence number in sql database?

I write a query to auto increase the sequence number of new record, Its works fine in single call.
insert into testTable (sequence_no)
case when exists (select sequence_no from testTable)
then (select top(1) sequence_no +1 from testTable order by sequence_no desc)
else '1'
end as sequence_no
Then I add a thread remark on it, loop 100000 times, open 2 thread and run it same time.
thread 1:
declare #cnt INT =0;
while #cnt<100000
begin
insert into testTable (sequence_no, thread_no)
case when exists (select sequence_no from testTable)
then (select top(1) sequence_no +1 from testTable order by sequence_no desc)
else '1'
end as sequence_no, '1' as thread_no
SET #cnt = #cnt + 1;
END;
thread 2:
declare #cnt INT =0;
while #cnt<100000
begin
insert into testTable (sequence_no, thread_no)
case when exists (select sequence_no from testTable)
then (select top(1) sequence_no +1 from testTable order by sequence_no desc)
else '1'
end as sequence_no, '2' as thread_no
SET #cnt = #cnt + 1;
END;
The result around 70% request success, the others fail with
Violation of PRIMARY KEY constraint 'sequence_no'. Cannot insert duplicate key in object 'dbo.testTable'.
I thought it will be solve if I use a transaction on each request, but the result is samiliar, around 70% success ,and others fail with PK duplicate.
Is it means my practise is bad for sequnce number generation?
Can someone give me a improvement?
Each RDBMS system usually has it own "auto-number" (usually used for PrimaryKeys) setup.
MySql, Ms-Sql-Server, ORACLE (the article shows the 3 different syntaxes for each...just at one url)
https://www.w3schools.com/sql/sql_autoincrement.asp
PostGres:
https://chartio.com/resources/tutorials/how-to-define-an-auto-increment-primary-key-in-postgresql/
Your question is tagged with (microsoft)sql-server, so I'll paste
that.
Syntax for SQL Server The following SQL statement defines the
"PersonKey" column to be an auto-increment primary key field in the
"Person" table:
CREATE TABLE dbo.Person (
PersonKey int IDENTITY(1,1) PRIMARY KEY,
LastName varchar(255) NOT NULL,
FirstName varchar(255)
);
Do not reinvent the wheel.
So unless you are (trying) to INSERT a million rows in < 2 seconds....use what is already there for you.
Opinions.
You should not care that each primary-key is perfectly in sequence. Aka, "gaps" should be Ok.
if you think you need perfect sequencing, you need to ask yourself "why".
a primary key needs to be unique. having "order" helps with indexing.
but
1,2,3,6,7,9,11 are ordered. (4,5,8 are missing, but does it really matter that they are missing?)
I will add that Ms Sql Server has added "sequences" since version 2016 (or was it 2014?)
https://learn.microsoft.com/en-us/sql/relational-databases/sequence-numbers/sequence-numbers?view=sql-server-2016
There are reasons to pick one over the other.
https://www.sqlshack.com/difference-between-identity-sequence-in-sql-server/
The 2 cent explanation is sequence can provide a range of values. And it is not tied to a single table (like identity is).
But practically, you end up having more "gaps" in the values..because once a sequence is requested, the same value is never regenerated...even if the ~potential rows for INSERT do not actually make it as inserted-rows.
The first example in the question (sequence_no +1) is how NOT to do it! Instead, SQL Server has two ways you can do this, and either is acceptable:
Define the column as an identity column, and omit it from the INSERT completely:
insert into testTable (thread_no) VALUES ('1')
Create a Sequence, and use the value from the sequence:
insert into testTAble(sequence_no, thread_no) VALUES (NEXT VALUE FOR testTableSequence, '1')
Again, under no circumstances should you ever try to use TOP 1 sequence_no+1 or MAX(sequence_no)+1. That is wrong in SQL Server (really, it's wrong in MySql, too).
Du to concurrent access, if you want your code succed without any exception, must must LOCK the all table with an exclusive lock the time the transaction execute. So you must rewrite your code with :
BEGIN TRANSACTION;
insert into testTable WITH (XLOCK, TABLOCK, HOLDLOCK) (sequence_no)
SELECT case when exists (select sequence_no from testTable)
then (select top(1) sequence_no +1 from testTable order by sequence_no desc)
else '1'
end as sequence_no;
COMMIT
But, this will be catastrophic for concurrent accesses... Locking all the table will take time, induce contention, generate deadlock !
It is why RDBMS offers internal increment as IDENTITY or SEQUENCES that are parts of the SQL ISO standard

SQLServer lock table during stored procedure

I've got a table where I need to auto-assign an ID 99% of the time (the other 1% rules out using an identity column it seems). So I've got a stored procedure to get next ID along the following lines:
select #nextid = lastid+1 from last_auto_id
check next available id in the table...
update last_auto_id set lastid = #nextid
Where the check has to check if users have manually used the IDs and find the next unused ID.
It works fine when I call it serially, returning 1, 2, 3 ... What I need to do is provide some locking where multiple processes call this at the same time. Ideally, I just need it to exclusively lock the last_auto_id table around this code so that a second call must wait for the first to update the table before it can run it's select.
In Postgres, I can do something like 'LOCK TABLE last_auto_id;' to explicitly lock the table. Any ideas how to accomplish it in SQL Server?
Thanks in advance!
Following update increments your lastid by one and assigns this value to your local variable in a single transaction.
Edit
thanks to Dave and Mitch for pointing out isolation level problems with the original solution.
UPDATE last_auto_id WITH (READCOMMITTEDLOCK)
SET #nextid = lastid = lastid + 1
You guys have between you answered my question. I'm putting in my own reply to collate the working solution I've got into one post. The key seems to have been the transaction approach, with locking hints on the last_auto_id table. Setting the transaction isolation to serializable seemed to create deadlock problems.
Here's what I've got (edited to show the full code so hopefully I can get some further answers...):
DECLARE #Pointer AS INT
BEGIN TRANSACTION
-- Check what the next ID to use should be
SELECT #NextId = LastId + 1 FROM Last_Auto_Id WITH (TABLOCKX) WHERE Name = 'CustomerNo'
-- Now check if this next ID already exists in the database
IF EXISTS (SELECT CustomerNo FROM Customer
WHERE ISNUMERIC(CustomerNo) = 1 AND CustomerNo = #NextId)
BEGIN
-- The next ID already exists - we need to find the next lowest free ID
CREATE TABLE #idtbl ( IdNo int )
-- Into temp table, grab all numeric IDs higher than the current next ID
INSERT INTO #idtbl
SELECT CAST(CustomerNo AS INT) FROM Customer
WHERE ISNUMERIC(CustomerNo) = 1 AND CustomerNo >= #NextId
ORDER BY CAST(CustomerNo AS INT)
-- Join the table with itself, based on the right hand side of the join
-- being equal to the ID on the left hand side + 1. We're looking for
-- the lowest record where the right hand side is NULL (i.e. the ID is
-- unused)
SELECT #Pointer = MIN( t1.IdNo ) + 1 FROM #idtbl t1
LEFT OUTER JOIN #idtbl t2 ON t1.IdNo + 1 = t2.IdNo
WHERE t2.IdNo IS NULL
END
UPDATE Last_Auto_Id SET LastId = #NextId WHERE Name = 'CustomerNo'
COMMIT TRANSACTION
SELECT #NextId
This takes out an exclusive table lock at the start of the transaction, which then successfully queues up any further requests until after this request has updated the table and committed it's transaction.
I've written a bit of C code to hammer it with concurrent requests from half a dozen sessions and it's working perfectly.
However, I do have one worry which is the term locking 'hints' - does anyone know if SQLServer treats this as a definite instruction or just a hint (i.e. maybe it won't always obey it??)
How is this solution? No TABLE LOCK is required and works perfectly!!!
DECLARE #NextId INT
UPDATE Last_Auto_Id
SET #NextId = LastId = LastId + 1
WHERE Name = 'CustomerNo'
SELECT #NextId
Update statement always uses a lock to protect its update.
You might wanna consider deadlocks. This usually happens when multiple users use the stored procedure simultaneously. In order to avoid deadlock and make sure every query from the user will succeed you will need to do some handling during update failures and to do this you will need a try catch. This works on Sql Server 2005,2008 only.
DECLARE #Tries tinyint
SET #Tries = 1
WHILE #Tries <= 3
BEGIN
BEGIN TRANSACTION
BEGIN TRY
-- this line updates the last_auto_id
update last_auto_id set lastid = lastid+1
COMMIT
BREAK
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() AS ErrorNumber, ERROR_MESSAGE() as ErrorMessage
ROLLBACK
SET #Tries = #Tries + 1
CONTINUE
END CATCH
END
I prefer doing this using an identity field in a second table. If you make lastid identity then all you have to do is insert a row in that table and select #scope_identity to get your new value and you still have the concurrency safety of identity even though the id field in your main table is not identity.

Copy one column to another for over a billion rows in SQL Server database

Database : SQL Server 2005
Problem : Copy values from one column to another column in the same table with a billion+
rows.
test_table (int id, bigint bigid)
Things tried 1: update query
update test_table set bigid = id
fills up the transaction log and rolls back due to lack of transaction log space.
Tried 2 - a procedure on following lines
set nocount on
set rowcount = 500000
while #rowcount > 0
begin
update test_table set bigid = id where bigid is null
set #rowcount = ##rowcount
set #rowupdated = #rowsupdated + #rowcount
end
print #rowsupdated
The above procedure starts slowing down as it proceeds.
Tried 3 - Creating a cursor for update.
generally discouraged in SQL Server documentation and this approach updates one row at a time which is too time consuming.
Is there an approach that can speed up the copying of values from one column to another. Basically I am looking for some 'magic' keyword or logic that will allow the update query to rip through the billion rows half a million at a time sequentially.
Any hints, pointers will be much appreciated.
I'm going to guess that you are closing in on the 2.1billion limit of an INT datatype on an artificial key for a column. Yes, that's a pain. Much easier to fix before the fact than after you've actually hit that limit and production is shut down while you are trying to fix it :)
Anyway, several of the ideas here will work. Let's talk about speed, efficiency, indexes, and log size, though.
Log Growth
The log blew up originally because it was trying to commit all 2b rows at once. The suggestions in other posts for "chunking it up" will work, but that may not totally resolve the log issue.
If the database is in SIMPLE mode, you'll be fine (the log will re-use itself after each batch). If the database is in FULL or BULK_LOGGED recovery mode, you'll have to run log backups frequently during the running of your operation so that SQL can re-use the log space. This might mean increasing the frequency of the backups during this time, or just monitoring the log usage while running.
Indexes and Speed
ALL of the where bigid is null answers will slow down as the table is populated, because there is (presumably) no index on the new BIGID field. You could, (of course) just add an index on BIGID, but I'm not convinced that is the right answer.
The key (pun intended) is my assumption that the original ID field is probably the primary key, or the clustered index, or both. In that case, lets take advantage of that fact, and do a variation of Jess' idea:
set #counter = 1
while #counter < 2000000000 --or whatever
begin
update test_table set bigid = id
where id between #counter and (#counter + 499999) --BETWEEN is inclusive
set #counter = #counter + 500000
end
This should be extremely fast, because of the existing indexes on ID.
The ISNULL check really wasn't necessary anyway, neither is my (-1) on the interval. If we duplicate some rows between calls, that's not a big deal.
Use TOP in the UPDATE statement:
UPDATE TOP (#row_limit) dbo.test_table
SET bigid = id
WHERE bigid IS NULL
You could try to use something like SET ROWCOUNT and do batch updates:
SET ROWCOUNT 5000;
UPDATE dbo.test_table
SET bigid = id
WHERE bigid IS NULL
GO
and then repeat this as many times as you need to.
This way, you're avoiding the RBAR (row-by-agonizing-row) symptoms of cursors and while loops, and yet, you don't unnecessarily fill up your transaction log.
Of course, in between runs, you'd have to do backups (especially of your log) to keep its size within reasonable limits.
Is this a one time thing? If so, just do it by ranges:
set counter = 500000
while #counter < 2000000000 --or whatever your max id
begin
update test_table set bigid = id where id between (#counter - 500000) and #counter and bigid is null
set counter = #counter + 500000
end
I didn't run this to try it, but if you can get it to update 500k at a time I think you're moving in the right direction.
set rowcount 500000
update test_table tt1
set bigid = (SELECT tt2.id FROM test_table tt2 WHERE tt1.id = tt2.id)
where bigid IS NULL
You can also try changing the recover model so you don't log the transactions
ALTER DATABASE db1
SET RECOVERY SIMPLE
GO
update test_table
set bigid = id
GO
ALTER DATABASE db1
SET RECOVERY FULL
GO
First step, if there are any, would be to drop indexes before the operation. This is probably what is causing the speed degrade with time.
The other option, a little outside the box thinking...can you express the update in such a way that you could materialize the column values in a select? If you can do this then you could create what amounts to a NEW table using SELECT INTO which is a minimally logged operation (assuming in 2005 that you are set to a recovery model of SIMPLE or BULK LOGGED). This would be pretty fast and then you can drop the old table, rename this table to to old table name and recreate any indexes.
select id, CAST(id as bigint) bigid into test_table_temp from test_table
drop table test_table
exec sp_rename 'test_table_temp', 'test_table'
I second the
UPDATE TOP(X) statement
Also to suggest, if you're in a loop, add in some WAITFOR delay or COMMIT between, to allow other processes some time to use the table if needed vs. blocking forever until all the updates are completed

Possible to implement a manual increment with just simple SQL INSERT?

I have a primary key that I don't want to auto increment (for various reasons) and so I'm looking for a way to simply increment that field when I INSERT. By simply, I mean without stored procedures and without triggers, so just a series of SQL commands (preferably one command).
Here is what I have tried thus far:
BEGIN TRAN
INSERT INTO Table1(id, data_field)
VALUES ( (SELECT (MAX(id) + 1) FROM Table1), '[blob of data]');
COMMIT TRAN;
* Data abstracted to use generic names and identifiers
However, when executed, the command errors, saying that
"Subqueries are not allowed in this
context. only scalar expressions are
allowed"
So, how can I do this/what am I doing wrong?
EDIT: Since it was pointed out as a consideration, the table to be inserted into is guaranteed to have at least 1 row already.
You understand that you will have collisions right?
you need to do something like this and this might cause deadlocks so be very sure what you are trying to accomplish here
DECLARE #id int
BEGIN TRAN
SELECT #id = MAX(id) + 1 FROM Table1 WITH (UPDLOCK, HOLDLOCK)
INSERT INTO Table1(id, data_field)
VALUES (#id ,'[blob of data]')
COMMIT TRAN
To explain the collision thing, I have provided some code
first create this table and insert one row
CREATE TABLE Table1(id int primary key not null, data_field char(100))
GO
Insert Table1 values(1,'[blob of data]')
Go
Now open up two query windows and run this at the same time
declare #i int
set #i =1
while #i < 10000
begin
BEGIN TRAN
INSERT INTO Table1(id, data_field)
SELECT MAX(id) + 1, '[blob of data]' FROM Table1
COMMIT TRAN;
set #i =#i + 1
end
You will see a bunch of these
Server: Msg 2627, Level 14, State 1, Line 7
Violation of PRIMARY KEY constraint 'PK__Table1__3213E83F2962141D'. Cannot insert duplicate key in object 'dbo.Table1'.
The statement has been terminated.
Try this instead:
INSERT INTO Table1 (id, data_field)
SELECT id, '[blob of data]' FROM (SELECT MAX(id) + 1 as id FROM Table1) tbl
I wouldn't recommend doing it that way for any number of reasons though (performance, transaction safety, etc)
It could be because there are no records so the sub query is returning NULL...try
INSERT INTO tblTest(RecordID, Text)
VALUES ((SELECT ISNULL(MAX(RecordID), 0) + 1 FROM tblTest), 'asdf')
I don't know if somebody is still looking for an answer but here is a solution that seems to work:
-- Preparation: execute only once
CREATE TABLE Test (Value int)
CREATE TABLE Lock (LockID uniqueidentifier)
INSERT INTO Lock SELECT NEWID()
-- Real insert
BEGIN TRAN LockTran
-- Lock an object to block simultaneous calls.
UPDATE Lock WITH(TABLOCK)
SET LockID = LockID
INSERT INTO Test
SELECT ISNULL(MAX(T.Value), 0) + 1
FROM Test T
COMMIT TRAN LockTran
We have a similar situation where we needed to increment and could not have gaps in the numbers. (If you use an identity value and a transaction is rolled back, that number will not be inserted and you will have gaps because the identity value does not roll back.)
We created a separate table for last number used and seeded it with 0.
Our insert takes a few steps.
--increment the number
Update dbo.NumberTable
set number = number + 1
--find out what the incremented number is
select #number = number
from dbo.NumberTable
--use the number
insert into dbo.MyTable using the #number
commit or rollback
This causes simultaneous transactions to process in a single line as each concurrent transaction will wait because the NumberTable is locked. As soon as the waiting transaction gets the lock, it increments the current value and locks it from others. That current value is the last number used and if a transaction is rolled back, the NumberTable update is also rolled back so there are no gaps.
Hope that helps.
Another way to cause single file execution is to use a SQL application lock. We have used that approach for longer running processes like synchronizing data between systems so only one synchronizing process can run at a time.
If you're doing it in a trigger, you could make sure it's an "INSTEAD OF" trigger and do it in a couple of statements:
DECLARE #next INT
SET #next = (SELECT (MAX(id) + 1) FROM Table1)
INSERT INTO Table1
VALUES (#next, inserted.datablob)
The only thing you'd have to be careful about is concurrency - if two rows are inserted at the same time, they could attempt to use the same value for #next, causing a conflict.
Does this accomplish what you want?
It seems very odd to do this sort of thing w/o an IDENTITY (auto-increment) column, making me question the architecture itself. I mean, seriously, this is the perfect situation for an IDENTITY column. It might help us answer your question if you'd explain the reasoning behind this decision. =)
Having said that, some options are:
using an INSTEAD OF trigger for this purpose. So, you'd do your INSERT (the INSERT statement would not need to pass in an ID). The trigger code would handle inserting the appropriate ID. You'd need to use the WITH (UPDLOCK, HOLDLOCK) syntax used by another answerer to hold the lock for the duration of the trigger (which is implicitly wrapped in a transaction) & to elevate the lock type from "shared" to "update" lock (IIRC).
you can use the idea above, but have a table whose purpose is to store the last, max value inserted into the table. So, once the table is set up, you would no longer have to do a SELECT MAX(ID) every time. You'd simply increment the value in the table. This is safe provided that you use appropriate locking (as discussed). Again, that avoids repeated table scans every time you INSERT.
use GUIDs instead of IDs. It's much easier to merge tables across databases, since the GUIDs will always be unique (whereas records across databases will have conflicting integer IDs). To avoid page splitting, sequential GUIDs can be used. This is only beneficial if you might need to do database merging.
Use a stored proc in lieu of the trigger approach (since triggers are to be avoided, for some reason). You'd still have the locking issue (and the performance problems that can arise). But sprocs are preferred over dynamic SQL (in the context of applications), and are often much more performant.
Sorry about rambling. Hope that helps.
How about creating a separate table to maintain the counter? It has better performance than MAX(id), as it will be O(1). MAX(id) is at best O(lgn) depending on the implementation.
And then when you need to insert, simply lock the counter table for reading the counter and increment the counter. Then you can release the lock and insert to your table with the incremented counter value.
Have a separate table where you keep your latest ID and for every transaction get a new one.
It may be a bit slower but it should work.
DECLARE #NEWID INT
BEGIN TRAN
UPDATE TABLE SET ID=ID+1
SELECT #NEWID=ID FROM TABLE
COMMIT TRAN
PRINT #NEWID -- Do what you want with your new ID
Code without any transaction scope (I use it in my engineer course as an exercice) :
-- Preparation: execute only once
CREATE TABLE increment (val int);
INSERT INTO increment VALUES (1);
-- Real insert
DECLARE #newIncrement INT;
UPDATE increment
SET #newIncrement = val,
val = val + 1;
INSERT INTO Table1 (id, data_field)
SELECT #newIncrement, 'some data';
declare #nextId int
set #nextId = (select MAX(id)+1 from Table1)
insert into Table1(id, data_field) values (#nextId, '[blob of data]')
commit;
But perhaps a better approach would be using a scalar function getNextId('table1')
Any critiques of this? Works for me.
DECLARE #m_NewRequestID INT
, #m_IsError BIT = 1
, #m_CatchEndless INT = 0
WHILE #m_IsError = 1
BEGIN TRY
SELECT #m_NewRequestID = (SELECT ISNULL(MAX(RequestID), 0) + 1 FROM Requests)
INSERT INTO Requests ( RequestID
, RequestName
, Customer
, Comment
, CreatedFromApplication)
SELECT RequestID = #m_NewRequestID
, RequestName = dbo.ufGetNextAvailableRequestName(PatternName)
, Customer = #Customer
, Comment = [Description]
, CreatedFromApplication = #CreatedFromApplication
FROM RequestPatterns
WHERE PatternID = #PatternID
SET #m_IsError = 0
END TRY
BEGIN CATCH
SET #m_IsError = 1
SET #m_CatchEndless = #m_CatchEndless + 1
IF #m_CatchEndless > 1000
THROW 51000, '[upCreateRequestFromPattern]: Unable to get new RequestID', 1
END CATCH
This should work:
INSERT INTO Table1 (id, data_field)
SELECT (SELECT (MAX(id) + 1) FROM Table1), '[blob of data]';
Or this (substitute LIMIT for other platforms):
INSERT INTO Table1 (id, data_field)
SELECT TOP 1
MAX(id) + 1, '[blob of data]'
FROM
Table1
ORDER BY
[id] DESC;

atomic compare and swap in a database

I am working on a work queueing solution. I want to query a given row in the database, where a status column has a specific value, modify that value and return the row, and I want to do it atomically, so that no other query will see it:
begin transaction
select * from table where pk = x and status = y
update table set status = z where pk = x
commit transaction
--(the row would be returned)
it must be impossible for 2 or more concurrent queries to return the row (one query execution would see the row while its status = y) -- sort of like an interlocked CompareAndExchange operation.
I know the code above runs (for SQL server), but will the swap always be atomic?
I need a solution that will work for SQL Server and Oracle
Is PK the primary key? Then this is a non issue, if you already know the primary key there is no sport. If pk is the primary key, then this begs the obvious question how do you know the pk of the item to dequeue...
The problem is if you don't know the primary key and want to dequeue the next 'available' (ie. status = y) and mark it as dequeued (delete it or set status = z).
The proper way to do this is to use a single statement. Unfortunately the syntax differs between Oracle and SQL Server. The SQL Server syntax is:
update top (1) [<table>]
set status = z
output DELETED.*
where status = y;
I'm not familiar enough with Oracle's RETURNING clause to give an example similar to SQL's OUTPUT one.
Other SQL Server solutions require lock hints on the SELECT (with UPDLOCK) to be correct.
In Oracle the preffered avenue is use the FOR UPDATE, but that does not work in SQL Server since FOR UPDATE is to be used in conjunction with cursors in SQL.
In any case, the behavior you have in the original post is incorrect. Multiple sessions can all select the same row(s) and even all update it, returning the same dequeued item(s) to multiple readers.
As a general rule, to make an operation like this atomic you'll need to ensure that you set an exclusive (or update) lock when you perform the select so that no other transaction can read the row before your update.
The typical syntax for this is something like:
select * from table where pk = x and status = y for update
but you'd need to look it up to be sure.
I have some applications that follow a similar pattern. There is a table like yours that represents a queue of work. The table has two extra columns: thread_id and thread_date. When the app asks for work froom the queue, it submits a thread id. Then a single update statement updates all applicable rows with the thread id column with the submitted id and the thread date column with the current time. After that update, it selects all rows with that thread id. This way you dont need to declare an explicit transaction. The "locking" occurs in the initial update.
The thread_date column is used to ensure that you do not end up with orphaned work items. What happens if items are pulled from the queue and then your app crashes? You have to have the ability to try those work items again. So you might grab all items off the queue that have not been marked completed but have been assigned to a thread with a thread date in the distant past. Its up to you to define "distant."
Try this. The validation is in the UPDATE statement.
Code
IF EXISTS (SELECT * FROM sys.tables WHERE name = 't1')
DROP TABLE dbo.t1
GO
CREATE TABLE dbo.t1 (
ColID int IDENTITY,
[Status] varchar(20)
)
GO
DECLARE #id int
DECLARE #initialValue varchar(20)
DECLARE #newValue varchar(20)
SET #initialValue = 'Initial Value'
INSERT INTO dbo.t1 (Status) VALUES (#initialValue)
SELECT #id = SCOPE_IDENTITY()
SET #newValue = 'Updated Value'
BEGIN TRAN
UPDATE dbo.t1
SET
#initialValue = [Status],
[Status] = #newValue
WHERE ColID = #id
AND [Status] = #initialValue
SELECT ColID, [Status] FROM dbo.t1
COMMIT TRAN
SELECT #initialValue AS '#initialValue', #newValue AS '#newValue'
Results
ColID Status
----- -------------
1 Updated Value
#initialValue #newValue
------------- -------------
Initial Value Updated Value