SQL Table Locking - sql

I have an SQL Server locking question regarding an application we have in house. The application takes submissions of data and persists them into an SQL Server table. Each submission is also assigned a special catalog number (unrelated to the identity field in the table) which is a sequential alpha numeric number. These numbers are pulled from another table and are not generated at run time. So the steps are
Insert Data into Submission Table
Grab next Unassigned Catalog
Number from Catalog Table
Assign the Catalog Number to the
Submission in the Submission table
All these steps happen sequentially in the same stored procedure.
Its, rate but sometimes we manage to get two submission at the same second and they both get assigned the same Catalog Number which causes a localized version of the Apocalypse in our company for a small while.
What can we do to limit the over assignment of the catalog numbers?

When getting your next catalog number, use row locking to protect the time between you finding it and marking it as in use, e.g.:
set transaction isolation level REPEATABLE READ
begin transaction
select top 1 #catalog_number = catalog_number
from catalog_numbers with (updlock,rowlock)
where assigned = 0
update catalog_numbers set assigned = 1 where catalog_number = :catalog_number
commit transaction

You could use an identity field to produce the catalog numbers, that way you can safely create and get the number:
insert into Catalog () values ()
set #CatalogNumber = scope_identity()
The scope_identity function will return the id of the last record created in the same session, so separate sessions can create records at the same time and still end up with the correct id.
If you can't use an identity field to create the catalog numbers, you have to use a transaction to make sure that you can determine the next number and create it without another session accessing the table.

I like araqnid's response. You could also use an insert trigger on the submission table to accomplish this. The trigger would be in the scope of the insert, and you would effectively embed the logic to assign the catalog_number in the trigger. Just wanted to put your options up here.

Here's the easy solution. No race condition. No blocking from a restrictive transaction isolation level. Probably won't work in SQL dialects other than T-SQL, though.
I assume their is some outside force at work to keep your catalog number table populated with unassigned catalog numbers.
This technique should work for you: just do the same sort of "interlocked update" that retrieves a value, something like:
update top 1 CatalogNumber
set in_use = 1 ,
#newCatalogNumber = catalog_number
from CatalogNumber
where in_use = 0
Anyway, the following stored procedure just just ticks up a number on each execution and hands back the previous one. If you want fancier value, add a computed column that applies the transform of choice to the incrementing value to get the desired value.
drop table dbo.PrimaryKeyGenerator
go
create table dbo.PrimaryKeyGenerator
(
id varchar(100) not null ,
current_value int not null default(1) ,
constraint PrimaryKeyGenerator_PK primary key clustered ( id ) ,
)
go
drop procedure dbo.GetNewPrimaryKey
go
create procedure dbo.GetNewPrimaryKey
#name varchar(100)
as
set nocount on
set ansi_nulls on
set concat_null_yields_null on
set xact_abort on
declare
#uniqueValue int
--
-- put the supplied key in canonical form
--
set #name = ltrim(rtrim(lower(#name)))
--
-- if the name isn't already defined in the table, define it.
--
insert dbo.PrimaryKeyGenerator ( id )
select id = #name
where not exists ( select *
from dbo.PrimaryKeyGenerator pkg
where pkg.id = #name
)
--
-- now, an interlocked update to get the current value and increment the table
--
update PrimaryKeyGenerator
set #uniqueValue = current_value ,
current_value = current_value + 1
where id = #name
--
-- return the new unique value to the caller
--
return #uniqueValue
go
To use it:
declare #pk int
exec #pk = dbo.GetNewPrimaryKey 'foobar'
select #pk
Trivial to mod it to return a result set or return the value via an OUTPUT parameter.

Related

Best way to generate a UniqueID for a group of rows?

This is very simplified but I have a web service array of items that look something like this:
[12345, 34131, 13431]
and I am going to be looping through the array and inserting them one by one into a database and I want that table to look like this. These values would be tied to a unique identifier showing that they were
1 12345
1 34131
1 13431
and then if another array came along it would then insert all of its numbers with unique ID 2.... basically this is to keep track of groups.
There will be multiple processes executing this potentially at the same time so what would be the best way to generate the unique identifier and also ensure that 2 processes couldn't have used the same one?
You should fix your data model. It is missing an entity, say, batches.
create table batches (
batch_id int identity(1, 1) primary key,
created_at datetime default getdate()
);
You might have other information as well.
And your table should have a foreign key reference, batch_id to batches.
Then your code should do the following:
Insert a new row into batches. A new batch has begun.
Fetch the id that was just created.
Use this id for the rows that you want to insert.
Although you could do this with a sequence, a separate table makes more sense to me. You are tying a bunch of rows together into something. That something should be represented in the data model.
You can declare this :
DECLARE #UniqueID UNIQUEIDENTIFIER = NEWID();
and use this as your unique identifier when you insert your batch
Since it isn't a primary key, an identity column is out. Honestly I'd probably just track it using a separate id sequence table. Create a proc that grabs the next available ID and then increments it. If you open a transaction at the beginning of the proc it should prevent the second thread from getting the number until the first thread is done with it's update.
Something like:
CREATE PROCEDURE getNextID
#NextNumber INT OUTPUT
,#id_type VARCHAR(20)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #NextValue TABLE (NextNumber int);
BEGIN TRANSACTION;
UPDATE id_sequence
SET last_used_number = ISNULL(#NextNumber, 0) + 1
OUTPUT inserted.last_used_number INTO #NextValue(NextNumber)
WHERE id_type = #id_type
SELECT #NextNumber = NextNumber FROM #NextValue
COMMIT TRANSACTION;
END

Creating JVM level and Thread Safe Sequence in DB

It is old question but to be sure i am asking again.
Actually i have created sequence like in Oracle with a table and want to use with multiple threads and multiple JVM's all process will be hitting it parallel.
Following is sequence stored procedure just want to ask whether this will work with multiple JVM's and always provide unique number to threads in all jvm's or is there any slight chance of it returning same sequence number two more than one calls?
create table sequenceTable (id int)
insert into sequenceTable values (0)
create procedure mySequence
AS
BEGIN
declare #seqNum int
declare #rowCount int
select #rowCount = 0
while(#rowCount = 0)
begin
select #seqNum = id from sequenceTable
update sequenceTable set id = id + 1 where id = #seqNum
select #rowCount = ##rowcount
print 'numbers of rows update %1!', #rowCount
end
SELECT #seqNum
END
If you choose to maintain your current design of updating the sequenceTable.id column each time you want to generate a new sequence number, you need to make sure:
the 'current' process gets an exclusive lock on the row containing the desired sequence number
the 'current' process then updates the desired row and retrieves the newly updated value
the 'current' process releases the exclusive lock
While the above can be implemented via a begin tran + update + select + commit tran, it's actually a bit easier with a single update statement, eg:
create procedure mySequence
AS
begin
declare #seqNum int
update sequenceTable
set #seqNum = id + 1,
id = id + 1
select #seqNum
end
The update statement is its own transaction so the update of the id column and the assignment of #seqNum = id + 1 is performed under an exclusive lock within the update's transaction.
Keep in mind that the exclusive lock will block other processes from obtaining a new id value; net result is that the generation of new id values will be single-threaded/sequential
While this is 'good' from the perspective of ensuring all processes obtain a unique value, it does mean this particular update statement becomes a bottleneck if you have multiple processes hitting the update concurrently.
In such a situation (high volume of concurrent updates) you could alleviate some contention by calling the stored proc less often; this could be accomplished by having the calling processes request a range of new id values (eg, pass #increment as input parameter to the proc, then instead of id + 1 you use id + #increment), with the calling process then knowing it can use sequence numbers (#seqNum-#increment+1) to #seqNum.
Obviously (?) any process that uses a stored proc to generate 'next id' values only works if *ALL* processes a) always call the proc for a new id value and b) *ALL* processes only use the id value returned by the proc (eg, they don't generate their own id values).
If there's a possibility of applications not following this process (call proc to get new id value), you may want to consider pushing the creation of the unique id values out to the table where these id values are being inserted; in other words, modify the target table's id column to include the identity attribute; this eliminates the need for applications to call the stored proc (to generate a new id) and it (still) ensures a unique id is generated for each insert.
You can emulate sequences in ASE. Use the reserve_identity function to achieve required type of activity:
create table sequenceTable (id bigint identity)
go
create procedure mySequence AS
begin
select reserve_identity('sequenceTable', 1)
end
go
This solution is non-blocking and does generate minimal transaction log activity.

SQL Server : update value if it does not exist in a column

I am using SQL Server 2008 and trying to create a statement which will update a single value within a row from another table if a certain parameter is met. I need to make this as simple as possible for a member of my team to use.
So in this case I want to store 2 values, the Sales Order and the reference. Unfortunately the Sales order has a unique identifier that I need to record and enter into the jobs table and NOT the actual sales order reference.
The parameter which needs to be met is that the Sales order unique identifier cannot exist anywhere in the sales order column within the jobs table. I can get this to work when the while value is set to 1 but not when it's set to 0 and in my head it should be set to 0. Anyone got any ideas why this doesn't work?
/****** Attach an SO to a WO ******/
/****** ONLY EDIT THE VALUES BETWEEN '' ******/
Declare #Reference nvarchar(30);
Set #Reference = 'WO16119';
Declare #SO nvarchar(30);
Set #SO = '0000016205';
/****** DO NOT ALTER THE CODE BEYOND THIS POINT!!!!!!!!!!!!! ******/
/* store more values */
Declare #SOID nvarchar(30);
Set #SOID = (Select SOPOrderReturnID
FROM Test_DB.dbo.SOTable
Where DocumentNo = #SO);
/* check if update should run */
Declare #Check nvarchar (30);
Set #Check = (Select case when exists (select *
from Test_DB.dbo.Jobs
where SalesOrderNumber != #SOID)
then CAST(1 AS BIT)
ELSE CAST(0 AS BIT) End)
While (#Check = 0)
/* if check is true run code below */
Begin
Update Test_DB.dbo.jobs
SET SalesOrderNumber = (Select SOPOrderReturnID
FROM Test_DB.dbo.SOPOrderReturn
Where DocumentNo = #SO)
Where Reference = #Reference
END;
A few comments. First in order to avoid getting into a never ending loop you may want to change your while for an IF statement. You aren't changing the #check value so that will run forever:
IF (#Check = 0)
BEGIN
/* if check is true run code below */
Update Test_DB.dbo.jobs
SET SalesOrderNumber = (Select SOPOrderReturnID
FROM Test_DB.dbo.SOPOrderReturn
Where DocumentNo = #SO)
Where Reference = #Reference
END
Then, without knowing your application I would say that the way you make checks is going to require you to lock your tables to avoid other users invalidating the results of your SELECTs.
I would go instead to creating a UNIQUE constraint over the column you want to be unique and handle the error gracefully. This way you don't need to create big locks on your tables
CREATE UNIQUE INDEX IX_UniqueIndex ON Test_DB.dbo.Jobs(SalesOrderNumber)
As per your comment if you cannot create a unique index you may want to try the following SQL although it could cause too much locking and be affected by race conditions:
IF NOT EXISTS (SELECT 1 FROM Test_DB.dbo.Jobs j INNER JOIN Test_DB.dbo.SOTable so ON j.SalesOrderNumber = so.SPOrderReturnId)
BEGIN
UPDATE Test_DB.dbo.jobs
SET SalesOrderNumber = so.SOPOrderReturnID
FROM
Test_DB.dbo.Jobs j
INNER JOIN Test_DB.dbo.SOTable so ON j.SalesOrderNumber = so.SPOrderReturnId
Where
Reference = #Reference
END
The risk of this are that you are running to separate queries (the select and the update) so between them the state of the database could change. So it may be possible that the first query returns nothing exists for that Id but at the moment of the update other user has inserted/updated that data so the previous result is no longer true.
You can try to avoid this problem by using a isolation level that locks the table on the read (like Serializable) but that could cause locks and even deadlocks in the database.
The best solution here is the unique index. If a certain column has to be unique inside a table the best controller system is the db itself by defining constraints.

sql stored procedure not working(no rows affected)

trying to get this stored procedure to work.
ALTER PROCEDURE [team1].[add_testimonial]
-- Add the parameters for the stored procedure here
#currentTestimonialDate char(10),#currentTestimonialContent varchar(512),#currentTestimonialOriginator varchar(20)
AS
BEGIN
DECLARE
#keyValue int
SET NOCOUNT ON;
--Get the Highest Key Value
SELECT #keyValue=max(TestimonialKey)
FROM Testimonial
--Update the Key by 1
SET #keyValue=#keyValue+1
--Store into table
INSERT INTO Testimonial VALUES (#keyValue, #currentTestimonialDate, #currentTestimonialContent, #currentTestimonialOriginator)
END
yet it just returns
Running [team1].[add_testimonial] ( #currentTestimonialDate = 11/11/10, #currentTestimonialContent = this is a test, #currentTestimonialOriginator = theman ).
No rows affected.
(0 row(s) returned)
#RETURN_VALUE = 0
Finished running [team1].[add_testimonial].
and nothing is added to the database, what might be the problem?
There may have problems in two place:
a. There is no data in the table so, max(TestimonialKey) returns null, below is the appropriate way to handle it.
--Get the Highest Key Value
SELECT #keyValue= ISNULL(MAX(TestimonialKey), 0)
FROM Testimonial
--Update the Key by 1
SET #keyValue=#keyValue+1
b. Check your data type of the column currentTestimonialDate whether it is char or DateTime type, if this field is datetime type in the table then convert #currentTestimonialDate to DateTime before inserting to the table.
Also, check number of columns that are not null allowed and you're passing data to them.
If you're not passing data for all columns then try by specifying columns name as below:
--Store into table
INSERT INTO Testimonial(keyValue, currentTestimonialDate,
currentTestimonialContent, currentTestimonialOriginator)
VALUES (#keyValue, #currentTestimonialDate,
#currentTestimonialContent, #currentTestimonialOriginator)
EDIT:
After getting the comment from marc_s:
Make keyValue as INT IDENTITY, If multiple user call it concurrently that wont be problem, DBMS will handle it, so the ultimate query in procedure might be as below:
ALTER PROCEDURE [team1].[add_testimonial]
-- Add the parameters for the stored procedure here
#currentTestimonialDate char(10),
#currentTestimonialContent varchar(512),#currentTestimonialOriginator varchar(20)
AS
BEGIN
SET NOCOUNT ON;
--Store into table
INSERT INTO Testimonial VALUES (#currentTestimonialDate,
#currentTestimonialContent, #currentTestimonialOriginator)
END
Two issues that I can spot:
SELECT #keyValue=max(TestimonialKey)
should be
SELECT #keyValue=ISNULL(max(TestimonialKey), 0)
To account for the case when there are no records in the database
Second, I believe that with NOCOUNT ON, you will not return the count of inserted rows to the caller. So, before your INSERT statement, add
SET NOCOUNT OFF

atomic compare and swap in a database

I am working on a work queueing solution. I want to query a given row in the database, where a status column has a specific value, modify that value and return the row, and I want to do it atomically, so that no other query will see it:
begin transaction
select * from table where pk = x and status = y
update table set status = z where pk = x
commit transaction
--(the row would be returned)
it must be impossible for 2 or more concurrent queries to return the row (one query execution would see the row while its status = y) -- sort of like an interlocked CompareAndExchange operation.
I know the code above runs (for SQL server), but will the swap always be atomic?
I need a solution that will work for SQL Server and Oracle
Is PK the primary key? Then this is a non issue, if you already know the primary key there is no sport. If pk is the primary key, then this begs the obvious question how do you know the pk of the item to dequeue...
The problem is if you don't know the primary key and want to dequeue the next 'available' (ie. status = y) and mark it as dequeued (delete it or set status = z).
The proper way to do this is to use a single statement. Unfortunately the syntax differs between Oracle and SQL Server. The SQL Server syntax is:
update top (1) [<table>]
set status = z
output DELETED.*
where status = y;
I'm not familiar enough with Oracle's RETURNING clause to give an example similar to SQL's OUTPUT one.
Other SQL Server solutions require lock hints on the SELECT (with UPDLOCK) to be correct.
In Oracle the preffered avenue is use the FOR UPDATE, but that does not work in SQL Server since FOR UPDATE is to be used in conjunction with cursors in SQL.
In any case, the behavior you have in the original post is incorrect. Multiple sessions can all select the same row(s) and even all update it, returning the same dequeued item(s) to multiple readers.
As a general rule, to make an operation like this atomic you'll need to ensure that you set an exclusive (or update) lock when you perform the select so that no other transaction can read the row before your update.
The typical syntax for this is something like:
select * from table where pk = x and status = y for update
but you'd need to look it up to be sure.
I have some applications that follow a similar pattern. There is a table like yours that represents a queue of work. The table has two extra columns: thread_id and thread_date. When the app asks for work froom the queue, it submits a thread id. Then a single update statement updates all applicable rows with the thread id column with the submitted id and the thread date column with the current time. After that update, it selects all rows with that thread id. This way you dont need to declare an explicit transaction. The "locking" occurs in the initial update.
The thread_date column is used to ensure that you do not end up with orphaned work items. What happens if items are pulled from the queue and then your app crashes? You have to have the ability to try those work items again. So you might grab all items off the queue that have not been marked completed but have been assigned to a thread with a thread date in the distant past. Its up to you to define "distant."
Try this. The validation is in the UPDATE statement.
Code
IF EXISTS (SELECT * FROM sys.tables WHERE name = 't1')
DROP TABLE dbo.t1
GO
CREATE TABLE dbo.t1 (
ColID int IDENTITY,
[Status] varchar(20)
)
GO
DECLARE #id int
DECLARE #initialValue varchar(20)
DECLARE #newValue varchar(20)
SET #initialValue = 'Initial Value'
INSERT INTO dbo.t1 (Status) VALUES (#initialValue)
SELECT #id = SCOPE_IDENTITY()
SET #newValue = 'Updated Value'
BEGIN TRAN
UPDATE dbo.t1
SET
#initialValue = [Status],
[Status] = #newValue
WHERE ColID = #id
AND [Status] = #initialValue
SELECT ColID, [Status] FROM dbo.t1
COMMIT TRAN
SELECT #initialValue AS '#initialValue', #newValue AS '#newValue'
Results
ColID Status
----- -------------
1 Updated Value
#initialValue #newValue
------------- -------------
Initial Value Updated Value