SQL number generation in concurrent environment (Transation isolation level) - sql

I am working with an application that generates invoice numbers (sequentially based on few parameters) and so far it has been using a trigger with serialized transaction. Because the trigger is rather "heavy" it manages to timeout execution of the insert query.
I'm now working on a solution to that problem and so far I came to the point where I have a stored procedure that do the insert and after the insert I have a transaction with isolation level serializable (which by the way applies to that transaction only or should i set it back after the transaction has been commited?) that:
gets the number
if not found do the insert into that table and if found updates the number (increment)
commits the transaction
I'm wondering whether there's a better way to ensure the number is used once and gets incrementer with the table locked (only the number tables gets locked, right?).
I read about sp_getapplock, would that be somewhat a better way to achieve my goal?

I would optimize the routine for update (and handle "insert if not there" separately), at which point it would be:
declare #number int;
update tbl
set #number = number, number += 1
where year = #year and month = #month and office = #office and type = #type;
You don't need any specific locking hints or isolation levels, SQL Server will ensure no two transactions read the same value before incrementing.
If you'd like to avoid handling the insert separately, you can:
merge into tbl
using (values (#year, #month, #office, #type)) as v(y,m,o,t)
on tbl.year = v.year and tbl.month = v.month and tbl.office = v.office and tbl.type = v.type
when not matched by target then
insert (year, month, office, type, number) values(#year, #month, #office, #type, 1)
when matched then
update set #number = tbl.number, tbl.number += 1
;
Logically this should provide the same guard against race condition as update, but for some reason I don't remember where is the proof.

If you first insert and then update you have a time window where an invalid number is set and can be observed. Further, if the 2nd transaction fails which can always happen you have inconsistent data.
Try this:
Take a fresh number in tran 1.
Insert in tran 2 with the number that was taken already
That way you might burn a number but there will never be inconsistent data.

Related

SQL Server - Multiple Processes Inserting to table

I have a couple of stored procs (Add & Remove) which run some selects, inserts, deletes & updates on some tables. These seem fine.
Each of these procs uses a TRANSACTION.
I begin the transaction before I do any changes to data and near the end of the proc i do..
IF ##TRANSCCOUNT > 0
COMMIT TRANSACTION #transName;
Within the Add and Remove procs, and within the TRANSACTION I call another stored procedure (Adjust) to update a table which keeps a running total of values.
I am finding that this is getting out of sync.....
Here is the body of that proc....
INSERT INTO L2(ProductId, LocationId, POId, StockMoveId, BasketId, OrderId, AdjusterValue, CurrentValue)
SELECT TOP 1
#ProductId, #LocationId, null, null, #BasketId, null, #Value, (CurrentValue + #Value)
FROM L2
WHERE 1=1
AND LocationId = #LocationId
AND ProductId = #ProductId
ORDER BY Id Desc
ProductId, LocationId, StockMoveId and OrderId are all foreign keys to the relevent tables but do allow nulls so only the approprate one needs to be populated with an actual value.
Here is an image showing an example of where it goes wrong....
The 19 should have been addded to 324 nmaking a new total of 343, however, as you can see it seems to have been added to the 300 and 319 is inserted.
Questions...
Is this actually in the transaction that was began in the calling stored proc.
How can I prevent this situation?
I've tried using MAX to get the right row to try and speed up but the execution plan on that isn't as cost effective as the simple TOP. ID, btw is an Identity column and PKey.
Do I need to Lock the table, and if I do with the other process calling Adjust wait or will they error.
Any assistance much appreciated.
More info....
I have been experimenting and it would seem the only solution that consistently works as desired is to have the Id column as an INT field and simply increment it myself on the INSERT.
This doesn't sit well with me as to me it doesn't make sense as to why the IDENTITY column n doesn't seem to cope.
I've tried the posted Identity column solution, sequences and incrementing ID myself
After lots of searching, experimenting I SEEM to have a solution that is now very robust.
I now have the ID as a simple INT column and I manage the ID myself by getting the MAX + 1 for each new insert.
I now wrap the body of the Adjust proc in its own Transaction and use the following to get the next ID.....
DECLARE #trxNam Varchar(10) = 'tranNextId';
DECLARE #newId INT;
DECLARE #currentLevelId INT;
BEGIN TRANSACTION #trxNam;
SELECT #newId = MAX(id) + 1 FROM L2 WITH(updlock,serializable);
I then do my insert using the #newId and COMMIT the named transaction.
I have a scenarion set up where I have a number of Win32Apps caling my API 100s of times that was cosistantly failing due to intermittent PKEY violations.
Now it doesn't.
Happy days!
Still I'm looking to see if I can simply have an identity column again and use the Transaction in the adjust proc...That would be cleaner I think.
I found this article led me to the solution....

Conditionally inserting records into a table in multithreaded environment based on a count

I am writing a T-SQL stored procedure that conditionally adds a record to a table only if the number of similar records is below a certain threshold, 10 in the example below. The problem is this will be run from a web application, so it will run on multiple threads, and I need to ensure that the table never has more than 10 similar records.
The basic gist of the procedure is:
BEGIN
DECLARE #c INT
SELECT #c = count(*)
FROM foo
WHERE bar = #a_param
IF #c < 10 THEN
INSERT INTO foo
(bar)
VALUES (#a_param)
END IF
END
I think I could solve any potential concurrency problems by replacing the select statement with:
SELECT #c = count(*) WITH (TABLOCKX, HOLDLOCK)
But I am curious if there any methods other than lock hints for managing concurrency problems in T-SQL
One option would be to use the sp_getapplock system stored procedure. You can place your critical section logic in a transaction and use the built in locking of sql server to ensure synchronized access.
Example:
CREATE PROC MyCriticalWork(#MyParam INT)
AS
DECLARE #LockRequestResult INT
SET #LockRequestResult=0
DECLARE #MyTimeoutMiliseconds INT
SET #MyTimeoutMiliseconds=5000--Wait only five seconds max then timeouit
BEGIN TRAN
EXEC #LockRequestResult=SP_GETAPPLOCK 'MyCriticalWork','Exclusive','Transaction',#MyTimeoutMiliseconds
IF(#LockRequestResult>=0)BEGIN
/*
DO YOUR CRITICAL READS AND WRITES HERE
*/
--Release the lock
COMMIT TRAN
END ELSE
ROLLBACK TRAN
Use SERIALIZABLE. By definition it provides you the illusion that your transaction is the only transaction running. Be aware that this might result in blocking and deadlocking. In fact this SQL code is a classic candidate for deadlocking: Two transactions might first read a set of rows, then both will try to modify that set of rows. Locking hints are the classic way of solving that problem. Retry also works.
As stated in the comment. Why are you trying to insert on multiple threads? You cannot write to a table faster on multiple threads.
But you don't need a declare
insert into [Table_1] (ID, fname, lname)
select 3, 'fname', 'lname'
from [Table_1]
where ID = 3
having COUNT(*) <= 10
If you need to take a lock then do so
The data is not 3NF
Should start any design with a proper data model
Why rule out table lock?
That could very well be the best approach
Really, what are the chances?
Even without a lock you would have to have two at a count of 9 submit at exactly the same time. Even then it would stop at 11. Is the 10 an absolute hard number?

sql server: Is this nesting in a transcation sufficient for getting a unique number from the database?

i want to generate a unique number from a table.
It has to be thread safe of course, so when i check for the last number and get '3', and then store '4' in the database, i don't want anybody else just in between those two actions (get the number and store it one higher in the database) also to get '3' back, and then also storing '4'
So i thought, put it in a transaction like this:
begin transaction
declare #maxNum int
select #maxNum = MAX(SequenceNumber) from invoice
where YEAR = #year
if #maxNum is null
begin
set #maxNum = 0
end
set #maxNum = #maxNum + 1
INSERT INTO [Invoice]
([Year]
,[SequenceNumber]
,[DateCreated])
VALUES
(#year
,#maxNum
,GETUTCDATE()
)
commit transaction
return #maxNum
But i wondered, is that enough, to put it in a transaction?
my first thought was: it locks this sp for usage by other people, but is that correct? how can sql server know what to lock at the first step?
Will this construction guarantee me that nobody else will do the select #maxnum part just when i am updating the #maxnum value, and at that moment receiving the same #maxnum as i did so i'm in trouble.
I hope you understand what i want to accomplish, and also if you know if i did choose the right solution.
EDIT:
also described as 'How to Single-Thread a stored procedure'
If you want to have the year and a sequence number stored in the database, and create an invoice number from that, I'd use:
a InvoiceYear column (which could totally be computed as YEAR(InvoiceDate))
an InvoiceID INT IDENTITY column which you could reset every year to 1
create a computed column InvoiceNumber as:
ALTER TABLE dbo.InvoiceTable
ADD InvoiceNumber AS CAST(InvoiceYear AS VARCHAR(4)) + '-' +
RIGHT('000000' + CAST(InvoiceID AS VARCHAR(6)), 6) PERSISTED
This way, you automagically get invoice numbers:
2010-000001
......
2010-001234
......
2010-123456
Of course, if you need more than 6 digits (= 1 million invoices) - just adjust the RIGHT() and CAST() statements for the InvoiceID column.
Also, since this is a persisted computed column, you can index it for fast retrieval.
This way: you don't have to worry about concurrency, stored procs, transactions and stuff like that - SQL Server will do that for you - for free!
No, it's not enough. The shared lock set by the select will not prevent anyone from reading that same value at the same time.
Change this:
select #maxNum = MAX(SequenceNumber) from invoice where YEAR = #year
To this:
select #maxNum = MAX(SequenceNumber) from invoice with (updlock, holdlock) where YEAR = #year
This way you replace the shared lock with an update lock, and two update locks are not compatible with each over.
The holdlock means that the lock is to be held until the end of the transaction. So you do still need the transaction bit.
Note that this will not help if there's some other procedure that also wants to do the update. If that other procedure reads the value without providing the updlock hint, it will still be able to read the previous value of the counter. This may be a good thing, as it improves concurrency in scenarios where the other readers do not intend to make an update later, but it also may be not what you want, in which case either update all procedures to use updlock, or use xlock instead to place an exclusive lock, not compatible with shared locks.
As it turned out, i didn't want to lock the table, i just wanted to execute the stored procedure one at a time.
In C# code i would place a lock on another object, and that's what was discussed here
http://www.sqlservercentral.com/Forums/Topic357663-8-1.aspx
So that's what i used
declare #Result int
EXEC #Result =
sp_getapplock #Resource = 'holdit1', #LockMode = 'Exclusive', #LockTimeout = 10000 --Time to wait for the lock
IF #Result < 0
BEGIN
ROLLBACK TRAN
RAISERROR('Procedure Already Running for holdit1 - Concurrent execution is not supported.',16,9)
RETURN(-1)
END
where 'holdit1' is just a name for the lock.
#result returns 0 or 1 if it succeeds in getting the lock (one of them is when it immediately succeeds, and the other is when you get the lock while waiting)

Having TRANSACTION In All Queries

Do you think always having a transaction around the SQL statements in a stored procedure is a good practice? I'm just about to optimize this legacy application in my company, and one thing I found is that every stored procedure has BEGIN TRANSACTION. Even a procedure with a single select or update statement has one. I thought it would be nice to have BEGIN TRANSACTION if performing multiple actions, but not just one action. I may be wrong, which is why I need someone else to advise me. Thanks for your time, guys.
It is entirely unnecessary as each SQL statement executes atomically, ie. as if it were already running in its own transaction. In fact, opening unnecessary transactions can lead to increased locking, even deadlocks. Forgetting to match COMMITs with BEGINs can leave a transaction open for as long as the connection to the database is open and interfere with other transactions in the same connection.
Such coding almost certainly means that whoever wrote the code was not very experienced in database programming and is a sure smell that there may be other problems as well.
The only possible reason I could see for this is if you have the possibility of needing to roll-back the transaction for a reason other than a SQL failure.
However, if the code is literally
begin transaction
statement
commit
Then I see absolutely no reason to use an explicit transaction, and it's probably being done because it's always been done that way.
I don't know of any benefit of not just using auto commit transactions for these statements.
Possible disadvantages of using explicit transactions everywhere might be that it just adds clutter to the code and so makes it less easy to see when an explicit transaction is being used to ensure correctness over multiple statements.
Additionally it increases the risk that a transaction is left open holding locks unless care is taken (e.g. with SET XACT_ABORT ON).
Also there is a minor performance implication as shown in #8kb's answer. This illustrates it another way using the visual studio profiler.
Setup
(Testing against an empty table)
CREATE TABLE T (X INT)
Explicit
SET NOCOUNT ON
DECLARE #X INT
WHILE ( 1 = 1 )
BEGIN
BEGIN TRAN
SELECT #X = X
FROM T
COMMIT
END
Auto Commit
SET NOCOUNT ON
DECLARE #X INT
WHILE ( 1 = 1 )
BEGIN
SELECT #X = X
FROM T
END
Both of them end up spending time in CMsqlXactImp::Begin and CMsqlXactImp::Commit but for the explicit transactions case it spends a significantly greater proportion of the execution time in these methods and hence less time doing useful work.
+--------------------------------+----------+----------+
| | Auto | Explicit |
+--------------------------------+----------+----------+
| CXStmtQuery::ErsqExecuteQuery | 35.16% | 25.06% |
| CXStmtQuery::XretSchemaChanged | 20.71% | 14.89% |
| CMsqlXactImp::Begin | 5.06% | 13% |
| CMsqlXactImp::Commit | 12.41% | 24.03% |
+--------------------------------+----------+----------+
When performing multiple insert/update/delete, it is better to have a transaction to insure atomicity on operation and it insure that all the tasks of operation are executed or none.
For single insert/update/delete statement, it depends upon what kind of operation (from business layer perspective) you are performing and how important it is. If you are performing some calculation before single insert/update/delete, then better use transaction, may be some data changed after you retrieve data for insert/update/delete.
One plus point is you can add another INSERT (for example) and it's already safe.
Then again, you also have the problem of nested transactions if a stored procedure calls another one. An inner rollback will cause error 266.
If every call is simple CRUD with no nesting then it's pointless: but if you nest or have multiple writes pre TXN then it's good to have a consistent template.
You mentioned that you'll be optimizing this legacy app.
One of the first, and easiest, things you can do to improve performance is remove all the BEGIN TRAN and COMMIT TRAN for the stored procedures that only do SELECTs.
Here is a simple test to demonstrate:
/* Compare basic SELECT times with and without a transaction */
DECLARE #date DATETIME2
DECLARE #noTran INT
DECLARE #withTran INT
SET #noTran = 0
SET #withTran = 0
DECLARE #t TABLE (ColA INT)
INSERT #t VALUES (1)
DECLARE
#count INT,
#value INT
SET #count = 1
WHILE #count < 1000000
BEGIN
SET #date = GETDATE()
SELECT #value = ColA FROM #t WHERE ColA = 1
SET #noTran = #noTran + DATEDIFF(MICROSECOND, #date, GETDATE())
SET #date = GETDATE()
BEGIN TRAN
SELECT #value = ColA FROM #t WHERE ColA = 1
COMMIT TRAN
SET #withTran = #withTran + DATEDIFF(MICROSECOND, #date, GETDATE())
SET #count = #count + 1
END
SELECT
#noTran / 1000000. AS Seconds_NoTransaction,
#withTran / 1000000. AS Seconds_WithTransaction
/** Results **/
Seconds_NoTransaction Seconds_WithTransaction
--------------------------------------- ---------------------------------------
14.23600000 18.08300000
You can see there is a definite overhead associated with the transactions.
Note: this is assuming your these stored procedures are not using any special isolation levels or locking hints (for something like handling pessimistic concurrency). In that case, obvously you would want to keep them.
So to answer the question, I would only leave in the transactions where you are actually attempting to preserve the integrity of the data modifications in case of an error in the code, SQL Server, or the hardware.
I can only say that placing a transaction block like this to every stored procedure might be a novice's work.
A transaction should be placed only in a block that has more than one insert/update statements, other than that, there is no need to place a transaction block in the stored procedure.
BEGIN TRANSACTION / COMMIT syntax shouldn't be used in every stored procedure by default unless you are trying to cover the following scenarios:
You include the WITH MARK option because you want to support restoring the database from a backup to a specific point in time.
You intend to port the code from SQL Server to another database platform like Oracle. Oracle does not commit transactions by default.

Possible to implement a manual increment with just simple SQL INSERT?

I have a primary key that I don't want to auto increment (for various reasons) and so I'm looking for a way to simply increment that field when I INSERT. By simply, I mean without stored procedures and without triggers, so just a series of SQL commands (preferably one command).
Here is what I have tried thus far:
BEGIN TRAN
INSERT INTO Table1(id, data_field)
VALUES ( (SELECT (MAX(id) + 1) FROM Table1), '[blob of data]');
COMMIT TRAN;
* Data abstracted to use generic names and identifiers
However, when executed, the command errors, saying that
"Subqueries are not allowed in this
context. only scalar expressions are
allowed"
So, how can I do this/what am I doing wrong?
EDIT: Since it was pointed out as a consideration, the table to be inserted into is guaranteed to have at least 1 row already.
You understand that you will have collisions right?
you need to do something like this and this might cause deadlocks so be very sure what you are trying to accomplish here
DECLARE #id int
BEGIN TRAN
SELECT #id = MAX(id) + 1 FROM Table1 WITH (UPDLOCK, HOLDLOCK)
INSERT INTO Table1(id, data_field)
VALUES (#id ,'[blob of data]')
COMMIT TRAN
To explain the collision thing, I have provided some code
first create this table and insert one row
CREATE TABLE Table1(id int primary key not null, data_field char(100))
GO
Insert Table1 values(1,'[blob of data]')
Go
Now open up two query windows and run this at the same time
declare #i int
set #i =1
while #i < 10000
begin
BEGIN TRAN
INSERT INTO Table1(id, data_field)
SELECT MAX(id) + 1, '[blob of data]' FROM Table1
COMMIT TRAN;
set #i =#i + 1
end
You will see a bunch of these
Server: Msg 2627, Level 14, State 1, Line 7
Violation of PRIMARY KEY constraint 'PK__Table1__3213E83F2962141D'. Cannot insert duplicate key in object 'dbo.Table1'.
The statement has been terminated.
Try this instead:
INSERT INTO Table1 (id, data_field)
SELECT id, '[blob of data]' FROM (SELECT MAX(id) + 1 as id FROM Table1) tbl
I wouldn't recommend doing it that way for any number of reasons though (performance, transaction safety, etc)
It could be because there are no records so the sub query is returning NULL...try
INSERT INTO tblTest(RecordID, Text)
VALUES ((SELECT ISNULL(MAX(RecordID), 0) + 1 FROM tblTest), 'asdf')
I don't know if somebody is still looking for an answer but here is a solution that seems to work:
-- Preparation: execute only once
CREATE TABLE Test (Value int)
CREATE TABLE Lock (LockID uniqueidentifier)
INSERT INTO Lock SELECT NEWID()
-- Real insert
BEGIN TRAN LockTran
-- Lock an object to block simultaneous calls.
UPDATE Lock WITH(TABLOCK)
SET LockID = LockID
INSERT INTO Test
SELECT ISNULL(MAX(T.Value), 0) + 1
FROM Test T
COMMIT TRAN LockTran
We have a similar situation where we needed to increment and could not have gaps in the numbers. (If you use an identity value and a transaction is rolled back, that number will not be inserted and you will have gaps because the identity value does not roll back.)
We created a separate table for last number used and seeded it with 0.
Our insert takes a few steps.
--increment the number
Update dbo.NumberTable
set number = number + 1
--find out what the incremented number is
select #number = number
from dbo.NumberTable
--use the number
insert into dbo.MyTable using the #number
commit or rollback
This causes simultaneous transactions to process in a single line as each concurrent transaction will wait because the NumberTable is locked. As soon as the waiting transaction gets the lock, it increments the current value and locks it from others. That current value is the last number used and if a transaction is rolled back, the NumberTable update is also rolled back so there are no gaps.
Hope that helps.
Another way to cause single file execution is to use a SQL application lock. We have used that approach for longer running processes like synchronizing data between systems so only one synchronizing process can run at a time.
If you're doing it in a trigger, you could make sure it's an "INSTEAD OF" trigger and do it in a couple of statements:
DECLARE #next INT
SET #next = (SELECT (MAX(id) + 1) FROM Table1)
INSERT INTO Table1
VALUES (#next, inserted.datablob)
The only thing you'd have to be careful about is concurrency - if two rows are inserted at the same time, they could attempt to use the same value for #next, causing a conflict.
Does this accomplish what you want?
It seems very odd to do this sort of thing w/o an IDENTITY (auto-increment) column, making me question the architecture itself. I mean, seriously, this is the perfect situation for an IDENTITY column. It might help us answer your question if you'd explain the reasoning behind this decision. =)
Having said that, some options are:
using an INSTEAD OF trigger for this purpose. So, you'd do your INSERT (the INSERT statement would not need to pass in an ID). The trigger code would handle inserting the appropriate ID. You'd need to use the WITH (UPDLOCK, HOLDLOCK) syntax used by another answerer to hold the lock for the duration of the trigger (which is implicitly wrapped in a transaction) & to elevate the lock type from "shared" to "update" lock (IIRC).
you can use the idea above, but have a table whose purpose is to store the last, max value inserted into the table. So, once the table is set up, you would no longer have to do a SELECT MAX(ID) every time. You'd simply increment the value in the table. This is safe provided that you use appropriate locking (as discussed). Again, that avoids repeated table scans every time you INSERT.
use GUIDs instead of IDs. It's much easier to merge tables across databases, since the GUIDs will always be unique (whereas records across databases will have conflicting integer IDs). To avoid page splitting, sequential GUIDs can be used. This is only beneficial if you might need to do database merging.
Use a stored proc in lieu of the trigger approach (since triggers are to be avoided, for some reason). You'd still have the locking issue (and the performance problems that can arise). But sprocs are preferred over dynamic SQL (in the context of applications), and are often much more performant.
Sorry about rambling. Hope that helps.
How about creating a separate table to maintain the counter? It has better performance than MAX(id), as it will be O(1). MAX(id) is at best O(lgn) depending on the implementation.
And then when you need to insert, simply lock the counter table for reading the counter and increment the counter. Then you can release the lock and insert to your table with the incremented counter value.
Have a separate table where you keep your latest ID and for every transaction get a new one.
It may be a bit slower but it should work.
DECLARE #NEWID INT
BEGIN TRAN
UPDATE TABLE SET ID=ID+1
SELECT #NEWID=ID FROM TABLE
COMMIT TRAN
PRINT #NEWID -- Do what you want with your new ID
Code without any transaction scope (I use it in my engineer course as an exercice) :
-- Preparation: execute only once
CREATE TABLE increment (val int);
INSERT INTO increment VALUES (1);
-- Real insert
DECLARE #newIncrement INT;
UPDATE increment
SET #newIncrement = val,
val = val + 1;
INSERT INTO Table1 (id, data_field)
SELECT #newIncrement, 'some data';
declare #nextId int
set #nextId = (select MAX(id)+1 from Table1)
insert into Table1(id, data_field) values (#nextId, '[blob of data]')
commit;
But perhaps a better approach would be using a scalar function getNextId('table1')
Any critiques of this? Works for me.
DECLARE #m_NewRequestID INT
, #m_IsError BIT = 1
, #m_CatchEndless INT = 0
WHILE #m_IsError = 1
BEGIN TRY
SELECT #m_NewRequestID = (SELECT ISNULL(MAX(RequestID), 0) + 1 FROM Requests)
INSERT INTO Requests ( RequestID
, RequestName
, Customer
, Comment
, CreatedFromApplication)
SELECT RequestID = #m_NewRequestID
, RequestName = dbo.ufGetNextAvailableRequestName(PatternName)
, Customer = #Customer
, Comment = [Description]
, CreatedFromApplication = #CreatedFromApplication
FROM RequestPatterns
WHERE PatternID = #PatternID
SET #m_IsError = 0
END TRY
BEGIN CATCH
SET #m_IsError = 1
SET #m_CatchEndless = #m_CatchEndless + 1
IF #m_CatchEndless > 1000
THROW 51000, '[upCreateRequestFromPattern]: Unable to get new RequestID', 1
END CATCH
This should work:
INSERT INTO Table1 (id, data_field)
SELECT (SELECT (MAX(id) + 1) FROM Table1), '[blob of data]';
Or this (substitute LIMIT for other platforms):
INSERT INTO Table1 (id, data_field)
SELECT TOP 1
MAX(id) + 1, '[blob of data]'
FROM
Table1
ORDER BY
[id] DESC;