MS SQL locking for update - sql

I am implementing a competition where there might be a lot of simultaneous entries. I am collecting some user data, which I am putting in one table called entries. I have another table of pre-generated unique discount codes in a table called discountCodes. I am then assigning one to each entry. I thought I would do this by putting a entry id in the discountCodes table.
As there may be a lot of concurrent users I think I should select the first unassigned row and then assign the entry id to that row. I need to make sure between picking an unassigned row and adding the entry id that another thread doesn't find the same row.
What is the best way of ensuring that the row doesn't get assigned twice?

Something can be done like
The following example sets the TRANSACTION ISOLATION LEVEL for the session. For each Transact-SQL statement that follows, SQL Server holds all of the shared locks until the end of the transaction. Source:MSDN
USE databaseName;
GO
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
GO
BEGIN TRANSACTION;
GO
SELECT *
FROM Table1;
GO
SELECT *
FROM Table2;
GO
COMMIT TRANSACTION;
GO
Read more SET TRANSACTION ISOLATION LEVEL

I would recommend building a bridge table instead of having the EntryId in the DiscountCodes table with an EntryId and a DiscountCodeId. Place a Unique Constraint on both of those fields.
This way your entry point will encounter a constraint violation when it tries to enter a duplicate.

WITH e AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) rn
FROM entries ei
WHERE NOT EXISTS
(
SELECT NULL
FROM discountCodes dci
WHERE dci.entryId = ei.id
)
),
dc AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) rn
FROM discountCodes
WHERE entryId IS NULL
)
UPDATE dc
SET dc.entryId = e.id
FROM e
JOIN dc
ON dc.rn = e.rn

I would just put an IDENTITY field on each table and let the corresponding entry match the corresponding discountCode - i.e. if you have a thousand discountCodes up front, your identity column in the discountCodes table will range from 1 to 1000. That will match your first 1 to 1000 entries. If you get more than 1000 entries, just add one discountCode per additional entry.
That way SQL Server handles all the problematic "get the next number in the sequence" logic for you.

You can try and use sp_getapplock and synchronize the write operation, just make sure it locks against the same hash, like
DECLARE #state Int
BEGIN TRAN
-- here we're using 1 sec's as time out, but you should determine what the min value is for your instance
EXEC #state = sp_getapplock 'SyncIt', 'Exclusive', 'Transaction', 1000
-- do insert/update/etc...
-- if you like you can be a little verbose and explicit, otherwise line below shouldn't be needed
EXEC sp_releaseapplock 'SyncIt', 'Transaction'
COMMIT TRAN

Related

How to guarantee only one process picks up a processing task

I have multiple computers that have the task of sending out emails found in a table on a common SQL Server. Each computer polls the email table to look for messages it can send by looking at a status flag set to 0. If a computer does a
SELECT * FROM tblEmailQueue where StatusFlag=0
if it returns a record it immediately sets the StatusFlag to 1 which should cause the other computer polling the same table not to find this record. My fear is that if two computer find the record at the same time before either can update the StatusFlag, the email will be sent twice. Does anyone have ideas on how to ensure only one computer will get the record? I know I might be able to do a table lock but I would rather now have to do this.
Instead of using two queries which may cause a race condition, you can update the values and output the updated rows at once using the OUTPUT clause.
This will update the rows with statusflag=0 and output all of the updated ones;
UPDATE tblEmailQueue
SET statusflag=1
OUTPUT DELETED.*
WHERE statusflag=0;
An SQLfiddle to test with.
EDIT: If you're picking one row, you may want some ordering. Since the update itself can't order, you can use a common table expression to do the update;
WITH cte AS (
SELECT TOP 1 id, statusflag FROM tblEmailQueue
WHERE statusflag = 0 ORDER BY id
)
UPDATE cte SET statusflag=1 OUTPUT DELETED.*;
Another SQLfiddle.
You can perform select and send email in the same transaction. Also you can use ROWLOCK hint and don't commit transaction until you send email or set new value for StatusFlag. It means that nobody (exept transaction with hint NOLOCK or READ UNCOMMITED isolation level) can read this row as long as you commit transaction.
SELECT * FROM tblEmailQueue WITH(ROWLOCK) where StatusFlag=0
In addition you should check isolation level. For your case isolation level should be READ COMMITED or REPEATABLE READ.
See information about isolation levels here
Add another column to your table tblEmailQueue (say UserID), then try to pull one email such as
--Let flag an email and assign it to the application who made the request
--#CurrentUserID is an id unique to each application or user and supplied by the application
--or user who made the request, this will also ensures that the record is going to
--the right application and perhaps you can use it for other purpose such as monitoring.
UPDATE tblEmailQueue set UserID = #CurrentUserID, StatusFlag=1 where ID = isnull(
select top 1 ID from tblEmailQueue where StatusFlag=0 order by ID
), 0)
--Lets get an email that had a flag for the current user id
SELECT * FROM tblEmailQueue where StatusFlag=1 and UserID = #CurrentUserID
Here in Indianapolis, we are familiar with race conditions ;-)
Lets assume you actually have and ID field and a StatusFlag field and create a stored proc that includes
declare #id int
select top 1 #id = id from tblEmailQuaue where StatusFlag=0
if ##rowcount = 1
begin
update tblEmailQuaue set StatusFlag=1 where ID = #id AND StatusFlag=0
if ##rowcount = 1
begin
-- I won the race, continue processing
...
end
end
ADDED
An explicit handling like this is inferior to Joachim's method if all you want is the result of the select. But this method this method also works with old versions of SQL server as well as other databases.

Delete new record if same data exists

I want to delete new record if the same record created before.
My columns are date, time and MsgLog. If date and time are same, I want to delete new one.
I need help .
You can check in the table whether that value exists or not in the column using a query. If it exists, you can show message that a record already exists.
To prevent such kind of erroneous additions you can add restriction to your table to ensure unique #Date #Time pairs; if you don't want to change data structure (e.g. you want to add records with such restrictions once or twice) you can exploit insert select counstruction.
-- MS SQL version, check your DBMS
insert into MyTable(
Date,
Time,
MsgLog)
select #Date,
#Time,
#MsgLog
where not exists(
select 1
from MyTable
where (#Date = Date) and
(#Time = Time)
)
P.S. want to delete new one equals to do not insert new one
You should create a unique constraint in the DB level to avoid invalid data no matter who writes to your DB.
It's always important to have your schema well defined. That way you're safe that no matter how many apps are using your DB or even in case someone just writes some inserts manually.
I don't know which DB are you using but in MySQL can use to following DDL
alter table MY_TABLE add unique index(date, time);
And in Oracle you can :
alter table MY_TABLE ADD CONSTRAINT constaint_name UNIQUE (date, time);
That said, you can also (not instead of) do some checks before inserting new values to avoid dealing with exceptions or to improve performance by avoiding making unnecessary access to your DB (length \ nulls for example could easily be dealt with in the application level).
You can avoid deleting by checking for duplicate while inserting.
Just modify your insert procedure like this, so no duplicates will entered.
declare #intCount as int;
select #intCount =count(MsgLog) where (date=#date) and (time =#time )
if #intCount=0
begin
'insert procedure
end
> Edited
since what you wanted is you need to delete the duplicate entries after your bulk insert. Think about this logic,
create a temporary table
Insert LogId,date,time from your table to the temp table order by date,time
now declare four variables, #preTime,#PreDate,#CurrTime,#CurrDate
Loop for each items in temp table, like this
while
begin
#pkLogID= ' Get LogID for the current row
select #currTime=time,#currDate=date from tblTemp where pkLogId=#pkLogID 'Assign Current values
'Delete condition check
if (#currDate=#preDate) and (#currTime=#preTime)
begin
delete from MAINTABLE where pkLogId=#pkLogID
end
select #preDate=#currDate,#preTime=#currTime 'Assign current values as preValues for next entries
end
The above strategy is we sorted all entries according to date and time, so duplicates will come closer, and we started to compare each entry with its previous, when match found we deleting the duplicate entry.

Do databases always lock non-existent rows after a query or update?

Given:
customer[id BIGINT AUTO_INCREMENT PRIMARY KEY, email VARCHAR(30), count INT]
I'd like to execute the following atomically: Update the customer if he already exists; otherwise, insert a new customer.
In theory this sounds like a perfect fit for SQL-MERGE but the database I am using doesn't support MERGE with AUTO_INCREMENT columns.
https://stackoverflow.com/a/1727788/14731 seems to indicate that if you execute a query or update statement against a non-existent row, the database will lock the index thereby preventing concurrent inserts.
Is this behavior guaranteed by the SQL standard? Are there any databases that do not behave this way?
UPDATE: Sorry, I should have mentioned this earlier: the solution must use READ_COMMITTED transaction isolation unless that is impossible in which case I will accept the use of SERIALIZABLE.
This question is asked about once a week on SO, and the answers are almost invariably wrong.
Here's the right one.
insert customer (email, count)
select 'foo#example.com', 0
where not exists (
select 1 from customer
where email = 'foo#example.com'
)
update customer set count = count + 1
where email = 'foo#example.com'
If you like, you can insert a count of 1 and skip the update if the inserted rowcount -- however expressed in your DBMS -- returns 1.
The above syntax is absolutely standard and makes no assumption about locking mechanisms or isolation levels. If it doesn't work, your DBMS is broken.
Many people are under the mistaken impression that the select executes "first" and thus introduces a race condition. No: that select is part of the insert. The insert is atomic. There is no race.
Use Russell Fox's code but use SERIALIZABLE isolation. This will take a range lock so that the non-existing row is logically locked (together with all other non-existing rows in the surrounding key range).
So it looks like this:
BEGIN TRAN
IF EXISTS (SELECT 1 FROM foo WITH (UPDLOCK, HOLDLOCK) WHERE [email] = 'thisemail')
BEGIN
UPDATE foo...
END
ELSE
BEGIN
INSERT INTO foo...
END
COMMIT
Most code taken from his answer, but fixed to provided mutual exclusion semantics.
Answering my own question since there seems to be a lot of confusion around the topic. It seems that:
-- BAD! DO NOT DO THIS! --
insert customer (email, count)
select 'foo#example.com', 0
where not exists (
select 1 from customer
where email = 'foo#example.com'
)
is open to race-conditions (see Only inserting a row if it's not already there). From what I've been able to gather, the only portable solution to this problem:
Pick a key to merge against. This could be the primary key, or another unique key, but it must have a unique constraint.
Try to insert a new row. You must catch the error that will occur if the row already exists.
The hard part is over. At this point, the row is guaranteed to exist and you are protected from race-conditions by the fact that you are holding a write-lock on it (due to the insert from the previous step).
Go ahead and update if needed or select its primary key.
IF EXISTS (SELECT 1 FROM foo WHERE [email] = 'thisemail')
BEGIN
UPDATE foo...
END
ELSE
BEGIN
INSERT INTO foo...
END

SQL Server : add row if doesn't exist, increment value of one column, atomic

I have a table that keeps a count of user actions. Each time an action is done, the value needs to increase. Since the user can have multiple sessions at the same time, the process needs to be atomic to avoid multi-user issues.
The table has 3 columns:
ActionCode as varchar
UserID as int
Count as int
I want to pass ActionCode and UserID to a function that will add a new row if one doesn't already exist, and set count to 1. If the row does exist, it will just increase the count by one. ActionCode and UserID make up the primary unique index for this table.
If all I needed to do was update, I could do something simple like this (because an UPDATE query is atomic already):
UPDATE (Table)
SET Count = Count + 1
WHERE ActionCode = #ActionCode AND UserID = #UserID
I'm new to atomic transactions in SQL. This question has probably been answered in multiple parts here, but I'm having trouble finding those and also placing those parts in one solution. This needs to be pretty fast as well, without getting to complex, because these actions may occur frequently.
Edit: Sorry, this might be a dupe of MySQL how to do an if exist increment in a single query. I searched a lot but had tsql in my search, once I changed to sql instead, that was the top result. It isn't obvious if that is atomic, but pretty sure it would be. I'll probably vote to delete this as dupe, unless someone thinks there can be some new value added by this question and answer.
Assuming you are on SQL Server, to make a single atomic statement you could use MERGE
MERGE YourTable AS target
USING (SELECT #ActionCode, #UserID) AS source (ActionCode, UserID)
ON (target.ActionCode = source.ActionCode AND target.UserID = source.UserID)
WHEN MATCHED THEN
UPDATE SET [Count] = target.[Count] + 1
WHEN NOT MATCHED THEN
INSERT (ActionCode, UserID, [Count])
VALUES (source.ActionCode, source.UserID, 1)
OUTPUT INSERTED.* INTO #MyTempTable;
UPDATE Use output to select the values if necessary. The code updated.
Using MERGE in SQL Server 2008 is probably the best bet. There is also another simple way to solve it.
If the UserID/Action doesn't exist, do an INSERT of a new row with a 0 for Count. If this statement fails due to it already being present (as inserted by another concurrent session just then), simply ignore the error.
If you want to do the insert and block while performing it to eliminate any chance of error, you can add some lock hints:
INSERT dbo.UserActionCount (UserID, ActionCode, Count)
SELECT #UserID, #ActionCode, 0
WHERE NOT EXISTS (
SELECT *
FROM dbo.UserActionCount WITH (ROWLOCK, HOLDLOCK, UPDLOCK)
WHERE
UserID = #UserID
AND ActionCode = #ActionCode
);
Then do the UPDATE with + 1 as in the usual case. Problem solved.
DECLARE #NewCount int,
UPDATE UAC
SET
Count = Count + 1,
#NewCount = Count + 1
FROM dbo.UserActionCount UAC
WHERE
ActionCode = #ActionCode
AND UserID = #UserID;
Note 1: The MERGE should be okay, but know that just because something is done in one statement (and therefore atomic) does not mean that it does not have concurrency problems. Locks are acquired and released over time throughout the lifetime of a query's execution. A query like the following WILL experience concurrency problems causing duplicate ID insertion attempts, despite being atomic.
INSERT T
SELECT (SELECT Max(ID) FROM Table) + 1, GetDate()
FROM Table T;
Note 2: An article I read by people experienced in super-high-transaction-volume systems said that they found the "try-it-then-handle-any-error" method to offer higher concurrency than acquiring and releasing locks. This may not be the case in all system designs, but it is at least worth considering. I have since searched for this article several times (including just now) and been unable to find it again... I hope to find it some day and reread it.
Incase anyone else needs the syntax to use this in a stored procedure and return the inserted/updated rows (I was surprised inserted.* also returns the updated rows, but it does). Here is what I ended up with. I forgot I had an additional column in my primary key (ActionKey), it is reflected below. Can also do "output inserted.Count" if you only want to return the Count, which is more practical.
CREATE PROCEDURE dbo.AddUserAction
(
#Action varchar(30),
#ActionKey varchar(50) = '',
#UserID int
)
AS
MERGE UserActions AS target
USING (SELECT #Action, #ActionKey, #UserID) AS source (Action, ActionKey, UserID)
ON (target.Action = source.Action AND target.ActionKey = source.ActionKey AND target.UserID = source.UserID)
WHEN MATCHED THEN
UPDATE SET [Count] = target.[Count] + 1
WHEN NOT MATCHED THEN
INSERT (Action, ActionKey, UserID, [Count])
VALUES (source.Action, source.ActionKey, source.UserID, 1)
output inserted.*;

C# SqlParameter - provide SQL (Microsoft SQL)

I am currently tasked with a project on a database whose schema cannot be changed. I need to insert a new row into a table that requires an ID to be unique, but the original creators of the structure did not set this value to autoincrement. To go around this, I have been using code akin to:
(SELECT TOP 1 [ID] from [Table] ORDER BY [ID] DESC) + 1
when giving the value of the ID field, basically having an inner query of sorts. Problem is that a few lines down, I need that ID I just inputted. If I could set a SQLParameter to output for this column, I could get the value it was set to, problem is I'm using SQL, and not a hard value like I do with other SQLParameters. Can't I use SQL in place of just a value?
This is a potential high volume exchange, so I'd rather not do 2 different queries (one to get id, then one to insert).
You say you cannot change the schema, but can you add an additional table to the project that does an autoincrement column? Then you could use that table to (safely) create your new IDs and return them to your code.
This is similar to how Oracle does IDs, and sometimes vendor applications for sql server that also run on Oracle will use that approach just to help minimize the differences between the two databases.
Update:
Ah, I just spotted your comment to the other answer here. In that case, the only other thing I can think that might work is to put your two statements (insert a new ID, and then read back the new ID) inside a transaction with the SERIALIZABLE isolation level. And that just kinda sucks, because it leaves you open to performance and locking gotchas.
Is it possible for you to create a stored procedure in the database to do this and the return value of the stored procedure will then return the ID that you need?
I'm a bit confused about where you need to use this ID. If it inside of the same stored proc just use this method:
DECLARE #NewId int
SELECT TOP 1 #NewId = [ID] + 1 from [Table] ORDER BY [ID] DESC
SELECT #NewId
You can put more than one SQL statement in a single SqlCommand. So you could easily do something along the lines of what Abe suggested:
DECLARE #NewId int
SELECT TOP 1 #NewId = [ID] + 1 from [Table] ORDER BY [ID] DESC
INSERT INTO [Table] (ID, ...) VALUES (#NewId, ...)
SELECT #NewId
Then you just call ExecuteScalar on your SqlCommand, and it will do the INSERT and then return the ID it used.