How to guarantee only one process picks up a processing task - sql

I have multiple computers that have the task of sending out emails found in a table on a common SQL Server. Each computer polls the email table to look for messages it can send by looking at a status flag set to 0. If a computer does a
SELECT * FROM tblEmailQueue where StatusFlag=0
if it returns a record it immediately sets the StatusFlag to 1 which should cause the other computer polling the same table not to find this record. My fear is that if two computer find the record at the same time before either can update the StatusFlag, the email will be sent twice. Does anyone have ideas on how to ensure only one computer will get the record? I know I might be able to do a table lock but I would rather now have to do this.

Instead of using two queries which may cause a race condition, you can update the values and output the updated rows at once using the OUTPUT clause.
This will update the rows with statusflag=0 and output all of the updated ones;
UPDATE tblEmailQueue
SET statusflag=1
OUTPUT DELETED.*
WHERE statusflag=0;
An SQLfiddle to test with.
EDIT: If you're picking one row, you may want some ordering. Since the update itself can't order, you can use a common table expression to do the update;
WITH cte AS (
SELECT TOP 1 id, statusflag FROM tblEmailQueue
WHERE statusflag = 0 ORDER BY id
)
UPDATE cte SET statusflag=1 OUTPUT DELETED.*;
Another SQLfiddle.

You can perform select and send email in the same transaction. Also you can use ROWLOCK hint and don't commit transaction until you send email or set new value for StatusFlag. It means that nobody (exept transaction with hint NOLOCK or READ UNCOMMITED isolation level) can read this row as long as you commit transaction.
SELECT * FROM tblEmailQueue WITH(ROWLOCK) where StatusFlag=0
In addition you should check isolation level. For your case isolation level should be READ COMMITED or REPEATABLE READ.
See information about isolation levels here

Add another column to your table tblEmailQueue (say UserID), then try to pull one email such as
--Let flag an email and assign it to the application who made the request
--#CurrentUserID is an id unique to each application or user and supplied by the application
--or user who made the request, this will also ensures that the record is going to
--the right application and perhaps you can use it for other purpose such as monitoring.
UPDATE tblEmailQueue set UserID = #CurrentUserID, StatusFlag=1 where ID = isnull(
select top 1 ID from tblEmailQueue where StatusFlag=0 order by ID
), 0)
--Lets get an email that had a flag for the current user id
SELECT * FROM tblEmailQueue where StatusFlag=1 and UserID = #CurrentUserID

Here in Indianapolis, we are familiar with race conditions ;-)
Lets assume you actually have and ID field and a StatusFlag field and create a stored proc that includes
declare #id int
select top 1 #id = id from tblEmailQuaue where StatusFlag=0
if ##rowcount = 1
begin
update tblEmailQuaue set StatusFlag=1 where ID = #id AND StatusFlag=0
if ##rowcount = 1
begin
-- I won the race, continue processing
...
end
end
ADDED
An explicit handling like this is inferior to Joachim's method if all you want is the result of the select. But this method this method also works with old versions of SQL server as well as other databases.

Related

A trigger that inserts several rows instead of one

We have an issue with the following trigger. We would like to insert a row into the UPDATEPROCESSINFO table when there is no row with the new INSTANCEID and update it for the next ones.
But we were surprised to discover that sometimes we have multiple rows with the same INSTANCEID. Is it because it was very fast? How to prevent this from happening? Our aim is to have one row per INSTANCEID.
Thanks for help
create or replace TRIGGER TRIG_UPDATE_PROCESS_INFO
AFTER INSERT ON PROCESSSTEP
FOR EACH ROW
DECLARE
AUDIT_TIME TIMESTAMP(6);
BEGIN
SELECT MAX(LASTUPDATETIME)
INTO AUDIT_TIME
FROM UPDATEPROCESSINFO
WHERE INSTANCEID = :NEW.INSTANCEID;
IF AUDIT_TIME IS NULL THEN
INSERT INTO UPDATEPROCESSINFO
(INSTANCEID, STEPID, STEPSTATUS, STEPITERATION, LASTUPDATETIME)
VALUES
(:NEW.INSTANCEID, :NEW.STEPID, :NEW.STATUS, :NEW.STEPITERATION, :NEW.AUDITTIMESTAMP);
ELSIF :NEW.AUDITTIMESTAMP > AUDIT_TIME THEN
UPDATE UPDATEPROCESSINFO
SET STEPID = :NEW.STEPID,
LASTUPDATETIME = :NEW.AUDITTIMESTAMP,
STEPSTATUS = :NEW.STATUS,
STEPITERATION = :NEW.STEPITERATION
WHERE INSTANCEID = :NEW.INSTANCEID;
END IF;
END;
This may be occurring because you have multiple sessions which are inserting into PROCESSSTEP for the same INSTANCEID. If two of the sessions insert into PROCESSSTEP at nearly the same time, and neither of them has committed their changes, then neither session will "see" the other's changes, and both will think that a row does not exist in UPDATEPROCESSINFO.
In my view this design appears to have a problem. I suggest changing it to have a PROCESS_STEP_HISTORY table, and as each step in the process is completed a row is inserted into PROCESS_STEP_HISTORY to record the information for the process step that was completed. Then, when something needed to find out information about the "last" step which was completed it would just do something like
SELECT a.*
FROM (SELECT *
FROM PROCESS_STEP_HISTORY h
WHERE INSTANCE_ID = whatever
ORDER BY LASTUPDATETIME DESC) a
WHERE ROWNUM = 1
It also has the advantage of preserving information about every step in the process, which may prove useful.
I also don't recommend using a trigger to do this sort of thing. This is business logic, and putting business logic into triggers is never a good idea.
Best of luck.

Can an updated with nested select be considered atomic in Sybase?

I am trying something like this:
set rowcount 10 //fetch only 10 row
Update tableX set x=#BatchId where id in (select id from tableX where x=0)
basically mark 10 record as booked by supplying a batchId.
So my question is if this proc is executed in parallel then can I guarantee that update with select will be atomic and no invocation will select similar setof record from tableX for booking?
Thanks
To guarantee that no such overlaps occur, yo should:
(i) put BEGIN TRANSACTION - COMMIT around the statement
(ii) put the HOLDLOCK keyword directly behind 'tableX' (or run the whole statement at isolation level 3).

MS SQL locking for update

I am implementing a competition where there might be a lot of simultaneous entries. I am collecting some user data, which I am putting in one table called entries. I have another table of pre-generated unique discount codes in a table called discountCodes. I am then assigning one to each entry. I thought I would do this by putting a entry id in the discountCodes table.
As there may be a lot of concurrent users I think I should select the first unassigned row and then assign the entry id to that row. I need to make sure between picking an unassigned row and adding the entry id that another thread doesn't find the same row.
What is the best way of ensuring that the row doesn't get assigned twice?
Something can be done like
The following example sets the TRANSACTION ISOLATION LEVEL for the session. For each Transact-SQL statement that follows, SQL Server holds all of the shared locks until the end of the transaction. Source:MSDN
USE databaseName;
GO
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
GO
BEGIN TRANSACTION;
GO
SELECT *
FROM Table1;
GO
SELECT *
FROM Table2;
GO
COMMIT TRANSACTION;
GO
Read more SET TRANSACTION ISOLATION LEVEL
I would recommend building a bridge table instead of having the EntryId in the DiscountCodes table with an EntryId and a DiscountCodeId. Place a Unique Constraint on both of those fields.
This way your entry point will encounter a constraint violation when it tries to enter a duplicate.
WITH e AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) rn
FROM entries ei
WHERE NOT EXISTS
(
SELECT NULL
FROM discountCodes dci
WHERE dci.entryId = ei.id
)
),
dc AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) rn
FROM discountCodes
WHERE entryId IS NULL
)
UPDATE dc
SET dc.entryId = e.id
FROM e
JOIN dc
ON dc.rn = e.rn
I would just put an IDENTITY field on each table and let the corresponding entry match the corresponding discountCode - i.e. if you have a thousand discountCodes up front, your identity column in the discountCodes table will range from 1 to 1000. That will match your first 1 to 1000 entries. If you get more than 1000 entries, just add one discountCode per additional entry.
That way SQL Server handles all the problematic "get the next number in the sequence" logic for you.
You can try and use sp_getapplock and synchronize the write operation, just make sure it locks against the same hash, like
DECLARE #state Int
BEGIN TRAN
-- here we're using 1 sec's as time out, but you should determine what the min value is for your instance
EXEC #state = sp_getapplock 'SyncIt', 'Exclusive', 'Transaction', 1000
-- do insert/update/etc...
-- if you like you can be a little verbose and explicit, otherwise line below shouldn't be needed
EXEC sp_releaseapplock 'SyncIt', 'Transaction'
COMMIT TRAN

SQL Server : add row if doesn't exist, increment value of one column, atomic

I have a table that keeps a count of user actions. Each time an action is done, the value needs to increase. Since the user can have multiple sessions at the same time, the process needs to be atomic to avoid multi-user issues.
The table has 3 columns:
ActionCode as varchar
UserID as int
Count as int
I want to pass ActionCode and UserID to a function that will add a new row if one doesn't already exist, and set count to 1. If the row does exist, it will just increase the count by one. ActionCode and UserID make up the primary unique index for this table.
If all I needed to do was update, I could do something simple like this (because an UPDATE query is atomic already):
UPDATE (Table)
SET Count = Count + 1
WHERE ActionCode = #ActionCode AND UserID = #UserID
I'm new to atomic transactions in SQL. This question has probably been answered in multiple parts here, but I'm having trouble finding those and also placing those parts in one solution. This needs to be pretty fast as well, without getting to complex, because these actions may occur frequently.
Edit: Sorry, this might be a dupe of MySQL how to do an if exist increment in a single query. I searched a lot but had tsql in my search, once I changed to sql instead, that was the top result. It isn't obvious if that is atomic, but pretty sure it would be. I'll probably vote to delete this as dupe, unless someone thinks there can be some new value added by this question and answer.
Assuming you are on SQL Server, to make a single atomic statement you could use MERGE
MERGE YourTable AS target
USING (SELECT #ActionCode, #UserID) AS source (ActionCode, UserID)
ON (target.ActionCode = source.ActionCode AND target.UserID = source.UserID)
WHEN MATCHED THEN
UPDATE SET [Count] = target.[Count] + 1
WHEN NOT MATCHED THEN
INSERT (ActionCode, UserID, [Count])
VALUES (source.ActionCode, source.UserID, 1)
OUTPUT INSERTED.* INTO #MyTempTable;
UPDATE Use output to select the values if necessary. The code updated.
Using MERGE in SQL Server 2008 is probably the best bet. There is also another simple way to solve it.
If the UserID/Action doesn't exist, do an INSERT of a new row with a 0 for Count. If this statement fails due to it already being present (as inserted by another concurrent session just then), simply ignore the error.
If you want to do the insert and block while performing it to eliminate any chance of error, you can add some lock hints:
INSERT dbo.UserActionCount (UserID, ActionCode, Count)
SELECT #UserID, #ActionCode, 0
WHERE NOT EXISTS (
SELECT *
FROM dbo.UserActionCount WITH (ROWLOCK, HOLDLOCK, UPDLOCK)
WHERE
UserID = #UserID
AND ActionCode = #ActionCode
);
Then do the UPDATE with + 1 as in the usual case. Problem solved.
DECLARE #NewCount int,
UPDATE UAC
SET
Count = Count + 1,
#NewCount = Count + 1
FROM dbo.UserActionCount UAC
WHERE
ActionCode = #ActionCode
AND UserID = #UserID;
Note 1: The MERGE should be okay, but know that just because something is done in one statement (and therefore atomic) does not mean that it does not have concurrency problems. Locks are acquired and released over time throughout the lifetime of a query's execution. A query like the following WILL experience concurrency problems causing duplicate ID insertion attempts, despite being atomic.
INSERT T
SELECT (SELECT Max(ID) FROM Table) + 1, GetDate()
FROM Table T;
Note 2: An article I read by people experienced in super-high-transaction-volume systems said that they found the "try-it-then-handle-any-error" method to offer higher concurrency than acquiring and releasing locks. This may not be the case in all system designs, but it is at least worth considering. I have since searched for this article several times (including just now) and been unable to find it again... I hope to find it some day and reread it.
Incase anyone else needs the syntax to use this in a stored procedure and return the inserted/updated rows (I was surprised inserted.* also returns the updated rows, but it does). Here is what I ended up with. I forgot I had an additional column in my primary key (ActionKey), it is reflected below. Can also do "output inserted.Count" if you only want to return the Count, which is more practical.
CREATE PROCEDURE dbo.AddUserAction
(
#Action varchar(30),
#ActionKey varchar(50) = '',
#UserID int
)
AS
MERGE UserActions AS target
USING (SELECT #Action, #ActionKey, #UserID) AS source (Action, ActionKey, UserID)
ON (target.Action = source.Action AND target.ActionKey = source.ActionKey AND target.UserID = source.UserID)
WHEN MATCHED THEN
UPDATE SET [Count] = target.[Count] + 1
WHEN NOT MATCHED THEN
INSERT (Action, ActionKey, UserID, [Count])
VALUES (source.Action, source.ActionKey, source.UserID, 1)
output inserted.*;

select the rows affected by an update

If I have a table with this fields:
int:id_account
int:session
string:password
Now for a login statement I run this sql UPDATE command:
UPDATE tbl_name
SET session = session + 1
WHERE id_account = 17 AND password = 'apple'
Then I check if a row was affected, and if one indeed was affected I know that the password was correct.
Next what I want to do is retrieve all the info of this affected row so I'll have the rest of the fields info.
I can use a simple SELECT statement but I'm sure I'm missing something here, there must be a neater way you gurus know, and going to tell me about (:
Besides it bothered me since the first login sql statement I ever written.
Is there any performance-wise way to combine a SELECT into an UPDATE if the UPDATE did update a row?
Or am I better leaving it simple with two statements? Atomicity isn't needed, so I might better stay away from table locks for example, no?
You should use the same WHERE statement for SELECT. It will return the modified rows, because your UPDATE did not change any columns used for lookup:
UPDATE tbl_name
SET session = session + 1
WHERE id_account = 17 AND password = 'apple';
SELECT *
FROM tbl_name
WHERE id_account = 17 AND password = 'apple';
An advice: never store passwords as plain text! Use a hash function, like this:
MD5('apple')
There is ROW_COUNT() (do read about details in the docs).
Following up by SQL is ok and simple (which is always good), but it might unnecessary stress the system.
This won't work for statements such as...
Update Table
Set Value = 'Something Else'
Where Value is Null
Select Value From Table
Where Value is Null
You would have changed the value with the update and would be unable to recover the affected records unless you stored them beforehand.
Select * Into #TempTable
From Table
Where Value is Null
Update Table
Set Value = 'Something Else'
Where Value is Null
Select Value, UniqueValue
From #TempTable TT
Join Table T
TT.UniqueValue = T.UniqueValue
If you're lucky, you may be able to join the temp table's records to a unique field within Table to verify the update. This is just one small example of why it is important to enumerate records.
You can get the effected rows by just using ##RowCount..
select top (Select ##RowCount) * from YourTable order by 1 desc