How to write deadlock-proof upsert T-SQL? [closed] - sql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last month.
This post was edited and submitted for review last month and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
For a web api performing standard CRUD operations against a database :
For the DML CUD operations I could see two strategies at the endpoint :
Strategy 1 : CUD SPs accept params for table keys. First call read SP to determine if relevant record(s) exist; if not found return 401; if found then proceed with SP insert, update, delete, passing relevant key.
Example : endpoint to add new auto for customer. Auto table has FK to PK of Customer table. First call Auto Read SP ( first name, last name, phone, etc. ). If no row found, return 401 + custom message. If row found, call Auto Update SP passing Customer key.
Strategy 2 : CUD SPs accept properties params and SQL joins to other tables ( if relevant ).
Example : same endpoint. Single call to Auto Insert SP with with params auto make, model, year + customer name, phone. If fail, return 500 + SQL error message.
Advantage of Strategy 1 : better experience for end-user ( custom friendly error message ) and calls to CUD operations are never attempted if pre-determined to fail due to required records not present.
Advantages of Strategy 2 : less calls to database per call to web endpoint. How important is this typically ? Are there other advantages ?
EDIT : I'm trying out Aaron Bertrands upsert and trying to retrieve identity of new row inserted but always zero:
BEGIN TRAN
INSERT [dbo].[Auto]
([Make],[Model],[Year],[Customer])
SELECT #make, #model, #year, #customerId
WHERE NOT EXISTS
(SELECT 1
FROM [dbo].[Auto] WITH (UPDLOCK, SERIALIZABLE)
WHERE [Make] = #make
AND [Model] = #model
AND [Year] = #year
AND [Customer] = customerId)
IF ##ROWCOUNT = 0
BEGIN
UPDATE [dbo].[Auto]
SET [Make] = #make
,[Model] = #model
,[Year] = #year
,[Customer] = #customerId
WHERE [Make] = #make
AND [Model] = #model
AND [Year] = #year
AND [Customer] = #customerId)
END
SET #id = SCOPE_IDENTITY()
COMMIT

This question would be opinion-based if it weren't for the fact that Strategy 1 has absolutely no benefit.
You don't need multiple calls to get a friendly error message. Just call the stored procedure (or ad-hoc SQL batch). If it fails, catch the exception and send an appropriate and helpful HTTP response.
If extra logic is required, put it inside the stored procedure.

Related

Webapp - should a constraint be checked at database or application, or both

Simplified context:
let say I have 2 table in my database: Room(id, maxContract) and Contract(id, roomid, status)
let say I have a room 17 which allow max 2 clients, now I would search the Contract table of row roomid = 17 and status = active, if the more more than max (in this case 2) rows, I would prevent further INSERT until a contract expire.
Question:
Now I see 2 ways of doing this, first is in the database itself, maybe on a TRIGGER, and the second is doing this in my webapp DAO: query the Contract table to get the count, if-else to check the logic and only run the insert if true
But I am just a newbie, I don't know what is the best (or common) approach, which way I should do it? If this was my personal app, I would do both for max security, but designing a web I had to also take performance into consideration.
In case of frontend - backend, I know that validation is mandatory at backend and optional at the frontend, but between backend-database I don't know exactly
(In case this is opinion-based and there is no best-practice, I would like to know the pros and cons of both implementation)
EDIT:
to be more exact: user click JOIN ROOM => call an insertToRoom() method
+solution 1:
insertToRoom(){
if (roomIsAvailable()){
execute INSERT query;
}
else alert: "room is full";
}
roomIsAvailable() is a method to query and count how many contracts are bound to the room
+solution 2:
insertToRoom(){
execute INSERT query;
}
database:
CREATE TRIGGER before INSERT
if (some code to count the rooms)
ROLLBACK TRANSACTION;
in this case, if an unavailable room is join, database will return error, which in turn cause the execute INSERT query in the application to return false.
Either way, the falsy data is not inserted end the end user will get an error alert
I expect that more than one user can work with your application.
If you will get the current status from DB to the application, evaluate the condition there and then you will run the insert into DB table, then the condition can be violated. Let's imagine this sequence of steps:
User A read the current status from DB.
User B read the current status from DB.
User A evaluate the condition with result "there is a space for one client".
User B evaluate the condition with result "there is a space for one client".
User A update DB -> the room is full.
User B update DB -> in the room is more than allowed number of clients.
So that is not an option.
You can use triggers (as you mentioned), if your DB has such possibility.
You can also create one SQL statement which will check conditions and update the record in DB in one step (atomically). It depends on your DB engine, whether it is possible.
Update 2022-05-27
I was asked to explain in detail the atomic insert solution.
I'm not MySQL guru and I'm pretty sure, that there is more elegant way, how to do it, but something like this works too:
Let's create required DB objects first:
create table Room(
id INT,
maxContract INT);
create table Contract(
id INT AUTO_INCREMENT,
roomid INT,
status VARCHAR(30),
PRIMARY KEY(id)
);
create view UsedSpace as
select r.id as roomid, count(c.id) as used
from Room r
left outer join Contract c on c.roomid = r.id and status='active'
group by r.id;
Then we can use this statement to insert new row to Contract table:
insert into Contract(roomid, status)
select r.id, 'active'
from Room r
inner join UsedSpace us on r.id = us.roomid and r.maxContract > us.used
where id = 17;
When there is too much active contracts, then new row is not inserted.
You can check if the row was inserted or not via
select row_count();
Here is a fiddle to show results quickly.

Generate code for specific record instead of all records

I have code to generate a unique code for a filed in a SQL Server 2017 table. The entire code block works great without any issues. Due to user requirements I have to do this differently now. As you can see from my code it will generate a code for each row of data and set all rows 'Valid' filed to valid. What I absolutely have to do is generate the code for a specific record. And specifically for the most recently updated record. I totally understand how to get the most recently updated records and how to apply IDENT_CURRENT, ##IDENTITY, and SCOPE_IDENTITY but I wondered if there is something similar for recently Updated records. When I say updated record an Approver in Finance will change the Finance_Approval field status 'approved'. That specific record that was just changed to 'approved' is the one I want to generate my code againt. This code:
UPDATE Req_submitted
Set approved_code =
(SELECT FLOOR(RAND()*(525885-15+9)+16458)),
Valid = 'valid'
Below is all of code for the Trigger. I'm hoping this is a challenge someone wants to take on as my SQL has hit a wall on this one but the feature is very important.
ALTER TRIGGER [dbo].[Approve_request_trig]
ON [dbo].[Req_submitted]
AFTER update
AS
Begin
declare
#req_submitted_key int,
#Submitted_to_finance_approver_email varchar(50),
#approved_denied varchar(50),
#approved_finance varchar(50)
select #req_submitted_key = s.req_submitted_key,
#Submitted_to_finance_approver_email = s.Submitted_to_finance_approver_email,
#approved_denied = s.approved_denied,
#approved_finance = s.approved_finance
from inserted s;
if update(approved_finance)
UPDATE Req_submitted
Set approved_code =
(SELECT FLOOR(RAND()*(525885-15+9)+16458)),
Valid = 'valid'
;
End

Oracle SQL update double-check locking

Suppose we have table A with fields time: date, status: int, playerId: int, serverid: int
We added constraint on time, playerid and serverid (UNQ_TIME_PLAYERID_SERVERID)
At some time we try to update all rows in table A with new status and date:
update status = 1, time = sysdate where serverid=XXX and status != 1 and time > sysdate
Problem that there are two separated processes on separate machines that can execute same update at same sysdate.
And UNQ_TIME_PLAYERID_SERVERID violation occurs!
Is there any possibility to force Oracle check where cause before concrete update (when lock on row acquired)?
I do not want to use any 'select for update' things
If it's really the same update 100% of the time, then just catch the exception and ignore it.
In case you want to prevent an error occuring in the first place, you need to implement some logic to prevent the second update statement from ever executing.
I could think of a "lock table" just for this purpose. Create a table TABLE_A_LOCK_TB (add columns based on what information you want to have stored there for administrative reasons, e.g. user who set the lock or a timestamp, ...).
Before you execute an update statement on table A, just insert a row to TABLE_A_LOCK_TB. Once an update was successful, delete said row.
Before executing any update statement on table A just check whether the TABLE_A_LOCK_TB has a dataset. If it doesn't your update is good to go, if it does you don't execute the update.
To make this process easier you could just write a package for "locking" and "unlocking" table A by inserting / deleting a row from the TABLE_A_LOCK_TB. Also implement a function to check the "lock status".
If you need this logic for several tables you can also make it dynamic by just having a column holding the table name in TABLE_A_LOCK_TB and checking against that.
In your application logic you can handle every update like this then (pseudocode):
IF your_lock_package.lock_status(table_name) = false THEN
your_lock_package.set_lock(table_name);
-- update statement(s)
your_lock_package.release_lock(table_name);
ELSE
-- "error" handling / information to user + exit

Update SQL Server table with one time use values from another table

I have a table of users that has the usual suspects, name, email, etc. As the users complete an activity (queried from another table), I need to award them a gift card code.
update users
set giftcardcode = 'code from other table'
where email in (select email from useractivity where necessary conditions are met)
I have a table of unique gift card codes that are unique, one-time use codes. So I need to update my user table, setting the award code field equal to a distinct, unused gift card code from the gift card code table. Then I need to mark the 'used' field in the gift card table to 'Y'.
The goal is to do this with SQL and not any programming. I'm stumped.
I think there is a Many To Many relationship between User table and Activity table.
So, you can use a trigger to execute a query when update.
Each time a row will be updated in the Activity table, the trigger will do something.
It will UPDATE the User table by adding a new gift code.
I think you can add an attribute in your GiftCode table to easily check if the code as already been used. An you can get an unused code like that :
// Retrieve an unused code based on a BIT attribute.
SELECT TOP 1 [Code] FROM [GiftCode] WHERE IS_UNUSED = 1;
Don't forget to update this Gift code after using it.
You can use a SELECT statement including a sub SELECT statement to get a code too :
// Retrieve an unused code based on User table used codes.
SELECT TOP 1 [Code] FROM [GiftCode] WHERE [Code] NOT IN (SELECT [Code] FROM [User]);
It works well if you don't have too much users.
Otherwise , the first statement will be more efficient.
Don't forget to update the User table.
Now you can easily use one of these previous statement in a UPDATE statement.
It will be something like that :
UPDATE [User] SET [Code] = (
SELECT TOP 1 [Code] FROM [GiftCode] WHERE [Code] NOT IN (
SELECT [Code] FROM [User]))
WHERE USER_ID = // ...;
You can perform this in a trigger.
You can use a stored procedure, it's more efficient and will wrap all the SQL code in a compiled function. Then you can call it in your trigger.
You can execute a stored procedure in a job (see SQL Server Agent jobs) too.
create a Trigger on your table for update and do what you want inside it using inserted and deleted

SQL Server : add row if doesn't exist, increment value of one column, atomic

I have a table that keeps a count of user actions. Each time an action is done, the value needs to increase. Since the user can have multiple sessions at the same time, the process needs to be atomic to avoid multi-user issues.
The table has 3 columns:
ActionCode as varchar
UserID as int
Count as int
I want to pass ActionCode and UserID to a function that will add a new row if one doesn't already exist, and set count to 1. If the row does exist, it will just increase the count by one. ActionCode and UserID make up the primary unique index for this table.
If all I needed to do was update, I could do something simple like this (because an UPDATE query is atomic already):
UPDATE (Table)
SET Count = Count + 1
WHERE ActionCode = #ActionCode AND UserID = #UserID
I'm new to atomic transactions in SQL. This question has probably been answered in multiple parts here, but I'm having trouble finding those and also placing those parts in one solution. This needs to be pretty fast as well, without getting to complex, because these actions may occur frequently.
Edit: Sorry, this might be a dupe of MySQL how to do an if exist increment in a single query. I searched a lot but had tsql in my search, once I changed to sql instead, that was the top result. It isn't obvious if that is atomic, but pretty sure it would be. I'll probably vote to delete this as dupe, unless someone thinks there can be some new value added by this question and answer.
Assuming you are on SQL Server, to make a single atomic statement you could use MERGE
MERGE YourTable AS target
USING (SELECT #ActionCode, #UserID) AS source (ActionCode, UserID)
ON (target.ActionCode = source.ActionCode AND target.UserID = source.UserID)
WHEN MATCHED THEN
UPDATE SET [Count] = target.[Count] + 1
WHEN NOT MATCHED THEN
INSERT (ActionCode, UserID, [Count])
VALUES (source.ActionCode, source.UserID, 1)
OUTPUT INSERTED.* INTO #MyTempTable;
UPDATE Use output to select the values if necessary. The code updated.
Using MERGE in SQL Server 2008 is probably the best bet. There is also another simple way to solve it.
If the UserID/Action doesn't exist, do an INSERT of a new row with a 0 for Count. If this statement fails due to it already being present (as inserted by another concurrent session just then), simply ignore the error.
If you want to do the insert and block while performing it to eliminate any chance of error, you can add some lock hints:
INSERT dbo.UserActionCount (UserID, ActionCode, Count)
SELECT #UserID, #ActionCode, 0
WHERE NOT EXISTS (
SELECT *
FROM dbo.UserActionCount WITH (ROWLOCK, HOLDLOCK, UPDLOCK)
WHERE
UserID = #UserID
AND ActionCode = #ActionCode
);
Then do the UPDATE with + 1 as in the usual case. Problem solved.
DECLARE #NewCount int,
UPDATE UAC
SET
Count = Count + 1,
#NewCount = Count + 1
FROM dbo.UserActionCount UAC
WHERE
ActionCode = #ActionCode
AND UserID = #UserID;
Note 1: The MERGE should be okay, but know that just because something is done in one statement (and therefore atomic) does not mean that it does not have concurrency problems. Locks are acquired and released over time throughout the lifetime of a query's execution. A query like the following WILL experience concurrency problems causing duplicate ID insertion attempts, despite being atomic.
INSERT T
SELECT (SELECT Max(ID) FROM Table) + 1, GetDate()
FROM Table T;
Note 2: An article I read by people experienced in super-high-transaction-volume systems said that they found the "try-it-then-handle-any-error" method to offer higher concurrency than acquiring and releasing locks. This may not be the case in all system designs, but it is at least worth considering. I have since searched for this article several times (including just now) and been unable to find it again... I hope to find it some day and reread it.
Incase anyone else needs the syntax to use this in a stored procedure and return the inserted/updated rows (I was surprised inserted.* also returns the updated rows, but it does). Here is what I ended up with. I forgot I had an additional column in my primary key (ActionKey), it is reflected below. Can also do "output inserted.Count" if you only want to return the Count, which is more practical.
CREATE PROCEDURE dbo.AddUserAction
(
#Action varchar(30),
#ActionKey varchar(50) = '',
#UserID int
)
AS
MERGE UserActions AS target
USING (SELECT #Action, #ActionKey, #UserID) AS source (Action, ActionKey, UserID)
ON (target.Action = source.Action AND target.ActionKey = source.ActionKey AND target.UserID = source.UserID)
WHEN MATCHED THEN
UPDATE SET [Count] = target.[Count] + 1
WHEN NOT MATCHED THEN
INSERT (Action, ActionKey, UserID, [Count])
VALUES (source.Action, source.ActionKey, source.UserID, 1)
output inserted.*;