I would like to do something like this
begin trans
declare #maxpriceprice int;
Select #maxpriceprice MAX(price) from products where typeId = 5
and later on
Insert into products (price) values(#maxpriceprice + RAND() * 10)
commit tran
It is rather strange but it is strange application.
between the select and insert i cannot afford to havve people inserting stuff into thedatabase.
Would it be ok to do a select max price with (XLOCK) to prevent other people getting max price until i complete my transaction.
You never have a race condition in the database if you let the DBMS handle the locking for you. To do that, you have to give it enough information:
insert into products (price)
select MAX(price) + RAND() * 10
from products where typeId = 5
Would it be ok to do a select max price with (XLOCK)
Basically, no. You never want to leave a transaction open and hand control back to an application or a user. You want to get your ducks in a row, and execute your transaction. If you want to "lock" something (while, say, the user updates the information) you usually resort to an optimistic concurrency strategy whereby at time of update you verify that the row has not changed since it was read. You might want to read up on the timestamp datatype, which supports that style.
SQL also defines cursors, row-level locking, and serializable isolation. Those are all available too in SQL server, at some cost to concurrency.
Related
I have a database where the users booking classes.
There is a table Bookings where lets say we want to have ony 5 rows for 5 students.
When the student trying to book the class, i am checking first how many rows are in the table and if are less than 5, i do the Insert.
The problem is that when there are concurrent bookings in the same second of the time, i have more than 5 records in the table.
In every Insert i check first the number of the rows, but when are in the same time, the return number is the same and its not increasing.
How to avoid these concurrent inserts and keep the table rows to 5.
This sounds like the job for a TRIGGER!
create trigger LimitTable
on YourTableToLimit
after insert
as
declare #tableCount int
select #tableCount = Count(*)
from YourTableToLimit
if #tableCount > 5
begin
rollback
end
go
To be more clear, and you probably already know this... inserts are never concurrent. The concurrency happens from the calling code.
I might suggest that you have the wrong data structure if you need something like this though. I personally dislike relying on triggers like this.
Without knowing the full use case, it'd be hard to really offer a full solution though.
you can use unique constraint on (student_id, class_id, seat_number) and set seat_number as 1,2,3,4,5 (result from count function). But on delete you must update seat numbers for all bookings in class.
you can use queue to prevent concurrent inserts
This question is related/came from discussion about another thing:
What is the correct isolation level for Order header - Order lines transactions?
Imagine scenario where we have usual Orders_Headers and Orders_LineItems tables. Lets say also that we have a special business rules that say:
Each order has Discount field which is calculated based on time passed from last order entered
Each next order Discount field is calculated specially if there has been more than X order in last Y hours.
Each next order Discount field is calculated specially if average frequency of last 10 orders was higher than x per minute.
Each next order Discount field is calculated specially
Point here is to show that every Order is dependant on previous ones and isolation level is crucial.
We have a transaction (just logic of the code shown):
BEGIN TRANSACTION
INSERT INTO Order_Headers...
SET #Id = SCOPE_IDENTITY()
INSERT INTO Order_LineItems...(using #Id)
DECLARE #SomeVar INT
--just example to show selecting previous x orders
--needed to calculate Discount value for new Order
SELECT #SomeVar = COUNT(*) Order_Headers
WHERE ArbitraryCriteria
UPDATE Order_Headers
SET Discount= UDF(#SomeVar)
WHERE Id = #Id
COMMIT
END TRANSACTION
We also have another transaction to read orders:
SELECT TOP 10 * FROM Order_Headers
ORDER BY Id DESC
QUESTIONS
Is SNAPSHOT isolation level for first transaction and READ COMMITED for second appropriate levels?
Is there a better way of approaching CREATE/UPDATE transaction or is this the way to do it?
The problem with snapshot is not about inserting/reading (which i assume you decided to use). Its about updates, that you should be a concerned.
Snapshot isolation levels are using row versioning. Which means any time you insert/update/delete row, those rows get duplicated in tempdb(version store, location for those kinds of rows), and increase its size by 14 bytes with a versioning tag so that your newly started transaction can read a row from the last committed transaction. Keep in mind that these resized rows will stay as they are until you rebuild the index.
This should be an indicator ,that if your table is really busy, your indexes will be defragmented much faster and it will add certain amount of % overhead on your temp.So keep that in mind.
What is even bigger concern here are updates, as i mentioned.
Any time you insert/delete/update row, you will get exclusive locks on those rows (object later),and since you snapshot is using row versioning, inserts from another transaction are adding exclusive locks on a NEW row, and that is not a problem.However if you try to update an existing row and session 2 tries to acquire X lock on that row, it will fail because session 1 already has X lock on it, and this is where you will get this message:
Read Committed and Serializable have covered these issues well, so you might wanna take that approach and test all solutions before you actually implement it. Remember all transactions will cause blocking on updates, and snapshot/read comitted snapshot will simply fail.
Me personally would`ve used read committed snapshot and altered procedure , to rerun in catch block N amount of times, but hey that has flaws as well !
The serializable option:
Using a pessimistic locking strategy by way of the updlock and serializable table hints to acquire a key range lock specified by the where criteria (backed by a supporting index to lock only the range necessary for the query):
declare #Id int, #SomeVar int;
begin tran;
select #SomeVar = count(OrderDate)
from Order_Headers with (updlock,serializable)
where OrderDate >= '20170101';
insert into Order_Headers (OrderDate, SomeVar)
select sysdatetime(), #SomeVar;
set #Id = scope_identity();
insert into Order_LineItems (id,cols)
select #Id, cols
from #TableValuedParameter;
commit tran;
A good guide to the why and how of using the updlock and serializable table hints to lock a key range with a select, and why you need both, is covered in Sam Saffron''s upsert (update/insert) patterns.
Reference:
Documentation on serializable and other Table Hints - MSDN
Key-Range Locking - MSDN
SQL Server Isolation Levels: A Series - Paul White
Questions About T-SQL Transaction Isolation Levels You Were Too Shy to Ask - Robert Sheldon
Isolation Level references curated by Brent Ozar
I have spent a good portion of today and yesterday attempting to decide whether to utilize a loop or cursor in SQL or to figure out how to use set based logic to solve the problem. I am not new to set logic, but this problem seems to be particularly complex.
The Problem
The idea is that if I have a list of all transactions (10's, 100's of millions) and a date they occurred, I can start combining some of that data into a daily totals table so that it is more rapidly view able by reporting and analytic systems. The pseudocode for this is as such:
foreach( row in transactions_table )
if( row in totals_table already exists )
update totals_table, add my totals to the totals row
else
insert into totals_table with my row as the base values
delete ( or archive ) row
As you can tell, the block of the loop is relatively trivial to implement, and as is the cursor/looping iteration. However, the execution time is quite slow and unwieldy and my question is: is there a non-iterative way to perform such a task, or is this one of the rare exceptions where I just have to "suck it up" and use a cursor?
There have been a few discussions on the topic, some of which seem to be similar, but not usable due to the if/else statement and the operations on another table, for instance:
How to merge rows of SQL data on column-based logic? This question doesn't seem to be applicable because it simply returns a view of all sums, and doesn't actually make logical decisions about additions or updates to another table
SQL Looping seems to have a few ideas about selection with a couple of cases statements which seems possible, but there are two operations that I need done dependent upon the status of another table, so this solution does not seem to fit.
SQL Call Stored Procedure for each Row without using a cursor This solution seems to be the closest to what I need to do, in that it can handle arbitrary numbers of operations on each row, but there doesn't seem to be a consensus among that group.
Any advice how to tackle this frustrating problem?
Notes
I am using SQL Server 2008
The schema setup is as follows:
Totals: (id int pk, totals_date date, store_id int fk, machine_id int fk, total_in, total_out)
Transactions: (transaction_id int pk, transaction_date datetime, store_id int fk, machine_id int fk, transaction_type (IN or OUT), transaction_amount decimal)
The totals should be computed by store, by machine, and by date, and should total all of the IN transactions into total_in and the OUT transactions into total_out. The goal is to get a pseudo data cube going.
You would do this in two set-based statements:
BEGIN TRANSACTION;
DECLARE #keys TABLE(some_key INT);
UPDATE tot
SET totals += tx.amount
OUTPUT inserted.some_key -- key values updated
INTO #keys
FROM dbo.totals_table AS tot WITH (UPDLOCK, HOLDLOCK)
INNER JOIN
(
SELECT t.some_key, amount = SUM(amount)
FROM dbo.transactions_table AS t WITH (HOLDLOCK)
INNER JOIN dbo.totals_table AS tot
ON t.some_key = tot.some_key
GROUP BY t.some_key
) AS tx
ON tot.some_key = tx.some_key;
INSERT dbo.totals_table(some_key, amount)
OUTPUT inserted.some_key INTO #keys
SELECT some_key, SUM(amount)
FROM dbo.transactions_table AS tx
WHERE NOT EXISTS
(
SELECT 1 FROM dbo.totals_table
WHERE some_key = tx.some_key
)
GROUP BY some_key;
DELETE dbo.transactions_table
WHERE some_key IN (SELECT some_key FROM #keys);
COMMIT TRANSACTION;
(Error handling, applicable isolation level, rollback conditions etc. omitted for brevity.)
You do the update first so you don't insert new rows and then update them, performing work twice and possibly double counting. You could use output in both cases to a temp table, perhaps, to then archive/delete rows from the tx table.
I'd caution you to not get too excited about MERGE until they've resolved some of these bugs and you have read enough about it to be sure you're not lulled into any false confidence about how much "better" it is for concurrency and atomicity without additional hints. The race conditions you can work around; the bugs you can't.
Another alternative, from Nikola's comment
CREATE VIEW dbo.TotalsView
WITH SCHEMABINDING
AS
SELECT some_key_column(s), SUM(amount), COUNT_BIG(*)
FROM dbo.Transaction_Table
GROUP BY some_key_column(s);
GO
CREATE UNIQUE CLUSTERED INDEX some_key ON dbo.TotalsView(some_key_column(s));
GO
Now if you want to write queries that grab the totals, you can reference the view directly or - depending on query and edition - the view may automatically be matched even if you reference the base table.
Note: if you are not on Enterprise Edition, you may have to use the NOEXPAND hint to take advantage of the pre-aggregated values materialized by the view.
I do not think you need the loop.
You can just
Update all rows/sums that match your filters/ groups
Archive/ delete previous.
Insert all rows that do not match your filter/ groups
Archive/ delete previous.
SQL is supposed to use mass data not rows one by one.
I'm evaluating PostgreSQL for some personal project.
I was inspirited by it's Multi-Version Concurrency Control (MVCC)
I simulated a basic need - insert transaction and perform a vendor balance update with many threads at the same time, running the SQL commands like:
INSERT INTO
VendorAccountTransactions (VendorId, BalanceBefore, BalanceAfter)
VALUES (
1,
(SELECT CurrentBalance FROM VendorAccounts WHERE VendorId = 1),
(SELECT CurrentBalance FROM VendorAccounts WHERE VendorId = 1) + 19.99
);
UPDATE VendorAccounts SET CurrentBalance = CurrentBalance + 19.99 WHERE VendorId = 1;
Any Idea how to avoid deadlocks in such common case?
What is is needed - simply insert the transaction description with "balance before" / "balance after" and update the balance.
It will be used in high load application.
How to achieve the right result for this simple business need?
Thank you.
Update:
Maybe there is any other solution to re-design the database to avoid deadlocks or use some other solution to keep the business need solved?
Put the update first and include both statements in a transaction. The update will updlock the vendor row and prevent concurrent transactions from entering the transaction (they will wait until the first tran completed as the updlock is not available).
This will effectively serialize access to a given vendor which will ensure consistency.
Suppose I need to select max value as order number. Thus I'll select MAX(number), assign number to order, and save changes to database. However, how do I prevent others from messing with the number? Will transactions do? Something like:
ordersRepository.StartTransaction();
order.Number = ordersRepository.GetMaxNumber() + 1;
ordersRepository.Commit();
Will the code above "lock" changes so that order numbers are read/write only by one DB client? Given that transactions are plain NHibernate ones, and GetMaxNumber just does SELECT MAX(Number) FROM Orders.
Using an ITransaction with IsolationLevel.Serializable should do the job. Be careful of table contention, though. If you've got high frequency updates on the table, things could slow down big time. You might want to profile the hit on the db when using GetMaxNumber().
I had to do something similar to generate custom IDs for high concurrency usage. My solution moved the ID generation into the database, and used a separate Counter table to hold the max values.
Using a separate Counter table has a couple of plus points:
It removes the contention on the Order table
It's usually faster
If it's small enough, It can be pinned into memory
I also used a stored proc to return the next available ID:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
UPDATE [COUNTER] SET Value = Value + 1 WHERE COUNTER_ID = #counterId
COMMIT TRAN
RETURN [NEW_VALUE]
Hope that helps.