Lock and Isolation for resource reservation pattern

Lock and Isolation for resource reservation pattern - locking

I need to solve a resource reservation pattern with Spring and MariaDB.
The problem is very simple, I have a guest table where I store guest names of events, I have to be sure that the guest count for the event must be less or equals the maximum capacity.
This is the table:
create table guest(
event int,
name varchar(50)
)
create index event on guest (event);
What is the right lock procedure and isolation level for DB?
Please consider that this code will run in multi-threading container.
I chose to lock the table with a "SELECT...FOR UPDATE" to limit the lock only in one event rows.
// START TRANSACTION
#Transactional
public void reserve(int event, String name){
getJdbc().query("SELECT * FROM guest WHERE id=? FOR UPDATE",event);
Integer count=getJdbc().queryForObject("SELECT COUNT(*) FROM guest WHERE id=?",Integer.class,event);
if(count>=MAX_CAPACITY)
throw new ApplicationException("No room left");
getJdbc().query("INSERT INTO guest VALUES (?,?)",event,name);
}
// COMMIT
I made some test and seems that I need the READ_COMMITTED isolation levels, am I right?
This is what I found:
This is the first time I have to change the isolation level and I'm a bit surprised of this need, can you confirm that the standard MariaDB isolation level REPETABLE_READ fails with this pattern?

The problem is, that during the transaction in Thread 2, repeatable_read guarantees that you see the DB in the state as it was at the transaction start. So effects of transaction 1 which has not been completed yet at that time, will be hidden.
Therefore you will always see the same number of records independent on what other transactions meanwhile did. So both transactions will insert a record.
READ_COMMITTED means according to the documentations: "Each consistent read, even within the same transaction, sets and reads its own fresh snapshot". Fresh snapshot means, that the results of committed concurrent transactions will be included.

A suggestion for working around the problem. This involves keeping a counter instead of doing COUNT(*). (Yes, this violates the principle of not having redundant info.)
CREATE TABLE EventCount ( event ..., ct INT ..., PRIMARY KEY(event) ) ENGINE=InnoDB;
START TRANSACTION;
INSERT ...;
UPDATE EventCount
SET ct = ct + 1
WHERE event = ?;
ct = SELECT ct FROM EventCount WHERE event = ?;
if (ct > max)
{
ROLLBACK;
exit;
}
COMMIT;
(Caveat: I have not verified that this works for your situation.)

Related

How to implement Serializable Isolation Level in SQL Server

I need to implement a serializable isolation level in SQL Server but I've tried many ways and I don't get it.
I need to lock 1 row in one transaction (It doesn´t matter if lock the complete table). So, another transaction can´t even select the row (don´t read).
The last thing I tried:
For transaction 1:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
-- Here I select in another instance the same row
COMMIT TRAN
For transaction 2:
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
COMMIT TRAN
I would expect that transaction 2 wait until transaction 1 commit the operation, but the transaction 2 gives me the row.
Anyone can explain me if I miss something?

SQL Server conforms to the strict definition of a Serializable query. That is, there must be a result that can logically be generated IF both queries ran in serial order - Transaction 1 finishing before Transaction 2 can start, or vice versa.
This results in some effects that can be different than you would expect. There is a great explanation of the Serializable isolation level over at SQLPerformance.com that makes clear some of what this logical serializability ends up meaning. (Very helpful site, that one.)
For your above queries, there is no logical requirement to prevent the second query from reading the same row as the first query. No matter in what order the queries are run, they will both return the same data without modifying it. Since the Query Analyzer can identify this, there is no reason to place a read lock on the data. However, if one of the queries performed an update on the data, then (warning - logic assumption here, since I don't actually know the internals of how SQL Server handles this) the QA would set a stronger lock on the selected rows.
TL;DR - SQL Server wants to minimize blocking, so it uses logical analysis to see what types of locks are needed for a serializable isolation level, and it (tries to) use the minimum number and strength of locks needed to achieve its goal.
Now that we've dealt with that - there are only two ways that I can think of to lock a row so that no one else can read it: using XLOCK + TABLOCK (locking the whole table - not a recommended practice) or having some form of a field on each row that is updated when you start your process - something like an SPID field, or a bit flag for Locked. When you update it within your transaction, only SELECTs with NOLOCK hints will be able to read it.
Clearly, neither of these are optimal. I recommend the "This row is busy - go away" flag, as that's probably the approach I would take for an (almost) absolute lock on a row.

According to the documentation:
SERIALIZABLE Specifies the following:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the
current transaction until the current transaction completes.
If you're not making any changes to data with an INSERT, UPDATE, or DELETE inside transaction 1, SQL will release the Shared Lock after the read completes.
What you might want to try is adding a table hit to prevent the row lock from being released until the end of transaction 1.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code
FROM table1 WITH(ROWLOCK, HOLDLOCK)
WHERE code = 1
COMMIT TRAN

Maybe you can solve this with some hack like this?
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
UPDATE someTableForThisHack set val = CASE WHEN val = 1 THEN 0 else 1 End
SELECT code from table1.....
COMMIT TRANSACTION
So you create a table someTableForThisHack and insert one row to it.

postgresql hang forever on serializable transaction

I use libpqxx for my connection to postgresql. And everything was ok, until i run serialazable query on one table on one row.
table:
CREATE TABLE t1(id integer primary key);
postgres 9.4.4_x64
pqxx::connection c1(conn_str);
pqxx::connection c2(conn_str);
pqxx::transaction<pqxx::isolation_level::serializable> t1(c1);
t1.exec("INSERT INTO t1 (id) VALUES (25)");
pqxx::transaction<pqxx::isolation_level::serializable> t2(c2);
t2.exec("INSERT INTO t1 (id) VALUES (25)"); //hang here
t2.commit();
t1.commit();
my program hang forever. hang in PQexec function. Why? Is i think it must rollback one of transaction? but no? just hang.
UPDATE: same result for pure libpq:
c1 = PQconnectdb(conninfo);
c2 = PQconnectdb(conninfo);
res1 = PQexec(c1, "BEGIN");
PQclear(res1);
res1 = PQexec(c1, "INSERT INTO t1 (id) VALUES (104)");
PQclear(res1);
res2 = PQexec(c2, "BEGIN");
PQclear(res2);
res2 = PQexec(c2, "INSERT INTO t1 (id) VALUES (104)");
PQclear(res2);
res2 = PQexec(c2, "END");
PQclear(res2);
res1 = PQexec(c1, "END");
PQclear(res1);
postgresql 9.1 - same hang

The hang has nothing to do with the serializable isolation level.
I'm no libpqxx expert, but your example appears to be running both transactions in a single thread. That's your problem.
t2.exec("INSERT INTO t1 (id) VALUES (25)");
The above statement has to wait for t1 to commit or rollback before completing, but t1.commit() never gets a chance to execute. Deadlock! This is absolutely normal, and will happen regardless of your chosen isolation level. This is just a consequence of trying to run statements from 2 concurrent transactions in the same thread of execution, not a good idea.
Try running both transactions on different threads, and your hang will go away.

If only the two transaction are involved, you should get a unique violation error for transaction t2 - exactly the same as with default READ COMMITTED isolation level:
ERROR: duplicate key value violates unique constraint "t1_pkey"
DETAIL: Key (id)=(25) already exists.
t1 tried the insert first and wins regardless which transaction tries to commit first. All following transactions trying to insert the same key wait for the first. Again, this is valid for READ COMMITTED and SERIALIZABLE alike.
A possible explanation would be that a third transaction is involved, which tried to insert the same key first and is still open. Or several such transactions, artefacts of your tests.
All transactions wait for the first one that tried to insert the same key. If that one commits, all other get a unique violation. If it rolls back, the next in line gets its chance.
To check look at pg_stat_activity (being connected to the same database):
SELECT * FROM pg_stat_activity;
More specifically:
SELECT * FROM pg_stat_activity WHERE state = 'idle in transaction';
Then commit / rollback the idle transaction. Or brute-force: terminate that connection. Details:
psql: FATAL: too many connections for role

form postgres team:
"In concurrent programming, a deadlock is a situation in which two or more competing actions are each waiting for the other to finish, and thus neither ever does."
https://en.wikipedia.org/wiki/Deadlock
definition it is not a deadlock in the first place. "c1" is not waiting for "c2" to finish and can happily go about its business with id=104. However, your application never gives c1 the next command because it is stubbornly waiting for "c2" to perform its insert - which it cannot due to the lock held by "c1".
Lock testing can be done within a single thread but deadlock testing implies that both connections need to simultaneously be attempting to execute a command - which a single program cannot accomplish without threading.
David J.

SQL Server Insert query for a forum

Considering a forum table and many users simultaneously inserting messages into it, how safe is this transaction?
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
DECLARE #LastMessageId SMALLINT
SELECT #LastMessageId = MAX(MessageId)
FROM Discussions
WHERE ForumId = #ForumId AND DiscussionId = #DiscussionId
INSERT INTO Discussions
(ForumId, DiscussionId, MessageId, ParentId, MessageSubject, MessageBody)
VALUES
(#ForumId, #DiscussionId, #LastMessageId + 1, #ParentId, #MessageSubject, #MessageBody)
IF ##ERROR = 0
BEGIN
COMMIT TRANSACTION
RETURN 0
END
ROLLBACK TRANSACTION
RETURN 1
Here I read last MessageId and increment it. I can't use Identity field because it needs to be incremented for every message inserted in a group (not every message insert into table.)

Your transaction should be quite safe indeed - check out the MSDN docs on the SERIALIZABLE transaction level:
SERIALIZABLE
Specifies the following:
Statements cannot read data that has been modified but not yet
committed by other transactions.
No other transactions can modify data that has been read by the
current transaction until the current
transaction completes.
Other transactions cannot insert new rows with key values that
would fall in the range of keys read
by any statements in the current
transaction until the current
transaction completes.
Range locks are placed in the range of key values that match the
search conditions of each statement
executed in a transaction. This blocks
other transactions from updating or
inserting any rows that would qualify
for any of the statements executed by
the current transaction. This means
that if any of the statements in a
transaction are executed a second
time, they will read the same set of
rows. The range locks are held until
the transaction completes. This is the
most restrictive of the isolation
levels because it locks entire ranges
of keys and holds the locks until the
transaction completes. Because
concurrency is lower, use this option
only when necessary. This option has
the same effect as setting HOLDLOCK on
all tables in all SELECT statements in
a transaction.
The main problem with this transaction isolation level is that it's a pretty heavy load on the server, and serializes (as the name implies) any access, so your server performance and scalability will suffer, e.g. with very high numbers of users, you'll possibly get lots of timeouts for users waiting for a transaction to finish.
So using the more lightweight approach of a global message id as INT IDENTITY is definitely much better!

How do I only select rows that have been committed - sql2008

How do I select all rows for a table, their isn't part of any transaction that hasn't committed yet?
Example:
Let's say,
Table T has 10 rows.
User A is doing a transaction with some queries:
INSERT INTO T (...)
SELECT ...
FROM T
// doing other queries
Now, here comes the tricky part:
What if User B, in the time between User A inserted the row and the transaction was committed, was updating a list in the system with a select on Table T.
I only want that the SELECT User B is using returned the 10 rows(all rows from the table, that can't later be rolled back). How do I do this, if it's even possible?
I have tried setting the isolationlevel on the transaction and adding "WITH(NOLOCK)" "WITH(READUNCOMMITTED)" to the query without any luck.
The query either return all 11 records or it's waiting for the transaction to commit, and that's not what I need.
Any tips is much appriciated, thanks.

You need to use (default) read committed isolation level and the READPAST hint to skip rows locked as they are not committed (rather than being blocked waiting for the locks to be released)
This does rely on the INSERT taking out rowlocks though. If it takes out page locks you will be back to being blocked. Example follows
Connection 1
IF OBJECT_ID('test_readpast') IS NULL
BEGIN
CREATE TABLE test_readpast(i INT PRIMARY KEY CLUSTERED)
INSERT INTO test_readpast VALUES (1)
END
BEGIN TRAN
INSERT INTO test_readpast
WITH(ROWLOCK)
--WITH(PAGLOCK)
VALUES (2)
SELECT * FROM sys.dm_tran_locks WHERE request_session_id=##SPID
WAITFOR DELAY '00:01';
ROLLBACK
Connection 2
SELECT i
FROM test_readpast WITH (readpast)

Snapshot isolation ?
Either I or else the three people who have answered early have misread/ misinterpreted your question, so I have given a link so you can determine for yourself.

Actually, read uncommitted and nolock are the same. They mean you get to see rows that have not been committed yet.
If you run at the default isolation level, read committed, you will not see new rows that have not been committed. This should work by default, but if you want to be sure, prefix your select with set transaction isolation level read committed.

SQL problem - high volume trans and PK violation

I'm writing a high volume trading system. We receive messages at around 300-500 per second and these messages then need to be saved to the database as quickly as possible. These messages get deposited on a Message Queue and are then read from there.
I've implemented a Competing Consumer pattern, which reads from the queue and allows for multithreaded processing of the messages. However I'm getting a frequent primary key violation while the app is running.
We're running SQL 2008. The sample table structure would be:
TableA
{
MessageSequence INT PRIMARY KEY,
Data VARCHAR(50)
}
A stored procedure gets invoked to persist this message and looks something like this:
BEGIN TRANSACTION
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS
(
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
)
IF (##ROWCOUNT = 0)
BEGIN
UPDATE TableA
SET Data = #Data
WHERE MessageSequence = #MessageSequence
END
COMMIT TRANSACTION
All of this is in a TRY...CATCH block so if there's an error, it rolls back the transaction.
I've tried using table hints, like ROWLOCK, but it hasn't made a difference. Since the Insert is evaluated as a single statement, it seems ludicrous that I'm still getting a 'Primary Key on insert' issue.
Does anyone have an idea why this is happening? And have you got ANY ideas which may point me in the direction of a solution?

Why is this happening?
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
This SELECT will try to locate the row, if not found the EXISTS operator will return FALSE and the INSERT will proceed. Hoewever, the decision to INSERT is based on a state that was true at the time of the SELECT, but that is no longer guaranteed to be true at the time of the INSERT. In other words, you have race conditions where two threads can both look up the same #MessageSequence, both return NOT EXISTS and both try to INSERT, when only the first one will succeed, second one will cause a PK violation.
How do I solve it?
The quickest fix is to add a WITH (UPDLOCK) hint to the SELECT, this will enforce the lock placed on the #MessageSequence key to be retained and thus the INSERT/SELECT to behave atomically:
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS (
SELECT TOP 1 MessageSequence FROM TableA WITH(UPDLOCK) WHERE MessageSequence = #MessageSequence)
To prevent SQL from doing fancy stuff like page lock, you can also add the ROWLOCK hint.
However, that is not my recommendation. My recommendation may surpise you, but is this: do the operation that is most likely to succeed and handle the error if it failed. Ie. if your business case makes it more likely for the #MessageSequnce to be new, try an INSERT and handle the PK if it failed. This way you avoid the spurious look-ups, and hte cost of the catch/retry is amortized over the many cases when it succeeds from the first try.
Also, it is perhaps worth investigating using the built-in queues that come with SQL Server.

Common problem. Explained here:
Defensive database programming: eliminating IF statements

It might be related to the transaction isolation level. You might need
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
before you start the transaction.
Also, if you have more updates than inserts, you should try the update first and check rowcount and do the insert second.

This is very similar to post 939831. Ultimately you want to use the hints (ROWLOCK, READPAST, UPDLOCK). READPAST tells sql server to skip to the next record if the current one is locked. UPDLOCK tells sql server that the read lock is going to escalate to an update lock.
When I implemented something similar I locked the next record by the threadID
UPDATE TOP (1)
foo
SET
ProcessorID = #PROCID
FROM
OrderTable foo WITH (ROWLOCK, READPAST, UPDLOCK)
WHERE
ProcessorID = 0
Then selected the record
SELECT *
FROM foo WITH (NOLOCK)
WHERE ProcessorID = #PROCID
Then marked it as processed
UPDATE foo
SET ProcessorID = -1
WHERE ProcessorID = #PROCID
Later in off hours I perform the relatively expensive operation of performing the delete operation to clear the queue of processed records.

The atomicity of the following statement is what you are after:
INSERT TableA(MessageSequence, Data )
SELECT #MessageSequence, #Data
WHERE NOT EXISTS
(
SELECT TOP 1 MessageSequence FROM TableA WHERE MessageSequence = #MessageSequence
)
According to this person, it depends on the current isolation level.

On a tangent, if you're thinking of a high volume trading system you might want to consider a tick database designed for such data [I'm not exactly sure what "message" you are storing here], such as discussed in this thread for example: http://www.elitetrader.com/vb/showthread.php?threadid=81345.
These are typically in-memory solutions with proprietary query languages. We use kdb+ at our shop.

Not sure what Messaging product you use - but it may be worth looking at the transactions not at the DB level, but at the MQ Level.
Of course, if you are using a TM (Transaction manager), the two operations : 1)Get from MQ and 2)Write to DB are both 'bracketed' under the same parent commit.
So I am not sure if you are using an implicit or explicit or any TM here (for example, Microsoft's DTC).
MessageSequence is the PK, so could the same Message from the MQ be getting processed twice.
When you perform a 'GET" from MQ, make sure the GET is committed (i.e. not a db-commit, but a MQ-commit) - that will ensure the same MessageID cannot be 'popped' by the next thread that writes messages to the DB.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas