Is it possible to lock on a value of a column in SQL Server? - sql

I have a table that looks like that:
Id GroupId
1 G1
2 G1
3 G2
4 G2
5 G2
It should at any time be possible to read all of the rows (committed only). When there will be an update I want to have a transaction that will lock on group id, i.e. there should at any given time be only one transaction that attempts to update per GroupId.
It should ideally be still possible to read all committed rows (i.e. other transaction/ordinary reads that will not try to acquire the "update per group lock" should be still able to read).
The reason I want to do this is that an update can not rely on "outdated" data. I.e. I do make some calculations in a transaction and another transaction cannot edit row with id 1 or add a new row with the same GroupId after these rows were read by the first transaction (even though the first transaction would never modify the row itself it will be dependent on it's value).
Another "nice to have" requirement is that sometimes I would need the same requirement "cross group", i.e. the update transaction would have to lock 2 groups at the same time. (This is not a dynamic number of groups, but rather just 2)

Here are some ideas. I don't think any of them are perfect - I think you will need to give yourself a set of use-cases and try them. Some of the situations I tried after applying locks
SELECTs with the WHERE filter as another group
SELECTs with the WHERE filter as the locked group
UPDATES on the table with the WHERE clause as another group
UPDATEs on the table where ID (not GrpID!) was not locked
UPDATEs on the table where the row was locked (e.g., IDs 1 and 2)
INSERTs into the table with that GrpId
I have the funny feeling that none of these will be 100%, but the most likely answer is the second one (setting the transaction isolation level). It will probably lock more than desired, but will give you the isolation you need.
Also one thing to remember: if you lock many rows (e.g., there are thousands of rows with the GrpId you want) then SQL Server can escalate the lock to be a full-table lock. (I believe the tipping point is 5000 locks, but not sure).
Old-school hackjob
At the start of your transaction, update all the relevant rows somehow e.g.,
BEGIN TRAN
UPDATE YourTable
SET GrpId = GrpId
WHERE GrpId = N'G1';
-- Do other stuff
COMMIT TRAN;
Nothing else can use them because (bravo!) they are a write within a transaction.
Convenient - set isolation level
See https://learn.microsoft.com/en-us/sql/relational-databases/sql-server-transaction-locking-and-row-versioning-guide?view=sql-server-ver15#isolation-levels-in-the-
Before your transaction, set the isolation level high e.g., SERIALIZABLE.
You may want to read all the relevant rows at the start of your transaction (e.g., SELECT Grp FROM YourTable WHERE Grp = N'Grp1') to lock them from being updated.
Flexible but requires a lot of coding
Use resource locking with sp_getapplock and sp_releaseapplock.
These are used to lock resources, not tables or rows.
What is a resource? Well, anything you want it to be. In this case, I'd suggest 'Grp1', 'Grp2' etc. It doesn't actually lock rows. Instead, you ask (via sp_getapplock, or APPLOCK_TEST) whether you can get the resource lock. If so, continue. If not, then stop.
Anything code referring to these tables needs to be reviewed and potentially modified to ask if it's allowed to run or not. If something doesn't ask for permission and just does it, there's no actual real locks stopping it (except via any transactions you've explicity specified).
You also need to ensure that errors are handled appropriately (e.g., still releasing the app_lock) and that processes that are blocked are re-tried.

Related

Get distinct sets of rows for an INSERT executed in concurrent transactions

I am implementing a simple pessimistic locking mechanism using Postgres as a medium. The goal is that multiple instances of an application can simultaneously acquire locks on distinct sets of users.
The app instances are not trying to lock specific users. Instead they will take any user locks they can get.
Say, for example, we have three instances of the app running and there are currently 5 users that are not locked. All three instances attempt to acquire locks on up to three users at the same time. Their requests are served in arbitrary order.
Ideally, the first instance served would acquire 3 user locks, the second would acquire 2 and the third would acquire no locks.
So far I have not been able to write a Query that accomplishes this. I'll show you my best attempt so far.
Here are the tables for the example:
CREATE TABLE my_locks (
id bigserial PRIMARY KEY,
user_id bigint NOT NULL UNIQUE
);
CREATE TABLE my_users (
id bigserial PRIMARY KEY,
user_name varchar(50) NOT NULL
);
And this is the Query for acquiring locks:
INSERT INTO my_locks(user_id)
SELECT u.id
FROM my_users AS u
LEFT JOIN my_locks
AS l
ON u.id = l.user_id
WHERE l.user_id IS NULL
LIMIT 3
RETURNING *
I had hoped that folding the collecting of lockable users and the insertion of the locks into the database into a single query would ensure that multiple simultaneous requests would be processed in their entirety one after the other.
It doesn't work that way. If applied to the above example where three instances use this Query to simultaneously acquire locks on a pool of 5 users, one instance acquires three locks and the other instances receive an error for attempting to insert locks with non-unique user-IDs.
This is not ideal, because it prevents the locking mechanism from scaling. There are a number of workarounds to deal with this, but what I am looking for is a database-level solution. Is there a way to tweak the Query or DB configuration in such a way that multiple app instances can (near-)simultaneously acquire the maximum available number of locks in perfectly distinct sets?
The locking clause SKIP LOCKED should be perfect for you. Added with Postgres 9.5.
The manual:
With SKIP LOCKED, any selected rows that cannot be immediately locked are skipped.
FOR NO KEY UPDATE should be strong enough for your purpose. (Still allows other, non-exclusive locks.) And ideally, you take the weakest lock that's strong enough.
Work with just locks
If you can do your work while a transaction locking involved users stays open, then that's all you need:
BEGIN;
SELECT id FROM my_users
LIMIT 3
FOR NO KEY UPDATE SKIP LOCKED;
-- do some work on selected users here !!!
COMMIT;
Locks are gathered along the way and kept till the end of the current transaction. While the order can be arbitrary, we don't even need ORDER BY. No waiting, no deadlock possible with SKIP LOCKED. Each transaction scans over the table and locks the first 3 rows still up for grabs. Very cheap and fast.
Since transaction might stay open for a while, don't put anything else into the same transaction so not to block more than necessary.
Work with lock table additionally
If you can't do your work while a transaction locking involved users stays open, register users in that additional table my_locks.
Before work:
INSERT INTO my_locks(user_id)
SELECT id FROM my_users u
WHERE NOT EXISTS (
SELECT FROM my_locks l
WHERE l.user_id = u.id
)
LIMIT 3
FOR NO KEY UPDATE SKIP LOCKED
RETRUNGING *;
No explicit transaction wrapper needed.
Users in my_locks are excluded in addition to those currently locked exclusively. That works under concurrent load. While each transaction is open, locks are active. Once those are released at the end of the transaction they have already been written to the locks table - and are visible to other transaction at the same time.
There's a theoretical race condition for concurrent statements not seeing newly committed rows in the locks table just yet, and grabbing the same users after locks have just been released. But that would fail trying to write to the locks table. A UNIQUE constraint is absolute and will not allow duplicate entries, disregarding visibility.
Users won't be eligible again until deleted from your locks table.
Further reading:
Postgres UPDATE ... LIMIT 1
Select rows which are not present in other table
Aside:
... multiple simultaneous requests would be processed in their entirety one after the other.
It doesn't work that way.
To understand how it actually works, read about the Multiversion Concurrency Control (MVCC) of Postgres in the manual, starting here.

Some confusion on the description of read consistency in Oracle

Below is a short brief of read consistency from oracle concepts guide.
What is a sql statement, just one sql? Or Pl/SQL or Store Procedure? Anyone can help provide me one opposite example which can indicates the un-consistency read?
read consistency
A consistent view of data seen by a user. For example, in statement-level read
consistency the set of data seen by a SQL statement remains constant throughout
statement execution.
A "statement" in this context is one DML statement: a single SELECT, INSERT, UPDATE, DELETE, MERGE.
It is not a PL/SQL block. Similarly, multiple executions of the same DML statement (say, within a PL/SQL loop) are separate "statements". If you need consistency over multiple statements or within a PL/SQL block, you can achieve that using SET TRANSACTION ISOLATION LEVEL SERIALIZABLE or SET TRANSACTION READ ONLY. Both introduce limitations.
An opposite example of an inconsistent read would be as follows.
Starting conditions: table BIG_TABLE has 10 million rows.
User A at 10:00:
SELECT COUNT(*) FROM BIG_TABLE;
User B at 10:01:
DELETE FROM BIG_TABLE WHERE ID >= 9000000; -- delete the last million rows
User B at 10:02:
COMMIT;
User A at 10:03: query completes:
COUNT(*)
--------------
9309129
That is wrong. User A should have either gotten 10 million rows or 9 million rows. At no point were there 9309129 committed rows in the table. What has happened is that user A had read 309,129 rows that user B was deleting before Oracle actually processed the deletion (or before the COMMIT). Then, after the user B delete/commit, user A's query stopped seeing the deleted rows and stopped counting them.
This sort of problem is impossible in Oracle, thanks to its implementation of Multiversion Read Consistency.
In Oracle, in the above situation, as it encountered blocks that had rows deleted (and committed) by User B, User A's query would have used the UNDO data reconstruct what those blocks looked like at 10:00 -- the time when user A's query started.
That's basically it -- Oracle statements operate on the a version of the database as it existed as of a single point in time. This point in time is almost always the time when the statement started. There are some exception cases involving updates when that point in time will be moved to a point in time "mid statement". But it is always consistent as of one point in time or another.

Lock issues on large recordset

I have a database table that I use as a queue system, where separate process that talk to each other create and read entries in the table. For example, when a user initiates a search an entry is created, then another process that runs every second or two will pick up that new entry, update the status and then do a search, updating the entry again when the search is complete. This all seems to work well with thousands of searches per hour.
However, I have a master admin screen that lets me view the status of all of these 'jobs' but it runs very slowly. I basically return all entries in the table for the last hour so I can keep an eye on what's going on. I think that I am running into lock issues of some sort. I only need to read each entry, and don't really care if it the data is a little bit out of date. I just use a standard 'Select * from Table' statement so maybe it is waiting for other locks to expire before returning data as the jobs are constantly updating the data.
Would this be handled better by a certain kind of cursor to return each row one at a time, etc? Any other ideas?
Thanks
If you really don't care if the data is a bit out of date... or if you only need the data to be 99.99% accurate, consider using WITH (NOLOCK):
SELECT * FROM Table WITH (NOLOCK);
This will instruct your query to use the READ UNCOMMITTED ISOLATION LEVEL, which has the following behavior:
Specifies that dirty reads are allowed. No shared locks are issued to
prevent other transactions from modifying data read by the current
transaction, and exclusive locks set by other transactions do not
block the current transaction from reading the locked data.
Be aware that NOLOCK may cause some inaccuracies in your data, so it probably isn't a good idea to use it throughout the rest of your system.
You need FROM yourtable WITH (NOLOCK) table hint.
You may also want to look at transaction isolation in your update process, if you aren't already
An alternative to NOLOCK (which can lead to very bad things, such as missed rows or duplicated rows) is to allow read committed snapshot isolation at the database level and then issue your query with:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

Are Transactions Always Atomic?

I'm trying to better understand a nuance of SQL Server transactions.
Say I have a query that updates 1,000 existing rows, updating one of the columns to have the values 1 through 1,000. It's possible to execute this query and, when completed, those rows would not be numbered sequentially. This is because it's possible for another query to modify one of those rows before my query finishes.
On the other hand, if I wrap those updates in a transaction, that guarantees that if any one update fails, I can fail all updates. But does it also mean that those rows would be guaranteed to be sequential when I'm done?
In other words, are transactions always atomic?
But does it also mean that those rows would be guaranteed to be sequential when I'm done?
No. This has nothing to do with transactions, because what you're asking for simply doesn't exists: relational tables have no order an asking for 'sequential rows' is the wrong question to ask. You can rephrase the question as 'will the 1000 updated rows contain the entire sequence from 1 to 1000, w/o gaps' ? Most likely yes, but the truth of the matter is that there could be gaps depending on the way you do the updates. Those gaps would not appear because updated rows are modified after the update before commit, but because the update will be a no-op (will not update any row) which is a common problem of read-modify-write back type of updates ( the row 'vanishes' between the read and the write-back due to concurrent operations).
To answer your question more precisely whether your code is correct or not you have to post the exact code you're doing the update with, as well as the exact table structure, including all indexes.
Atomic means the operation(s) within the transaction with either occur, or they don't.
If one of the 1,000 statements fails, none of the operations within the transaction will commit. The smaller the sample of statements within a transaction -- say 100 -- means that the blocks of 100 leading up to the error (say at the 501st) can be committed (the first 400; the 500 block won't, and the 600+ blocks will).
But does it also mean that those rows would be guaranteed to be sequential when I'm done?
You'll have to provide more context about what you're doing in a transaction to be "sequential".
The 2 points are unrelated
Sequential
If you insert values 1 to 1000, it will be sequential with an WHERE and ORDER BY to limit you to these 1000 rows in some column. Unless there are duplicates, so you'd need a unique constraint
If you rely on an IDENTITY, it isn't guaranteed: Do Inserted Records Always Receive Contiguous Identity Values.
Atomicity
All transactions are atomic:
Is neccessary to encapsulate a single merge statement (with insert, delete and update) in a transaction?
SQL Server and connection loss in the middle of a transaction
Does it delete partially if execute a delete statement without transaction?
SQL transactions, like transactions on all database platforms, put the data in isolation to cover the entire ACID acronym (atomic, consistent, isolated and durable). So the answer is yes.
A transaction guarantees atomicity. That is the point.
You problem is that after you do the insert, they are only "Sequential" until the next thing comes along and touches one of the new records.
If another step in you process requires them to still be sequential then that step, too, needs to be within your original transaction.

In Oracle, a way for Updates to not lock rows?

I have an Update query that recalculates -every- column value in exactly one row, each time.
I've been seeing more row-level lock contention, due to these Update queries occurring on the same row.
I'm thinking maybe one solution would be to have subsequent Updates simply preempt any Updates already in progress. Is this possible? Does Oracle support this kind of Update?
To spell out the idea in full:
update query #1 begins, in its own transaction
needs to update row X
acquires lock on row X
update query #2 begins, again in its own transaction
blocks, waiting for query #1 to release the lock on row X.
My thought is, can step 5 simply be: query #1 is aborted, query #2 proceeds. Or maybe dispense with acquiring the row-level lock in the first place.
I realize this logic would be disastrously wrong should the update query be updating only a subset of columns in a given row. But it's not -- every column gets recalculated, each time.
I'd ask whether a physical table is the right mechanism for whatever you are doing. One factor is how transactions needs to be handled. Anything that means "Don't lock for the duration of the transaction" will run into transactional issues.
There are a couple of non-transactional options:
Global context values might be useful (depends if you are on RAC) and how to handle persistence after a restart.
Another option is DBMS_PIPE where you'd have a background process maintaining that table and the separate sessions send messages to that process rather than update the table directly.
Queuing is another thought.
If you just need to need to reduce the time the record is locked, autonomous transactions could be the answer
It's possible to do the opposite of what you're asking, have query 2 fail if query 1 is in progress using SELECT FOR UPDATE and NOWAIT.
Alternatively, you could try to see if you can get the desired effect by adjusting the isolation level, but I do not recommend this without extensive testing, as you don't know what knock-on effects it may have.
Oracle's UPDATE doesn't support any locking hints.
But OraFAQ forum suggests such hacky workaround:
DECLARE
x CHAR(1);
BEGIN
SELECT 'x' INTO x
FROM tablea
WHERE -- your update condition
FOR UPDATE OF cola NOWAIT;
UPDATE tablea
SET cola = value
WHERE -- your update condition
EXCEPTION
WHEN OTHERS THEN
NULL; -- handle the exception
END;