SELECT ... FOR UPDATE SKIP LOCKED in REPETABLE READ transactions - sql

I have the following statement in my PostgreSQL 10.5 database, which I execute in a repeatable read transaction:
delete from task
where task.task_id = (
select task.task_id
from task
order by task.created_at asc
limit 1
for update skip locked
)
returning
task.task_id,
task.created_at
Unfortunately, when I run it, I sometimes get:
[67] ERROR: could not serialize access due to concurrent update
[67] STATEMENT: delete from task
where task.task_id = (
select task.task_id
from task
order by task.created_at asc
limit $1
for update skip locked
)
returning
task.task_id,
task.created_at
which means the transaction rolled back because some other transaction modified the record in the meantime. (I think?)
I don't quite understand this. How could a different transaction modify a record that was selected with for update skip locked, and deleted?

This quote from the manual discusses your case exactly:
UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands
behave the same as SELECT in terms of searching for target rows: they
will only find target rows that were committed as of the transaction
start time. However, such a target row might have already been updated
(or deleted or locked) by another concurrent transaction by the time
it is found. In this case, the repeatable read transaction will wait
for the first updating transaction to commit or roll back (if it is
still in progress). If the first updater rolls back, then its effects
are negated and the repeatable read transaction can proceed with
updating the originally found row. But if the first updater commits
(and actually updated or deleted the row, not just locked it) then the
repeatable read transaction will be rolled back with the message
ERROR: could not serialize access due to concurrent update
Meaning, your transaction was unable to lock the row to begin with - due to concurrent write access that got there first. SKIP LOCKED cannot save you from this completely as there may not be a lock to skip any more and we still run into a serialization failure if the row has already been changed (and the change committed - hence the lock released) since transaction start.
The same statement should work just fine with default READ COMMITTED transaction isolation. Related:
Postgres UPDATE … LIMIT 1

Related

SQL Server READ_COMMITTED_SNAPSHOT Isolation Level - Shared Lock Issue

In microsoft documentation about transaction isolation level it states that;
If READ_COMMITTED_SNAPSHOT is set to OFF (the default on SQL Server), the Database Engine uses shared locks to prevent other transactions from modifying rows while the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other transactions until the other transaction is completed. The shared lock type determines when it will be released. Row locks are released before the next row is processed. Page locks are released when the next page is read, and table locks are released when the statement finishes.
Means, with READ_COMMITED_SNAPSHOT is set to OFF, if i do a SELECT on a certain record inside a transaction, it should holds a shared lock that will block other transactions from doing an update.
I tested this scenario, but it doesn't do that. Update statement succeeded without a blocking.
Why is that? Does the documentation is wrong? Or I understand incorrectly?
This is my database current isolation level set to OFF as per the document.
These are the steps i used to test. I used StackOverflow public data dump as my DB.
Window #1. Ran the below SELECT query
BEGIN TRANSACTION
SELECT * FROM dbo.Posts WHERE Id=4175774
Window #2. Ran the below UPDATE query
BEGIN TRANSACTION
UPDATE dbo.Posts SET Score=36
WHERE Id=4175774
Expected Result:
UPDATE query should get locked and not succeed until I commit the Window #1 Transaction.
Actual Result:
UPDATE query got succeeded instantly.

Postgress Serialisation Error For SKIP Locked with repeatable read

I am running a query with skip locked
select * from booker.review_tasks where review_task_id in (
2140285001,
2140285031,
2140304551
) for update skip locked ;
And then updating it using another concurrent transaction
update booker.review_tasks set priority = 190 where review_task_id in (2140285001,
2140304551);
I am getting an error in the first transaction as
was aborted: ERROR: could not serialize access due to concurrent update Call getNextException to see other errors in the batch.
at com.amazon.contentreviewplatformservice.db.RWNamedJdbcTemplate.batchUpdate(RWNamedJdbcTemplate.java:126) ~[ContentReviewPlatformService-1.0.jar:?]
at com.amazon.contentreviewplatformservice.db.dao.DAOUtils.changeUserTaskAssignmentStatus(DAOUtils.java:126) ~[ContentReviewPlatformService-1.0.jar:?]
at com.amazon.contentreviewplatformservice.workflow.dao.WorkflowManagerDAO.changeReviewState(WorkflowManagerDAO.java:784) ~[ContentReviewPlatformService-1.0.jar:?]
How can a row be updating when its locked , or if locked is skip why is there error of concurrent update
The SELECT ... FOR UPDATE first selects the rows and then locks them. It sees the database in the state of its transaction snapshot, which was taken by the first statement in the REPEATABLE READ transaction, possibly a while back.
After the SELECT ... FOR UPDATE has found a row, it will try to lock it. Now if the UPDATE has run at any time between the time the snapshot was taken and the attempt to lock the row, PostgreSQL determines that the row version it would have to lock is different from the row version the transaction sees and throws a serialization error.

How can I read dirty values in SQL UPDATE statement WHERE clause

Let's assume I have the following query in two separate SSMS query windows:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
BEGIN TRANSACTION
UPDATE dbo.Jobs
SET [status] = 'Running'
OUTPUT Inserted.*
WHERE [status] = 'Waiting'
--I'm NOT committing yet
--Commit Transaction
I run query window 1 (but do not commit), and then I run query window 2.
I want for query window 2 to immediately update only rows that were inserted after I started query 1 (all new records come in with a status of 'Waiting'). However, SQL Server is waiting for the first query to finish, because in an update statement it's not reading dirty values (even if it's set to READ UNCOMMITTED);
Is there a way to overcome this?
In my application I will have 2 (or more) processes running it, I want that process 2 should be able to pickup the rows that process 1 have not picked up; I don't want that process 2 should need to wait until process 1 is finish
What you are asking for is simply impossible.
Even at the lowest isolation level of READ UNCOMMITTED (aka NOLOCK), an X-Lock (exclusive) must be taken in order to make modifications. In other words, writes are always locked, even if the reads that fetched those rows were not locked.
So even though session 2 is running under READ UNCOMMITTED also, if it wants to do a modification it must also take an X-Lock, which is incompatible with the first X-Lock.
The solution here is to either do this in one session, or commit immediately. In any case, do not hold locks for any length of time, as it can cause massive blocking chains and even deadlocks.
If you want to just ignore all those rows which have been inserted, you could use the WITH (READPAST) hint.
READ UNCOMMITTED as an isolation level or as a hint has huge issues.
It can cause anything from deadlocks to completely incorrect results. For example, you could read a row twice, or not at all, when by the logical definition of the schema there should have been exactly one row. You could read entire pages twice or not at all.
You can get deadlocks due to U-Locks not being taken in UPDATE and DELETE statements.
And you still take schema locks, so you can still get stuck behind a synchronous statistics update or an index rebuild.

How select for update works for queries with multiple rows/results?

So given this transaction:
select * from table_a where field_a = 'A' for update;
Assuming this gives out multiple rows/results, will the database lock all the results right off the bat? Or will it lock it one row at a time.
If the later is true, does that mean running this query concurrently, can result in a deadlock?
Thus, adding an order by to maintain consistency on the order is needed to solve this problem?
The documentation explains what happens as follows:
FOR UPDATE
FOR UPDATE causes the rows retrieved by the SELECT statement to be
locked as though for update. This prevents them from being locked,
modified or deleted by other transactions until the current
transaction ends. That is, other transactions that attempt UPDATE,
DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE
or SELECT FOR KEY SHARE of these rows will be blocked until
the current transaction ends; conversely, SELECT FOR UPDATE will
wait for a concurrent transaction that has run any of those commands
on the same row, and will then lock and return the updated row (or no
row, if the row was deleted). Within a REPEATABLE READ or
SERIALIZABLE transaction, however, an error will be thrown if a row
to be locked has changed since the transaction started. For further
discussion see Section 13.4.
The direct answer to your question is that Postgres cannot lock all the rows "right off the bat"; it has to find them first. Remember, this is row-level locking rather than table-level locking.
The documentation includes this note:
SELECT FOR UPDATE modifies selected rows to mark them locked, and so
will result in disk writes.
I interpret this as saying that Postgres executes the SELECT query and as it finds the rows, it marks them as locked. The lock (for a given row) starts when Postgres identifies the row. It continues until the end of the transaction.
Based on this, I think it is possible for a deadlock situation to arise using SELECT FOR UPDATE.

Delete and Insert Inside one Transaction SQL

I just want to ask if it is always the first query will be executed when encapsulate to a transaction? for example i got 500 k records to be deleted and 500 k to be inserted, is there a possibility of locking?
Actually I already test this query and it works fine but i want to make sure if my assumption is correct.
Note: this will Delete and Insert the same record with possible update on other columns.
BEGIN TRAN;
DELETE FROM OUTPUT TABLE WHERE ID = (1,2,3,4 etc)
INSERT INTO OUTPUT TABLE Values (1,2,3,4 etc)
COMMIT TRAN;
Within a transaction all write locks (all locks acquired for modifications) must obey the strict two phase locking rule. One of the consequences is that a write (X) lock acquired in a transaction cannot be released until the transaction commits. So yes, the DELETE and INSERT will execute sequentially and all locks acquired during the DELETE will be retained while executing the INSERT.
Keep in mind that deleting 500k rows in a transaction will escalate the locks to one table lock, see Lock Escalation.
Deleting 500k rows and inserting 500k rows in a single transaction, while maybe correct, is a bad idea. You should avoid such large units of works, long transaction, if possible. Long transactions pin the log in place, create blocking and contention, increase recovery and DB startup time, increase SQL Server resource consumption (locks require memory).
You should consider doing the operation in small batches (perhaps 10000 rows at time), use MERGE instead of DELETE/INSERT (if possible) and, last but not least, consider a partitioned sliding window
implementation, see How to Implement an Automatic Sliding Window in a Partitioned Table.
From the documentation on TRANSACTION (emphasis mine):
BEGIN TRANSACTION represents a point at which the data referenced by a
connection is logically and physically consistent. If errors are
encountered, all data modifications made after the BEGIN TRANSACTION
can be rolled back to return the data to this known state of
consistency. Each transaction lasts until either it completes without
errors and COMMIT TRANSACTION is issued to make the modifications a
permanent part of the database, or errors are encountered and all
modifications are erased with a ROLLBACK TRANSACTION statement.
BEGIN TRANSACTION starts a local transaction for the connection
issuing the statement. Depending on the current transaction isolation
level settings, many resources acquired to support the Transact-SQL
statements issued by the connection are locked by the transaction
until it is completed with either a COMMIT TRANSACTION or ROLLBACK
TRANSACTION statement. Transactions left outstanding for long periods
of time can prevent other users from accessing these locked resources,
and also can prevent log truncation.
Although BEGIN TRANSACTION starts a local transaction, it is not
recorded in the transaction log until the application subsequently
performs an action that must be recorded in the log, such as executing
an INSERT, UPDATE, or DELETE statement. An application can perform
actions such as acquiring locks to protect the transaction isolation
level of SELECT statements, but nothing is recorded in the log until
the application performs a modification action.