Delete Statement and Transaction for One row in table - sql

I have only 1 row in a table. Suppose I use following statements
BEGIN TRANSACTION
DELETE FROM mytable (Now if we check the table, the row will be deleted from table.)
ROLLBACK TRANSACTION (The row will be again available in table).
My question is, when we ROLLBACK the transaction, from where the SQL Server restores the data or where the data is temporarily held during transaction.
Thanks

All of the data modifications within the transaction are stored within the transaction log, with additional space also reserved in the log for the undo records, in the event that it has to rollback. Each transaction log has sufficient information within it, to reverse the change it has made, so that it can undo the change if required.
If we take a simple delete operation as an example, the record being deleted is stored inside the transaction log entry of LOP_DELETE_ROWS and with some non-trivial effort you can decode and demonstrate the entire row is within the log entry.
If the transaction is to be rolled back, the undo space reserved in the log is going to be used, and the row would be re-inserted. The reason for the undo reservation of space is to ensure that the transaction log can not be filled up mid transaction, leaving it no space to complete or rollback.

Related

Hibernate + SQL Server: One Transaction blocks all other transactions

I am developing a server application with Spring + Hibernate + SQL-Server and i recognized that all my transactions are blocking other transaction, even if other transactions do not touch the same tables / rows and there are no relationships between this tables.
Here is a screenshot:
Transaction Report
The screenshot shows that a delete statement on table A blocks a select statement on table B. But there is no relationship between the tables.
In my understanding a transaction should only lock a table or row and another transactions that will hit the locked table or row will be blocked.
But why are all transactions blocked?
Do i missunderstand anything?
Solution was found:
The process was a copy-process. At the beginning i opened a transaction for the whole process and copy n-objects in this process. When i need to do selects while copying the selects were blocked. So i decide to open a transaction everytime i copy one object and after that copying-process close the transaction for this one object. So selects are no longer blocked. This design is like the copy-process in Windows. If you decide to cancel the copy-process the already copied files will stay. Like in my case the already copied objects will stay cause the transaction was closed and committed for every object separatly.

SELECT ... FOR UPDATE SKIP LOCKED in REPETABLE READ transactions

I have the following statement in my PostgreSQL 10.5 database, which I execute in a repeatable read transaction:
delete from task
where task.task_id = (
select task.task_id
from task
order by task.created_at asc
limit 1
for update skip locked
)
returning
task.task_id,
task.created_at
Unfortunately, when I run it, I sometimes get:
[67] ERROR: could not serialize access due to concurrent update
[67] STATEMENT: delete from task
where task.task_id = (
select task.task_id
from task
order by task.created_at asc
limit $1
for update skip locked
)
returning
task.task_id,
task.created_at
which means the transaction rolled back because some other transaction modified the record in the meantime. (I think?)
I don't quite understand this. How could a different transaction modify a record that was selected with for update skip locked, and deleted?
This quote from the manual discusses your case exactly:
UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands
behave the same as SELECT in terms of searching for target rows: they
will only find target rows that were committed as of the transaction
start time. However, such a target row might have already been updated
(or deleted or locked) by another concurrent transaction by the time
it is found. In this case, the repeatable read transaction will wait
for the first updating transaction to commit or roll back (if it is
still in progress). If the first updater rolls back, then its effects
are negated and the repeatable read transaction can proceed with
updating the originally found row. But if the first updater commits
(and actually updated or deleted the row, not just locked it) then the
repeatable read transaction will be rolled back with the message
ERROR: could not serialize access due to concurrent update
Meaning, your transaction was unable to lock the row to begin with - due to concurrent write access that got there first. SKIP LOCKED cannot save you from this completely as there may not be a lock to skip any more and we still run into a serialization failure if the row has already been changed (and the change committed - hence the lock released) since transaction start.
The same statement should work just fine with default READ COMMITTED transaction isolation. Related:
Postgres UPDATE … LIMIT 1

How select for update works for queries with multiple rows/results?

So given this transaction:
select * from table_a where field_a = 'A' for update;
Assuming this gives out multiple rows/results, will the database lock all the results right off the bat? Or will it lock it one row at a time.
If the later is true, does that mean running this query concurrently, can result in a deadlock?
Thus, adding an order by to maintain consistency on the order is needed to solve this problem?
The documentation explains what happens as follows:
FOR UPDATE
FOR UPDATE causes the rows retrieved by the SELECT statement to be
locked as though for update. This prevents them from being locked,
modified or deleted by other transactions until the current
transaction ends. That is, other transactions that attempt UPDATE,
DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE
or SELECT FOR KEY SHARE of these rows will be blocked until
the current transaction ends; conversely, SELECT FOR UPDATE will
wait for a concurrent transaction that has run any of those commands
on the same row, and will then lock and return the updated row (or no
row, if the row was deleted). Within a REPEATABLE READ or
SERIALIZABLE transaction, however, an error will be thrown if a row
to be locked has changed since the transaction started. For further
discussion see Section 13.4.
The direct answer to your question is that Postgres cannot lock all the rows "right off the bat"; it has to find them first. Remember, this is row-level locking rather than table-level locking.
The documentation includes this note:
SELECT FOR UPDATE modifies selected rows to mark them locked, and so
will result in disk writes.
I interpret this as saying that Postgres executes the SELECT query and as it finds the rows, it marks them as locked. The lock (for a given row) starts when Postgres identifies the row. It continues until the end of the transaction.
Based on this, I think it is possible for a deadlock situation to arise using SELECT FOR UPDATE.

Delete and Insert Inside one Transaction SQL

I just want to ask if it is always the first query will be executed when encapsulate to a transaction? for example i got 500 k records to be deleted and 500 k to be inserted, is there a possibility of locking?
Actually I already test this query and it works fine but i want to make sure if my assumption is correct.
Note: this will Delete and Insert the same record with possible update on other columns.
BEGIN TRAN;
DELETE FROM OUTPUT TABLE WHERE ID = (1,2,3,4 etc)
INSERT INTO OUTPUT TABLE Values (1,2,3,4 etc)
COMMIT TRAN;
Within a transaction all write locks (all locks acquired for modifications) must obey the strict two phase locking rule. One of the consequences is that a write (X) lock acquired in a transaction cannot be released until the transaction commits. So yes, the DELETE and INSERT will execute sequentially and all locks acquired during the DELETE will be retained while executing the INSERT.
Keep in mind that deleting 500k rows in a transaction will escalate the locks to one table lock, see Lock Escalation.
Deleting 500k rows and inserting 500k rows in a single transaction, while maybe correct, is a bad idea. You should avoid such large units of works, long transaction, if possible. Long transactions pin the log in place, create blocking and contention, increase recovery and DB startup time, increase SQL Server resource consumption (locks require memory).
You should consider doing the operation in small batches (perhaps 10000 rows at time), use MERGE instead of DELETE/INSERT (if possible) and, last but not least, consider a partitioned sliding window
implementation, see How to Implement an Automatic Sliding Window in a Partitioned Table.
From the documentation on TRANSACTION (emphasis mine):
BEGIN TRANSACTION represents a point at which the data referenced by a
connection is logically and physically consistent. If errors are
encountered, all data modifications made after the BEGIN TRANSACTION
can be rolled back to return the data to this known state of
consistency. Each transaction lasts until either it completes without
errors and COMMIT TRANSACTION is issued to make the modifications a
permanent part of the database, or errors are encountered and all
modifications are erased with a ROLLBACK TRANSACTION statement.
BEGIN TRANSACTION starts a local transaction for the connection
issuing the statement. Depending on the current transaction isolation
level settings, many resources acquired to support the Transact-SQL
statements issued by the connection are locked by the transaction
until it is completed with either a COMMIT TRANSACTION or ROLLBACK
TRANSACTION statement. Transactions left outstanding for long periods
of time can prevent other users from accessing these locked resources,
and also can prevent log truncation.
Although BEGIN TRANSACTION starts a local transaction, it is not
recorded in the transaction log until the application subsequently
performs an action that must be recorded in the log, such as executing
an INSERT, UPDATE, or DELETE statement. An application can perform
actions such as acquiring locks to protect the transaction isolation
level of SELECT statements, but nothing is recorded in the log until
the application performs a modification action.

Is there a difference between a SELECT statement inside a transaction and one that is outside of it?

Does the default READ COMMITTED isolation level somehow makes the SELECT statement act different inside of a transaction than one that is not in a transaction?
I am using MS SQL.
Yes, the one inside the transaction can see changes made by other previous Insert/Update/delete statements in that transaction; a Select statement outside the transaction cannot.
If all you are asking about is what the Isolation Level does, then understand that all Select statements (hey, all statements of any kind) - are in a transaction. The only difference between one that is explicitly in a transaction and one that is standing on its own is that the one that is standing alone starts its transaction immediately before it executes it, and commits or roll back immediately after it executes;
whereas the one that is explicitly in a transaction can (because it has a Begin Transaction statement) can have other statements (inserts/updates/deletes, whatever) occurring within that same transaction, either before or after that Select statement.
So whatever the isolation level is set to, both selects (inside or outside an explicit transaction) will nevertheless be in a transaction which is operating at that isolation level.
Addition:
The following is for SQL Server, but all databases MUST work in the same way. In SQL Server the Query Processor is always in one of 3 Transaction Modes, AutoCommit, Implicit, or Explicit.
AutoCommit is the default transaction management mode of the SQL Server Database Engine. .. Every Transact-SQL statement is committed or rolled back when it completes. ... If a statement completes successfully, it is committed; if it encounters any error, it is rolled back. This is the default, and is the answer to #Alex's question in the comments.
In Implicit Transaction mode, "... the SQL Server Database Engine automatically starts a new transaction after the current transaction is committed or rolled back. You do nothing to delineate the start of a transaction; you only commit or roll back each transaction. Implicit transaction mode generates a continuous chain of transactions. ..." Note that the italicized snippet is for each transaction, whether it be a single or multiple statement transaction.
The engine is placed in Explicit Transaction mode when you explicitly initiate a transaction with BEGIN TRANSACTION Statement. Then, every statement is executed within that transaction until you explicitly terminate the transaction (with COMMIT or ROLLBACK) or if a failure occurs that causes the engine to terminate and Rollback.
Yes, there is a bit of a difference. For MySQL, the database doesn't actually start with a snapshot until your first query. Therefore, it's not begin that matters, but the first statement within the transaction. If I do the following:
#Session 1
begin; select * from table;
#Session 2
delete * from table; #implicit autocommit
#Session 1
select * from table;
Then I'll get the same thing in session one both times (the information that was in the table before I deleted it). When I end session one's transaction (commit, begin, or rollback) and check again from that session, the table will show as empty.
The READ COMMITTED isolation level is about the records that have been written. It has nothing to do with whether or not this select statement is in a transaction (except for those things written during that same transaction).
If your database (or in mysql, the underlying storage engine of all tables used in your select statement) is transactional, then there simply no way to execute it "outside of a transaction".
Perhaps you meant "run it in autocommit mode", but that is not the same as "not transactional". In the latter case, it still runs in a transaction, it's just that the transaction ends immediately after your statement is finshed.
So, in both cases, during the run, a single select statement will be isolated at the READ COMMITTED level from the other transactions.
Now what this means for your READ COMMITTED transaction isolation level: perhaps surprisingly, not that much.
READ COMMITTED means that you may encounter non-repeatable reads: when running multiple select statements in the same transaction, it is possible that rows that you selected at a certain point in time are modified and comitted by another transaction. You will be able to see those changes when you re-execute the select statement later on in the same pending transaction. In autocommit mode, those 2 select statements would be executed in their own transaction. If another transaction would have modified and committed the rows you selected the first time, you would be able to see those changes just as well when you executed the statement the second time.