SQL Server READ_COMMITTED_SNAPSHOT Isolation Level - Shared Lock Issue - sql-server-2017

In microsoft documentation about transaction isolation level it states that;
If READ_COMMITTED_SNAPSHOT is set to OFF (the default on SQL Server), the Database Engine uses shared locks to prevent other transactions from modifying rows while the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other transactions until the other transaction is completed. The shared lock type determines when it will be released. Row locks are released before the next row is processed. Page locks are released when the next page is read, and table locks are released when the statement finishes.
Means, with READ_COMMITED_SNAPSHOT is set to OFF, if i do a SELECT on a certain record inside a transaction, it should holds a shared lock that will block other transactions from doing an update.
I tested this scenario, but it doesn't do that. Update statement succeeded without a blocking.
Why is that? Does the documentation is wrong? Or I understand incorrectly?
This is my database current isolation level set to OFF as per the document.
These are the steps i used to test. I used StackOverflow public data dump as my DB.
Window #1. Ran the below SELECT query
BEGIN TRANSACTION
SELECT * FROM dbo.Posts WHERE Id=4175774
Window #2. Ran the below UPDATE query
BEGIN TRANSACTION
UPDATE dbo.Posts SET Score=36
WHERE Id=4175774
Expected Result:
UPDATE query should get locked and not succeed until I commit the Window #1 Transaction.
Actual Result:
UPDATE query got succeeded instantly.

Related

How to implement Serializable Isolation Level in SQL Server

I need to implement a serializable isolation level in SQL Server but I've tried many ways and I don't get it.
I need to lock 1 row in one transaction (It doesn´t matter if lock the complete table). So, another transaction can´t even select the row (don´t read).
The last thing I tried:
For transaction 1:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
-- Here I select in another instance the same row
COMMIT TRAN
For transaction 2:
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
COMMIT TRAN
I would expect that transaction 2 wait until transaction 1 commit the operation, but the transaction 2 gives me the row.
Anyone can explain me if I miss something?
SQL Server conforms to the strict definition of a Serializable query. That is, there must be a result that can logically be generated IF both queries ran in serial order - Transaction 1 finishing before Transaction 2 can start, or vice versa.
This results in some effects that can be different than you would expect. There is a great explanation of the Serializable isolation level over at SQLPerformance.com that makes clear some of what this logical serializability ends up meaning. (Very helpful site, that one.)
For your above queries, there is no logical requirement to prevent the second query from reading the same row as the first query. No matter in what order the queries are run, they will both return the same data without modifying it. Since the Query Analyzer can identify this, there is no reason to place a read lock on the data. However, if one of the queries performed an update on the data, then (warning - logic assumption here, since I don't actually know the internals of how SQL Server handles this) the QA would set a stronger lock on the selected rows.
TL;DR - SQL Server wants to minimize blocking, so it uses logical analysis to see what types of locks are needed for a serializable isolation level, and it (tries to) use the minimum number and strength of locks needed to achieve its goal.
Now that we've dealt with that - there are only two ways that I can think of to lock a row so that no one else can read it: using XLOCK + TABLOCK (locking the whole table - not a recommended practice) or having some form of a field on each row that is updated when you start your process - something like an SPID field, or a bit flag for Locked. When you update it within your transaction, only SELECTs with NOLOCK hints will be able to read it.
Clearly, neither of these are optimal. I recommend the "This row is busy - go away" flag, as that's probably the approach I would take for an (almost) absolute lock on a row.
According to the documentation:
SERIALIZABLE Specifies the following:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the
current transaction until the current transaction completes.
If you're not making any changes to data with an INSERT, UPDATE, or DELETE inside transaction 1, SQL will release the Shared Lock after the read completes.
What you might want to try is adding a table hit to prevent the row lock from being released until the end of transaction 1.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code
FROM table1 WITH(ROWLOCK, HOLDLOCK)
WHERE code = 1
COMMIT TRAN
Maybe you can solve this with some hack like this?
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
UPDATE someTableForThisHack set val = CASE WHEN val = 1 THEN 0 else 1 End
SELECT code from table1.....
COMMIT TRANSACTION
So you create a table someTableForThisHack and insert one row to it.

Delete and Insert Inside one Transaction SQL

I just want to ask if it is always the first query will be executed when encapsulate to a transaction? for example i got 500 k records to be deleted and 500 k to be inserted, is there a possibility of locking?
Actually I already test this query and it works fine but i want to make sure if my assumption is correct.
Note: this will Delete and Insert the same record with possible update on other columns.
BEGIN TRAN;
DELETE FROM OUTPUT TABLE WHERE ID = (1,2,3,4 etc)
INSERT INTO OUTPUT TABLE Values (1,2,3,4 etc)
COMMIT TRAN;
Within a transaction all write locks (all locks acquired for modifications) must obey the strict two phase locking rule. One of the consequences is that a write (X) lock acquired in a transaction cannot be released until the transaction commits. So yes, the DELETE and INSERT will execute sequentially and all locks acquired during the DELETE will be retained while executing the INSERT.
Keep in mind that deleting 500k rows in a transaction will escalate the locks to one table lock, see Lock Escalation.
Deleting 500k rows and inserting 500k rows in a single transaction, while maybe correct, is a bad idea. You should avoid such large units of works, long transaction, if possible. Long transactions pin the log in place, create blocking and contention, increase recovery and DB startup time, increase SQL Server resource consumption (locks require memory).
You should consider doing the operation in small batches (perhaps 10000 rows at time), use MERGE instead of DELETE/INSERT (if possible) and, last but not least, consider a partitioned sliding window
implementation, see How to Implement an Automatic Sliding Window in a Partitioned Table.
From the documentation on TRANSACTION (emphasis mine):
BEGIN TRANSACTION represents a point at which the data referenced by a
connection is logically and physically consistent. If errors are
encountered, all data modifications made after the BEGIN TRANSACTION
can be rolled back to return the data to this known state of
consistency. Each transaction lasts until either it completes without
errors and COMMIT TRANSACTION is issued to make the modifications a
permanent part of the database, or errors are encountered and all
modifications are erased with a ROLLBACK TRANSACTION statement.
BEGIN TRANSACTION starts a local transaction for the connection
issuing the statement. Depending on the current transaction isolation
level settings, many resources acquired to support the Transact-SQL
statements issued by the connection are locked by the transaction
until it is completed with either a COMMIT TRANSACTION or ROLLBACK
TRANSACTION statement. Transactions left outstanding for long periods
of time can prevent other users from accessing these locked resources,
and also can prevent log truncation.
Although BEGIN TRANSACTION starts a local transaction, it is not
recorded in the transaction log until the application subsequently
performs an action that must be recorded in the log, such as executing
an INSERT, UPDATE, or DELETE statement. An application can perform
actions such as acquiring locks to protect the transaction isolation
level of SELECT statements, but nothing is recorded in the log until
the application performs a modification action.

SELECT during a lengthy UPDATE - What happens to SELECT for different Transaction Isolation Levels and SELECT WITH (NOLOCK)?

Given an UPDATE execution which takes 5 minutes or so, what happens when SELECT tries to retrieve data from the same table? For different Transaction Isolation Levels and SELECT WITH (NOLOCK), does SELECT wait for UPDATE? If not, does SELECT return old data (data before the UPDATE) or part of the currently inserted records (such as 50% of the records currently being inserted) ?
If found the following question, but it only describes what happens when you execute and UPDATE during a long SELECT.
SQL Server - does [SELECT] lock [UPDATE]?
I am using MS SQL Server 2012. Hopefully, this behaviour is consistent for different implementations.
This post by Gavin Draper explains it quite well and contains some example query's.
SQL Server Isolation Levels By Example
Isolation levels in SQL Server control the way locking works between
transactions.
SQL Server 2008 supports the following isolation levels
Read Uncommitted
Read Committed (The default)
Repeatable Read
Serializable
Snapshot
Before I run through each of these in detail you may want to create a
new database to run the examples, run the following script on the new
database to create the sample data. Note : You’ll also want to drop
the IsolationTests table and re-run this script before each example to
reset the data.
CREATE TABLE IsolationTests
(
Id INT IDENTITY,
Col1 INT,
Col2 INT,
Col3 INTupdate te
)
INSERT INTO IsolationTests(Col1,Col2,Col3)
SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
Also before we go any further it is important to understand these two
terms….
Dirty Reads – This is when you read uncommitted data, when doing this there is no guarantee that data read will ever be committed
meaning the data could well be bad.
Phantom Reads – This is when data that you are working with has been changed by another transaction since you first read it in.
This
means subsequent reads of this data in the same transaction could
well be different.
Read Uncommitted
This is the lowest isolation level there is. Read uncommitted causes
no shared locks to be requested which allows you to read data that is
currently being modified in other transactions. It also allows other
transactions to modify data that you are reading.
As you can probably imagine this can cause some unexpected results in
a variety of different ways. For example data returned by the select
could be in a half way state if an update was running in another
transaction causing some of your rows to come back with the updated
values and some not to.
To see read uncommitted in action lets run Query1 in one tab of
Management Studio and then quickly run Query2 in another tab before
Query1 completes.
Query1
BEGIN TRAN
UPDATE IsolationTests SET Col1 = 2
--Simulate having some intensive processing here with a wait
WAITFOR DELAY '00:00:10'
ROLLBACK
Query2
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SELECT * FROM IsolationTests
Notice that Query2 will not wait for Query1 to finish, also more
importantly Query2 returns dirty data. Remember Query1 rolls back all
its changes however Query2 has returned the data anyway, this is
because it didn't wait for all the other transactions with exclusive
locks on this data it just returned what was there at the time.
There is a syntactic shortcut for querying data using the read
uncommitted isolation level by using the NOLOCK table hint. You
could change the above Query2 to look like this and it would do the
exact same thing.
SELECT * FROM IsolationTests WITH(NOLOCK)
Read Committed
This is the default isolation level and means selects will only return
committed data. Select statements will issue shared lock requests
against data you’re querying this causes you to wait if another
transaction already has an exclusive lock on that data. Once you have
your shared lock any other transactions trying to modify that data
will request an exclusive lock and be made to wait until your Read
Committed transaction finishes.
You can see an example of a read transaction waiting for a modify
transaction to complete before returning the data by running the
following Queries in separate tabs as you did with Read Uncommitted.
Query1
BEGIN TRAN
UPDATE Tests SET Col1 = 2
--Simulate having some intensive processing here with a wait
WAITFOR DELAY '00:00:10'
ROLLBACK
Query2
SELECT * FROM IsolationTests
Notice how Query2 waited for the first transaction to complete before
returning and also how the data returned is the data we started off
with as Query1 did a rollback. The reason no isolation level was
specified is because Read Committed is the default isolation level for
SQL Server. If you want to check what isolation level you are running
under you can run DBCC useroptions. Remember isolation levels are
Connection/Transaction specific so different queries on the same
database are often run under different isolation levels.
Repeatable Read
This is similar to Read Committed but with the additional guarantee
that if you issue the same select twice in a transaction you will get
the same results both times. It does this by holding on to the shared
locks it obtains on the records it reads until the end of the
transaction, This means any transactions that try to modify these
records are forced to wait for the read transaction to complete.
As before run Query1 then while its running run Query2
Query1
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN
SELECT * FROM IsolationTests
WAITFOR DELAY '00:00:10'
SELECT * FROM IsolationTests
ROLLBACK
Query2
UPDATE IsolationTests SET Col1 = -1
Notice that Query1 returns the same data for both selects even though
you ran a query to modify the data before the second select ran. This
is because the Update query was forced to wait for Query1 to finish
due to the exclusive locks that were opened as you specified
Repeatable Read.
If you rerun the above Queries but change Query1 to Read Committed you
will notice the two selects return different data and that Query2 does
not wait for Query1 to finish.
One last thing to know about Repeatable Read is that the data can
change between 2 queries if more records are added. Repeatable Read
guarantees records queried by a previous select will not be changed or
deleted, it does not stop new records being inserted so it is still
very possible to get Phantom Reads at this isolation level.
Serializable
This isolation level takes Repeatable Read and adds the guarantee that
no new data will be added eradicating the chance of getting Phantom
Reads. It does this by placing range locks on the queried data. This
causes any other transactions trying to modify or insert data touched
on by this transaction to wait until it has finished.
You know the drill by now run these queries side by side…
Query1
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT * FROM IsolationTests
WAITFOR DELAY '00:00:10'
SELECT * FROM IsolationTests
ROLLBACK
Query2
INSERT INTO IsolationTests(Col1,Col2,Col3)
VALUES (100,100,100)
You’ll see that the insert in Query2 waits for Query1 to complete
before it runs eradicating the chance of a phantom read. If you change
the isolation level in Query1 to repeatable read, you’ll see the
insert no longer gets blocked and the two select statements in Query1
return a different amount of rows.
Snapshot
This provides the same guarantees as serializable. So what's the
difference? Well it’s more in the way it works, using snapshot doesn't
block other queries from inserting or updating the data touched by the
snapshot transaction. Instead row versioning is used so when data is
changed the old version is kept in tempdb so existing transactions
will see the version without the change. When all transactions that
started before the changes are complete the previous row version is
removed from tempdb. This means that even if another transaction has
made changes you will always get the same results as you did the first
time in that transaction.
So on the plus side your not blocking anyone else from modifying the
data whilst you run your transaction but…. You’re using extra
resources on the SQL Server to hold multiple versions of your changes.
To use the snapshot isolation level you need to enable it on the
database by running the following command
ALTER DATABASE IsolationTests
SET ALLOW_SNAPSHOT_ISOLATION ON
If you rerun the examples from serializable but change the isolation
level to snapshot you will notice that you still get the same data
returned but Query2 no longer waits for Query1 to complete.
Summary
You should now have a good idea how each of the different isolation
levels work. You can see how the higher the level you use the less
concurrency you are offering and the more blocking you bring to the
table. You should always try to use the lowest isolation level you can
which is usually read committed.
READ UNCOMMITTED: The SELECT can read all kinds of nasty inconsistencies. Old rows, new rows, duplicate rows, missing rows. It can also totally error out with the famous "data movement" error.
READ COMMITTED: Will block without snapshot isolation. Will return the old state with snapshot isolation in perfect consistency.
REPEATABLE READ/SERIALIZABLE: Will block.
SNAPSHOT: Will return the old state with snapshot isolation in perfect consistency.
It sounds like you should read a few concurrency tutorials. I have written these brief facts to get you started. To really understand what's going on to the point that you can make predictions (that come true) you need to go deeper than an answer on Stack Overflow can provide.
Most of the time, you want to use SNAPSHOT for read-only transactions. It takes away all concurrency concerns. Be aware that it has a few drawbacks.

Does a SQL UPDATE operation read data to "local memory"?

This answer quotes this Technet article which explains the two interpretations of lost updates:
A lost update can be interpreted in one of two ways. In the first scenario, a lost update is considered to have taken place when data that has been updated by one transaction is overwritten by another transaction, before the first transaction is either committed or rolled back. This type of lost update cannot occur in SQL Server 2005 because it is not allowed under any transaction isolation level.
The other interpretation of a lost update is when one transaction (Transaction #1) reads data into its local memory, and then another transaction (Transaction #2) changes this data and commits its change. After this, Transaction #1 updates the same data based on what it read into memory before Transaction #2 was executed. In this case, the update performed by Transaction #2 can be considered a lost update.
So it looks like the difference is that in the first scenario the whole update happens out of "local memory" while in the second one there's "local memory" used and this makes a difference.
Suppose I have the following code:
UPDATE MagicTable SET MagicColumn = MagicColumn + 10 WHERE SomeCondition
Does this involve "local memory"? Is it prone to the first or to the second interpretation of lost updates?
I suppose it would come under the second interpretation.
However the way this type of UPDATE is implemented in SQL Server a lost update is still not possible. Rows read for the update are protected with a U lock (converted to X lock when the row is actually updated).
U locks are not compatible with other U locks (or X locks)
So at all isolation levels if two concurrent transactions were to run this statement then one of them would end up blocked behind the other transaction's U lock or X lock and would not be able to proceed until that transaction completes.
Therefore it is not possible for lost updates to occur with this pattern in SQL Server at any isolation level.
To achieve a lost update you would need to do something like
BEGIN TRAN
DECLARE #MagicColumn INT;
/*Two concurrent transactions can both read the same pre-update value*/
SELECT #MagicColumn = MagicColumn FROM MagicTable WHERE SomeCondition
UPDATE MagicTable SET MagicColumn = #MagicColumn + 10 WHERE SomeCondition
COMMIT

Is there a difference between a SELECT statement inside a transaction and one that is outside of it?

Does the default READ COMMITTED isolation level somehow makes the SELECT statement act different inside of a transaction than one that is not in a transaction?
I am using MS SQL.
Yes, the one inside the transaction can see changes made by other previous Insert/Update/delete statements in that transaction; a Select statement outside the transaction cannot.
If all you are asking about is what the Isolation Level does, then understand that all Select statements (hey, all statements of any kind) - are in a transaction. The only difference between one that is explicitly in a transaction and one that is standing on its own is that the one that is standing alone starts its transaction immediately before it executes it, and commits or roll back immediately after it executes;
whereas the one that is explicitly in a transaction can (because it has a Begin Transaction statement) can have other statements (inserts/updates/deletes, whatever) occurring within that same transaction, either before or after that Select statement.
So whatever the isolation level is set to, both selects (inside or outside an explicit transaction) will nevertheless be in a transaction which is operating at that isolation level.
Addition:
The following is for SQL Server, but all databases MUST work in the same way. In SQL Server the Query Processor is always in one of 3 Transaction Modes, AutoCommit, Implicit, or Explicit.
AutoCommit is the default transaction management mode of the SQL Server Database Engine. .. Every Transact-SQL statement is committed or rolled back when it completes. ... If a statement completes successfully, it is committed; if it encounters any error, it is rolled back. This is the default, and is the answer to #Alex's question in the comments.
In Implicit Transaction mode, "... the SQL Server Database Engine automatically starts a new transaction after the current transaction is committed or rolled back. You do nothing to delineate the start of a transaction; you only commit or roll back each transaction. Implicit transaction mode generates a continuous chain of transactions. ..." Note that the italicized snippet is for each transaction, whether it be a single or multiple statement transaction.
The engine is placed in Explicit Transaction mode when you explicitly initiate a transaction with BEGIN TRANSACTION Statement. Then, every statement is executed within that transaction until you explicitly terminate the transaction (with COMMIT or ROLLBACK) or if a failure occurs that causes the engine to terminate and Rollback.
Yes, there is a bit of a difference. For MySQL, the database doesn't actually start with a snapshot until your first query. Therefore, it's not begin that matters, but the first statement within the transaction. If I do the following:
#Session 1
begin; select * from table;
#Session 2
delete * from table; #implicit autocommit
#Session 1
select * from table;
Then I'll get the same thing in session one both times (the information that was in the table before I deleted it). When I end session one's transaction (commit, begin, or rollback) and check again from that session, the table will show as empty.
The READ COMMITTED isolation level is about the records that have been written. It has nothing to do with whether or not this select statement is in a transaction (except for those things written during that same transaction).
If your database (or in mysql, the underlying storage engine of all tables used in your select statement) is transactional, then there simply no way to execute it "outside of a transaction".
Perhaps you meant "run it in autocommit mode", but that is not the same as "not transactional". In the latter case, it still runs in a transaction, it's just that the transaction ends immediately after your statement is finshed.
So, in both cases, during the run, a single select statement will be isolated at the READ COMMITTED level from the other transactions.
Now what this means for your READ COMMITTED transaction isolation level: perhaps surprisingly, not that much.
READ COMMITTED means that you may encounter non-repeatable reads: when running multiple select statements in the same transaction, it is possible that rows that you selected at a certain point in time are modified and comitted by another transaction. You will be able to see those changes when you re-execute the select statement later on in the same pending transaction. In autocommit mode, those 2 select statements would be executed in their own transaction. If another transaction would have modified and committed the rows you selected the first time, you would be able to see those changes just as well when you executed the statement the second time.