How do I select all rows for a table, their isn't part of any transaction that hasn't committed yet?
Example:
Let's say,
Table T has 10 rows.
User A is doing a transaction with some queries:
INSERT INTO T (...)
SELECT ...
FROM T
// doing other queries
Now, here comes the tricky part:
What if User B, in the time between User A inserted the row and the transaction was committed, was updating a list in the system with a select on Table T.
I only want that the SELECT User B is using returned the 10 rows(all rows from the table, that can't later be rolled back). How do I do this, if it's even possible?
I have tried setting the isolationlevel on the transaction and adding "WITH(NOLOCK)" "WITH(READUNCOMMITTED)" to the query without any luck.
The query either return all 11 records or it's waiting for the transaction to commit, and that's not what I need.
Any tips is much appriciated, thanks.
You need to use (default) read committed isolation level and the READPAST hint to skip rows locked as they are not committed (rather than being blocked waiting for the locks to be released)
This does rely on the INSERT taking out rowlocks though. If it takes out page locks you will be back to being blocked. Example follows
Connection 1
IF OBJECT_ID('test_readpast') IS NULL
BEGIN
CREATE TABLE test_readpast(i INT PRIMARY KEY CLUSTERED)
INSERT INTO test_readpast VALUES (1)
END
BEGIN TRAN
INSERT INTO test_readpast
WITH(ROWLOCK)
--WITH(PAGLOCK)
VALUES (2)
SELECT * FROM sys.dm_tran_locks WHERE request_session_id=##SPID
WAITFOR DELAY '00:01';
ROLLBACK
Connection 2
SELECT i
FROM test_readpast WITH (readpast)
Snapshot isolation ?
Either I or else the three people who have answered early have misread/ misinterpreted your question, so I have given a link so you can determine for yourself.
Actually, read uncommitted and nolock are the same. They mean you get to see rows that have not been committed yet.
If you run at the default isolation level, read committed, you will not see new rows that have not been committed. This should work by default, but if you want to be sure, prefix your select with set transaction isolation level read committed.
Related
Test setup
I have a SQL Server 2014 and a simple table MyTable that contains columns Code (int) and Data (nvarchar(50)), no indexes created for this table.
I have 4 records in the table in the following manner:
1, First
2, Second
3, Third
4, Fourth
Then I run the following query in a transaction:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRANSACTION
DELETE FROM dbo.MyTable
WHERE dbo.MyTable.Code = 2
I have one affected row and I don't issue either Commit or Rollback.
Next I start yet another transaction:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
SELECT TOP 10 Code, Data
FROM dbo.MyTable
WHERE Code = 3
At this step the transaction with the SELECT query hangs waiting for completion of the transaction with the DELETE query.
My question
I don't understand why the transaction with SELECT query is waiting for the transaction with the DELETE query. In my understanding the deleted row (with code 2) has nothing to do with the selected row (with code 3) and as far as I understand the specific of isolation level SERIALIZABLE SQL Server shouldn't lock entire table in this case. Maybe this happens because the minimal locking amount for SERIALIZABLE is a page? Then it could produce an inconsistent behavior for selecting rows from some other pages if the table would have more rows, say 1000000 (some rows from other pages wouldn't be locked then). Please help to figure out why the locking takes place in my case.
Under locking READ COMMITTED, REPEATABLE READ, or SERIALIZABLE a SELECT query must place Shared (S) locks for every row the query plan actually reads. The locks can be placed either at the row-level, page-level, or table-level. Additionally SERIALIZABLE will place locks on ranges, so that no other session could insert a matching row while the lock is held.
And because you have "no indexes created for this table", this query:
SELECT TOP 10 Code, Data
FROM dbo.MyTable
WHERE Code = 3
Has to be executed with a table scan, and it must read all the rows (even those with Code=2) to determine whether they qualify for the SELECT.
This is one reason why you should almost always use Row-Versioning, either by setting the database to READ COMMITTED SNAPSHOT, or by coding read-only transactions to use SNAPSHOT isolation.
I need to implement a serializable isolation level in SQL Server but I've tried many ways and I don't get it.
I need to lock 1 row in one transaction (It doesn´t matter if lock the complete table). So, another transaction can´t even select the row (don´t read).
The last thing I tried:
For transaction 1:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
-- Here I select in another instance the same row
COMMIT TRAN
For transaction 2:
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
COMMIT TRAN
I would expect that transaction 2 wait until transaction 1 commit the operation, but the transaction 2 gives me the row.
Anyone can explain me if I miss something?
SQL Server conforms to the strict definition of a Serializable query. That is, there must be a result that can logically be generated IF both queries ran in serial order - Transaction 1 finishing before Transaction 2 can start, or vice versa.
This results in some effects that can be different than you would expect. There is a great explanation of the Serializable isolation level over at SQLPerformance.com that makes clear some of what this logical serializability ends up meaning. (Very helpful site, that one.)
For your above queries, there is no logical requirement to prevent the second query from reading the same row as the first query. No matter in what order the queries are run, they will both return the same data without modifying it. Since the Query Analyzer can identify this, there is no reason to place a read lock on the data. However, if one of the queries performed an update on the data, then (warning - logic assumption here, since I don't actually know the internals of how SQL Server handles this) the QA would set a stronger lock on the selected rows.
TL;DR - SQL Server wants to minimize blocking, so it uses logical analysis to see what types of locks are needed for a serializable isolation level, and it (tries to) use the minimum number and strength of locks needed to achieve its goal.
Now that we've dealt with that - there are only two ways that I can think of to lock a row so that no one else can read it: using XLOCK + TABLOCK (locking the whole table - not a recommended practice) or having some form of a field on each row that is updated when you start your process - something like an SPID field, or a bit flag for Locked. When you update it within your transaction, only SELECTs with NOLOCK hints will be able to read it.
Clearly, neither of these are optimal. I recommend the "This row is busy - go away" flag, as that's probably the approach I would take for an (almost) absolute lock on a row.
According to the documentation:
SERIALIZABLE Specifies the following:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the
current transaction until the current transaction completes.
If you're not making any changes to data with an INSERT, UPDATE, or DELETE inside transaction 1, SQL will release the Shared Lock after the read completes.
What you might want to try is adding a table hit to prevent the row lock from being released until the end of transaction 1.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code
FROM table1 WITH(ROWLOCK, HOLDLOCK)
WHERE code = 1
COMMIT TRAN
Maybe you can solve this with some hack like this?
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
UPDATE someTableForThisHack set val = CASE WHEN val = 1 THEN 0 else 1 End
SELECT code from table1.....
COMMIT TRANSACTION
So you create a table someTableForThisHack and insert one row to it.
So here is full paragraph from postgresql 9.6 docs about Read Committed Isolation Level:
Read Committed is the default isolation level in PostgreSQL. When a
transaction uses this isolation level, a SELECT query (without a FOR
UPDATE/SHARE clause) sees only data committed before the query began;
it never sees either uncommitted data or changes committed during
query execution by concurrent transactions. In effect, a SELECT query
sees a snapshot of the database as of the instant the query begins to
run. However, SELECT does see the effects of previous updates executed
within its own transaction, even though they are not yet committed.
Also note that two successive SELECT commands can see different data,
even though they are within a single transaction, if other
transactions commit changes after the first SELECT starts and before
the second SELECT starts.
So basically:
SELECT query sees only data committed before the query began and never sees changes committed during query execution by
concurrent transactions.
But in last sentence it states that:
Also note that two successive SELECT commands can see different data,
even though they are within a single transaction, if other
transactions commit changes after the first SELECT starts and before
the second SELECT starts.
For me it looks contradictory. Could someone elaborate that? How exactly two SELECT queries could see different data within one transaction? Isn`t transaction isolated?
Yes, that's true. To avoid such cases you need to use a higher isolation level: "repeatable read".
Or even "serializable" if you need your transactions to be completely isolated.
Just keep in mind that higher isolation pays higher cost by means of performance.
Here you can find detailed explanation: https://www.postgresql.org/docs/9.1/static/transaction-iso.html
So here is an example how could it happen:
Connection1 opens transaction and reads a row from Table1 with
"select *"
Connection2 updates the same row and commits
Connection1 reads the same row again and gets updated data
The isolation level is "read committed", so literally everything commited becomes visible to other connections.
If you can not use a higher isolation level for some reason, there is a way to prevent such "unexpected" update from happening: your Connection1 can use "select ... for update" instead. This will effectively lock the row until transaction of Connection1 commits or rolls back. So Connection2 will wait for this commit or rollback to be able to update the row.
There is no contradiction. Two consecutive SELECT statements within a transaction may fetch different results. Consider this - you begin a transaction, then issue
select * from emp;
You get 2 records.
Another session inserts a record into emp and commits.
In the first session, you again issue
select * from emp;
You get 3 records. This is expected behavior at READ COMMITTED isolation level.
Sample code
tmp=# begin ;
BEGIN
tmp=# select * from emp;
id
----
1
2
(2 rows)
tmp=# select * from emp;
id
----
1
2
3
(3 rows)
tmp=# commit;
I believe that each SELECT statement in SQL Server will cause either a Shared or Key lock to be placed. But will it place that same type of lock when it is in a transaction? Will Shared or Key locks allow other processes to read the same records?
For example I have the following logic
Begin Trans
-- select data that is needed for the next 2 statements
SELECT * FROM table1 where id = 1000; -- Assuming this returns 10, 20, 30
insert data that was read from the first query
INSERT INTO table2 (a,b,c) VALUES(10, 20, 30);
-- update table 3 with data found in the first query
UPDATE table3
SET d = 10,
e = 20,
f = 30;
COMMIT;
At this point will my select statement still create a shared or key lock or will it get escalated to exclusive lock? Will other transaction be able to read records from the table1 or will all transaction wait until the my transaction is committed before others are able to select from it?
In an application does it makes since to move the select statement outside of a transaction and just keep the insert/update in one transaction?
A SELECT will always place a shared lock - unless you use the WITH (NOLOCK) hint (then no lock will be placed), use a READ UNCOMMITTED transaction isolation level (same thing), or unless you specifically override it with query hints like WITH (XLOCK) or WITH (UPDLOCK).
A shared lock allows other reading processes to also acquire a shared lock and read the data - but they prevent exclusive locks (for insert, delete, update operations) from being acquired.
In this case, with just three rows selected, there will be no lock escalation (that only happens when more than 5000 locks are being acquired by a single transaction).
Depending on the transaction isolation level, those shared locks will be held for different amounts of times. With READ COMMITTED, the default level, the locks is released immediately after the data has been read, while with REPEATABLE READ or SERIALIZABLE levels, the locks will be held until the transaction is committed or rolled back.
Given an UPDATE execution which takes 5 minutes or so, what happens when SELECT tries to retrieve data from the same table? For different Transaction Isolation Levels and SELECT WITH (NOLOCK), does SELECT wait for UPDATE? If not, does SELECT return old data (data before the UPDATE) or part of the currently inserted records (such as 50% of the records currently being inserted) ?
If found the following question, but it only describes what happens when you execute and UPDATE during a long SELECT.
SQL Server - does [SELECT] lock [UPDATE]?
I am using MS SQL Server 2012. Hopefully, this behaviour is consistent for different implementations.
This post by Gavin Draper explains it quite well and contains some example query's.
SQL Server Isolation Levels By Example
Isolation levels in SQL Server control the way locking works between
transactions.
SQL Server 2008 supports the following isolation levels
Read Uncommitted
Read Committed (The default)
Repeatable Read
Serializable
Snapshot
Before I run through each of these in detail you may want to create a
new database to run the examples, run the following script on the new
database to create the sample data. Note : You’ll also want to drop
the IsolationTests table and re-run this script before each example to
reset the data.
CREATE TABLE IsolationTests
(
Id INT IDENTITY,
Col1 INT,
Col2 INT,
Col3 INTupdate te
)
INSERT INTO IsolationTests(Col1,Col2,Col3)
SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
UNION ALL SELECT 1,2,3
Also before we go any further it is important to understand these two
terms….
Dirty Reads – This is when you read uncommitted data, when doing this there is no guarantee that data read will ever be committed
meaning the data could well be bad.
Phantom Reads – This is when data that you are working with has been changed by another transaction since you first read it in.
This
means subsequent reads of this data in the same transaction could
well be different.
Read Uncommitted
This is the lowest isolation level there is. Read uncommitted causes
no shared locks to be requested which allows you to read data that is
currently being modified in other transactions. It also allows other
transactions to modify data that you are reading.
As you can probably imagine this can cause some unexpected results in
a variety of different ways. For example data returned by the select
could be in a half way state if an update was running in another
transaction causing some of your rows to come back with the updated
values and some not to.
To see read uncommitted in action lets run Query1 in one tab of
Management Studio and then quickly run Query2 in another tab before
Query1 completes.
Query1
BEGIN TRAN
UPDATE IsolationTests SET Col1 = 2
--Simulate having some intensive processing here with a wait
WAITFOR DELAY '00:00:10'
ROLLBACK
Query2
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SELECT * FROM IsolationTests
Notice that Query2 will not wait for Query1 to finish, also more
importantly Query2 returns dirty data. Remember Query1 rolls back all
its changes however Query2 has returned the data anyway, this is
because it didn't wait for all the other transactions with exclusive
locks on this data it just returned what was there at the time.
There is a syntactic shortcut for querying data using the read
uncommitted isolation level by using the NOLOCK table hint. You
could change the above Query2 to look like this and it would do the
exact same thing.
SELECT * FROM IsolationTests WITH(NOLOCK)
Read Committed
This is the default isolation level and means selects will only return
committed data. Select statements will issue shared lock requests
against data you’re querying this causes you to wait if another
transaction already has an exclusive lock on that data. Once you have
your shared lock any other transactions trying to modify that data
will request an exclusive lock and be made to wait until your Read
Committed transaction finishes.
You can see an example of a read transaction waiting for a modify
transaction to complete before returning the data by running the
following Queries in separate tabs as you did with Read Uncommitted.
Query1
BEGIN TRAN
UPDATE Tests SET Col1 = 2
--Simulate having some intensive processing here with a wait
WAITFOR DELAY '00:00:10'
ROLLBACK
Query2
SELECT * FROM IsolationTests
Notice how Query2 waited for the first transaction to complete before
returning and also how the data returned is the data we started off
with as Query1 did a rollback. The reason no isolation level was
specified is because Read Committed is the default isolation level for
SQL Server. If you want to check what isolation level you are running
under you can run DBCC useroptions. Remember isolation levels are
Connection/Transaction specific so different queries on the same
database are often run under different isolation levels.
Repeatable Read
This is similar to Read Committed but with the additional guarantee
that if you issue the same select twice in a transaction you will get
the same results both times. It does this by holding on to the shared
locks it obtains on the records it reads until the end of the
transaction, This means any transactions that try to modify these
records are forced to wait for the read transaction to complete.
As before run Query1 then while its running run Query2
Query1
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN
SELECT * FROM IsolationTests
WAITFOR DELAY '00:00:10'
SELECT * FROM IsolationTests
ROLLBACK
Query2
UPDATE IsolationTests SET Col1 = -1
Notice that Query1 returns the same data for both selects even though
you ran a query to modify the data before the second select ran. This
is because the Update query was forced to wait for Query1 to finish
due to the exclusive locks that were opened as you specified
Repeatable Read.
If you rerun the above Queries but change Query1 to Read Committed you
will notice the two selects return different data and that Query2 does
not wait for Query1 to finish.
One last thing to know about Repeatable Read is that the data can
change between 2 queries if more records are added. Repeatable Read
guarantees records queried by a previous select will not be changed or
deleted, it does not stop new records being inserted so it is still
very possible to get Phantom Reads at this isolation level.
Serializable
This isolation level takes Repeatable Read and adds the guarantee that
no new data will be added eradicating the chance of getting Phantom
Reads. It does this by placing range locks on the queried data. This
causes any other transactions trying to modify or insert data touched
on by this transaction to wait until it has finished.
You know the drill by now run these queries side by side…
Query1
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT * FROM IsolationTests
WAITFOR DELAY '00:00:10'
SELECT * FROM IsolationTests
ROLLBACK
Query2
INSERT INTO IsolationTests(Col1,Col2,Col3)
VALUES (100,100,100)
You’ll see that the insert in Query2 waits for Query1 to complete
before it runs eradicating the chance of a phantom read. If you change
the isolation level in Query1 to repeatable read, you’ll see the
insert no longer gets blocked and the two select statements in Query1
return a different amount of rows.
Snapshot
This provides the same guarantees as serializable. So what's the
difference? Well it’s more in the way it works, using snapshot doesn't
block other queries from inserting or updating the data touched by the
snapshot transaction. Instead row versioning is used so when data is
changed the old version is kept in tempdb so existing transactions
will see the version without the change. When all transactions that
started before the changes are complete the previous row version is
removed from tempdb. This means that even if another transaction has
made changes you will always get the same results as you did the first
time in that transaction.
So on the plus side your not blocking anyone else from modifying the
data whilst you run your transaction but…. You’re using extra
resources on the SQL Server to hold multiple versions of your changes.
To use the snapshot isolation level you need to enable it on the
database by running the following command
ALTER DATABASE IsolationTests
SET ALLOW_SNAPSHOT_ISOLATION ON
If you rerun the examples from serializable but change the isolation
level to snapshot you will notice that you still get the same data
returned but Query2 no longer waits for Query1 to complete.
Summary
You should now have a good idea how each of the different isolation
levels work. You can see how the higher the level you use the less
concurrency you are offering and the more blocking you bring to the
table. You should always try to use the lowest isolation level you can
which is usually read committed.
READ UNCOMMITTED: The SELECT can read all kinds of nasty inconsistencies. Old rows, new rows, duplicate rows, missing rows. It can also totally error out with the famous "data movement" error.
READ COMMITTED: Will block without snapshot isolation. Will return the old state with snapshot isolation in perfect consistency.
REPEATABLE READ/SERIALIZABLE: Will block.
SNAPSHOT: Will return the old state with snapshot isolation in perfect consistency.
It sounds like you should read a few concurrency tutorials. I have written these brief facts to get you started. To really understand what's going on to the point that you can make predictions (that come true) you need to go deeper than an answer on Stack Overflow can provide.
Most of the time, you want to use SNAPSHOT for read-only transactions. It takes away all concurrency concerns. Be aware that it has a few drawbacks.