READ UNCOMMITTED and Estimates - sql-server-2005

From time to time, I want to run a stored procedure to get a rough estimate of how many records in two or three different tables satisfy some criteria. If during this estimate new records are added, deleted or updated, there is not really a problem (I just want a rough estimate). That being, I can afford for this process using SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED. However, I have two questions about this:
1) Since I am only using SELECT COUNT(*) instructions, do I really need to wrap these statement in a BEGIN/COMMIT TRANSACTION block?
2) Do I need to SET TRANSACTION ISOLATION LEVEL READ COMMITTED back in the end of the stored procedure, or will this be automatically set once its execution ends?

No. Reads don't need to be in a transaction
The SET is scoped only for the stored procedure. See my answer here: Is it okay if from within one stored procedure I call another one that sets a lower transaction isolation level?. However, you'd use the NOLOCK hint rather then SET: SELECT COUNT(*) FROM myTable WITH (NOLOCK).
If you want an approximate count without WHERE filters, then use sys.dm_db_partition_stats. See my answer here: Fastest way to count exact number of rows in a very large table?

1) No. It uses implicit transaction for each statement if you do not specify the transaction scope. You do not have to put explicit transaction scope for making 'set transaction isolation level to work'.
2) You don't have to reset it to original. It will be taken care by SQL Server. Please refer to this SO entry: Transaction Isolation Level Scopes

Related

Using NOLOCK for reading single static row. Whats the harm?

Can anyone with DEADLOCK experience enlighten me?
I read that it can cause log file corruption - is that possible? I think MS would never do that. Also if "some situations", like mine, are okay with DEADLOCK, why not use it?
I have no datasets, return tables (like other posts in Stack Overflow). I have one SQL statement with ID select which returns only one row like:
sqlstr = "SELECT Parameter1 FROM Companies WITH (NOLOCK) WHERE ID = 25
Also, this parameter does not change. But as this is a heavy load aspnet application (not a web site) and I run this kind of query again and again, every SQL read causes a lock in SQL server. If possible I'd prefer to avoid that.
Every post in this site is about multiple records, recordsets, dirty reads. I could not find anything about "reading single record which is not changing all the time".
Any expert's opinion, please?
This simple select statement when executed without any lock/nolock hints under default transaction isolation level , obtains a shared lock on the row, It means other users can also read this row while its being read by this query.
On the other hand when you specify WITH (NOLOCK) query hint, it does not obtain any locks at all. In this case again other users can read this row as well but you might be reading a dirty row (data that has not been committed to disk yet and is in the process of being modified).
So in either case this simple select will not cause a deadlock. So really the question you should be asking yourself is, should users be able to see dirty data or not? and in most cases the answer would be no.
Therefore do not worry about getting deadlocks with this select query. as long as you are using default transaction isolation level. In a more strict isolation level like seriallizable a select can lock out other users but in default isolation level you should be ok.
NOLOCK has two main disadvantages: It can return uncommitted data (you don't seem worried about that) and it can cause queries to spuriously fail under very rare circumstances. Never will NOLOCK cause physical database corruption.
Consider using snapshot isolation for transactions that only read data. Readers under SI do not lock or block. SI takes them out of the picture. It provides perfect consistency for read-only transactions. Be sure to find out about the drawbacks.
It isn't worth it.
NOLOCK is often exploited as a magic way to speed up database reads, but I try to avoid using it whever possible.
The result set can contain rows that have not yet been committed, that are often later rolled back.
An error or Result set can be empty, be missing rows or display the same row multiple times.
This is because other transactions are moving data at the same time you're reading it.
READ COMMITTED adds an additional issue where data is corrupted within a single column where multiple users change the same cell simultaneously.
There are other side-effects too, which result in sacrificing the speed increase you were hoping to gain in the first place.
Now you know, never use it again.
After deep searches and asking questions to many experts I found out that using NOLOCK hint causes no problem in this scenario, yet its not advised. nothing wrong with NOLOCK but as I use sql2014 I "should" use ISOLATION LEVEL option. Its a method came instead of NOLOCK. For example for huge table selects that cause deadlocks:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
SELECT * FROM HugeTable;
COMMIT TRANSACTION;
is very handy.
I had HugeTable and a web form that uses sqlAdapter and Radgrid to show this data. Whenever I run this report, though indexes and paging of radgrid is fine, it caused deadlock, which makes sense. I changed select statement of sqlAdapter to above sentence, its perfect now.
best.

Differences between TRANSACTION's levels: READ WRITE and ISOLATION LEVEL SERIALIZABLE

What's the differences between these two transaction's levels: READ WRITE and ISOLATION LEVEL SERIALIZABLE ?
As I understand, READ WRITE allows dirty reads, while ISOLATION LEVEL SERIALIZABLE prevents data from changing by other users(think that I'm mistaken here) or just read that data that is available at the beginning of the transaction(don't see the data, that has been changed by other users during my transaction).
You can find detailed information about this topic on the oracle site.
Basically READ COMMITTED allows "nonrepeatable reads" and "phantom reads", while both are prohibited in SERIALIZABLE.
If non-repeatable reads are permitted, the same SELECT query inside of the same transaction, might return different results based on when the query is issued. Other parallel transactions may change the data and this changes might become visible inside your transaction.
If phantom reads are permitted, it can happen that when you issue the same SELECT query twice inside of one transaction, and another transactions inserts rows into the table in parallel, these rows might become visible inside of your transaction, but only in the resultset of the second select. So the same select statement will return for example 5 rows the first time and 10 rows the second time it was executed.
Both properties are similar, but the first only says something about data which may change, while the scond property says something about additional rows which might be returned.

Lock issues on large recordset

I have a database table that I use as a queue system, where separate process that talk to each other create and read entries in the table. For example, when a user initiates a search an entry is created, then another process that runs every second or two will pick up that new entry, update the status and then do a search, updating the entry again when the search is complete. This all seems to work well with thousands of searches per hour.
However, I have a master admin screen that lets me view the status of all of these 'jobs' but it runs very slowly. I basically return all entries in the table for the last hour so I can keep an eye on what's going on. I think that I am running into lock issues of some sort. I only need to read each entry, and don't really care if it the data is a little bit out of date. I just use a standard 'Select * from Table' statement so maybe it is waiting for other locks to expire before returning data as the jobs are constantly updating the data.
Would this be handled better by a certain kind of cursor to return each row one at a time, etc? Any other ideas?
Thanks
If you really don't care if the data is a bit out of date... or if you only need the data to be 99.99% accurate, consider using WITH (NOLOCK):
SELECT * FROM Table WITH (NOLOCK);
This will instruct your query to use the READ UNCOMMITTED ISOLATION LEVEL, which has the following behavior:
Specifies that dirty reads are allowed. No shared locks are issued to
prevent other transactions from modifying data read by the current
transaction, and exclusive locks set by other transactions do not
block the current transaction from reading the locked data.
Be aware that NOLOCK may cause some inaccuracies in your data, so it probably isn't a good idea to use it throughout the rest of your system.
You need FROM yourtable WITH (NOLOCK) table hint.
You may also want to look at transaction isolation in your update process, if you aren't already
An alternative to NOLOCK (which can lead to very bad things, such as missed rows or duplicated rows) is to allow read committed snapshot isolation at the database level and then issue your query with:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

Does it make sense to begin a transaction where there will be only some data-retrieval operations?

Does it make sense to begin a transaction where there will be only some data-retrieval operations and no UPDATE or INSERT will occur?
Thanks!
Not normally.
If you have 2 SELECTs they could become inconsistent in the fraction of second between reads.
A transaction won't fix this for SQL Server/Sybase type locking because read locks will be released. So you'd need to use higher isolation levels which will affect concurrency (potentially quite serious)
The trade off between "tiny risk of inconsistent data" and "loss of performance" is up to you.
Starting a transaction for that case insures that the data seen will be consistent; some other process will not be able to update the rows you are looking at such that the second SELECT sees something different from the first when it should be seeing the same thing.
Yes, to ensure transaction-leven read consistency.
Ensuring Repeatable Reads with Read-Only Transactions
By default, the consistency model for Oracle guarantees statement-level read consistency, but does not guarantee transaction-level read consistency (repeatable reads). If you want transaction-level read consistency, and if your transaction does not require updates, then you can specify a read-only transaction. After indicating that your transaction is read-only, you can execute as many queries as you like against any database table, knowing that the results of each query in the read-only transaction are consistent with respect to a single point in time.
http://download.oracle.com/docs/cd/B10501_01/appdev.920/a96590/adg08sql.htm

Difference between "read commited" and "repeatable read" in SQL Server

I think the above isolation levels are so alike. Could someone please describe with some nice examples what the main difference is?
Read committed is an isolation level that guarantees that any data read was committed at the moment is read. It simply restricts the reader from seeing any intermediate, uncommitted, 'dirty' read. It makes no promise whatsoever that if the transaction re-issues the read, will find the Same data, data is free to change after it was read.
Repeatable read is a higher isolation level, that in addition to the guarantees of the read committed level, it also guarantees that any data read cannot change, if the transaction reads the same data again, it will find the previously read data in place, unchanged, and available to read.
The next isolation level, serializable, makes an even stronger guarantee: in addition to everything repeatable read guarantees, it also guarantees that no new data can be seen by a subsequent read.
Say you have a table T with a column C with one row in it, say it has the value '1'. And consider you have a simple task like the following:
BEGIN TRANSACTION;
SELECT * FROM T;
WAITFOR DELAY '00:01:00'
SELECT * FROM T;
COMMIT;
That is a simple task that issue two reads from table T, with a delay of 1 minute between them.
under READ COMMITTED, the second SELECT may return any data. A concurrent transaction may update the record, delete it, insert new records. The second select will always see the new data.
under REPEATABLE READ the second SELECT is guaranteed to display at least the rows that were returned from the first SELECT unchanged. New rows may be added by a concurrent transaction in that one minute, but the existing rows cannot be deleted nor changed.
under SERIALIZABLE reads the second select is guaranteed to see exactly the same rows as the first. No row can change, nor deleted, nor new rows could be inserted by a concurrent transaction.
If you follow the logic above you can quickly realize that SERIALIZABLE transactions, while they may make life easy for you, are always completely blocking every possible concurrent operation, since they require that nobody can modify, delete nor insert any row. The default transaction isolation level of the .Net System.Transactions scope is serializable, and this usually explains the abysmal performance that results.
And finally, there is also the SNAPSHOT isolation level. SNAPSHOT isolation level makes the same guarantees as serializable, but not by requiring that no concurrent transaction can modify the data. Instead, it forces every reader to see its own version of the world (its own 'snapshot'). This makes it very easy to program against as well as very scalable as it does not block concurrent updates. However, that benefit comes with a price: extra server resource consumption.
Supplemental reads:
Isolation Levels in the Database Engine
Concurrency Effects
Choosing Row Versioning-based Isolation Levels
Repeatable Read
The state of the database is maintained from the start of the transaction. If you retrieve a value in session1, then update that value in session2, retrieving it again in session1 will return the same results. Reads are repeatable.
session1> BEGIN;
session1> SELECT firstname FROM names WHERE id = 7;
Aaron
session2> BEGIN;
session2> SELECT firstname FROM names WHERE id = 7;
Aaron
session2> UPDATE names SET firstname = 'Bob' WHERE id = 7;
session2> SELECT firstname FROM names WHERE id = 7;
Bob
session2> COMMIT;
session1> SELECT firstname FROM names WHERE id = 7;
Aaron
Read Committed
Within the context of a transaction, you will always retrieve the most recently committed value. If you retrieve a value in session1, update it in session2, then retrieve it in session1again, you will get the value as modified in session2. It reads the last committed row.
session1> BEGIN;
session1> SELECT firstname FROM names WHERE id = 7;
Aaron
session2> BEGIN;
session2> SELECT firstname FROM names WHERE id = 7;
Aaron
session2> UPDATE names SET firstname = 'Bob' WHERE id = 7;
session2> SELECT firstname FROM names WHERE id = 7;
Bob
session2> COMMIT;
session1> SELECT firstname FROM names WHERE id = 7;
Bob
Makes sense?
Simply the answer according to my reading and understanding to this thread and #remus-rusanu answer is based on this simple scenario:
There are two transactions A and B.
Transaction B is reading Table X
Transaction A is writing in table X
Transaction B is reading again in Table X.
ReadUncommitted: Transaction B can read uncommitted data from Transaction A and it could see different rows based on B writing. No lock at all
ReadCommitted: Transaction B can read ONLY committed data from Transaction A and it could see different rows based on COMMITTED only B writing. could we call it Simple Lock?
RepeatableRead: Transaction B will read the same data (rows) whatever Transaction A is doing. But Transaction A can change other rows. Rows level Block
Serialisable: Transaction B will read the same rows as before and Transaction A cannot read or write in the table. Table-level Block
Snapshot: every Transaction has its own copy and they are working on it. Each one has its own view
Old question which has an accepted answer already, but I like to think of these two isolation levels in terms of how they change the locking behavior in SQL Server. This might be helpful for those who are debugging deadlocks like I was.
READ COMMITTED (default)
Shared locks are taken in the SELECT and then released when the SELECT statement completes. This is how the system can guarantee that there are no dirty reads of uncommitted data. Other transactions can still change the underlying rows after your SELECT completes and before your transaction completes.
REPEATABLE READ
Shared locks are taken in the SELECT and then released only after the transaction completes. This is how the system can guarantee that the values you read will not change during the transaction (because they remain locked until the transaction finishes).
Trying to explain this doubt with simple diagrams.
Read Committed: Here in this isolation level, Transaction T1 will be reading the updated value of the X committed by Transaction T2.
Repeatable Read: In this isolation level, Transaction T1 will not consider the changes committed by the Transaction T2.
I think this picture can also be useful, it helps me as a reference when I want to quickly remember the differences between isolation levels (thanks to kudvenkat on youtube)
Please note that, the repeatable in repeatable read regards to a tuple, but not to the entire table. In ANSC isolation levels, phantom read anomaly can occur, which means read a table with the same where clause twice may return different return different result sets. Literally, it's not repeatable.
My observation on initial accepted solution.
Under RR (default mysql) - If a tx is open and a SELECT has been fired, another tx can NOT delete any row belonging to previous READ result set until previous tx is committed (in fact delete statement in the new tx will just hang), however the next tx can delete all rows from the table without any trouble. Btw, a next READ in previous tx will still see the old data until it is committed.