uncommitted reads? - sql

I was looking at potential concurrency issues in DB so i went to read up. I found http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/c0005267.htm and it mentions access to uncommitted data.
Access to uncommitted data.
Application A might update a value in
the database, and application B might
read that value before it was
committed. Then, if the value of A is
not later committed, but backed out,
the calculations performed by B are
based on uncommitted (and presumably
invalid) data.
What... i thought other sessions (same app and even same thread) can read data that has not been committed yet? I thought only the connection/session (i am not sure of my terminology) that wrote the data into the uncommitted transaction can read uncommitted data.
Can other threads really read data that hasnt been committed?
I plan to use mysql but i may use sqlite

What other sessions can read depends on how you set up your database. In MySQL it also depends on what database engine you use. The term you're looking for (in ANSI SQL terms) is "isolation level".
Many databases will default to an isolation level where reads on uncommitted data will block. So if transaction A updates record 1234 in table T and then transaction B tries to select record 1234 before A commits or rolls back then B will block until A does one of those things.
See MySQL Transactions, Part II - Transaction Isolation Levels.
One serious downside of this is that batch update operations that live in long-running transactions (typically) can potentially block many requests.
You can also set it so B will see uncommitted data but that is often ill-advised.
Alternatively you can use a scheme called MVCC ("Multiversion concurrency control"), which will give different transactions a consistent view of the data based on the time the transaction started. This avoids the uncommitted read problem (reading data that may be rolled back) and is much more scalable, especially in the context of long-lived transactions.
MySQL supports MVCC.

Certainly in SQL Server you can, you have to chose to do it, it is not the default, but if you use the right isolation level or query hint you can chose to read an uncommitted row, this can leads to problems and even a double read of the same row in theory.

That article mentions access to uncommitted data as one of the problems eliminated by the database manager.
The database manager controls this
access to prevent undesirable effects,
such as:
...
Access to uncommitted data.
MySQL's InnoDB storage engine supports several transaction isolation levels. For details, see
http://dev.mysql.com/doc/refman/5.4/en/set-transaction.html.

For some versions of some databases, setting queries to be able to read uncommitted will improve performance, because of reduced locking. That still leaves questions of security, reliability, and scalability to be answered.
To give a specific, I used to work on a very large e-commerce site. They used read uncommitted on reads to the store catalog, since the data was heavily accessed, infrequently changed, and not sensitive to concerns about reading uncommitted data. Any data from the catalog that was used to place an order would be re-verified anyway. This was on SQL Server 2000, which was known to have locking performance problems. On newer versions of SQL Server, the locking performance has improved, so this wouldn't be necessary.

Related

Is it possible to set no lock or TRANSACTION ISOLATION LEVEL READ UNCOMMITTED at database level?

In our application as per recommendation from DBA, we are adding no lock hint to each and every select query used.
So, it requires that each select query needed to be modified to set the table hint and it will be time to do manually.
As we want to use the hint across all tables in database, Is it possible to set the no lock hint ( or TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ) at database level so that each query is not needed to be modified and table hint be applied to all queries?
The short answer is "no". The default isolation level in SQL Server is READ COMMITTED, and there is no way to change this to UNCOMMITTED, either globally or per-database. And that's a very good thing too.
WITH (NOLOCK) is a recipe for trouble when it comes to getting accurate results from your database, and in bad cases it can even cause timeouts from queries that run forever due to data getting moved (which NOLOCK cannot protect against). See Is the NOLOCK (Sql Server hint) bad practice? for some more discussion, and some good tips on alternatives.
In particular, many applications that are reader-heavy and want to proceed without blocking can benefit from snapshot isolation. Unlike UNCOMMITTED, you can make snapshot isolation the default with the READ_COMMITTED_SNAPSHOT option. Be sure to read up on the pros and cons of snapshot isolation before you do this -- or better yet, ask your DBA to do this, as any DBA who recommends a global use of WITH (NOLOCK) has some reading up to do. Query hints should be used only as a last resort.

For a long running report, do I use a read only or serializable transaction?

I have a long running report written in SQL*Plus with a couple of SELECTs.
I'd like to change the transaction isolation level to get a consistent view on the data. I found two possible solutions:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
and
SET TRANSACTION READ ONLY;
Which one do I use for a report and why? Any performance implications? Any implications on other sessions (this is a production database).
Please note that the question is specifically about the two options above, not about the various isolation levels.
Does SERIALIZABLE blocks changes to a table that is queried for my report?
I would naively assume that READ ONLY is a little bit less stressful for the database, as there are no data changes to be expected. Is this true, does Oracle take advantage of that?
In Oracle, you can really choose between SERIALIZABLE and READ COMMITTED.
READ ONLY is the same as serializable, in regard to the way it sees other sessions' changes, with the exception it does not allow table modifications.
With SERIALIZABLE or READ ONLY your queries won't see the changes made to the database after your serializable transaction had begun.
With READ COMMITTED, your queries won't see the changes made to the database during the queries' lifetime.
SERIALIZABLE READ COMMITTED ANOTHER SESSION
(or READ ONLY)
Change 1
Transaction start Transaction start
Change 2
Query1 Start Query1 Start
... ... Change 3
Query1 End Query1 End
Query2 Start Query2 Start
... ... Change 4
Query2 End Query2 End
With serializable, query1 and query2 will only see change1.
With read committed, query1 will see changes 1 and 2, and query2 will see changes 1 through 3.
This article from the Oracle documentation gives a lot of detailed info about the different transaction isolation levels. http://docs.oracle.com/cd/B10501_01/server.920/a96524/c21cnsis.htm
In your example, it sounds like you are wanting Serializable. Oracle does not block when reading data, so using serializable in your read-only query should not block queries or crud operations in other transactions.
As mentioned in other answers, using the read only isolation level is similar to using serializable, except that read only does not allow inserts, updates, or deletes. However, since read only is not an SQL standard and serializable is, then I would use serializable in this situation since it should accomplish the same thing, will be clear for other developers in the future, and because Oracle provides more detailed documentation about what is going with "behind the scenes" with the serializable isolation level.
Here is some info about serializable, from the article referenced above (I added some comments in square brackets for clarification):
Serializable isolation mode provides somewhat more consistency by
protecting against phantoms [reading inserts from other transactions] and nonrepeatable reads [reading updates/deletes from other transactions] and can be
important where a read/write transaction executes a query more than
once.
Unlike other implementations of serializable isolation, which lock
blocks for read as well as write, Oracle provides nonblocking queries [non-blocking reads]
and the fine granularity of row-level locking, both of which reduce
write/write contention.
.. a lot of questions.
Both isolation levels are equivalent for sessions that only use selects. So it does not make a difference which one you choose. (READ ONLY is not a ANSI Standard)
Except performance influences, there are no implications from other sessions or to other sessions inside a session with transaction isolation level SERIALIZABLE or READ ONLY, unless you commit anything in this session (SERIALIZABLE only).
Performance of your select inside these two isolation levels should not differ because you don't change data there.
Performance using one of these two isolation levels compared against Oracle default READ COMMITTED is not optimal. Especially if a lot of data is changing during your SERIALIZABLE transaction, you can expect a performance downside.
I would naively assume that READ ONLY is a little bit less stressful for the database, as there are no data changes to be expected. Is this true, does Oracle take advantage of that?
=> No.
hope this helps.
Interesting question. I believe that SERIALIZABLE and READ ONLY would have the same "tax" on the database, and would be greater than that of READ COMITTED (usually the default). There shouldn't be a significant performance difference to you or other concurrent users. However, if the database can't maintain your read consistency due to too small of UNDO tablespace or too short undo_retention (default is 15 minutes), then your query will fail with the infamous ORA-01555. Other users shouldn't experience pain, unless there are other users trying to do something similar. Ask your DBA what the undo_retention parameter is set at and how big the UNDO tablespace and whether or not it's autoextensible.
If there's a similarly sized non-prod environment, try benchmarking your queries with different isolation levels. Check the database for user locks after the 1st query runs (or have your DBA check if you don't have enough privileges). Do this test several times, each with different isolation levels. Basically, documentation is great, but an experiment is often quicker and unequivocal.
Finally, to dodge the issue completely, is there any way you could combine your two queries into one, perhaps with a UNION ALL? This depends largely on the relationship of your two queries. If so, then the question becomes moot. The one combined query would be self-consistent.

Understanding the NOLOCK hint

Let's say I have a table with 1,000,000 rows and running a SELECT * FROM TableName on this table takes around 10 seconds to return the data.
Without a NOLOCK statement (putting to one side issues around dirty reads) would this query lock the table for 10 seconds meaning that no other process could read or write to the table?
I am often informed by DBAs then when querying live data to diagnose data issues I should use NOLOCK to ensure that I don't lock the table which may cause issues for users. Is this true?
The NOLOCK table hint will cause that no shared locks will be taken for the table in question; same with READUNCOMMITTED isolation level, but this time applies not to a single table but rather to everything involved. So, the answer is 'no, it won't lock the table'. Note that possible schema locks will still be held even with readuncommitted.
Not asking for shared locks the read operation will potentially read dirty data (updated but not yet committed) and non-existent data (updated, but rolled back), transactionally inconsistent in any case, and migh even skip the whole pages as well (in case of page splits happening simultaneously with the read operation).
Specifying either NOLOCK or READUNCOMMITTED is not considered a good practice. It will be faster, of course. Just make sure you're aware of consequences.
Also, support for these hints in UPDATE and DELETE statements will be removed in a future version, according to docs.
Even with NOLOCK, SQL Server can choose to override and obtain a lock, nevertheless. It is called a QUERY **HINT** for a reason. The sure fire way of avoiding locks would be to use SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED in the beginning of the session.
Your query will try to obtain a table level lock as you're asking for all the data. So yes it will be held for the duration of the query and all other processes trying to acquire Exclusive locks on that table with be put on wait queues. Processes that are also trying to read from that table will not be blocked.
Any diagnosis performed on a live system should done with care and using the NOLOCK hint will allow you to view data without creating any contention for users.
EDIT: As pointed out update locks are compatible with shared locks. So won't be blocked by the read process.

Which database isolation we should use for the same , and which isolation level is best?

In SQL server 2005,
I have so many stored procedure , some of used update table records with transactions, some used for get the table records.
when the one SP is calling in one seetion which are updaing the table records, at thet time the if i run another SP for getiing table data , then it should be run without waiting, what i need to do?
Which database isolation we should use for the same , and which isolation level is best?
It can be poosible with "transalation snapshot isolation level". but it will pick that old snapshot data in Teampdb database, which can be degrade performance.
what you suggest?
READ COMMITTED is the default transaction isolation level in SQL Server.
Use of any other isolation level (to me, anyway) constitutes a code smell — to me, at least — that requires some real justification, with the possible exception of the limited use of READ UNCOMMITTED in certain contexts.
You should use SNAPSHOT isolation level. The best is to turn on READ COMMITTED SNAPSHOT at the database level, which will silently transform all the default READ COMMITTED transactions into snapshot transactions.
The reasons why SNAPSHOT is best for applications, specially for applications that can run into blocking due to concurrency issues, are countless, and the benefits are endless. It is true that SNAPSHOT isolation occurs a cost in resources used, but unless you measured and find conclusive evidence that is the row version store that is causing the problems, you cannot dismiss it upfront.
The one isolation level one should never use is UNCOMMITTED. That is asking for trouble. Dirty reads are inconsistent reads.
REPEATABLE and SERIALIZABLE have extremely narrow use cases and you'll most likely never need them. Unfortunately SERIALIZABLE is abused by the .Net System.Transactions and by the MTS/COM+ so a lot of applications end up using it (andf having huge scalability issues because of it), although it is not required.
If first transaction updated data in some table then second will wait anyway to get this data (except READ UNCOMMITTED isolation level, but in this case you can have very inconsistent data).

Sql 2005 Locking for OLTP - Committed or Uncommitted?

A DBA that my company hired to troubleshoot deadlock issues just told me that our OLTP databases locking problems will improve if we set the transaction level to READ COMMITTED from READ UNCOMMITTED.
Isn't that just 100% false? READ COMMITTED will cause more locks, correct?
More Details:
Our data is very "siloed" and user specific. 99.9999999 % of all user interactions work with your own data and our dirty read scenarios, if they happen, can barely effect what the user is trying to do.
Thanks for all the answers, the dba in question ended up being useless, and we fixed the locking issues by adding a single index.
I regret that I didn't specify the locking problems were occurring for update statements and not regular selects. From my googing the two different query types have distinct solutions when dealing with locking issues.
That does sound like a bit of a rash decision, however without all the details of your environment it is difficult to say.
You should advise your DBA to consider the use of SQL Server's advanced isolation features, i.e. the use of Row Versioning techniques. This was introduced to SQL Server 2005 to specifically address issues with OLTP database that experience high locking.
The following white paper contains quite complicated subject matter but it is a must read for all exceptional DBA's. It includes example of how to use each of the additional isolation levels, in different types of environments i.e. OLTP, Offloaded Reporting Environment etc.
http://msdn.microsoft.com/en-us/library/ms345124.aspx
In summary it would be both foolish and rash to modify the transaction isolation for all of your T-SQL queries without first developing a solid understanding of how the excessive locking is occuring within your environment.
I hope this helps but please let me know if you require further clarification.
Cheers!
Doesn't it depend on what your problem is: for example if your problem is a deadlock, mightn't an increase in the locking level cause an earlier acquisition of locks and therefore a decreased possibility of deadly embrace?
It the data is siloed and you are still getting deadlocks then you may simply need to add rowlock hints to the queries that are causing the problem so that the locks are taken at the row level and not the page level (which is the default).
READ UNCOMMITTED will reduce the number of locks if you are locking data because of SELECT statements. If you are locking data because of INSERT, UPDATE and DELETE statements then changing the isolation level to READ UNCOMMITTED won't do anything for you. READ UNCOMMITTED has the same effect as adding WITH (NOLOCK) to your queries.
This sounds scary. Do you really want to just change these parameters to avoid deadlocks? Maybe the data needs to be locked?
That said, it could be that the DBA is referring to the new (as of SQL Server 2005) READ COMMITTED SNAPSHOT that uses row versioning and can eliminate some kinds of deadlocks.
http://www.databasejournal.com/features/mssql/article.php/3566746/Controlling-Transactions-and-Locks-Part-5-SQL-2005-Snapshots.htm