Sql isolation levels, Read and Write locks - sql

A bit lame question but I got confused...
Difference between isolation levels as far as I understood is how they managed their locks (http://en.wikipedia.org/wiki/Isolation_(database_systems)). So as mentioned in the article there are Read, Write and Range locks but there is no definition what they are itself.
What are you allowed to do and what not. When I googled for it there was nothing concrete
and instead I got confused with new terms like Pessimistic Lock an Optimistic Lock, Exclusive lock, Gap lock and so on. I'd be pleased if someone give me a short overview and maybe point me a good bunch materials to enlighten myself.
My initial question which started the research of isolation levels was:
What happens when I have concurrent inserts (different users of web app) into one table when my transactions isolation level is READ_COMMITED. Is the whole table locked or not?
Or generally what happens down there :) ?
Thanks in advance !

What happens when I have concurrent inserts (different users of web
app) into one table when my transactions isolation level is
READ_COMMITED.
"Read committed" means that other sessions cannot see the newly inserted row until its transaction is committed. A SQL statement that runs without an explicit transaction is wrapped in an implicit one, so "read committed" affects all inserts.
Some databases implement "read committed" with locks. For example, a read lock can be placed on the inserted row, preventing other tractions from reading it. Other databases, like Oracle, use multiversion concurrency control. That means they can represent a version of the database before the insert. This allows them to implement "read committed" without locks.

With my understanding, isolation level will decide how and when the locks are to be acquired and released.
Ref: http://aboutsqlserver.com/2011/04/28/locking-in-microsoft-sql-server-part-2-locks-and-transaction-isolation-levels/

This is what I was looking for ...
http://en.wikipedia.org/wiki/Two-phase_locking

Related

How does transaction isolation level work with respect to read/writes and read/write locks?

I understand the dirty read, non-repeatable read and phantom read issue.
Also I have read about isolation levels: read uncommitted, read committed, repeatable read, serializable.
I also understand that reading results in a shared lock. To get a shared lock there shouldnt already be an active exlcusive lock. Where as insert/update/delete results in an exclusive lock. To get an exclusive lock there shouldn't be any other exclusive or shared lock active.
For each level, none of the articles I have read explain the isolation level concept with respect to:
Whether the level is applicable to a read or write transaction or both.
Whether reading/writing enforces any read/write locks different to the above explanation
Transaction is a all or nothing concept with regards to write. Whereas is transaction isolation level a concept with regards to reads only?
If anyone can enlighten regarding these points for each level then it will be very helpful.
You might find these articles by Paul White to be very useful.
But in answer to your questions:
Firstly, Shared vs Exclusive locks define what is allowed to happen concurrently against the lock. The isolation level defines how much is locked and how long for.
Isolation level is applicable to both types of transactions. SNAPSHOT in particular has different effects depending whether a write is involved or not.
There are also Intent locks, which are equivalent versions of other locks and allow a lock to be escalated from Page or Row Lock to Table/Partition.
You also have Schema Modification locks, which prevent anyone changing the table/column (or index) definitions from underneath you (this is applicable even to NOLOCK).
The isolation level defines how much gets locked, is it a row or a range? It also indicates what happens to a lock after it has been used. Is it held until the end of the transaction, or is released as soon as a commit happens?

For a long running report, do I use a read only or serializable transaction?

I have a long running report written in SQL*Plus with a couple of SELECTs.
I'd like to change the transaction isolation level to get a consistent view on the data. I found two possible solutions:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
and
SET TRANSACTION READ ONLY;
Which one do I use for a report and why? Any performance implications? Any implications on other sessions (this is a production database).
Please note that the question is specifically about the two options above, not about the various isolation levels.
Does SERIALIZABLE blocks changes to a table that is queried for my report?
I would naively assume that READ ONLY is a little bit less stressful for the database, as there are no data changes to be expected. Is this true, does Oracle take advantage of that?
In Oracle, you can really choose between SERIALIZABLE and READ COMMITTED.
READ ONLY is the same as serializable, in regard to the way it sees other sessions' changes, with the exception it does not allow table modifications.
With SERIALIZABLE or READ ONLY your queries won't see the changes made to the database after your serializable transaction had begun.
With READ COMMITTED, your queries won't see the changes made to the database during the queries' lifetime.
SERIALIZABLE READ COMMITTED ANOTHER SESSION
(or READ ONLY)
Change 1
Transaction start Transaction start
Change 2
Query1 Start Query1 Start
... ... Change 3
Query1 End Query1 End
Query2 Start Query2 Start
... ... Change 4
Query2 End Query2 End
With serializable, query1 and query2 will only see change1.
With read committed, query1 will see changes 1 and 2, and query2 will see changes 1 through 3.
This article from the Oracle documentation gives a lot of detailed info about the different transaction isolation levels. http://docs.oracle.com/cd/B10501_01/server.920/a96524/c21cnsis.htm
In your example, it sounds like you are wanting Serializable. Oracle does not block when reading data, so using serializable in your read-only query should not block queries or crud operations in other transactions.
As mentioned in other answers, using the read only isolation level is similar to using serializable, except that read only does not allow inserts, updates, or deletes. However, since read only is not an SQL standard and serializable is, then I would use serializable in this situation since it should accomplish the same thing, will be clear for other developers in the future, and because Oracle provides more detailed documentation about what is going with "behind the scenes" with the serializable isolation level.
Here is some info about serializable, from the article referenced above (I added some comments in square brackets for clarification):
Serializable isolation mode provides somewhat more consistency by
protecting against phantoms [reading inserts from other transactions] and nonrepeatable reads [reading updates/deletes from other transactions] and can be
important where a read/write transaction executes a query more than
once.
Unlike other implementations of serializable isolation, which lock
blocks for read as well as write, Oracle provides nonblocking queries [non-blocking reads]
and the fine granularity of row-level locking, both of which reduce
write/write contention.
.. a lot of questions.
Both isolation levels are equivalent for sessions that only use selects. So it does not make a difference which one you choose. (READ ONLY is not a ANSI Standard)
Except performance influences, there are no implications from other sessions or to other sessions inside a session with transaction isolation level SERIALIZABLE or READ ONLY, unless you commit anything in this session (SERIALIZABLE only).
Performance of your select inside these two isolation levels should not differ because you don't change data there.
Performance using one of these two isolation levels compared against Oracle default READ COMMITTED is not optimal. Especially if a lot of data is changing during your SERIALIZABLE transaction, you can expect a performance downside.
I would naively assume that READ ONLY is a little bit less stressful for the database, as there are no data changes to be expected. Is this true, does Oracle take advantage of that?
=> No.
hope this helps.
Interesting question. I believe that SERIALIZABLE and READ ONLY would have the same "tax" on the database, and would be greater than that of READ COMITTED (usually the default). There shouldn't be a significant performance difference to you or other concurrent users. However, if the database can't maintain your read consistency due to too small of UNDO tablespace or too short undo_retention (default is 15 minutes), then your query will fail with the infamous ORA-01555. Other users shouldn't experience pain, unless there are other users trying to do something similar. Ask your DBA what the undo_retention parameter is set at and how big the UNDO tablespace and whether or not it's autoextensible.
If there's a similarly sized non-prod environment, try benchmarking your queries with different isolation levels. Check the database for user locks after the 1st query runs (or have your DBA check if you don't have enough privileges). Do this test several times, each with different isolation levels. Basically, documentation is great, but an experiment is often quicker and unequivocal.
Finally, to dodge the issue completely, is there any way you could combine your two queries into one, perhaps with a UNION ALL? This depends largely on the relationship of your two queries. If so, then the question becomes moot. The one combined query would be self-consistent.

Will transactions stop other code from reading inconsistent data?

I have a stored procedure that inserts into several tables in a single transaction. I know transactions can maintain data consistency in non-concurrent situations by allowing rollbacks after errors, power failure, etc., but if other code selects from these tables before I commit the transaction, could it possibly select inconsistent data?
Basically, can you select uncommitted transactions?
If so, then how do people typically deal with this?
This depends on the ISOLATION LEVEL of the read query rather than the transaction. This can be set centrally on the connection or provided in the SELECT hint.
See:
Connection side: http://msdn.microsoft.com/en-us/library/system.data.isolationlevel.aspx
Database side: http://msdn.microsoft.com/en-us/library/ms173763.aspx
As already mentioned by Aliostad, this depends on the selected isolation level. The Wikipedia article has examples of the different common scenarios.
So yes, you can choose to get uncommitted data, but only by choice. I never did that and I have to admit that the idea seems a bit ... dangerous to me. But there are probably reasonable use cases.
Extending Aliostad's answer:
By default, other reading processes won't read data that is being changed (uncommitted, aka "dirty reads"). This applies to all clients and drivers
You have to override this default deliberately with the NOLOCK hint or changing isolation level to allow "dirty reads".

Which database isolation we should use for the same , and which isolation level is best?

In SQL server 2005,
I have so many stored procedure , some of used update table records with transactions, some used for get the table records.
when the one SP is calling in one seetion which are updaing the table records, at thet time the if i run another SP for getiing table data , then it should be run without waiting, what i need to do?
Which database isolation we should use for the same , and which isolation level is best?
It can be poosible with "transalation snapshot isolation level". but it will pick that old snapshot data in Teampdb database, which can be degrade performance.
what you suggest?
READ COMMITTED is the default transaction isolation level in SQL Server.
Use of any other isolation level (to me, anyway) constitutes a code smell — to me, at least — that requires some real justification, with the possible exception of the limited use of READ UNCOMMITTED in certain contexts.
You should use SNAPSHOT isolation level. The best is to turn on READ COMMITTED SNAPSHOT at the database level, which will silently transform all the default READ COMMITTED transactions into snapshot transactions.
The reasons why SNAPSHOT is best for applications, specially for applications that can run into blocking due to concurrency issues, are countless, and the benefits are endless. It is true that SNAPSHOT isolation occurs a cost in resources used, but unless you measured and find conclusive evidence that is the row version store that is causing the problems, you cannot dismiss it upfront.
The one isolation level one should never use is UNCOMMITTED. That is asking for trouble. Dirty reads are inconsistent reads.
REPEATABLE and SERIALIZABLE have extremely narrow use cases and you'll most likely never need them. Unfortunately SERIALIZABLE is abused by the .Net System.Transactions and by the MTS/COM+ so a lot of applications end up using it (andf having huge scalability issues because of it), although it is not required.
If first transaction updated data in some table then second will wait anyway to get this data (except READ UNCOMMITTED isolation level, but in this case you can have very inconsistent data).

uncommitted reads?

I was looking at potential concurrency issues in DB so i went to read up. I found http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/c0005267.htm and it mentions access to uncommitted data.
Access to uncommitted data.
Application A might update a value in
the database, and application B might
read that value before it was
committed. Then, if the value of A is
not later committed, but backed out,
the calculations performed by B are
based on uncommitted (and presumably
invalid) data.
What... i thought other sessions (same app and even same thread) can read data that has not been committed yet? I thought only the connection/session (i am not sure of my terminology) that wrote the data into the uncommitted transaction can read uncommitted data.
Can other threads really read data that hasnt been committed?
I plan to use mysql but i may use sqlite
What other sessions can read depends on how you set up your database. In MySQL it also depends on what database engine you use. The term you're looking for (in ANSI SQL terms) is "isolation level".
Many databases will default to an isolation level where reads on uncommitted data will block. So if transaction A updates record 1234 in table T and then transaction B tries to select record 1234 before A commits or rolls back then B will block until A does one of those things.
See MySQL Transactions, Part II - Transaction Isolation Levels.
One serious downside of this is that batch update operations that live in long-running transactions (typically) can potentially block many requests.
You can also set it so B will see uncommitted data but that is often ill-advised.
Alternatively you can use a scheme called MVCC ("Multiversion concurrency control"), which will give different transactions a consistent view of the data based on the time the transaction started. This avoids the uncommitted read problem (reading data that may be rolled back) and is much more scalable, especially in the context of long-lived transactions.
MySQL supports MVCC.
Certainly in SQL Server you can, you have to chose to do it, it is not the default, but if you use the right isolation level or query hint you can chose to read an uncommitted row, this can leads to problems and even a double read of the same row in theory.
That article mentions access to uncommitted data as one of the problems eliminated by the database manager.
The database manager controls this
access to prevent undesirable effects,
such as:
...
Access to uncommitted data.
MySQL's InnoDB storage engine supports several transaction isolation levels. For details, see
http://dev.mysql.com/doc/refman/5.4/en/set-transaction.html.
For some versions of some databases, setting queries to be able to read uncommitted will improve performance, because of reduced locking. That still leaves questions of security, reliability, and scalability to be answered.
To give a specific, I used to work on a very large e-commerce site. They used read uncommitted on reads to the store catalog, since the data was heavily accessed, infrequently changed, and not sensitive to concerns about reading uncommitted data. Any data from the catalog that was used to place an order would be re-verified anyway. This was on SQL Server 2000, which was known to have locking performance problems. On newer versions of SQL Server, the locking performance has improved, so this wouldn't be necessary.