What happens to Isolation Level in Stored Proc that fails - sql

I have a SP that takes about 30 seconds to run. 10 seconds of this creating an initial temp table on which I operate eg.
SELECT * INTO #tmp FROM view_name ....(lots of joins and where conditions)
My understanding is that by default this long select (insert) can lock the table and therefore the application attached to the database?
My solution to this was to add
SET ISOLATION TRANSACTION LEVEL READ UNCOMMITTED;
to the top of the Stored Proc and then reset it to COMMITED at the end.
I guess this is a two part question firstly is my understanding of this correct and secondly if somewhere in my SP it fails with an error. Will the DB remain in UNCOMMITTED mode?

For the specific case of code executed from a stored procedure, the isolation level is reset to what it was before you called the stored procedure, regardless of what happens. From the docs:
If you issue SET TRANSACTION ISOLATION LEVEL in a stored procedure or
trigger, when the object returns control the isolation level is reset
to the level in effect when the object was invoked.
If you're not using a stored procedure, or you're dealing with code inside the stored procedure, the (much more complicated) story follows below.
Isolation level is a property of the session, not of the database. Whether or not a statement like SET TRANSACTION ISOLATION LEVEL executes after an error depends on whether the error aborts the entire batch or only the statement. The rules for this are complicated and unintuitive. To make things even more interesting, the isolation level of a session is not reset if it's associated with a pooled connection. If you want foolproof code that maintains an isolation level, preferably set the isolation level consistently from client code. If you must do it in T-SQL, it's a good idea to use SET XACT_ABORT ON and TRY .. CATCH, or the WITH (NOLOCK) table hint to restrict the locking options to a particular table or tables.
But why complicate your life? Consider if you really need to use READ UNCOMMITTED (or WITH (NOLOCK)), because you have no guarantee of data consistency anymore when you do, and even if you think you're OK with that, it can lead to results that aren't just a little bit wrong but completely useless. Unfortunately, this is true especially when the load on the database increases (which is why you might consider reducing locking in the first place).
If your query takes a long time while locking data, and that is a problem to applications (neither of which should be assumed, because SQL Server is good at locking data for only as long as necessary), your first instinct should be to see if the query itself can be optimized to reduce lock time (by adding indexes, most likely). If you've still got unacceptable locking left, consider using snapshot isolation. Only after those things have been evaluated should you consider dirty reads, and even then only for cases where consistency of the data is not your primary concern (for example, the query gets some monitoring statistics that are refreshed every so often anyway).

Related

Do default Tranactions in postgresql provide any benefit when only the last statement is writing?

I just learned that in Postgresql the default transaction isolation level is "Read committed". I'm very used to MySQLs "REPEATABLE READ" isolation level. In postgresql by my understanding this means in a default transaction "two successive SELECT commands can see different data". With that in mind, is there any benefit to transactions when only the last statement in the transaction is writing?
The transaction does not prevent you from data changing between statements, the only benefit I see is rolling the transaction back on failure. But if only one writing statement exists at the end, then that would happen anyway.
To make a bit more clear what I'm referring to, lets take a generic simple sequence of (pseudo) queries to a table:
BEGIN TRANSACTION
SELECT userId FROM users WHERE username = "the provided username"
INSERT INTO activites (activity, user_fk) VALUES ("posted on SO", userId)
COMMIT
In this sequence and any general sequence of statments where only the last statement is writing, is there a benefit in postgresql to using a transaction with the default isolation level?
Bonus question, is there any overhead from it?
The difference between READ COMMITTED and REPEATABLE READ is that the former takes a new database snapshot for each statement, while the latter takes a snapshot only for the first SQL statement and uses that snapshot for the whole transaction. This implies that REPEATABLE READ actually performs better that READ COMMITTED, since it takes fewer snapshots.
The disadvantage of REPEATABLE READ is that you can get serialization errors. That does not affect your example, but if you had an UPDATE instead of an INSERT, it could be that the row you are trying to update has been modified by a concurrent transaction since the snapshot was taken. The serialization error that causes would mean that you have to repeat the transaction. Another disadvantage of REPEATABLE READ transactions is that a long-running read-only transaction can hinder the progress of VACUUM, which it wouldn't do in READ COMMITTED mode.
For read-only transactions or transactions like the one you are showing, REPEATABLE READ is often the better isolation level. The nice thing about READ COMMITTED is that you can get no serialization errors apart from deadlocks.
To explicitly answer your question: there is no advantage to running the statement from your example in a single transaction. You may as well use the default autocommit mode to run them in separate transactions.
Incidentally, the SQL standard decrees that the default transaction isolation level be SERIALIZABLE, but I don't know any database that implements that.

Avoid dirty/phantom reads while selecting from database

I have two tables A and B.
My transactions are like this:
Read -> read from table A
Write -> write in table B, write in table A
I want to avoid dirty/phantom reads since I have multiple nodes making request to same database.
Here is an example:
Transaction 1 - Update is happening on table B
Transaction 2 - Read is happening on table A
Transaction 1 - Update is happening on table A
Transaction 2 - Completed
Transaction 1 - Rollback
Now Transaction 2 client has dirty data. How should I avoid this?
If your database is not logged, there is nothing you can do. By choosing an unlogged database, those who set it up decided this sort of issue was not a problem. The only way to fix the problem here is change the database mode to logged, but that is not something you do casually on a whim — there are lots of ramifications to the change.
Assuming your database is logged — it doesn't matter here whether it is buffered logging or unbuffered logging or (mostly) a MODE ANSI database — then unless you set DIRTY READ isolation, you are running using at least COMMITTED READ isolation (it will be Informix's REPEATABLE READ level, standard SQL's SERIALIZABLE level, if the database is MODE ANSI).
If you want to ensure that data rows do not change after a transaction has read them, you need to run at a higher isolation — REPEATABLE READ. (See SET ISOLATION in the manual for the details. (Beware of the nomenclature for SET TRANSACTION; there's a section of the manual about Comparing SET ISOLATION and SET TRANSACTION and related sections.) The downside to using SET ISOLATION TO REPEATABLE READ (or SET TRANSACTION ISOLATION LEVEL SERIALIZABLE) is that the extra locks needed reduce concurrency — but give you the best guarantees about the state of the database.

What is wrong with the below tsql code

BEGIN TRAN
SELECT * FROM AnySchema.AnyTable
WHERE AnyColumn = SomeCondition
COMMIT
I know the transaction is not required here because it is just a select but just want to know how bad a programming it is and whether it is going to be an overhead on the DB engine.
You may use transaction on SELECT statements to ensure nobody else could update / delete records of the table of while the bunch of your select queries are executing.
Using WITH(NOLOCK):
Anyways, you may also use WITH(NOLOCK) for t_sql
SELECT * FROM AnySchema.AnyTable WITH(NOLOCK)
WHERE AnyColumn = SomeCondition
WITH (NOLOCK) is the equivalent of using READ UNCOMMITED as a transaction isolation level. Here stand the risk of reading an uncommitted row that is subsequently rolled back, i.e. data that never made it into the database.
So, while it can prevent reads being deadlocked by other operations, it comes with a risk.
TRANSACTION Block :
Using TRANSACTION block will not cause much of extra DB overload but if you keep the same type practice on, and suppose , at any SQL block you forget (you / your developers may forget, right ?) to close the transaction, then other processes can't work on the same table.
Anyways, it depends on what type of application you are using. If very frequent update and select things are there , it is advised not to use such transaction blocks. If medium level of updates and select are there, occurs next to each other, you may use transaction blocks for select (but ensure to close the transaction).
That's actually a good question. To understand what's going on, you need to know about SET IMPLICIT_TRANSACTIONS.
When ON, the system is in implicit transaction mode. This means that if ##TRANCOUNT = 0, any of the following Transact-SQL statements begins a new transaction. It is equivalent to an unseen BEGIN TRANSACTION being executed first:
When OFF, each of the preceding T-SQL statements is bounded by an unseen BEGIN TRANSACTION and an unseen COMMIT TRANSACTION statement. When OFF, we say the transaction mode is autocommit. [this is the default]
If your T-SQL code visibly issues a BEGIN TRANSACTION, we say the transaction mode is explicit.
https://msdn.microsoft.com/en-us/library/ms187807.aspx
Since the SQL Server would have created a transaction for you, manually doing doesn't actually change anything. The exact same thing would have happened either way.
Summary: What you are doing isn't 'wrong' because it has no effect, but unnecessary and confusing to the reader.
I think you would be taking a holdlock and most likely a tablock
table hints
That is not always a good thing as you would block any update or deletes (maybe even inserts)
It would be better to let SQL decide what level of of locks to take. Most likely pagelocks. I would stay away from nolock as bad stuff can happen.
On a select on a single table just let the optimizer do it's thing.

In SQL Server 2005, when does a Select query block Inserts or Updates to the same or other table(s)?

In the past I always thought that select query would not blocks other insert sql. However, recently I wrote a query that takes a long time (more than 2 min) to select data from a table. During the select, a number of insert sql statements were timing out.
If select blocks insert, what would be the solution way to prevent the timeout without causing dirty read?
I have investigate option of using isolation snapshot, but currently I have no access to change the client's database to enable the “ALLOW_SNAPSHOT_ISOLATION”.
Thanks
When does a Select query block Inserts or Updates to the same or
other table(s)?
When it holds a lock on a resource that is mutually exclusive with one that the insert or update statement needs.
Under readcommitted isolation level with no additional locking hints then the S locks taken out are typically released as soon as the data is read. For repeatable read or serializable however they will be held until the end of the transaction (statement for a single select not running in an explicit transaction).
serializable will often take out range locks which will cause additional blocking over and above that caused by the holding of locks on the rows and pages actually read.
READPAST might be what you're looking for - check out this article.

Deadlocks - Will this really help?

So I've got a query that keeps deadlocking on me. People who know the system well can't figure out why the sproc is deadlocking, but they tell me that I should just add this to it:
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
Is this really a valid solution? What does that do?
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
This will cause the system to return inconsitent data, including duplicate records and missing records. Read more at Previously committed rows might be missed if NOLOCK hint is used, or here at Timebomb - The Consistency problem with NOLOCK / READ UNCOMMITTED.
Deadlocks can be investigated and fixed, is not a big deal if you follow the proper procedure. Of course, throwing a dirty read may seem easier, but down the road you'll be sitting long hours staring at your general ledger and wondering why the heck it does not balance debits and credits. So read again until you really grok this: DIRTY READs ARE INCONSISTENT READS.
If you want a get-out-of-jail card, turn on snapshot isolation:
ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON
But keep in mind that snapshot isolation does not fix the deadlocks, it only hides them. Proper investigation of the deadlock cause and fix is always the appropriate action.
NOCOUNT will keep your query from returning rowcounts back to the calling application (i.e. 1000000 rows affected).
TRANSACTION ISOLATION LEVEL READ UNCOMMITTED will allow for dirty reads as indicated here.
The isolation level may help, but do you want to allow dirty reads?
Randomly adding SET options to the query is unlikely to help I'm afraid
SET NOCOUNT ON
Will have no effect on the issue.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
will prevent your query taking out shared locks. As well as reading "dirty" data it also can lead to your query reading the same rows twice, or not at all, dependant upon what other concurrent activity is happening.
Whether this will resolve your deadlock issue depends upon the type of deadlock. It will have no effect at all if the issue is 2 writers deadlocking due to non linear ordering of lock requests. (transaction 1 updating row a, transaction 2 updating row b then tran 1 requesting a lock on b and tran 2 requesting a lock on a)
Can you post the offending query and deadlock graph? (if you are on SQL 2005 or later)
The best guide is:
http://technet.microsoft.com/es-es/library/ms173763.aspx
Snippet:
Specifies that statements can read rows that have been modified by other
transactions but not yet committed.
Transactions running at the READ
UNCOMMITTED level do not issue shared
locks to prevent other transactions
from modifying data read by the
current transaction. READ UNCOMMITTED
transactions are also not blocked by
exclusive locks that would prevent the
current transaction from reading rows
that have been modified but not
committed by other transactions. When
this option is set, it is possible to
read uncommitted modifications, which
are called dirty reads. Values in the
data can be changed and rows can
appear or disappear in the data set
before the end of the transaction.
This option has the same effect as
setting NOLOCK on all tables in all
SELECT statements in a transaction.
This is the least restrictive of the
isolation levels.
In SQL Server, you can also minimize
locking contention while protecting
transactions from dirty reads of
uncommitted data modifications using
either:
The READ COMMITTED isolation level
with the READ_COMMITTED_SNAPSHOT
database option set to ON. The
SNAPSHOT isolation level
.
On a different tack, there are two other aspects to consider, that may help.
1) Indexes and the indexes used by the SQL. The indexing strategy used on the tables will affect how many rows are affected. If you make the data modifications using a unique index, you may reduce the chance of deadlocks.
One algorithm - of course it will not work it all cases. The use of NOLOCK is targeted rather than being global.
The "old" way:
UPDATE dbo.change_table
SET somecol = newval
WHERE non_unique_value = 'something'
The "new" way:
INSERT INTO #temp_table
SELECT uid FROM dbo.change_table WITH (NOLOCK)
WHERE non_unique_value = 'something'
UPDATE dbo.change_table
SET somecol = newval
FROM dbo.change_table c
INNER JOIN
#temp_table t
ON (c.uid = t.uid)
2) Transaction duration
The longer a transaction is open the more likely there may be contention. If there is a way to reduce the amount of time that records remain locked, you can reduce the chances of a deadlock occurring.
For example, perform as many SELECT statements (e.g. lookups) at the start of the code instead of performing an INSERT or UPDATE, then a lookup, then an INSERT, and then another lookup.
This is where one can use the NOLOCK hint for SELECTs on "static" tables that are not changing reducing the lock "footprint" of the code.