Would this prevent the row from being read during the transaction? - sql

I remember an example where reads in a transaction then writing back the data is not safe because another transaction may read/write to it in the time between. So i would like to check the date and prevent the row from being modified or read until my transaction is finish. Would this do the trick? and are there any sql variants that this will not work on?
update tbl set id=id where date>expire_date and id=#id
Note: date>expire_date happens to be my condition. It could be anything. Would this prevent other transaction from reading the row until i commit or rollback?

In a lot of cases, your UPDATE statement will not prevent other transactions from reading the row.
ziang mentioned transaction isolation levels.
Depending on the isolation level, databases use different types of locking. At the highest level, locking can be divided into two categories:
- pessimistic,
- optimistic
MS SQL 2008, for example, has 6 isolation levels, 4 of them are pessimistic, 2 are optimistic. By default , it uses READ COMMITTED isolation level, which falls into the pessimistic category.
Oracle, on another note, uses optimistic locking by default.
The statement that will lock your record for writing is
SELECT * FROM TBL WITH UPDLOCK WHERE id=#id
From that point on, no other transaction will be able to update your record with id=#id
And only transactions running in isolation level READ UNCOMMITTED will be able to read it.
With the default transaction level, READ COMMITTED, no other thansaction will be able to read or write into this record until you either commit or roll back your entire transaction.

It depends on the transaction isolation level you set on your transaction control. There are 4 types of read
READ UNCOMMITTED: this allows the dirty read
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
for more info, you can check msdn.

You should be able to do this in a normal select using a combination of
HOLDLOCK/ROWLOCK

It very well may work. Different platforms offer different services. For instance, in T-SQL, you can simply set the isolation level of the transaction and, as a result, force a lock to be obtained. I don't know what platform you are using so I cannot answer your question definitively.

Related

Do default Tranactions in postgresql provide any benefit when only the last statement is writing?

I just learned that in Postgresql the default transaction isolation level is "Read committed". I'm very used to MySQLs "REPEATABLE READ" isolation level. In postgresql by my understanding this means in a default transaction "two successive SELECT commands can see different data". With that in mind, is there any benefit to transactions when only the last statement in the transaction is writing?
The transaction does not prevent you from data changing between statements, the only benefit I see is rolling the transaction back on failure. But if only one writing statement exists at the end, then that would happen anyway.
To make a bit more clear what I'm referring to, lets take a generic simple sequence of (pseudo) queries to a table:
BEGIN TRANSACTION
SELECT userId FROM users WHERE username = "the provided username"
INSERT INTO activites (activity, user_fk) VALUES ("posted on SO", userId)
COMMIT
In this sequence and any general sequence of statments where only the last statement is writing, is there a benefit in postgresql to using a transaction with the default isolation level?
Bonus question, is there any overhead from it?
The difference between READ COMMITTED and REPEATABLE READ is that the former takes a new database snapshot for each statement, while the latter takes a snapshot only for the first SQL statement and uses that snapshot for the whole transaction. This implies that REPEATABLE READ actually performs better that READ COMMITTED, since it takes fewer snapshots.
The disadvantage of REPEATABLE READ is that you can get serialization errors. That does not affect your example, but if you had an UPDATE instead of an INSERT, it could be that the row you are trying to update has been modified by a concurrent transaction since the snapshot was taken. The serialization error that causes would mean that you have to repeat the transaction. Another disadvantage of REPEATABLE READ transactions is that a long-running read-only transaction can hinder the progress of VACUUM, which it wouldn't do in READ COMMITTED mode.
For read-only transactions or transactions like the one you are showing, REPEATABLE READ is often the better isolation level. The nice thing about READ COMMITTED is that you can get no serialization errors apart from deadlocks.
To explicitly answer your question: there is no advantage to running the statement from your example in a single transaction. You may as well use the default autocommit mode to run them in separate transactions.
Incidentally, the SQL standard decrees that the default transaction isolation level be SERIALIZABLE, but I don't know any database that implements that.

Avoid dirty/phantom reads while selecting from database

I have two tables A and B.
My transactions are like this:
Read -> read from table A
Write -> write in table B, write in table A
I want to avoid dirty/phantom reads since I have multiple nodes making request to same database.
Here is an example:
Transaction 1 - Update is happening on table B
Transaction 2 - Read is happening on table A
Transaction 1 - Update is happening on table A
Transaction 2 - Completed
Transaction 1 - Rollback
Now Transaction 2 client has dirty data. How should I avoid this?
If your database is not logged, there is nothing you can do. By choosing an unlogged database, those who set it up decided this sort of issue was not a problem. The only way to fix the problem here is change the database mode to logged, but that is not something you do casually on a whim — there are lots of ramifications to the change.
Assuming your database is logged — it doesn't matter here whether it is buffered logging or unbuffered logging or (mostly) a MODE ANSI database — then unless you set DIRTY READ isolation, you are running using at least COMMITTED READ isolation (it will be Informix's REPEATABLE READ level, standard SQL's SERIALIZABLE level, if the database is MODE ANSI).
If you want to ensure that data rows do not change after a transaction has read them, you need to run at a higher isolation — REPEATABLE READ. (See SET ISOLATION in the manual for the details. (Beware of the nomenclature for SET TRANSACTION; there's a section of the manual about Comparing SET ISOLATION and SET TRANSACTION and related sections.) The downside to using SET ISOLATION TO REPEATABLE READ (or SET TRANSACTION ISOLATION LEVEL SERIALIZABLE) is that the extra locks needed reduce concurrency — but give you the best guarantees about the state of the database.

Concurrent transaction management in MS SqlServer

I have a question concerning transaction isolation in SQL Server. The default isolation level is set to 2 (READ_COMMITED). In the first transaction, I insert some data in table users; in the second, I try, unsuccessfully, to select all data from the same table, it seems that the second transaction waits for the first one to commit/rollback.
Does anyone have an explanation?
That is exactly what the read committed means, the second transaction will wait until it is able to read all the data. There are ways to either read the uncommitted data or skip it, but that is not something you would normally want to do.
There is quite a lot of material available for this, for example this one in TechNet

SQL Server Update Locks

If you have the following sql, is it possible that if it is run multiple times by many different processes at exactly the same time, that two or more processes may update the table?
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
UPDATE table
SET Column1 = 1
WHERE Column1 = 0
No other locks etc are specified in the sql, other that Read Uncommitted.
I'm trying to track down an issue, and I'm now clutching at straws...
Got this from MSDN.
Transactions running at the READ UNCOMMITTED level do not issue shared locks to prevent other transactions from modifying data read by the current transaction. READ UNCOMMITTED transactions are also not blocked by exclusive locks that would prevent the current transaction from reading rows that have been modified but not committed by other transactions. When this option is set, it is possible to read uncommitted modifications, which are called dirty reads. Values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the isolation levels.
So basically, this is equivalent to SQL Server , NOLOCK hint. This might result in dirty reads, i.e. if some process in updated 1000 records and updated 500 till now, and other process read that data, then data might be in inconsistent form. This also helps in executing update without getting blocked (shared lock) by multiple select queries.
Hope this make some sense to your question. For reference -- MSDN

Deadlocks - Will this really help?

So I've got a query that keeps deadlocking on me. People who know the system well can't figure out why the sproc is deadlocking, but they tell me that I should just add this to it:
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
Is this really a valid solution? What does that do?
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
This will cause the system to return inconsitent data, including duplicate records and missing records. Read more at Previously committed rows might be missed if NOLOCK hint is used, or here at Timebomb - The Consistency problem with NOLOCK / READ UNCOMMITTED.
Deadlocks can be investigated and fixed, is not a big deal if you follow the proper procedure. Of course, throwing a dirty read may seem easier, but down the road you'll be sitting long hours staring at your general ledger and wondering why the heck it does not balance debits and credits. So read again until you really grok this: DIRTY READs ARE INCONSISTENT READS.
If you want a get-out-of-jail card, turn on snapshot isolation:
ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON
But keep in mind that snapshot isolation does not fix the deadlocks, it only hides them. Proper investigation of the deadlock cause and fix is always the appropriate action.
NOCOUNT will keep your query from returning rowcounts back to the calling application (i.e. 1000000 rows affected).
TRANSACTION ISOLATION LEVEL READ UNCOMMITTED will allow for dirty reads as indicated here.
The isolation level may help, but do you want to allow dirty reads?
Randomly adding SET options to the query is unlikely to help I'm afraid
SET NOCOUNT ON
Will have no effect on the issue.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
will prevent your query taking out shared locks. As well as reading "dirty" data it also can lead to your query reading the same rows twice, or not at all, dependant upon what other concurrent activity is happening.
Whether this will resolve your deadlock issue depends upon the type of deadlock. It will have no effect at all if the issue is 2 writers deadlocking due to non linear ordering of lock requests. (transaction 1 updating row a, transaction 2 updating row b then tran 1 requesting a lock on b and tran 2 requesting a lock on a)
Can you post the offending query and deadlock graph? (if you are on SQL 2005 or later)
The best guide is:
http://technet.microsoft.com/es-es/library/ms173763.aspx
Snippet:
Specifies that statements can read rows that have been modified by other
transactions but not yet committed.
Transactions running at the READ
UNCOMMITTED level do not issue shared
locks to prevent other transactions
from modifying data read by the
current transaction. READ UNCOMMITTED
transactions are also not blocked by
exclusive locks that would prevent the
current transaction from reading rows
that have been modified but not
committed by other transactions. When
this option is set, it is possible to
read uncommitted modifications, which
are called dirty reads. Values in the
data can be changed and rows can
appear or disappear in the data set
before the end of the transaction.
This option has the same effect as
setting NOLOCK on all tables in all
SELECT statements in a transaction.
This is the least restrictive of the
isolation levels.
In SQL Server, you can also minimize
locking contention while protecting
transactions from dirty reads of
uncommitted data modifications using
either:
The READ COMMITTED isolation level
with the READ_COMMITTED_SNAPSHOT
database option set to ON. The
SNAPSHOT isolation level
.
On a different tack, there are two other aspects to consider, that may help.
1) Indexes and the indexes used by the SQL. The indexing strategy used on the tables will affect how many rows are affected. If you make the data modifications using a unique index, you may reduce the chance of deadlocks.
One algorithm - of course it will not work it all cases. The use of NOLOCK is targeted rather than being global.
The "old" way:
UPDATE dbo.change_table
SET somecol = newval
WHERE non_unique_value = 'something'
The "new" way:
INSERT INTO #temp_table
SELECT uid FROM dbo.change_table WITH (NOLOCK)
WHERE non_unique_value = 'something'
UPDATE dbo.change_table
SET somecol = newval
FROM dbo.change_table c
INNER JOIN
#temp_table t
ON (c.uid = t.uid)
2) Transaction duration
The longer a transaction is open the more likely there may be contention. If there is a way to reduce the amount of time that records remain locked, you can reduce the chances of a deadlock occurring.
For example, perform as many SELECT statements (e.g. lookups) at the start of the code instead of performing an INSERT or UPDATE, then a lookup, then an INSERT, and then another lookup.
This is where one can use the NOLOCK hint for SELECTs on "static" tables that are not changing reducing the lock "footprint" of the code.