Hi I am new to PostgreSQL transaction isolation level, I wanted to learn more on different isolation level in PostgreSQL. I have 2 windows here. Each running PSQL shell, to emulate 2 clients, to my PostgreSQL server in my local machine.
I tried to insert into the table with non transaction query in test table with one client and set up a read committed isolation based transaction in other client.
I find it weird how can read committed isolation level see changes made from the other client. Isn't the read committed meant to see a snapshot of the changes in the beginning of begin transaction state.
In the left,
SELECT statement being run inside of a transaction.
In the right,
INSERT statement being run in non transaction environment
Can you explain me on how is this meant to happen ?
Related
I have two tables A and B.
My transactions are like this:
Read -> read from table A
Write -> write in table B, write in table A
I want to avoid dirty/phantom reads since I have multiple nodes making request to same database.
Here is an example:
Transaction 1 - Update is happening on table B
Transaction 2 - Read is happening on table A
Transaction 1 - Update is happening on table A
Transaction 2 - Completed
Transaction 1 - Rollback
Now Transaction 2 client has dirty data. How should I avoid this?
If your database is not logged, there is nothing you can do. By choosing an unlogged database, those who set it up decided this sort of issue was not a problem. The only way to fix the problem here is change the database mode to logged, but that is not something you do casually on a whim — there are lots of ramifications to the change.
Assuming your database is logged — it doesn't matter here whether it is buffered logging or unbuffered logging or (mostly) a MODE ANSI database — then unless you set DIRTY READ isolation, you are running using at least COMMITTED READ isolation (it will be Informix's REPEATABLE READ level, standard SQL's SERIALIZABLE level, if the database is MODE ANSI).
If you want to ensure that data rows do not change after a transaction has read them, you need to run at a higher isolation — REPEATABLE READ. (See SET ISOLATION in the manual for the details. (Beware of the nomenclature for SET TRANSACTION; there's a section of the manual about Comparing SET ISOLATION and SET TRANSACTION and related sections.) The downside to using SET ISOLATION TO REPEATABLE READ (or SET TRANSACTION ISOLATION LEVEL SERIALIZABLE) is that the extra locks needed reduce concurrency — but give you the best guarantees about the state of the database.
Experiment details:
I am running this in Microsoft SQL Server management studio.
On one query window I run:
BEGIN TRANSACTION a;
ALTER table <table name>
ALTER column <column> varchar(1025)
On the other I run:
SELECT 1
FROM sys.objects
WHERE name = ' <other_table name>'
Or this:
SELECT 1
FROM sys.objects
WHERE object_id = OBJECT_ID(N'[<other_table name>]')
For some reason the select with name= does not return until I do commit to the tranaction.
I am doing transaction to simulate a long operation of alter column that we have in our DB sometimes. Which I don't want to harm other operations.
This is because you are working under Default Transaction Isolation Level i.e ReadCommited .
Read Commited
Under Read Commited Trasanction Isolation Level when you explicitly Begin a Transaction Sql Server obtains Exclusive locks on the resources in order to maintain data integrity and prevent users from dirty reads. Once you have begin a transaction and working with some rows other users will not be able to see them rows untill you Commit your transaction or Rollback. However this is sql server's default behaviour this can be changed under different Transaction Isolation Level for instance under Read Uncommited You will be able to read rows which are being Modified/Used by other users , but there is a chance of you have Dirty Reads "Data That you think is still in database but Other user has changed it."
My Suggestion
If Dirty Reads is something you can live with go on than
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITED;
Preferably Otherwise stick to default behaviour of Sql Server and let the user wait for a bit rather than giving them dirty/wrong data. I hope it helps.
Edit
it does not matter if the both queries are executed under same or different isolation level what matters is under what isolation level the queries are being executed, and you have to have Read Uncommited transaction isolation level when you are making use of explicit transactions. If you want other user to have access to the data in during a transaction. Not recomended and a bad practice, I would personally use Snapshot Isolation which will extensively make use of tempdb and will only show the last commited data.
Suppose we try to run concurrently the following two transactions:
T1:
BEGIN
UPDATE place
SET state_code=5
WHERE state_code=6
AND type='town'
COMMIT
T2:
BEGIN
UPDATE place
SET state_code=6
WHERE state_code=5
AND type='town'
COMMIT
On Postgres 9.0 the transactions complete successfully. However, when run on a SQLServer 2005 T1 gets aborted, thus all states end up with state_code=5.
Which DBMS did/did not obey strict transaction serialisation and why?
What settings might or might not have been set in each DBMS to cause this behaviour?
MS SQL uses transaction statments as:
BEGIN transaction
UPDATE place
SET state_code=5
WHERE state_code=6
AND type='town'
COMMIT
In PostgreSQL you can test this by using a table-level lock to ensure that the competing transactions run concurrently. Setup:
CREATE TABLE place (id serial primary key, state_code integer, type text);
INSERT INTO place (state_code, type) VALUES (5, 'town'), (6, 'town');
Then in a session T0:
BEGIN; LOCK TABLE place;
In two new sessions issue the commands for T1 and T2.
Now back in T0 issue:
ROLLBACK;
to release the table-level lock and allow T1 and T2 to race.
In READ COMMITTED isolation (the default) both will be transposed... though in the real world this will depend on the exact ordering of the transactions and cannot be relied upon.
In REPEATABLE READ, (called SERIALIZABLE isolation in PostgreSQL 9.0) they will both be transposed, with the same caveat as with READ COMMITTED since both are essentially equivalent for single statement transactions; the snapshot is taken when the first statement is run, not at the BEGIN.
In SERIALIZABLE isolation, set via default_transaction_isolation GUC or BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE, one transaction will fail to commit with a serialization failure while the other will succeed, resulting in all rows having state_code 5, or all rows having state_code 6, depending on which transaction won.
If SQL Server also aborts one of the transactions, then it's doing proper serialization.
SQL Server obeyed proper serialization (as would a current version of PostgreSQL when strict serialization is requested), while the tested version of PostgreSQL was set to either READ COMMITTED isolation (where it isn't supposed to serialize) or pre-9.1's SERIALIZABLE isolation (which could not detect this anomaly) and did not obey strict transaction serialization semantics.
This could be experienced in PostgreSQL 9.2 if default_transaction_isolation is set to anything except SERIALIZABLE, and was the norm before PostgreSQL 9.1.
It looks like SQL Server uses SET TRANSACTION ISOLATION LEVEL. In PostgreSQL you use the default_transaction_isolation GUC or SET TRANSACTION ISOLATION LEVEL. It isn't clear if SQL Server lets you set a default transaction isolation level globally.
We have scenario:
SQL Server 2008
we have replication on the db
we have simple sproc that performs ALTER of one of the table (add new column)
isolation level is default (READ COMMITTED)
Stored procedure fails with error:
You can only specify the READPAST lock in the READ COMMITTED or REPEATABLE READ isolation levels
Questions:
What causes the problem?
How to fix that?
UPDATE:
I believe this is very common problem so I'm wonder why there is no good explanations why replication causes this issue
You cant only specify READPAST when reading from committed data.
Reason is because readpast ignores locked rows, so when you use it, you are pretty much saying to sql server, give me everything that has not been touched by any other transaction. Example from BOL:
For example, assume table T1 contains a single integer column with the
values of 1, 2, 3, 4, 5. If transaction A changes the value of 3 to 8
but has not yet committed, a SELECT * FROM T1 (READPAST) yields values
1, 2, 4, 5.
It doesnt make much sense saying that on a read uncommitted isolation level, which by default, brings back uncommitted values. Its kinda requesting two opposite things.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
//REST OF QUERY ADD WITH (READPAST) after the table names Ex. SELECT FOO.* FROM dbo.foobar FOO WITH (READPAST)
And/or this should help you.
http://support.microsoft.com/kb/981995
Are you sure isolation level is set to READ COMMITTED?
I've seen this error when isolation is set to serializable and you use ALTER TABLE on a table published for replication. This is because some replication stored procedures use the READPAST hint to avoid blocking which can only be used in READ COMMITTED or REPEATABLE READ isolation levels.
If you are sure that the isolation level is set to READ COMMITTED, then I would recommend contacting Microsoft PSS on this one as it should not be happening. They will be able to assist you better.
I remember an example where reads in a transaction then writing back the data is not safe because another transaction may read/write to it in the time between. So i would like to check the date and prevent the row from being modified or read until my transaction is finish. Would this do the trick? and are there any sql variants that this will not work on?
update tbl set id=id where date>expire_date and id=#id
Note: date>expire_date happens to be my condition. It could be anything. Would this prevent other transaction from reading the row until i commit or rollback?
In a lot of cases, your UPDATE statement will not prevent other transactions from reading the row.
ziang mentioned transaction isolation levels.
Depending on the isolation level, databases use different types of locking. At the highest level, locking can be divided into two categories:
- pessimistic,
- optimistic
MS SQL 2008, for example, has 6 isolation levels, 4 of them are pessimistic, 2 are optimistic. By default , it uses READ COMMITTED isolation level, which falls into the pessimistic category.
Oracle, on another note, uses optimistic locking by default.
The statement that will lock your record for writing is
SELECT * FROM TBL WITH UPDLOCK WHERE id=#id
From that point on, no other transaction will be able to update your record with id=#id
And only transactions running in isolation level READ UNCOMMITTED will be able to read it.
With the default transaction level, READ COMMITTED, no other thansaction will be able to read or write into this record until you either commit or roll back your entire transaction.
It depends on the transaction isolation level you set on your transaction control. There are 4 types of read
READ UNCOMMITTED: this allows the dirty read
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
for more info, you can check msdn.
You should be able to do this in a normal select using a combination of
HOLDLOCK/ROWLOCK
It very well may work. Different platforms offer different services. For instance, in T-SQL, you can simply set the isolation level of the transaction and, as a result, force a lock to be obtained. I don't know what platform you are using so I cannot answer your question definitively.