READ COMMITTED database isolation level in oracle - sql

I'm working on a web app connected to oracle. We have a table in oracle with a column "activated". Only one row can have this column set to 1 at any one time. To enforce this, we have been using SERIALIZED isolation level in Java, however we are running into the "cannot serialize transaction" error, and cannot work out why.
We were wondering if an isolation level of READ COMMITTED would do the job. So my question is this:
If we have a transaction which involves the following SQL:
SELECT *
FROM MODEL;
UPDATE MODEL
SET ACTIVATED = 0;
UPDATE MODEL
SET ACTIVATED = 1
WHERE RISK_MODEL_ID = ?;
COMMIT;
Given that it is possible for more than one of these transactions to be executing at the same time, would it be possible for more than one MODEL row to have the activated flag set to 1 ?
Any help would be appreciated.

your solution should work: your first update will lock the whole table. If another transaction is not finished, the update will wait. Your second update will guarantee that only one row will have the value 1 because you are locking the table (it doesn't prevent INSERT statements however).
You should also make sure that the row with the RISK_MODEL_ID exists (or you will have zero row with the value '1' at the end of your transaction).
To prevent concurrent INSERT statements, you would LOCK the table (in EXCLUSIVE MODE).

You could consider using a unique, function based index to let Oracle handle the constraint of only having a one row with activated flag set to 1.
CREATE UNIQUE INDEX MODEL_IX ON MODEL ( DECODE(ACTIVATED, 1, 1, NULL));
This would stop more than one row having the flag set to 1, but does not mean that there is always one row with the flag set to 1.

If what you want is to ensure that only one transaction can run at a time then you can use the FOR UPDATE syntax. As you have a single row which needs locking this is a very efficient approach.
declare
cursor c is
select activated
from model
where activated = 1
for update of activated;
r c%rowtype;
begin
open c;
-- this statement will fail if another transaction is running
fetch c in r;
....
update model
set activated = 0
where current of c;
update model
set activated = 1
where risk_model_id = ?;
close c;
commit;
end;
/
The commit frees the lock.
The default behaviour is to wait until the row is freed. Otherwise we can specify NOWAIT, in which case any other session attempting to update the current active row will fail immediately, or we can add a WAIT option with a polling time. NOWAIT is the option to choose to absolutely avoid the risk of hanging, and it also gives us the chance to inform the user that someone else is updating the table, which they might want to know.
This approach is much more scalable than updating all the rows in the table. Use a function-based index as WW showed to enforce the rule that only one row can have ACTIVATED=1.

Related

How to implement Serializable Isolation Level in SQL Server

I need to implement a serializable isolation level in SQL Server but I've tried many ways and I don't get it.
I need to lock 1 row in one transaction (It doesn´t matter if lock the complete table). So, another transaction can´t even select the row (don´t read).
The last thing I tried:
For transaction 1:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
-- Here I select in another instance the same row
COMMIT TRAN
For transaction 2:
BEGIN TRAN
SELECT code FROM table1 WHERE code = 1
COMMIT TRAN
I would expect that transaction 2 wait until transaction 1 commit the operation, but the transaction 2 gives me the row.
Anyone can explain me if I miss something?
SQL Server conforms to the strict definition of a Serializable query. That is, there must be a result that can logically be generated IF both queries ran in serial order - Transaction 1 finishing before Transaction 2 can start, or vice versa.
This results in some effects that can be different than you would expect. There is a great explanation of the Serializable isolation level over at SQLPerformance.com that makes clear some of what this logical serializability ends up meaning. (Very helpful site, that one.)
For your above queries, there is no logical requirement to prevent the second query from reading the same row as the first query. No matter in what order the queries are run, they will both return the same data without modifying it. Since the Query Analyzer can identify this, there is no reason to place a read lock on the data. However, if one of the queries performed an update on the data, then (warning - logic assumption here, since I don't actually know the internals of how SQL Server handles this) the QA would set a stronger lock on the selected rows.
TL;DR - SQL Server wants to minimize blocking, so it uses logical analysis to see what types of locks are needed for a serializable isolation level, and it (tries to) use the minimum number and strength of locks needed to achieve its goal.
Now that we've dealt with that - there are only two ways that I can think of to lock a row so that no one else can read it: using XLOCK + TABLOCK (locking the whole table - not a recommended practice) or having some form of a field on each row that is updated when you start your process - something like an SPID field, or a bit flag for Locked. When you update it within your transaction, only SELECTs with NOLOCK hints will be able to read it.
Clearly, neither of these are optimal. I recommend the "This row is busy - go away" flag, as that's probably the approach I would take for an (almost) absolute lock on a row.
According to the documentation:
SERIALIZABLE Specifies the following:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the
current transaction until the current transaction completes.
If you're not making any changes to data with an INSERT, UPDATE, or DELETE inside transaction 1, SQL will release the Shared Lock after the read completes.
What you might want to try is adding a table hit to prevent the row lock from being released until the end of transaction 1.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SELECT code
FROM table1 WITH(ROWLOCK, HOLDLOCK)
WHERE code = 1
COMMIT TRAN
Maybe you can solve this with some hack like this?
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
UPDATE someTableForThisHack set val = CASE WHEN val = 1 THEN 0 else 1 End
SELECT code from table1.....
COMMIT TRANSACTION
So you create a table someTableForThisHack and insert one row to it.

Stored procedures vs standard select update, avoid locks

In order to retrieve an ID, I first do a select and then an update, in two consequent queries.
The problem is that I am having problems with locked rows. I've read that putting both this statements, Select and Update in one stored procedure it helps with the locks. Is this true?
The queries I run are:
select counter
from dba.counter_list
where table_name = :TableName
update dba.counter_list
set counter = :NewCounter
where table_name = :TableName
The problem is that it can happen that multiple users are selecting the same row and also possible that they update the same row.
Assumptions:
you're using Sybase ASE
your select returns a single value for counter
you may want the old counter value for some purpose other than performing the update
Consider the following update statement which should eliminate any race conditions that may occur with multiple users running your select/update logic concurrently:
declare #counter int -- change to the appropriate datatype
update dba.counter_list
set #counter = counter, -- grab current value
counter = :NewCounter -- set to new value
where table_name = :TableName
select #counter -- send previous counter value to client
the update obtains an exclusive lock on the desired row (or page/table depending on table design and locking scheme)
with an exclusive lock in place you're able to retrieve the current value and set the new value with a single statement
Whether you submit the above via a SQL batch or a stored proc call is up to you and your DBA to decide ...
if statement cache is disabled, a SQL batch will need to be compiled each time it's submitted to the dataserver
if statement cache is enabled, and you submit this SQL batch on a regular basis then there's a chance the previous query plan is still in statement/procedure cache thus eliminating the (costly) compilation step
if a copy of previous stored proc (query) plan is not in procedure cache then you'll incur the (costly) compilation step when loading a (proc) query plan into procedure cahe
a stored proc is typically easier to replace in the event of a syntax/logic/performance issue (as opposed to editing, and possibly compiling, a front-end application)
... add your (least) favorite argument for SQL batch vs stored proc (vs prepared statement?) vs ??? ...
Is the table counter_list accessed by multiple clients concurrently ?
The best practices for OLTP is to call a stored procedure that will perform the update logic in one transaction.
Check that the table dba.counter_list has an index on column table_name.
Check also that it is row level locked.

exclusive lock and shared lock for select statement - SQL Server

I am not able to understand how select will behave while its part of exclusive transaction. Please consider following scenarios –
Scenario 1
Step 1.1
create table Tmp(x int)
insert into Tmp values(1)
Step 1.2 – session 1
begin tran
set transaction isolation level serializable
select * from Tmp
Step 1.3 – session 2
select * from Tmp
Even first session hasn't been finished, session 2 will be able to read tmp table. I thought Tmp will have exclusive lock and shared lock should not be issued to select query in session 2. And it’s not happening. I have made sure that default isolation level is READ COMMITED.
Thanks in advance for helping me in understanding this behavior.
EDIT : Why I need select in exclusive lock?
I have a SP which actually generate sequential values. So flow is -
read max values from Table and store value in variables
Update table set value=value+1
This SP is executed in parallel by several thousand instances. If two instances execute SP at same time, then they will read same value and will update value+1. Though I would like to have sequential value for every execution. I think its possible only if select is also part of exclusive lock.
If you want a transaction to be serializable, you have to change that option before you start the outermost transaction. So your first session is incorrect and is still actually running under read committed (or whatever other level was in effect for that session).
But even if you correct the order of statements, it still will not acquire an exclusive lock for a plain SELECT statement.
If you want the plain SELECT to acquire an exclusive lock, you need to ask for it:
select * from Tmp with (XLOCK)
or you need to execute a statement that actually requires an exclusive lock:
update Tmp set x = x
Your first session doesn't need an exclusive lock because it's not changing the data. If your first (serializable) session had run to completion and either rolled back or committed, before your second session was started, that session's results would still be the same because your first session didn't change the data - and so the "serializable" nature of the transaction was correct.

Check if table data has changed?

I am pulling the data from several tables and then passing the data to a long running process. I would like to be able to record what data was used for the process and then query the database to check if any of the tables have changed since the process was last run.
Is there a method of solving this problem that should work across all sql databases?
One possible solution that I've thought of is having a separate table that is only used for keeping track of whether the data has changed since the process was run. The table contains a "stale" flag. When I start running the process, stale is set to false. If any creation, update, or deletion occurs in any of the tables on which the operation depends, I set stale to true. Is this a valid solution? Are there better solutions?
One concern with my solution is situations like this:
One user starts inserting a new row into one of the tables. Stale gets set to true, but the new row has not actually been added yet. Another user has simultaneously started the long running process, pulling the data from the tables and setting the flag to false. The row is finally added. Now the data used for the process is out of date but the flag indicates it is not stale. Would transactions be able to solve this problem?
EDIT:
This is some SQL for my idea. Not sure if it works, but just to give you a better idea of what I was thinking:
# First transaction reads the data and sets the flag to false
BEGIN TRANSACTION
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
UPDATE flag SET stale = false
SELECT * FROM DATATABLE
COMMIT TRANSACTION
# Second transaction updates the data and sets the flag to true
BEGIN TRANSACTION
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
UPDATE data SET val = 15 WHERE ID = 10
UPDATE flag SET stale = true
COMMIT TRANSACTION
I do not have much experience with transactions or handwriting xml, so there are probably issues with this. From what I understand two serializable transactions can not be interleaved. Please correct me if I'm wrong.
Is there a way to accomplish this with only the first transaction? The process will be run rarely, but the updates to the data table will occur more frequently, so it would be nice to not lock up the data table when performing updates.
Also, is the SET TRANSACTION ISOLATION syntax specific to MS?
The stale flag will probably work, but a timestamp would be better since it provides more metadata about the age of the records which could be used to tune your queries, e.g., only pull data that is over 5 minutes old.
To address your concern about inserting a row at the same time a query is run, transactions with an appropriate isolation level will help. For row inserts, updates, and selects, at least use a transaction with an isolation level that prevents dirty reads so that no other connections can see the updated data until the transaction is committed.
If you are strongly concerned about the case where an update happens at the same time as a record pull, you could use the REPEATABLE READ or even SERIALIZABLE isolation levels, but this will slow DB access down.
Your SQLServer sampled should work. For alternate databases, Here's an example that works in PostGres:
Transaction 1
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- run queries that update the tables, then set last_updated column
UPDATE sometable SET last_updatee = now() WHERE id = 1;;
COMMIT;
Transaction 2
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- select data from tables, then set last_queried column
UPDATE sometable SET last_queried = now() WHERE id = 1;
COMMIT;
If transaction 1 starts, and then transaction 2 starts before transaction 1 has completed, transaction 2 will block during on the update, and then will throw an error when transaction 1 is committed. If transaction 2 starts first, and transaction 1 starts before that has finished, then transaction 1 will error. Your application code or process should be able to handle those errors.
Other databases use similar syntax - MySQL (with InnoDB plugin) requires you to set the isolation level before you start the transaction.

SQL Server Race Condition Question

(Note: this is for MS SQL Server)
Say you have a table ABC with a primary key identity column, and a CODE column. We want every row in here to have a unique, sequentially-generated code (based on some typical check-digit formula).
Say you have another table DEF with only one row, which stores the next available CODE (imagine a simple autonumber).
I know logic like below would present a race condition, in which two users could end up with the same CODE:
1) Run a select query to grab next available code from DEF
2) Insert said code into table ABC
3) Increment the value in DEF so it's not re-used.
I know that, two users could get stuck at Step 1), and could end up with same CODE in the ABC table.
What is the best way to deal with this situation? I thought I could just wrap a "begin tran" / "commit tran" around this logic, but I don't think that worked. I had a stored procedure like this to test, but I didn't avoid the race condition when I ran from two different windows in MS:
begin tran
declare #x int
select #x= nextcode FROM def
waitfor delay '00:00:15'
update def set nextcode = nextcode + 1
select #x
commit tran
Can someone shed some light on this? I thought the transaction would prevent another user from being able to access my NextCodeTable until the first transaction completed, but I guess my understanding of transactions is flawed.
EDIT: I tried moving the wait to after the "update" statement, and I got two different codes... but I suspected that. I have the waitfor statement there to simulate a delay so the race condition can be easily seen. I think the key problem is my incorrect perception of how transactions work.
Set the Transaction Isolation Level to Serializable.
At lower isolation levels, other transactions can read the data in a row that is read, (but not yet modified) in this transaction. So two transactions can indeed read the same value. At very low isolation (Read Uncommitted) other transactions can even read data after it's been modified (but before committed)...
Review details about SQL Server Isolation Levels here
So bottom line is that the Isolation level is crtitical piece here to control what level of access other transactions get into this one.
NOTE. From the link, about Serializable
Statements cannot read data that has been modified but not yet committed by other transactions.
This is because the locks are placed when the row is modified, not when the Begin Trans occurs, So what you have done may still allow another transaction to read the old value until the point where you modify it. So I would change the logic to modify it in the same statement as you read it, thereby putting the lock on it at the same time.
begin tran
declare #x int
update def set #x= nextcode, nextcode += 1
waitfor delay '00:00:15'
select #x
commit tran
As other responders have mentioned, you can set the transaction isolation level to ensure that anything you 'read' using a SELECT statement cannot change within a transaction.
Alternatively, you could take out a lock specifically on the DEF table by adding the syntax WITH HOLDLOCK after the table name, e.g.,
SELECT nextcode FROM DEF WITH HOLDLOCK
It doesn't make much difference here, as your transaction is small, but it can be useful to take out locks for some SELECTs and not others within a transaction. It's a question of 'repeatability versus concurrency'.
A couple of relavant MS-SQL docs.
Isolation levels
Table hints
Late answer. You want to avoid a race condition...
"SQL Server Process Queue Race Condition"
Recap:
You began a transaction. This doesn't actually "do" anything in and of itself, it modifies subsequent behavior
You read data from a table. The default isolation level is Read Committed, so this select statement is not made part of the transaction.
You then wait 15 seconds
You then issue an update. With the declared transaction, this will generate a lock until the transaction is committed.
You then commit the transaction, releasing the lock.
So, guessing you ran this simultaneously in two windows (A and B):
A read the "next" value from table def, then went into wait mode
B read the same "next" value from the table, then went into wait mode. (Since A only did a read, the transaction did not lock anything.)
A then updated the table, and probably commited the change before B exited the wait state.
B then updated the table, after A's write was committed.
Try putting the wait statement after the update, before the commit, and see what happens.
It's not a real race condition. It's more a common problem with concurrent transactions. One solution is to set a read lock on the table and therefor have a serialization in place.
This is actually a common problem in SQL databases and that is why most (all?) of them have some built in features to take care of this issue of obtaining a unique identifier. Here are some things to look into if you are using Mysql or Postgres. If you are using a different database I bet the provide something very similar.
A good example of this is postgres sequences which you can check out here:
Postgres Sequences
Mysql uses something called auto increments.
Mysql auto increment
You can set the column to a computed value that is persisted. This will take care of the race condition.
Persisted Computed Columns
NOTE
Using this method means you do not need to store the next code in a table. The code column becomes the reference point.
Implementation
Give the column the following properties under computed column specification.
Formula = dbo.GetNextCode()
Is Persisted = Yes
Create Function dbo.GetNextCode()
Returns VarChar(10)
As
Begin
Declare #Return VarChar(10);
Declare #MaxId Int
Select #MaxId = Max(Id)
From Table
Select #Return = Code
From Table
Where Id = #MaxId;
/* Generate New Code ... */
Return #Return;
End