Transaction isolation level - choosing the right one - sql

I'm a sql beginner and I need help concerning isolation levels of transactions.
I need to know which isolation level is the best for the following situation and why:
There are 3 tables in the database:
Animals (that are registered by inserting a chip into them) KEY - ID_CHIP REF CHIPS
Chips (that can but dont have to be inserted into an animal) KEY - ID_CHIP. One of the attributes is "INSERTED_BY" which references to the third table PEOPLE (gives ID of a person who inserted the chip, and NULL if it wasnt inserted yet)
People - KEY: ID
Now let's consider the following transactions: a new chip has been inserted into an animal. A person who updates the database has to change two things:
add a new entity to ANIMALS
update the chip record that was inserted (change the INSERTED_BY attribute from NULL to ID of a person who inserted the chip)
The second transaction is a controller transaction, who checks if the number of entities in ANIMALS is equal to the numer of CHIPS that have the attribute INSERTED_BY not equal to NULL.
A situation is shown by the image below:
Can anyone tell me which of the fours isolation levels is best and why? I'm stuck here.. Any help would be appreciated.

Your situation is easy because one of the transactions a purely read transaction. Look into snapshot isolation. Running the reader under SNAPSHOT isolation level will give it a point-in-time consistent view of the entire database. No locks will be taken or waited on.
This means that at t2 the insert will not be visible to C2.
This is very easy to implement and solves the problem completely.
Without SNAPSHOT isolation you'd need SERIALIZABLE isolation and you'll deadlock a lot. Now you need to investigate locking hints. Much more complex, not necessary.

Related

Check if relation is not exists and update another entity

I have a Product table, a Location table and ProductLocationRel table, which is relation table with locationId to productId.
I need to update Location entity (mark deactivated) if there is no relation with given location exists.
I thought about having a single SQL query for that, I'd like to keep such business rule on the code level, rather then delegating it to database.
Therefore, the idea then is programmatically check if there are any relation exist in a single transaction with SERIALIZABLE isolation level through find relation, check condition and then update steps, like so:
(pseudocode)
t = transaction.start()
exist = t.find(relation with locationId).
if(exist) throw Error("can't do this");
location.isActive = false;
t.update(location);
t.commit();
But I'm not sure how transaction would behave itself in this case.
The questions on this one which I have are:
If during transaction new relation records appear in DB, would transaction fail? I think yes, but I'm not sure.
Would that approach block whole relation table for this operation? This might become a bottleneck in this case.
If it would be just simple location delete, I wouldn't need to care, as db level reference integrity would catch this on delete step, but this is not the case.
I don't think it's relevant, as this touches purely transactions execution and SQL, but the database is postgres and runtime is node.js.

Database for upvotes and downvotes

If I have a database with for an article and I have a field for upvotes.
I was thinking about creating an SQL Query with which I will first get the current value of upvotes and then I will increment 1 to the value
But what if 5 people at once click on the upvote button what will happen then?
or is there a better way to do this altogether *
My strong suggestion is to keep a record of every upvote and downvote in a votes table:
create table votes (
votes_id <autoincrement> primary key,
user_id int references users(user_id), -- whodunnit
topic_id int references topics(topic_id), -- what they're voting on
inc int,
created_at datetime default current_datetime,
check (inc in (-1, 1))
);
You can then summarize the votes as you want. You can see trends in voting over time. You can ensure that someone can "unvote" if they have voted in the past.
And, inserting into a table runs no risk of having different users interfere with each other.
The downside is that summarizing the results takes a bit more time. You can optimize that when the issue arises.
There are two solutions:
If you really need to load the value into your application and increment it there, writing it back afterwards, get an appropriate lock on the table before selecting the value. Release the lock after you finished with the value. Either because of cancellation or rewriting an actual upvote.
Otherwise a concurrent instance B could have read the same value and write it back after the first instance A. Say both read 3. Both increment it to 4. Then A writes it back before B, the value in the database is 4, B the also writes it back and again the value in the database is 4. Though 3+2=5. So one upvote would get "lost" this way. It's called a "lost update problem".
You can prevent this with a lock as mentioned. As B cannot read from the table before you've updated and released the lock. Afterwards it will read 4 instead of 3 and therefore write back 5, which is correct.
But preferably, do it in a single update, like
UDPATE votes
SET votes = votes + 1
WHERE article = #some_id;
That is, you increment the actual value in the database, regardless of what your application thinks this value currently is.
Provided that your transaction has an appropriate isolation level the database will take care of locking by itself and thus keep concurrent transaction from updating with "dirty", outdated data.
I suggest, you read a little more about transactions, isolation levels and locking to fully understand the problem.

Oracle Audit Trail to get the list of columns which got updated in last transaction

Consider a table(Student) under a schema say Candidates(NOT DBA):
Student{RollNumber : VARCHAR2(10),Name : VARCHAR2(100),CLass : VARCHAR2(5),.........}
Let us assume that the table already contains some valid data.
I executed an update query to modify the name and class of the Student table
UPDATE STUDENT SET Name = 'ASHWIN' , CLASS = 'XYZ'
WHERE ROLLNUMBER = 'AQ1212'
Followed by another update query in which I am updating some other fields
UPDATE STUDENT SET Math_marks = 100 ,PHY_marks , CLASS = 'XYZ'
WHERE ROLLNUMBER = 'AQ1212'
Since I modified different columns in two different queries. I need to fetch the particular list of columns which got updated in last transaction. I am pretty sure that oracle must be maintaining this in some table logs which could be accessed by DBA. But I don't have the DBA access.
All I need is a the list of columns that got updated in last transaction under schema Candidates I DO NOT have the DBA rights
Please suggest me some ways.
NOTE : Here above I mentioned a simple table. But In actual I have got 8-10 tables for which I need to do this auditing where a key factor lets say ROLLNUMBER acts a foreign key for all other tables. Writing triggers would be a complex for all tables. So please help me out if there exists some other way to fetch the same.
"I am pretty sure that oracle must be maintaining this in some table logs which could be accessed by DBA."
Actually, no, not be default. An audit trail is a pretty expensive thing to maintain, so Oracle does nothing out of the box. It leaves us to decide what we what to audit (actions, objects, granularity) and then to switch on auditing for those things.
The Oracle requires DBA access to enable the built-in functionality, so that may rule it out for you anyway.
Auditing is a very broad topic, with lots of things to consider and configure. The Oracle documentation devotes a big chunk of the Security manual to Auditing. Find the Introduction To Auditing here. For monitoring updates to specific columns, what you're talking about is Fine-Grained Audit. Find out more.
"I have got 8-10 tables ... Writing triggers would be a complex for all tables."
Not necessarily. The triggers will all resemble each other, so you could build a code generator using the data dictionary view USER_TAB_COLUMNS to customise some generic boilerplate text.

Database design: different precision level to describe a table

I am designing a DB and I cannot figure out how to model the following situation:
There is a main table that is called "Transaction".
Every transaction has a "Status" to describe it.
Every status has 1 or 2 "Substatus" to describe it.
A "Substatus" can have a "Subsubstatus" to describe it.
Moreover I need to express in the model that every "substatus" or "subsubstatus" is strictly link to its master table: indeed for a given "substatus" there is only one status possible.
The link between "Status" , "Substatus" and "Subsubstatus" seems logically like that:
Status : STA_Id,STA_Name
Substatus : SST_Id,*STA_Id*,SST_Name
Subsubstatus : SSS_Id,*SST_Id*,SSS_Name
But the problem is the way to link that to "Transaction" table, taking into account that it can have 2 substatus and a subsubstatus.
I thought of linking "Subsubstatus" to "Transaction" but it forces me to give a subsubstatus to every transaction that is not really the case.
If you have an idea about that, it would be awesome!
You are desrcribing several things.
Transaction has a Status, a SubStatus and a SubSubStatus.
There is a parent child relationship between Status and SubStatus.
These is a parent child realtionship between SubStatus and SubSubStatus.
A Transaction's Substatus is contrained by its Status, defined by the realtionship between Status and SubStatus.
A Transaction's SubSubStatus is constrained by its SubStatus, defined by the realtionship between SubStatus and SubSubStatus.
Points 1, 2 and 3 are obviously defined by appropriate foriegn keys.
Points 4, and 5 will need to be be defined by more complicated contraints. These contraints will depend on how your database is implemented.
Alternatively you could denormalise all valid combinations into a single entity. This would make it easy to assure a Transaction has a valid combination but problematic when the relationship between statuses change.
Store the all the (sub)*statuses in the same table.
STA_Id,STA_Name,*SuperStatus_ID*
You might also want an extra field, STA_Depth to indicate how many sub levels you are down.
Make another table and join subsubstatus and transection table in that.such as
trans_subsubstatus SSS_Id,SST_Id,STA_Id
It is easy to maintain normal table than a complax one
Looks like you need something similar to this:
(NOTE: There is also F2 in front of Status.TransactionId, but I couldn't show it above due to the limitation of the diagramming tool I used.)
Legend:
F1: The Status table references (via FOREIGN KEY) the Transaction table.
F2: The Status table references itself (SuperStatusName refernces StatusName and TransactionId references itself).
PK: Denotes primary keys.
The Status table represents both statuses and sub-statuses. It allows for a transaction to have unlimited number of statuses1, for each status to have unlimited number of sub-statuses etc...
Since we have used the identifying relationships and the resulting composite natural keys, all statuses (at all levels of hierarchy) of the same transaction share the same TransactionId. This makes it very easy to get the set of all statuses of the given transaction. However, putting these statuses in the correct order requires recursive querying.
It also makes it very easy to query in the opposite direction and find the transaction of the given status.
In addition to that, given StatusName can appear only once per transaction (i.e. two statuses of the same transaction cannot share the same name, even if they are at different levels of hierarchy).2
1 If you really want to limit to only 2 statuses at the database level, that is possible, but would complicate the conceptually simple model above and would unnecessarily limit you should the requirements change in the future.
2 This may or may not be what you wanted. If you didn't, the model above can be changed accordingly.

NHibernate transaction and race condition

I've got an ASP.NET app using NHibernate to transactionally update a few tables upon a user action. There is a date range involved whereby only one entry to a table 'Booking' can be made such that exclusive dates are specified.
My problem is how to prevent a race condition whereby two user actions occur almost simultaneously and cause mutliple entries into 'Booking' for >1 date. I can't check just prior to calling .Commit() because I think that will still leave be with a race condition?
All I can see is to do a check AFTER the commit and roll the change back manually, but that leaves me with a very bad taste in my mouth! :)
booking_ref (INT) PRIMARY_KEY AUTOINCREMENT
booking_start (DATETIME)
booking_end (DATETIME)
make the isolation level of your transaction SERIALIZABLE (session.BeginTransaction(IsolationLevel.Serializable) and check and insert in the same transaction. You should not in general set the isolationlevel to serializable, just in situations like this.
or
lock the table before you check and eventually insert. You can do this by firing a SQL query through nhibernate:
session.CreateSQLQuery("SELECT null as dummy FROM Booking WITH (tablockx, holdlock)").AddScalar("dummy", NHibernateUtil.Int32);
This will lock only that table for selects / inserts for the duration of that transaction.
Hope it helped
The above solutions can be used as an option. If I sent 100 request by using "Parallel.For" while transaction level is serializable, yess there is no duplicated reqeust id but 25 transaction is failed. It is not acceptable for my client. So we fixed the problem with only storing request id and adding an unique index on other table as temp.
Your database should manage your data integrity.
You could make your 'date' column unique. Therefore, if 2 threads try to get the same date. One will throw a unique key violation and the other will succeed.