Imagine I have this simple table:
Table Name: Table1
Columns: Col1 NUMBER (Primary Key)
Col2 NUMBER
If I insert a record into Table1 with no commit...
INSERT INTO Table1 (Col1, Col2) Values (100, 1234);
How does Oracle know that this next INSERT statement violates the PK constraint, since nothing has yet been committed to the database yet.
INSERT INTO Table1 (Col1, Col2) Values (100, 5678);
Where/how does Oracle manage the transactions so that it knows I'm violating the constraint when I haven't even committed the transaction yet.
Oracle creates an index to enforce the primary key constraint (a unique index by default). When Session A inserts the first row, the index structure is updated but the change is not committed. When Session B tries to insert the second row, the index maintenance operation notes that there is already a pending entry in the index with that particular key. Session B cannot acquire the latch that protects the shared index structure so it will block until Session A's transaction completes. At that point, Session B will either be able to acquire the latch and make its own modification to the index (because A rolled back) or it will note that the other entry has been committed and will throw a unique constraint violation (because A committed).
It's because of the unique index that enforces the primary key constraint. Even though the insert into the data block is not yet committed, the attempt to add the duplicate entry into the index cannot succeed, even if it's done in another session.
Just because you haven't done a commit yet does not mean the first record hasn't been sent to the server. Oracle already knows about you intentions to insert the first record. When you insert the second record Oracle knows for sure there is no way this will ever succeed without a constraint violation so it refuses.
If another user were to insert the second record, Oracle will accept it if the first record has not been committed yet. If the second user commits before you do, your commit will fail.
Unless a particular constraint is "deferred", it will be checked at the point of the statement execution. If it is deferred, it will be checked at the end of the transaction. I'm assuming you did not defer your PRIMARY KEY and that's why you get a violation even before you commit.
How this is really done is an implementation detail and may vary between different database systems and even versions of the same system. The application developer should probably not make too many assumptions about it. In Oracle's case, PRIMARY KEY uses the underlying index for performance reasons, while there are systems out there that do not even require an index (if you can live with the corresponding performance hit).
BTW, a deferrable Oracle PRIMARY KEY constraint relies on a non-unique index (vs non-deferrable PRIMARY KEY that uses a unique index).
--- EDIT ---
I just realized you didn't even commit the first INSERT. I think Justin's answer explains nicely how what is essentially a lock contention causes one of the transactions to stall.
Related
I have a postgres database in which I'm refreshing data periodically. Most of the time it works, but sometimes I have issues with a unique index.
Minimal example
create table test_table (
id int
);
create unique index test_table_unique on test_table(id);
(I know, in this case it should be a primary key, but for the sake of example, please bear with me.)
Now, every hour, I do something like this:
begin;
delete from test_table;
insert into test_table (id) values (1), (2), (3)...
commit;
As I said, most of the time it will just work fine. However, sometimes postgres complains about a duplicate entry in the unique index.
error: duplicate key value violates unique constraint test_table_unique
detail: "Key (id)=(2) already exists."
My real database
In my actual table, I'm using JSON payloads, and the unique index is made on fields of that json payload. In particular, the error details is as follows:
create table if not exists source (
id serial primary key,
payload jsonb not null
);
create unique index if not exists source_index_and_id on source ((payload->>'_index'), (payload->>'_id'));
error details: "Key ((payload ->> '_index'::text), (payload ->> '_id'::text))=(companies, AC9860) already exists."
I'm confident there is no actual duplicate data. I'm deleting everything for a particular ->>_index, and the ->>_id is unique in my source data.
My understanding is that if I delete rows from a table, the indices will be updated before the next statements are executed. But it doesn't seem to be the case. I've found that it helps (not sure if it actually solves the issue) to commit the changes after the delete, and before the inserts.
begin;
delete...
commit;
begin;
insert...
commit;
What's happening here?
The only options how this could happen are
the deleting transaction rolled back
concurrent transactions inserted new rows after you deteted the original ones
the inserting transaction inserts the same key twice
the inserting transaction is accidentally run before the deleting one
PostGreSQL is not a real relational DBMS and does not match rule 7 of Codd's Rule about functional set operations.
Contrary to other RDBMS PostGreSQL delete rows one by one and this lack of functionality conduct to have sometime fantom key violation.
In my paper that compare PostGreSQL to MS SQL Server I made a test that show this evidence (§ 7 – The hard way to udpates unique values)
What is bigger performance hit on a postgres database when table has unique constraint:
Trying to insert and let it throw unique violation constraint error
Check if entry exist and not do insert if it does
I'm importing some data, and ORM is connecting some entries via many to many through connection table. It is not checking if connection exist, it just runs the query and fails with unique constraint when it exist.
Is it better to leave it like that, or to introduce a step where I would check if the entry exist and then do the insert if it doesn't?
I would assume that your check from 2. would be an extra statement, so it is probably more expensive. I cannot say for sure, since you were rather vague in your question.
Besides the second approach is suffering from a race condition: you can never guarantee that no conflicting row gets inserted by a concurrent session after you checked.
If you want to avoid the error, the best approach would be
INSERT INTO ... VALUES (...) ON CONFLICT DO NOTHING;
performance hit:
As unique constraint creates index on the specified column it will affect the rate of insertion and updation.And most abruptly, In batch operations where numbers of inserts and updates are very large.
I'm writing a programm which inserts data to a MariaDB-Server and can be used by different people on the same time. The transactions take some time, so the following problem might occur: Person A starts a transaction with primary key "c" and while the transaction is still uncommitted, Person B wants to insert data with the same primary key "c". How can I prevent that B can start its transaction with a primary key that A already uses in its uncommitted transaction?
I use MariaDB as database and InnoDB as Engine.
I've checked the Isolation-Levels but couldn't figure how to use them to solve my Problem.
Thanks!
It has nothing to do with transaction isolation levels. It's about locking.
Any insert/update/delete to a specific entry in an index locks that entry. Locks are granted first-come, first-serve. The next session that tries to do an insert/update/delete to the same index entry will be blocked.
You can demo this yourself. Open two MySQL client windows side by side.
First window:
mysql> START TRANSACTION;
mysql> INSERT INTO mytable SET c = 42;
Then don't commit yet.
Second window:
mysql> INSERT INTO mytable SET c = 42;
Notice that it hangs at this point, waiting for the lock.
First window:
mysql> commit;
Second window finally returns:
ERROR 1062 (23000): Duplicate entry '42' for key 'PRIMARY'
Every table should have a PRIMARY KEY. In MySQL, the PRIMARY KEY is, by definition, UNIQUE.
You can also have UNIQUE keys declared on the table.
Each connection should be doing this to demark a transaction:
BEGIN;
various SQL statements
COMMIT;
If any of those SQL statements inserts a row, it uses the unique key(s) to block others from inserting the same unique value into that table. This will lead to some form of error -- deadlock (fatal to the transaction), "lock wait timeout" -- which it might recover from, etc.
Note: If you have any SELECTs in the transaction, you may need to stick FOR UPDATE on the end of them. This signals what rows you might change in the transaction, thereby giving other connections a heads-up to stay out of the way.
Can you find out if any of this is going on? Not really. But why bother? Simply plow ahead and do what you need to do. But check for errors to see if some other connection prevented you from doing it.
Think of it is "optimistic" coding.
Leave the isolation level alone; it only adds confusion to typical tasks.
Primary keys are internal values that ensure uniqueness of rows and are not meant to be exposed to the external world.
Generate your primary keys using IDENTITY columns or using SEQUENCEs. They will handle multiple simultaneous inserts gracefully and will assign each one different values.
Using IDENTITY:
CREATE TABLE house (
id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
address VARCHAR(40) NOT NULL
);
INSERT INTO house (address) VALUES ('123 Maple Street');
Using a SEQUENCE:
CREATE SEQUENCE myseq1;
CREATE TABLE house (
id INTEGER NOT NULL PRIMARY KEY,
address VARCHAR(40) NOT NULL
);
INSERT INTO house (id, address) VALUES (NEXT VALUE FOR myseq1, '123 Maple Street');
Imagine I have this simple table:
Table Name: Table1
Columns: Col1 NUMBER (Primary Key)
Col2 NUMBER
If I insert a record into Table1 with no commit...
INSERT INTO Table1 (Col1, Col2) Values (100, 1234);
How does Oracle know that this next INSERT statement violates the PK constraint, since nothing has yet been committed to the database yet.
INSERT INTO Table1 (Col1, Col2) Values (100, 5678);
Where/how does Oracle manage the transactions so that it knows I'm violating the constraint when I haven't even committed the transaction yet.
Oracle creates an index to enforce the primary key constraint (a unique index by default). When Session A inserts the first row, the index structure is updated but the change is not committed. When Session B tries to insert the second row, the index maintenance operation notes that there is already a pending entry in the index with that particular key. Session B cannot acquire the latch that protects the shared index structure so it will block until Session A's transaction completes. At that point, Session B will either be able to acquire the latch and make its own modification to the index (because A rolled back) or it will note that the other entry has been committed and will throw a unique constraint violation (because A committed).
It's because of the unique index that enforces the primary key constraint. Even though the insert into the data block is not yet committed, the attempt to add the duplicate entry into the index cannot succeed, even if it's done in another session.
Just because you haven't done a commit yet does not mean the first record hasn't been sent to the server. Oracle already knows about you intentions to insert the first record. When you insert the second record Oracle knows for sure there is no way this will ever succeed without a constraint violation so it refuses.
If another user were to insert the second record, Oracle will accept it if the first record has not been committed yet. If the second user commits before you do, your commit will fail.
Unless a particular constraint is "deferred", it will be checked at the point of the statement execution. If it is deferred, it will be checked at the end of the transaction. I'm assuming you did not defer your PRIMARY KEY and that's why you get a violation even before you commit.
How this is really done is an implementation detail and may vary between different database systems and even versions of the same system. The application developer should probably not make too many assumptions about it. In Oracle's case, PRIMARY KEY uses the underlying index for performance reasons, while there are systems out there that do not even require an index (if you can live with the corresponding performance hit).
BTW, a deferrable Oracle PRIMARY KEY constraint relies on a non-unique index (vs non-deferrable PRIMARY KEY that uses a unique index).
--- EDIT ---
I just realized you didn't even commit the first INSERT. I think Justin's answer explains nicely how what is essentially a lock contention causes one of the transactions to stall.
I'm dealing with a set of tables in a database that appear to have a circular relationship (see image). This is the ARTS database if that is of any help to anyone.
A user signing on:
a) must create a (insert into) session, which in turn needs a SessionStartTransactionID (=SignOnTransaction)
b) a SignOnTransaction is a type of ControlTransaction
c) a ControlTransaciton is a type of Transaction
d) a Transaction needs a reference to an existing Session (along with Operator, etc.)
Note:
The Transaction.SessionStartTransactionID,Transaction.OperatorID, and Transaction.WorkStationID (thoese 3 are the composite primary key in Session) cannot be NULL in the Transaction table.
I can't figure out how to create (insert into) SignOnTransaction, or insert into any of the tables mentioned above.
How do you do that in SQL Server? Is that even possible?
Where would I start?
Thanks!
If something you're describing is impossible, then you're understanding it wrong. You can't have table A that has a required Key that references table B that has a required key that references table A. One of the two keys has to be nullable, or foregin key relationships aren't being enforced.
Some ideas
Given that Session uses StartTransactionID as part of its primary key means that it can't be null, so it seems likely that StartTransactionID in Transaction can be null, so that you insert Transaction, then ControlTransaction then SignOnTransaction then Session, then update the Transaction that was created with the id. (If the FK was not enforced, you can skip the update, and just use the same value for the PK if it isn't an Indentity column).
The only other possible solution I can think of is that you have to use an ALTER TABLE Transaction NOCHECK CONSTRAINT StartTransactionIDconstraint_name every time you first insert into Transaction, and then restore the constraint after you update the table. Seems like a hackish solution to be sure. Especially because you can't do an ALTER TABLE in a transaction so that you leave yourself open for a ton of problems.
...
Since this appears to be part of a production system, why don't you run a SQL Trace to see how the data is getting populated. This helps me all the time.