Prevent duplicate key usage - sql

Database is Postgresql. For an simplified example I will insert measurement data in various tables. Example DDL for one example table looks like this:
CREATE TABLE
measurement
(
id_meas BIGINT NOT NULL,
...
PRIMARY KEY (id_meas)
);
The process of inserting data currently works like this:
Select max id value from table
Increment id value
Insert next data row using incremented id value
This works as long as there is only one client inserting data. But what if there are > 1 client's inserting so that two clients may select 567 as max id value and both increment this to 568 as next id value to insert. In that case the second client executing the insert command will receive an duplicate key error. Is there a way to prevent those errors other than re-executing the insertion process after an error occurred?

You are looking for a serial column:
CREATE TABLE measurement (
id_meas bigserial primary key,
...
);
bigserial is a bigint that auto-increments (see here). You can also just use serial if an int is big enough.
This puts the database in charge of incrementing the value, rather than the application. You are guaranteed that race conditions will not result in the same value in different records. It is possible that gaps in the value will appear under some circumstances.

Related

concurrent insertion with unique key constraint

I have a web application with multiple instance running all connected to a p-SQL with a table - myTable that have only two columns with one column is having unique key constraint.
This application will receive at least 15K request in a second from multiple instance and my requirement is to avoid the duplicate insertion in the table.
myTable:
CREATE TABLE myTable (
id SERIAL PRIMARY KEY,
myData VARCHAR (50) UNIQUE);
Insert statement:
INSERT INTO myTable (myData) VALUES ('data-1');
My question is if all the instances try to insert same data (assume the above insert statement from all the instances), all at a time to the table will I get only one record inserted and one instance will get inserted success message and all other instance will get a duplicate key error?

How do I ensure that a referencing table also has data

My Postgres database has the following schema where the the user can store multi profile images.
CREATE TABLE users(
id INT GENERATE AS ALWAYS PRIMARY KEY,
name VARCHAR(50)
);
CREATE TABLE images(
id INT GENERATE AS ALWAYS PRIMARY KEY,
url VARCHAR(50)
);
CREATE TABLE user_images(
user_id INT REFERENCES users(id),
image_id INT REFERENCES images(id)
);
How do I ensure that when I insert a user object, I also insert at least one user image?
You cannot do so very easily . . . and I wouldn't encourage you to enforce this. Why? The problem is a "chick and egg" problem. You cannot insert a row into users because there is no image. You cannot insert a row into user_images because there is no user_id.
Although you can handle this situation with transactions or delayed constraint checking, that covers only half the issue -- because you have to prevent deletion of the last image.
Here are two alternative.
First, you can simply add a main_image_id to the users table and insist that it be NOT NULL. Voila! At least one image is required.
Second, you can use a trigger to maintain a count of images in users. Then treat rows with no images as "deleted" so they are never seen.
When you insert a data into a table database can return a id from row which was inserted. So, if id > 0 the row has been inserted. But first, add column id (bigserial, auto increment, unique) to all tables.
INSERT INTO user_images VALUES (...) RETURNING id;

Max Number Generation

I have been developing one application to register the student's details for admission. In SQL Server, I have to save the records and generate one reference number. That registration should be unique.
At the present, I am taking max number from the table and insert into the table. Duplicate records inserted in milliseconds. How to avoid duplicate records in the reference number column? In my application, 1000 concurrent users register at the same time.
Ex. IRCTC Ticket booking. They are generating PNR without duplicate.
There is no reason why a regular auto increment primary key column should not suffice here:
CREATE TABLE students (
id INT NOT NULL IDENTITY PRIMARY KEY,
...
)
SQL Server would never generate a duplicate value for the above id column. If, for some reason, this would not meet your expectations, then you could consider using a UUID for each student record, using the NEWID() function.

SQL Server : increment number w/ constraints

I have created a table that will create a ID for customers that is a number starting at 101 and increasing by 1 for each customer. So the first customer will have the ID 101, the second will have 102 and so on. In addition to the ID I have other information namely First and Last names. I have also added a constraint that applies to the first and last name columns that will force the entries to be made up by letters.
Here is the SQL statement:
CREATE TABLE tblcustomer
(
CUST_ID INT NOT NULL IDENTITY(101,1) PRIMARY KEY,
FIRST_NAME VARCHAR(15) NOT NULL,
LAST_NAME VARCHAR(15) NOT NULL,
CONSTRAINT firstlet CHECK (FIRST_NAME NOT LIKE '%[^a-zA-Z]%'
AND LAST_NAME NOT LIKE '%[^a-zA-Z]%')
);
This works as intended except for one small issue. When I try to insert say a number for the first name, the constraint will work and not enter anything to the table. But then when I insert the first and last name correctly, it will add the information to the table but the CUST_ID will skip a number.
Example Inserts:
insert into tblcustomer(FIRST_NAME,LAST_NAME) values ('Bob','Smith');
insert into tblcustomer(FIRST_NAME,LAST_NAME) values ('Greg','Johns');
insert into tblcustomer(FIRST_NAME,LAST_NAME) values ('Todd','123');
insert into tblcustomer(FIRST_NAME,LAST_NAME) values ('Todd','Howe');
Output:
CUST_ID FIRST_NAME LAST_NAME
-----------------------------
101 Bob Smith
102 Greg Johns
104 Todd Howe
So where the CUST_ID shows 104 should actually be 103.
Skipping a number is fine. It's normal behavior in any database, and you shouldn't expect the numbers to remain consecutive forever. If this bothers you, try using a GUID key instead.
An identity column value gets updated the moment it receives a request. Hence even when the insertion fails due to validation constraints, the number is already taken.
If your business case requires exact sequence of ID being generated (preserving order of insertion), you will need to set the value of ID column manually using identity_insert as on, then increment the max ID. Do note that if multiple such request come, there can be race conditions where 2 records with same ID are tried to be inserted, and second fails due to primary की constraint.
If all you want with the primary key being unique automatically, use a Guid field. That will save you from all this effort.
Simple example, you are using sequence for an auto increment. With begin transaction inserting a record into the table. But any how you just rollback that transaction.
so next insert will skip that transaction, because it will not hold or place the lock on the sequence.
Sequence will just raise the identity, its job done, If you want to use it or not. and as best practice its good and healthy for performance purpose.

SQL self calculated field

I have two tables which are connected by m2m relationship.
CREATE TABLE words
(
id INT PRIMARY KEY,
word VARCHAR(100) UNIQUE,
counter INT
)
CREATE TABLE urls
(
id INT PRIMARY KEY,
url VARCHAR(100) UNIQUE
)
CREATE TABLE urls_words
(
url_id INT NOT NULL REFERENCES urls(id),
word_id INT NOT NULL REFERENCES words(id)
)
and i have counter field in words table. How can i automize proccess of updating counter field which is responsible for calculating how much rows stored in urls_words with particular word.
I would investigate why you want to store this value. There may be good reasons, but triggers complicate databases.
If this is a "load-then-query" database, then you can update the count when you load data -- presumably at some frequency such as once a day or once a week. You don't need to worry about triggers.
If this is a transactional database, then triggers would be needed and these add complexity to the processing. They also lock tables when you might not want them locked.
An alternative is to have an index on urls_words(word_id, url_id). This would greatly speed the calculation of the count when you need it. It also does not require triggers or locks on multiple table during an update.
Create a trigger on urls_words table which updates the counter column on words table every time a change is made (ie update, insert, delete).