Databases - ID column - identity or not? - sql

I'm did some research about SQL batch inserts - let's say I have 100k items to be inserted, and I set the batch size to 100.
If the ID column is not marked as Identity then that bulk insert will work.
But I found quite interesting (theoretical so far) problem, and I need some opinions:
The problem can be, if e.g. 5 users are making the bulk inserts in the same time, how then safely provide the the ID column value ? I can't just get the table rows count + 1, because in that way all of that 5 users will have the ID duplicates and the bulk insert operation will fail.

You can use SEQUENCE as an UNIQUE ID generator or try TRIGGER ON INSERT to get a unique ID.
EDIT
With mysql you can build trigger for every row
DELIMITER $$
CREATE TRIGGER adresse_trigger_insert_check
BEFORE INSERT ON adresse
FOR EACH ROW BEGIN
IF NEW.land IS NULL THEN
SET NEW.land := 'XY';
END IF;
END$$
DELIMITER ;

Should I Use IDENTITY or Not?
Sometimes Another Approach Works Better

Related

Compound Trigger is Triggering Only Once

I have an ORDS enabled schema which accepts bulk JSON records and splits them one by one, and inserts them one by one into UEM table.
I've tried to create a trigger which fetches the last inserted row's id and use that value to insert into another table. The problem is, the trigger below only fetches and inserts the last inserted row's id, and it only does 1 insert.
To be more specific:
ORDS gets a bulk JSON payload which consists of 4 records.
POST handler starts a Procedure which splits these 4 records by line break, and immediately inserts these to CLOB columns of UEM table as 4 separate rows by using "connect by level". There is also the ID column which is automatically created and incremented.
In the parallel I also would like to get the ID of these rows and use it in another table insert. I've created the compound trigger below, but this trigger only retrieves the ID of the last record, and inserts only one row.
Why do you think it behaves like this? In the end, the procedure "inserted" 4 records.
CREATE OR REPLACE TRIGGER TEST_TRIGGER5
FOR INSERT ON UEM
COMPOUND TRIGGER
lastid NUMBER;
AFTER STATEMENT IS
BEGIN
SELECT MAX(ID) INTO lastid FROM UEM;
INSERT INTO SPRINT1 (tenantid, usersessionid, newuser, totalerrorcount, userid) VALUES ('deneme', 'testsessionid', 'yes', lastid, 'asdasfqwqwe');
END AFTER STATEMENT;
END TEST_TRIGGER5;
inserts these to CLOB columns of UEM table as 4 separate rows by using "connect by level".
You have 1 INSERT statement that is inserting 4 rows.
In the parallel I also would like to get the ID of these rows and use it in another table insert. I've created the compound trigger below, but this trigger only retrieves the ID of the last record, and inserts only one row.
Why do you think it behaves like this? In the end, the procedure "inserted" 4 records.
It may have inserted 4 ROWS but there was only 1 STATEMENT and you are using an AFTER STATEMENT trigger. If you want it to run for each row then you need to use a row-level trigger.
CREATE OR REPLACE TRIGGER TEST_TRIGGER5
AFTER INSERT ON UEM
FOR EACH ROW
BEGIN
INSERT INTO SPRINT1 (tenantid, usersessionid, newuser, totalerrorcount, userid)
VALUES ('deneme', 'testsessionid', 'yes', :NEW.id, 'asdasfqwqwe');
END TEST_TRIGGER5;
/
db<>fiddle here
Why? Because it is a statement level trigger. If you wanted it to fire for each row, you'd - obviously - use a row level trigger which has the
for each row
clause.

Oracle : Create session sequence?

I have a table as follows
The table contains my application users and stores their clients. The column User Client ID refers to a foreign key linked to a different table that stores the clients details.
I need another column (User Client Counter) which is a just a counter of the clients of each user. I need it to start from 1 and goes up for each individual application user.
For the moment I'm populating this by counting the number of clients for each user + 1 before inserting a new row in the table :
select count(*) + 1 into MyVariable from Mytable where UserClientId = Something
Then I use MyVariable in the column User Client Counter
This methods works quite well, but in case the user is connected from two different sessions, the query may produce a wrong number of counts... in addition to that the performance may be bad in case of big tables...
Is there anyway better way to replace such process by using sequences ?
I've been looking to session sequences but there are reset after the end of each session.
(This column is a business need and cannot be replaced by something like rownumber in restitution queries. Since every client has to keep always the same identifier for the application user)
Thank you in advance.
Cheers,
I think you can just create a unique index on the app user and the running number:
create unique index idx on mytable (app_user_id, num);
And then insert with max + 1:
insert into mytable (app_user_id, client_id, num)
values
(
:app_user_id,
:client_id,
coalesce((select max(num) + 1 from mytable where app_user_id = :app_user_id), 1)
);
For this sort of requirement to be safe you will need to be able to lock rows at the right level so that you don't have two sessions that think the they are allowed to use the same value. The impact of this is that while one session is inserting a row for the 'Company X' user, another session will wait for the first user to commit if they're also trying to insert a row for 'Company X'.
This is super easy when you just have a table that stores information at the right level.
You can have a table of your users with a counter column which starts at 0.
MY_APPLICATION_USER CLIENT_COUNTER
-------------------------------------------------- --------------
Company X 1
Company Y 3
Company Z 1
As you insert rows into your main table, you update this table first setting the client_counter to be client_counter + 1 (you do this as one insert statement, no risky select then update!), then you return the updated value into your value for the client_id. This can all be done with a simple trigger.
create or replace trigger app_clients_ins
before insert
on app_clients
for each row
declare
begin
update app_users
set client_counter = client_counter + 1
where my_application_user = :new.my_application_user
return client_counter into :new.user_client_number;
end;
/
Of course, like any sequence if you delete a row it's not going to allow your next insert to fill that gap.
(db<>fiddle https://dbfiddle.uk/?rdbms=oracle_18&fiddle=7f1b4f4b6316d5983b921cae7b89317a )
if you want to have unique values to be inserted and there are chances that multiple users can insert rows into the same table at the same time then it is better to user Oracle Sequence.
CREATE SEQUENCE id_seq INCREMENT BY 1;
INSERT INTO Mytable(id) VALUES (id_seq.nextval);
In you case I think you want different sequence created for each Customer, How many different Customers you have, if you have in 100's then i don't think create sequence will work as you may have to create as many sequence .

Restricting max no of rows insert in SQLite

I want to create a table which should contain max of 5 rows, when ever there is a 6th insert operation the last row must be deleted.
I want max of 5 rows and I want to do this in SQLIte database in android.
please suggest me a query which is simple.
You could do a row count in your application. If it returns more than four, or eaqual to five, rows you delete the last row before inserting a new one.
UPDATE:
I did a bit of testing on a table which i named test_table, and added this trigger:
DELIMITER $$
CREATE TRIGGER `schema_name`.`trigger_name` BEFORE INSERT ON `database_name`.`test_table`
FOR EACH ROW
if (select count(*) from test_table)> 4 /* count records to see if it exeeds your limit */
then
delete from test_table where id=(SELECT MAX(id) FROM test_table LIMIT 1); /* delete last row */
END if;
END
When I try to insert a sixth row I'm getting this error:
ERROR 1442: 1442: Can't update table 'test_table' in stored function/trigger because it is already used by statement which invoked this stored function/trigger.
I looked it up and found this answer: https://stackoverflow.com/a/21117071/1355562 - which lead me to http://dev.mysql.com/doc/refman/5.6/en/stored-program-restrictions.html.
The section "Restrictions for Stored Functions", point 5, says "... cannot modify a table that is already being used..."
Looks like you have to do this in you application bro ;)
ops.. just noticed you're working an MySQLlite.. Maybe it's different. This was for MySQL... Sorry about that...

multithreading with the trigger

I have written a Trigger which is transferring a record from a table members_new to members_old. The Function of trigger is to insert a record into members_old on after insert in members_new. So suppose a record is getting inserted into a members_new like
nMmbID nMmbName nMmbAdd
1 Abhi Bangalore
This record will get inserted into members_old with the same data structure of the table
My trigger is like :
create trigger add_new_record
after
insert on members_new
for each row
INSERT INTO `test`.`members_old`
(
`nMmbID`,
`nMmbName`,
`nMmbAdd`
)
(
SELECT
`members_new`.`nMmbID`,
`members_new`.`nMmbName`,
`members_new`.`nMmbAdd`
FROM `test`.`members_new`
where nMmbID = (select max(nMmbID) from `test`.`members_new` // written to read the last record from the members_new and stop duplication on the members_old , also this will reduce the chances of any error . )
)
This trigger is working for now , but my confusion is that what will happen if a multiple insertion is happening at one instance of time.
Will it reduce the performance?
Will I face deadlock condition ever in any case as my members_old have FKs?
If any better solution for this situation is there, please give limelight on that
From the manual:
You can refer to columns in the subject table (the table associated with the trigger) by using the aliases OLD and NEW. OLD.col_name refers to a column of an existing row before it is updated or deleted. NEW.col_name refers to the column of a new row to be inserted or an existing row after it is updated.
create trigger add_new_record
after
insert on members_new
for each row
INSERT INTO `test`.`members_old`
SET
`nMmbID` = NEW.nMmbID,
`nMmbName` = NEW.nMmbName,
`nMmbAdd` = NEW.nMmbAdd;
And you will have no problem with deadlocks or whatever. Also it should be much faster, because you don't have to read the max value before (which is also unsecure and might lead to compromised data). Read about isolation levels and transactions if you're interested why...

Why are sequences not updated when COPY is performed in PostgreSQL?

I'm inserting bulk records using COPY statement in PostgreSQL. What I realize is, the sequence IDs are not getting updated and when I try to insert a record later, it throws duplicate sequence ID. Should I manually update the sequence number to get the number of records after performing COPY? Isn't there a solution while performing COPY, just increment the sequence variable, that is, the primary key field of the table? Please clarify me on this. Thanks in advance!
For instance, if I insert 200 records, COPY does good and my table shows all the records. When I manually insert a record later, it says duplicate sequence ID error. It very well implies that it didn’t increment the sequence ids during COPYing as work fine during normal INSERTing. Instead of instructing the sequence id to set the max number of records, won’t there be any mechanism to educate the COPY command to increment the sequence IDs during its bulk COPYing option?
You ask:
Should I manually update the sequence number to get the number of records after performing COPY?
Yes, you should, as documented here:
Update the sequence value after a COPY FROM:
| BEGIN;
| COPY distributors FROM 'input_file';
| SELECT setval('serial', max(id)) FROM distributors;
| END;
You write:
it didn’t increment the sequence ids during COPYing as work fine during normal INSERTing
But that is not so! :) When you perform a normal INSERT, typically you do not specify an explicit value for the SEQUENCE-backed primary key. If you did, you would run in to the same problems as you are having now:
postgres=> create table uh_oh (id serial not null primary key, data char(1));
NOTICE: CREATE TABLE will create implicit sequence "uh_oh_id_seq" for serial column "uh_oh.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "uh_oh_pkey" for table "uh_oh"
CREATE TABLE
postgres=> insert into uh_oh (id, data) values (1, 'x');
INSERT 0 1
postgres=> insert into uh_oh (data) values ('a');
ERROR: duplicate key value violates unique constraint "uh_oh_pkey"
DETAIL: Key (id)=(1) already exists.
Your COPY command, of course, is supplying an explicit id value, just like the example INSERT above.
I realize that this is a bit old but maybe someone might still be looking for the answer.
As other said COPY works in a similar way as INSERT, so for inserting into a table that has a sequence, you simply don't mention the sequence field at all and it is taken care of for you. For COPY it works in the same exact way. But doesn't it COPY require ALL fields in the table to be present in the text file? The correct answer is NO, it doesn't, but it is the default behavior.
To COPY and leave the sequence out do the following:
COPY $YOURSCHEMA.$YOURTABLE(col1,col2,col3,col4) FROM '$your_input_file' DELIMITER ',' CSV HEADER;
No need to manually update the schema afterwards, it works as intended and in my testing is just about as fast.
You could copy to a sister table, then insert into mytable select * from sister - that would increment the sequence.
If your loaded data has the id field, don't select it for the insert: insert into mytable (col1, col2, col3) select col1, col2, col3 from sister