I have two tables (this is a very simplified model of my use case):
- TableCounter with 2 columns: idEntry, counter
- TableObject with 1 column : idEntry , seq (with the pair idEntry/seq unique)
I need to be able in 1 transaction to:
- increase counter for idEntry = x
- insert (x,new_counter_value) in the TableObject.
knowing that I must not lose any sequence, and it is a transaction highly concurrent and called a lot.
How would you write such a transaction in a statement (not for a stored procedure)? Would you lock the row of TableCounter for idEntry = x?
So far, I have this, but I look for a better solution.
BEGIN TRANSACTION;
SELECT counter FROM TableCounter WHERE idEntry=1 FOR UPDATE;
UPDATE TableCounter SET counter=counter+1 WHERE idEntry=1;
INSERT INTO TableObject(idEntry, seq) SELECT TableCounter.idEntry, TableCounter.counter FROM TableCounter WHERE TableCounter.idEntry = 1;
COMMIT TRANSACTION
Thank you
The select for update is useless if the next thing you do is to update the row anyway (this is true for any DBMS that supports select for update)
For Postgres this can be done in a single statement using a data modifying CTE:
with updated as (
update tablecounter
set counter = counter + 1
where identry = 1
returning identry, counter
)
insert into tableobject (identry, seq)
select identry, counter
from updated;
The update will lock the row, which means that any concurrent insert/update (for the same identry) will have to wait until the above is committed or rolled back.
If I (really) needed a gapless sequence and I could live with the scalability issues of such a solution (because the requirement is more important then performance or scalability) I would probably put that into a function. Something like the following:
Define the sequence (=counter) table
create table gapless_sequence
(
entity text not null primary key,
sequence_value integer not null default 0
);
-- "create" a new sequence
insert into gapless_sequence (entity) values ('some_table');
commit;
Now create a function that claims a new value
create function next_value(p_entity text)
returns integer
as
$$
update gapless_sequence
set sequence_value = sequence_value + 1
where entity = p_entity
returning sequence_value;
$$
language sql;
Same as above: the transaction that acquires the next sequence for an entity will block all subsequent calls to the function for the same entity, until the first transaction is committed (or rolled back).
Now defining a table that uses the gapless sequence is quite easy:
create table some_table
(
id integer primary key default next_value('some_table'),
some_column text
);
And then you simply do:
insert into some_table (some_column) values ('foo');
A concurrent insert into some_table would wait until the first transaction commits. The update will then see the committed value and return the appropriate next sequence value.
Of course this can also be done without using a default clause in the table definition, but then you would need to call the function explicitly in the insert statement:
insert into some_table
(id, some_column)
values
(next_value('some_table'), 'foo');
However that has the potential pitfall that nothing forces you to use the correct entity name when calling the function.
All the examples above assume that auto commit is turned off
Related
I'm moving from MySql to Postgres, and I noticed that when you delete rows from MySql, the unique ids for those rows are re-used when you make new ones. With Postgres, if you create rows, and delete them, the unique ids are not used again.
Is there a reason for this behaviour in Postgres? Can I make it act more like MySql in this case?
Sequences have gaps to permit concurrent inserts. Attempting to avoid gaps or to re-use deleted IDs creates horrible performance problems. See the PostgreSQL wiki FAQ.
PostgreSQL SEQUENCEs are used to allocate IDs. These only ever increase, and they're exempt from the usual transaction rollback rules to permit multiple transactions to grab new IDs at the same time. This means that if a transaction rolls back, those IDs are "thrown away"; there's no list of "free" IDs kept, just the current ID counter. Sequences are also usually incremented if the database shuts down uncleanly.
Synthetic keys (IDs) are meaningless anyway. Their order is not significant, their only property of significance is uniqueness. You can't meaningfully measure how "far apart" two IDs are, nor can you meaningfully say if one is greater or less than another. All you can do is say "equal" or "not equal". Anything else is unsafe. You shouldn't care about gaps.
If you need a gapless sequence that re-uses deleted IDs, you can have one, you just have to give up a huge amount of performance for it - in particular, you cannot have any concurrency on INSERTs at all, because you have to scan the table for the lowest free ID, locking the table for write so no other transaction can claim the same ID. Try searching for "postgresql gapless sequence".
The simplest approach is to use a counter table and a function that gets the next ID. Here's a generalized version that uses a counter table to generate consecutive gapless IDs; it doesn't re-use IDs, though.
CREATE TABLE thetable_id_counter ( last_id integer not null );
INSERT INTO thetable_id_counter VALUES (0);
CREATE OR REPLACE FUNCTION get_next_id(countertable regclass, countercolumn text) RETURNS integer AS $$
DECLARE
next_value integer;
BEGIN
EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;
RETURN next_value;
END;
$$ LANGUAGE plpgsql;
COMMENT ON get_next_id(countername regclass) IS 'Increment and return value from integer column $2 in table $1';
Usage:
INSERT INTO dummy(id, blah)
VALUES ( get_next_id('thetable_id_counter','last_id'), 42 );
Note that when one open transaction has obtained an ID, all other transactions that try to call get_next_id will block until the 1st transaction commits or rolls back. This is unavoidable and for gapless IDs and is by design.
If you want to store multiple counters for different purposes in a table, just add a parameter to the above function, add a column to the counter table, and add a WHERE clause to the UPDATE that matches the parameter to the added column. That way you can have multiple independently-locked counter rows. Do not just add extra columns for new counters.
This function does not re-use deleted IDs, it just avoids introducing gaps.
To re-use IDs I advise ... not re-using IDs.
If you really must, you can do so by adding an ON INSERT OR UPDATE OR DELETE trigger on the table of interest that adds deleted IDs to a free-list side table, and removes them from the free-list table when they're INSERTed. Treat an UPDATE as a DELETE followed by an INSERT. Now modify the ID generation function above so that it does a SELECT free_id INTO next_value FROM free_ids FOR UPDATE LIMIT 1 and if found, DELETEs that row. IF NOT FOUND gets a new ID from the generator table as normal. Here's an untested extension of the prior function to support re-use:
CREATE OR REPLACE FUNCTION get_next_id_reuse(countertable regclass, countercolumn text, freelisttable regclass, freelistcolumn text) RETURNS integer AS $$
DECLARE
next_value integer;
BEGIN
EXECUTE format('SELECT %I FROM %s FOR UPDATE LIMIT 1', freelistcolumn, freelisttable) INTO next_value;
IF next_value IS NOT NULL THEN
EXECUTE format('DELETE FROM %s WHERE %I = %L', freelisttable, freelistcolumn, next_value);
ELSE
EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;
END IF;
RETURN next_value;
END;
$$ LANGUAGE plpgsql;
i ran into PostgreSQL (probably not only psql) transaction race condition troubles. I'm trying to achieve such a simple task using multiple threads:
BEGIN;
SELECT * FROM t WHERE id = 1;
DELETE FROM t WHERE id = 1;
INSERT INTO t (id, value) VALUES (1, 'thread X'); -- X = 1,2,3,..
SELECT 1 FROM pg_sleep(10); -- only for race condition simulation
COMMIT;
However threads are colliding inside these transactions so multiple inserts are executed (primary key collision error). So i tried to use SELECT FOR UPDATE statement:
BEGIN;
SELECT * FROM t WHERE id = 1 FOR UPDATE;
DELETE FROM t WHERE id = 1;
INSERT INTO t (id, value) VALUES (1, 'thread X'); -- X = 1,2,3,..
SELECT 1 FROM pg_sleep(10); -- only for race condition simulation
COMMIT;
Transactions are correctly blocking on FOR UPDATE statement waiting for other threads commit.
However after "semaphore up" (waking up on that statement after another thread transaction has commited) empty result set is returned from DBMS although data are correctly available in table (from INSERT statement from faster thread):
BEGIN;
SELECT * FROM t WHERE id = 1 FOR UPDATE; -- blocking ... then return 0 records WRONG
SELECT * FROM t WHERE id = 1 FOR UPDATE; -- second try ... returns 1 record CORRECT
DELETE FROM t WHERE id = 1;
INSERT INTO t (id, value) VALUES (1, 'thread X'); -- X = 1,2,3,..
SELECT 1 FROM pg_sleep(10); -- only for race condition simulation
COMMIT;
As seen above, second (duplicated) select statement behaves correctly. Why?
The reason is that the blocked statement's snapshot is older than the transaction that inserted the new row, so it cannot see it once the lock is removed.
You can see it in the following statement because in READ COMMITTED isolation level each statement gets its own snapshot, so the second statement's snapshot includes the newly inserted row.
You could use REPEATABLE READ isolation level. In that case you should get a serialization error (I didn't test that, so please try it out – maybe you need SERIALIZABLE). Then you have to write your program so that it retries the transaction if it gets a serialization error, and everything should work.
I know it may sound odd but is there any way I can call my trigger on ROLLBACK event in a table? I was going through postgresql triggers documentation, there are events only for CREATE, UPDATE, DELETE and INSERT on table.
My requirement is on transaction ROLLBACK my trigger will select last_id from a table and reset table sequence with value = last_id + 1; in short I want to preserve sequence values on rollback.
Any kind of ideas and feed back will be appreciated guys!
You can't use a sequence for this. You need a single serialization point through which all inserts have to go - otherwise the "gapless" attribute can not be guaranteed. You also need to make sure that no rows will ever be deleted from that table.
The serialization also means that only a single transaction can insert rows into that table - all other inserts have to wait until the "previous" insert has been committed or rolled back.
One pattern how this can be implemented is to have a table where the the "sequence" numbers are stored. Let's assume we need this for invoice numbers which have to be gapless for legal reasons.
So we first create the table to hold the "current value":
create table slow_sequence
(
seq_name varchar(100) not null primary key,
current_value integer not null default 0
);
-- create a "sequence" for invoices
insert into slow_sequence values ('invoice');
Now we need a function that will generate the next number but that guarantees that no two transactions can obtain the next number at the same time.
create or replace function next_number(p_seq_name text)
returns integer
as
$$
update slow_sequence
set current_value = current_value + 1
where seq_name = p_seq_name
returning current_value;
$$
language sql;
The function will increment the counter and return the incremented value as a result. Due to the update the row for the sequence is now locked and no other transaction can update that value. If the calling transaction is rolled back, so is the update to the sequence counter. If it is committed, the new value is persisted.
To ensure that every transaction uses the function, a trigger should be created.
Create the table in question:
create table invoice
(
invoice_number integer not null primary key,
customer_id integer not null,
due_date date not null
);
Now create the trigger function and the trigger:
create or replace function f_invoice_trigger()
returns trigger
as
$$
begin
-- the number is assigned unconditionally so that this can't
-- be prevented by supplying a specific number
new.invoice_number := next_number('invoice');
return new;
end;
$$
language plpgsql;
create trigger invoice_trigger
before insert on invoice
for each row
execute procedure f_invoice_trigger();
Now if one transaction does this:
insert into invoice (customer_id, due_date)
values (42, date '2015-12-01');
The new number is generated. A second transaction then needs to wait until the first insert is committed or rolled back.
As I said: this solution is not scalable. Not at all. It will slow down your application massively if there are a lot of inserts into that table. But you can't have both: a scalable and correct implementation of a gapless sequence.
I'm also pretty sure that there are edge case that are not covered by the above code. So it's pretty likely that you can still wind up with gaps.
I wrote a function to create posts for a simple blogging engine:
CREATE FUNCTION CreatePost(VARCHAR, TEXT, VARCHAR[])
RETURNS INTEGER AS $$
DECLARE
InsertedPostId INTEGER;
TagName VARCHAR;
BEGIN
INSERT INTO Posts (Title, Body)
VALUES ($1, $2)
RETURNING Id INTO InsertedPostId;
FOREACH TagName IN ARRAY $3 LOOP
DECLARE
InsertedTagId INTEGER;
BEGIN
-- I am concerned about this part.
BEGIN
INSERT INTO Tags (Name)
VALUES (TagName)
RETURNING Id INTO InsertedTagId;
EXCEPTION WHEN UNIQUE_VIOLATION THEN
SELECT INTO InsertedTagId Id
FROM Tags
WHERE Name = TagName
FETCH FIRST ROW ONLY;
END;
INSERT INTO Taggings (PostId, TagId)
VALUES (InsertedPostId, InsertedTagId);
END;
END LOOP;
RETURN InsertedPostId;
END;
$$ LANGUAGE 'plpgsql';
Is this prone to race conditions when multiple users delete tags and create posts at the same time?
Specifically, do transactions (and thus functions) prevent such race conditions from happening?
I'm using PostgreSQL 9.2.3.
It's the recurring problem of SELECT or INSERT under possible concurrent write load, related to (but different from) UPSERT (which is INSERT or UPDATE).
This PL/pgSQL function uses UPSERT (INSERT ... ON CONFLICT ..) to INSERT or SELECT a single row:
CREATE OR REPLACE FUNCTION f_tag_id(_tag text, OUT _tag_id int)
LANGUAGE plpgsql AS
$func$
BEGIN
SELECT tag_id -- only if row existed before
FROM tag
WHERE tag = _tag
INTO _tag_id;
IF NOT FOUND THEN
INSERT INTO tag AS t (tag)
VALUES (_tag)
ON CONFLICT (tag) DO NOTHING
RETURNING t.tag_id
INTO _tag_id;
END IF;
END
$func$;
There is still a tiny window for a race condition. To make absolutely sure we get an ID:
CREATE OR REPLACE FUNCTION f_tag_id(_tag text, OUT _tag_id int)
LANGUAGE plpgsql AS
$func$
BEGIN
LOOP
SELECT tag_id
FROM tag
WHERE tag = _tag
INTO _tag_id;
EXIT WHEN FOUND;
INSERT INTO tag AS t (tag)
VALUES (_tag)
ON CONFLICT (tag) DO NOTHING
RETURNING t.tag_id
INTO _tag_id;
EXIT WHEN FOUND;
END LOOP;
END
$func$;
db<>fiddle here
This keeps looping until either INSERT or SELECT succeeds.
Call:
SELECT f_tag_id('possibly_new_tag');
If subsequent commands in the same transaction rely on the existence of the row and it is actually possible that other transactions update or delete it concurrently, you can lock an existing row in the SELECT statement with FOR SHARE.
If the row gets inserted instead, it is locked (or not visible for other transactions) until the end of the transaction anyway.
Start with the common case (INSERT vs SELECT) to make it faster.
Related:
Get Id from a conditional INSERT
How to include excluded rows in RETURNING from INSERT ... ON CONFLICT
Related (pure SQL) solution to INSERT or SELECT multiple rows (a set) at once:
How to use RETURNING with ON CONFLICT in PostgreSQL?
What's wrong with this pure SQL solution?
CREATE OR REPLACE FUNCTION f_tag_id(_tag text, OUT _tag_id int)
LANGUAGE sql AS
$func$
WITH ins AS (
INSERT INTO tag AS t (tag)
VALUES (_tag)
ON CONFLICT (tag) DO NOTHING
RETURNING t.tag_id
)
SELECT tag_id FROM ins
UNION ALL
SELECT tag_id FROM tag WHERE tag = _tag
LIMIT 1;
$func$;
Not entirely wrong, but it fails to seal a loophole, like #FunctorSalad worked out. The function can come up with an empty result if a concurrent transaction tries to do the same at the same time. The manual:
All the statements are executed with the same snapshot
If a concurrent transaction inserts the same new tag a moment earlier, but hasn't committed, yet:
The UPSERT part comes up empty, after waiting for the concurrent transaction to finish. (If the concurrent transaction should roll back, it still inserts the new tag and returns a new ID.)
The SELECT part also comes up empty, because it's based on the same snapshot, where the new tag from the (yet uncommitted) concurrent transaction is not visible.
We get nothing. Not as intended. That's counter-intuitive to naive logic (and I got caught there), but that's how the MVCC model of Postgres works - has to work.
So do not use this if multiple transactions can try to insert the same tag at the same time. Or loop until you actually get a row. The loop will hardly ever be triggered in common work loads anyway.
Postgres 9.4 or older
Given this (slightly simplified) table:
CREATE table tag (
tag_id serial PRIMARY KEY
, tag text UNIQUE
);
An almost 100% secure function to insert new tag / select existing one, could look like this.
CREATE OR REPLACE FUNCTION f_tag_id(_tag text, OUT tag_id int)
LANGUAGE plpgsql AS
$func$
BEGIN
LOOP
BEGIN
WITH sel AS (SELECT t.tag_id FROM tag t WHERE t.tag = _tag FOR SHARE)
, ins AS (INSERT INTO tag(tag)
SELECT _tag
WHERE NOT EXISTS (SELECT 1 FROM sel) -- only if not found
RETURNING tag.tag_id) -- qualified so no conflict with param
SELECT sel.tag_id FROM sel
UNION ALL
SELECT ins.tag_id FROM ins
INTO tag_id;
EXCEPTION WHEN UNIQUE_VIOLATION THEN -- insert in concurrent session?
RAISE NOTICE 'It actually happened!'; -- hardly ever happens
END;
EXIT WHEN tag_id IS NOT NULL; -- else keep looping
END LOOP;
END
$func$;
db<>fiddle here
Old sqlfiddle
Why not 100%? Consider the notes in the manual for the related UPSERT example:
https://www.postgresql.org/docs/current/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE
Explanation
Try the SELECT first. This way you avoid the considerably more expensive exception handling 99.99% of the time.
Use a CTE to minimize the (already tiny) time slot for the race condition.
The time window between the SELECT and the INSERT within one query is super tiny. If you don't have heavy concurrent load, or if you can live with an exception once a year, you could just ignore the case and use the SQL statement, which is faster.
No need for FETCH FIRST ROW ONLY (= LIMIT 1). The tag name is obviously UNIQUE.
Remove FOR SHARE in my example if you don't usually have concurrent DELETE or UPDATE on the table tag. Costs a tiny bit of performance.
Never quote the language name: 'plpgsql'. plpgsql is an identifier. Quoting may cause problems and is only tolerated for backwards compatibility.
Don't use non-descriptive column names like id or name. When joining a couple of tables (which is what you do in a relational DB) you end up with multiple identical names and have to use aliases.
Built into your function
Using this function you could largely simplify your FOREACH LOOP to:
...
FOREACH TagName IN ARRAY $3
LOOP
INSERT INTO taggings (PostId, TagId)
VALUES (InsertedPostId, f_tag_id(TagName));
END LOOP;
...
Faster, though, as a single SQL statement with unnest():
INSERT INTO taggings (PostId, TagId)
SELECT InsertedPostId, f_tag_id(tag)
FROM unnest($3) tag;
Replaces the whole loop.
Alternative solution
This variant builds on the behavior of UNION ALL with a LIMIT clause: as soon as enough rows are found, the rest is never executed:
Way to try multiple SELECTs till a result is available?
Building on this, we can outsource the INSERT into a separate function. Only there we need exception handling. Just as safe as the first solution.
CREATE OR REPLACE FUNCTION f_insert_tag(_tag text, OUT tag_id int)
RETURNS int
LANGUAGE plpgsql AS
$func$
BEGIN
INSERT INTO tag(tag) VALUES (_tag) RETURNING tag.tag_id INTO tag_id;
EXCEPTION WHEN UNIQUE_VIOLATION THEN -- catch exception, NULL is returned
END
$func$;
Which is used in the main function:
CREATE OR REPLACE FUNCTION f_tag_id(_tag text, OUT _tag_id int)
LANGUAGE plpgsql AS
$func$
BEGIN
LOOP
SELECT tag_id FROM tag WHERE tag = _tag
UNION ALL
SELECT f_insert_tag(_tag) -- only executed if tag not found
LIMIT 1 -- not strictly necessary, just to be clear
INTO _tag_id;
EXIT WHEN _tag_id IS NOT NULL; -- else keep looping
END LOOP;
END
$func$;
This is a bit cheaper if most of the calls only need SELECT, because the more expensive block with INSERT containing the EXCEPTION clause is rarely entered. The query is also simpler.
FOR SHARE is not possible here (not allowed in UNION query).
LIMIT 1 would not be necessary (tested in pg 9.4). Postgres derives LIMIT 1 from INTO _tag_id and only executes until the first row is found.
There's still something to watch out for even when using the ON CONFLICT clause introduced in Postgres 9.5. Using the same function and example table as in #Erwin Brandstetter's answer, if we do:
Session 1: begin;
Session 2: begin;
Session 1: select f_tag_id('a');
f_tag_id
----------
11
(1 row)
Session 2: select f_tag_id('a');
[Session 2 blocks]
Session 1: commit;
[Session 2 returns:]
f_tag_id
----------
NULL
(1 row)
So f_tag_id returned NULL in session 2, which would be impossible in a single-threaded world!
If we raise the transaction isolation level to repeatable read (or the stronger serializable), session 2 throws ERROR: could not serialize access due to concurrent update instead. So no "impossible" results at least, but unfortunately we now need to be prepared to retry the transaction.
Edit: With repeatable read or serializable, if session 1 inserts tag a, then session 2 inserts b, then session 1 tries to insert b and session 2 tries to insert a, one session detects a deadlock:
ERROR: deadlock detected
DETAIL: Process 14377 waits for ShareLock on transaction 1795501; blocked by process 14363.
Process 14363 waits for ShareLock on transaction 1795503; blocked by process 14377.
HINT: See server log for query details.
CONTEXT: while inserting index tuple (0,3) in relation "tag"
SQL function "f_tag_id" statement 1
After the session that received the deadlock error rolls back, the other session continues. So I guess we should treat deadlock just like serialization_failure and retry, in a situation like this?
Alternatively, insert the tags in a consistent order, but this is not easy if they don't all get added in one place.
I think there is a slight chance that when the tag already existed it might be deleted by another transaction after your transaction has found it. Using a SELECT FOR UPDATE should solve that.
In an Oracle table (e.g. MYTABLE, with a numeric sequenced field as primary key), I have to insert several thousand of rows, but some of them are supposed to already exist in the table.
Naturally, I should try to use MERGE but I need, as well, to retrieve all created (when inserting) and existing (when updating) primary keys.
As well, it should be as fast as possible.
Is the following attempt (pseudo code) the only way to go? Thanks.
keys_list = empty array
for each row to merge
do query 'SELECT PK_MYTABLE FROM MYTABLE WHERE PK_MYTABLE = '+row.pk_mytable
==> retrieve key
if found then:
add key to keys_list
else:
do query 'INSERT INTO MYTABLE (PK_MYTABLE, ...) VALUES (SEQ_MYTABLE.NEXTVAL, ...)'
do query 'SELECT SEQ_MYTABLE.CURRVAL FROM DUAL' ==> retrieve key
add key to keys_list
Add a MODIFICATION_DATE column to the table
Grab and save the sysdate.
When you merge update/insert the value of the sysdate as well.
When the merge is complete, select the rows where the MODIFICATION_DATE = SYSDATE and you
have the set you are interested in.
Why can't you use a MERGE statement for this? This is exactly what a MERGE is for. Here is a rough idea of how it would look...
merge into mytable mt
using
(
select key_field, value_field from sourcetable
) st
on
( mt.key_field = st.key_field )
when matched then update
set mt.value_field = st.value_field
when not matched then insert
( key_field, value_field )
values
( st.key_field, st.value_field )
;
Using a MERGE statement is fast because it is a single statement and the Oracle optimizer can utilize indexes and choose a better explain path than iterating through a cursor using PL/SQL.
If the keys are being generated from a sequence, then the normal way to get the key generated by that insert is to use the returning clause:
declare
v_insert_seq integer;
begin
insert into t1 (pk, c1)
values (myseq.nextval, 'value') returning pk into v_insert_seq;
end;
/
However, as best as I can tell, the merge statement doesn't support that returning feature.
Depending on the source of your new rows, there are different ways you could do this. If you are inserting one row at a time, then the approach above will work pretty well.
To detect the duplicate records, just catch the exceptions when you are inserting (when dup_val_on_index) and then handle them with updates.
If your source of rows is another table, you probably want to look at bulk inserts, and allowing Oracle to return you an array of new PK values. I tried this, but couldn't get it working, so perhaps it's not supported (or I'm missing something today - it gives a syntax error):
declare
type t_type is table of t1.pk%type;
v_insert_seqs t_type;
begin
insert into t1 (pk, c1)
select level newpk, 'value' c1value
from dual
connect by level <= 10 returning pk bulk collect into v_insert_seqs;
exception
when dup_val_on_index then
raise;
end;
/
The next best thing is to select the rows into arrays and then use bulk binds with the returning clause to capture the new PK IDs and also use Save Exceptions to catch all the rows that failed to inserted. Then you can process any of the failed inserted afterwards:
set serveroutput on
declare
type t_pk is table of t1.pk%type;
type t_c1 is table of t1.c1%type;
v_pks t_pk;
v_c1s t_c1;
v_new_pks t_pk;
ex_dml_errors EXCEPTION;
PRAGMA EXCEPTION_INIT(ex_dml_errors, -24381);
begin
-- get the batch of rows you want to insert
select level newpk, 'value' c1
bulk collect into v_pks, v_c1s
from dual connect by level <= 10;
-- bulk bind insert, saving exceptions and capturing the newly inserted
-- records
forall i in v_pks.first .. v_pks.last save exceptions
insert into t1 (pk, c1)
values (v_pks(i), v_c1s(i)) returning pk bulk collect into v_new_pks;
exception
-- Process the exceptions
when ex_dml_errors then
for i in 1..SQL%BULK_EXCEPTIONS.count loop
DBMS_OUTPUT.put_line('Error: ' || i ||
' Array Index: ' || SQL%BULK_EXCEPTIONS(i).error_index ||
' Message: ' || SQLERRM(-SQL%BULK_EXCEPTIONS(i).ERROR_CODE));
end loop;
end;
/
If you are running Oracle 10 or better, you might be able to do much the same thing, for nearly free by issuing a commit before the merge to update the SCN, then after the merge,
use the ORA_ROWSCN to detect which rows have changed.