how we can tune this Delete query? - sql

Justin, As per your suggestion, here i added loop..
Is there anyway we can tune this procedure ? I have n't tested yet.
Here we are just deleting the records from master and child table prior to 90days history records.
Assume that tables have more than 20k records are there to delete. and here i put commit for each 5k records.. Please letme know if i am wrong here ?
create or replace
Procedure PURGE_CLE_ALL_STATUS
( days_in IN number )
IS
LV_EXCEPTIONID NUMBER;
i number := 0;
cursor s1 is
select EXCEPTIONID
from EXCEPTIONREC --- master table
where TIME_STAMP < (sysdate -days_in);
BEGIN
for c1 in s1 loop
delete from EXCEPTIONRECALTKEY -- child table
where EXCEPTIONID =c1.EXCEPTIONID ;
delete from EXCEPTIONREC
where EXCEPTIONID =c1.EXCEPTIONID;
i := i + 1;
if i > 5000 then
commit;
i := 0;
end if;
end loop;
commit;
close S1;
EXCEPTION
WHEN OTHERS THEN
raise_application_error(-20001,'An error was encountered - '||
SQLCODE||' -ERROR- '||SQLERRM);
END;
/

instead of cursor .. you can directly give the condition in delete statement..
like below-
create or replace
Procedure PURGE_CLE_ALL_STATUS
( days_in IN number )
IS
LV_EXCEPTIONID NUMBER;
BEGIN
delete from EXCEPTIONRECALTKEY -- child table
where EXCEPTIONID = -- (1)
(select EXCEPTIONID
from EXCEPTIONREC --- master table
where TIME_STAMP < (sysdate -days_in));
delete from EXCEPTIONREC
where EXCEPTIONID = --(2)
(select EXCEPTIONID
from EXCEPTIONREC --- master table
where TIME_STAMP < (sysdate -days_in));
commit;
end if;
close c1;
END;
/
I fully agree with Justin Cave .. he gave very good point ..
if you are getting multiple rows from your cursor .. you can use in in place of = at (1) and (2).

"delete records from master and child table prior to 90days history records "
To get full performance benefit , you shall implement partition table. of courser this change in design but may not change your application.
your use case is to delete data on regular basis. You try creating daily or weekly partition which will store the new data in newer partition & drop the older partition regularly.
Current way of deleting will have performance bottle neck, since you trying to delete data older than 90 days. let us say there 10,000 record per day, then by 90th day there will be 90*10000. locating & deleting few records from 90 million record will be always will be slow & create some other lock problem.

Related

Oracle: how to limit number of rows in "select .. for update skip locked"

I have got a table:
table foo{
bar number,
status varchar2(50)
}
I have multiple threads/hosts each consuming the table. Each thread updates the status, i.e. pessimistically locks the row.
In oracle 12.2.
select ... for update skip locked seems to do the job but I want to limit number of rows. The new FETCH NEXT sounds right, but I cant get the syntax right:
SELECT * FROM foo ORDER BY bar
OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY
FOR UPDATE SKIP LOCKED;
What is the simplest way to achieve this, i.e. with minimum code1 (ideally without pl/sql function)?
I want something like this:
select * from (select * from foo
where status<>'baz' order by bar
) where rownum<10 for update skip locked
PS
1. We are considering moving away from oracle.
I suggest to create pl/sql function and use dynamic sql to control the number of locked records. The lock is acquired at a fetch time. So fetching N records automatically locks them. Keep in mind that records are unlocked once you finish the transaction - commit or rollback.
The following is the example to lock N records and return their id values as an array (assume you have added the primary key ID column in your table):
create or replace function get_next_unlocked_records(iLockSize number)
return sys.odcinumberlist
is
cRefCursor sys_refcursor;
aIds sys.odcinumberlist := sys.odcinumberlist();
begin
-- open cursor. No locks so far
open cRefCursor for
'select id from foo '||
'for update skip locked';
-- we fetch and lock at the same time
fetch cRefCursor bulk collect into aIds limit iLockSize;
-- close cursor
close cRefCursor;
-- return locked ID values,
-- lock is kept until the transaction is finished
return aIds;
end;
sys.odcinumberlist is the built-in array of numbers.
Here is the test script to run in db:
declare
aRes sys.odcinumberlist;
begin
aRes := get_next_unlocked_records(10);
for c in (
select column_value id
from table(aRes)
) loop
dbms_output.put_line(c.id);
end loop;
end;

Why row is visible to several sessions when selected FOR UPDATE SKIP LOCKED?

Assume there are two tables TST_SAMPLE (10000 rows) and TST_SAMPLE_STATUS (empty).
I want to iterate over each record in TST_SAMPLE and add exactly one record to TST_SAMPLE_STATUS accordingly.
In a single thread that would be simply this:
begin
for r in (select * from TST_SAMPLE)
loop
insert into TST_SAMPLE_STATUS(rec_id, rec_status)
values (r.rec_id, 'TOUCHED');
end loop;
commit;
end;
/
In a multithreaded solution there's a situation, which is not clear to me.
So could you explain what causes processing one row of TST_SAMPLE several times.
Please, see details below.
create table TST_SAMPLE(
rec_id number(10) primary key
);
create table TST_SAMPLE_STATUS(
rec_id number(10),
rec_status varchar2(10),
session_id varchar2(100)
);
begin
insert into TST_SAMPLE(rec_id)
select LEVEL from dual connect by LEVEL <= 10000;
commit;
end;
/
CREATE OR REPLACE PROCEDURE tst_touch_recs(pi_limit int) is
v_last_iter_count int;
begin
loop
v_last_iter_count := 0;
--------------------------
for r in (select *
from TST_SAMPLE A
where rownum < pi_limit
and NOT EXISTS (select null
from TST_SAMPLE_STATUS B
where B.rec_id = A.rec_id)
FOR UPDATE SKIP LOCKED)
loop
insert into TST_SAMPLE_STATUS(rec_id, rec_status, session_id)
values (r.rec_id, 'TOUCHED', SYS_CONTEXT('USERENV', 'SID'));
v_last_iter_count := v_last_iter_count + 1;
end loop;
commit;
--------------------------
exit when v_last_iter_count = 0;
end loop;
end;
/
In the FOR-LOOP I try to iterate over rows that:
- has no status (NOT EXISTS clause)
- is not currently locked in another thread (FOR UPDATE SKIP LOCKED)
There's no requirement for the exact amount of rows in an iteration.
Here pi_limit is just a maximal size of one batch. The only thing needed is to process each row of TST_SAMPLE in exactly one session.
So let's run this procedure in 3 threads.
declare
v_job_id number;
begin
dbms_job.submit(v_job_id, 'begin tst_touch_recs(100); end;', sysdate);
dbms_job.submit(v_job_id, 'begin tst_touch_recs(100); end;', sysdate);
dbms_job.submit(v_job_id, 'begin tst_touch_recs(100); end;', sysdate);
commit;
end;
Unexpectedly, we see that some rows were processed in several sessions
select count(unique rec_id) AS unique_count,
count(rec_id) AS total_count
from TST_SAMPLE_STATUS;
| unique_count | total_count |
------------------------------
| 10000 | 17397 |
------------------------------
-- run to see duplicates
select *
from TST_SAMPLE_STATUS
where REC_ID in (
select REC_ID
from TST_SAMPLE_STATUS
group by REC_ID
having count(*) > 1
)
order by REC_ID;
Please, help to recognize mistakes in implementation of procedure tst_touch_recs.
Here's a little example that shows why you're reading rows twice.
Run the following code in two sessions, starting the second a few seconds after the first:
declare
cursor c is
select a.*
from TST_SAMPLE A
where rownum < 10
and NOT EXISTS (select null
from TST_SAMPLE_STATUS B
where B.rec_id = A.rec_id)
FOR UPDATE SKIP LOCKED;
type rec is table of c%rowtype index by pls_integer;
rws rec;
begin
open c; -- data are read consistent to this time
dbms_lock.sleep ( 10 );
fetch c
bulk collect
into rws;
for i in 1 .. rws.count loop
dbms_output.put_line ( rws(i).rec_id );
end loop;
commit;
end;
/
You should see both sessions display the same rows.
Why?
Because Oracle Database has statement-level consistency, the result set for both is frozen when you open the cursor.
But when you have SKIP LOCKED, the FOR UPDATE locking only kicks in when you fetch the rows.
So session 1 starts and finds the first 9 rows not in TST_SAMPLE_STATUS. It then waits 10 seconds.
Provided you start session 2 within these 10 seconds, the cursor will look for the same nine rows.
At this point no rows are locked.
Now, here's where it gets interesting.
The sleep in the first session will finish. It'll then fetch the rows, locking them and skipping any that are already locked.
Very shortly after, it'll commit. Releasing the lock.
A few moments later, session 2 comes to read these rows. At this point the rows are not locked!
So there's nothing to skip.
How exactly you solve this depends on what you're trying to do.
Assuming you can't move to a set-based approach, you could make the transactions serializable by adding:
set transaction isolation level serializable;
before the cursor loop. This will then move to transaction-level consistency. Enabling the database to detect "something changed" when fetching rows.
But you'll need to catch ORA-08177: can't serialize access for this transaction errors in your within the outer loop. Or any process that re-reads the same rows will drop out at this point.
Or, as commenters have suggested used Advanced Queueing.

SQL optimization

Right now I am facing with an optimization problem.
I have a list of aticles (17000+) and some of them are inactive. The list is provided by the client into an EXCEL file and he asked me to resend them (obviosly only those active).
For this, I have to filter the production database based on the list provided by the customer. Unfortunately, I cannot load the list into a sepparate table from production and then join with master article table but I was able to do this into a UAT database, linked with production one.
The production article master data contains 200 000 000+ rows but filtering it, I can redure to around 80 000 000.
I order to retreive only the active article from production, I was thinking to use collections but it seems the last filter is taking tooooooo long.
Here are my code:
declare
type t_art is table of number index by pls_integer;
v_art t_art;
v_filtered t_art;
idx number := 0;
begin
for i in (select * from test_table#UAT_DATABASE)
loop
idx := idx + 1;
v_art(idx) := i.art_nr;
end loop;
for j in v_art.first .. v_art.last
loop
select distinct art_nr
bulk collect into v_filtered
from production_article_master_data
where status = 0 -- status is active
and sperr_stat in (0, 2)
and trunc(valid_until) >= trunc(sysdate)
and art_nr = v_art(j);
end loop;
end;
Explanation: from UAT database, via DBLink, I am insertinting the list into an ASSOCIATIVE ARRAY in production (v_art). Then, for each value in v_art(17000+ distinct articles), I am filtering with production article master data, returning in 2nd ASSOCITIAVE ARRAY, only the valid articles (there might be 6-8000).
Unfortunately, this filtering action is taking hours.
Can someone provide me some hints how to improve this in orde to decrease the execution time, please?
Thank you,
Just use SQL and join the two tables:
select distinct p.art_nr
from production_article_master_data p
INNER JOIN
test_table#UAT_DATABASE t
ON ( p.art_nr = t.art_nr )
where status = 0 -- status is active
and sperr_stat in (0, 2)
and trunc(valid_until) >= trunc(sysdate)
If you have to do it in PL/SQL then:
CREATE OR REPLACE TYPE numberlist IS TABLE OF NUMBER;
/
declare
-- If you are using Oracle 12c you should be able to declare the
-- type in the PL/SQL block. In earlier versions you will need to
-- declare it in the SQL scope instead.
-- TYPE numberlist IS TABLE OF NUMBER;
v_art NUMBERLIST;
v_filtered NUMBERLIST;
begin
select art_nr
BULK COLLECT INTO v_art
from test_table#UAT_DATABASE;
select distinct art_nr
bulk collect into v_filtered
from production_article_master_data
where status = 0 -- status is active
and sperr_stat in (0, 2)
and trunc(valid_until) >= trunc(sysdate)
and art_nr MEMBER OF v_art;
end;

PL/SQL Control Structure - LOOP

I have about 94,000 records that need to be deleted, but I have been told not to delete all at once because it will slow the performance due to the delete trigger. What would be the best solution to accomplish this? I was thinking of an additional loop after the commit of 1000, but not too sure how to implement or know if that will reduce performance even more.
DECLARE
CURSOR CLEAN IS
SELECT EMP_ID, ACCT_ID FROM RECORDS_TO_DELETE F; --Table contains the records that needs to be deleted.
COUNTER INTEGER := 0;
BEGIN
FOR F IN CLEAN LOOP
COUNTER := COUNTER + 1;
DELETE FROM EMPLOYEES
WHERE EMP_ID = F.EMP_ID AND ACCT_ID = F.ACCT_ID;
IF MOD(COUNTER, 1000) = 0 THEN
COMMIT;
END IF;
END LOOP;
COMMIT;
END;
You need to read a bit about BULK COLLECT statements in oracle. This is commonly considered as proper way working with large tables.
Example:
LOOP
FETCH c_delete BULK COLLECT INTO t_delete LIMIT l_delete_buffer;
FORALL i IN 1..t_delete.COUNT
DELETE ps_al_chk_memo
WHERE ROWID = t_delete (i);
COMMIT;
EXIT WHEN c_delete%NOTFOUND;
COMMIT;
END LOOP;
CLOSE c_delete;
You can do it in a single statement, this should be the fastest way in any kind:
DELETE FROM EMPLOYEES
WHERE (EMP_ID, ACCT_ID) =ANY (SELECT EMP_ID, ACCT_ID FROM RECORDS_TO_DELETE)
Since I can see the volume of record is not that much so can still go with SQL not by PLSQL.Whenever possible try SQL. I think it should not cause that much performance impact.
DELETE FROM EMPLOYEES
WHERE EXISTS
(SELECT 1 FROM RECORDS_TO_DELETE F
WHERE EMP_ID = F.EMP_ID
AND ACCT_ID= F.ACCT_ID);
Hope this helps.

SQL optimization question (oracle)

Edit: Please answer one of the two answers I ask. I know there are other options that would be better in a different case. These other potential options (partitioning the table, running as one large delete statement w/o committing in batches, etc) are NOT options in my case due to things outside my control.
I have several very large tables to delete from. All have the same foreign key that is indexed. I need to delete certain records from all tables.
table source
id --primary_key
import_source --used for choosing the ids to delete
table t1
id --foreign key
--other fields
table t2
id --foreign key
--different other fields
Usually when doing a delete like this, I'll put together a loop to step through all the ids:
declare
my_counter integer := 0;
begin
for cur in (
select id from source where import_source = 'bad.txt'
) loop
begin
delete from source where id = cur.id;
delete from t1 where id = cur.id;
delete from t2 where id = cur.id;
my_counter := my_counter + 1;
if my_counter > 500 then
my_counter := 0;
commit;
end if;
end;
end loop;
commit;
end;
However, in some code I saw elsewhere, it was put together in separate loops, one for each delete.
declare
type import_ids is table of integer index by pls_integer;
my_count integer := 0;
begin
select id bulk collect into my_import_ids from source where import_source = 'bad.txt'
for h in 1..my_import_ids.count
delete from t1 where id = my_import_ids(h);
--do commit check
end loop;
for h in 1..my_import_ids.count
delete from t2 where id = my_import_ids(h);
--do commit check
end loop;
--do commit check will be replaced with the same chunk to commit every 500 rows as the above query
So I need one of the following answered:
1) Which of these is better?
2) How can I find out which is better for my particular case? (IE if it depends on how many tables I have, how big they are, etc)
Edit:
I must do this in a loop due to the size of these tables. I will be deleting thousands of records from tables with hundreds of millions of records. This is happening on a system that can't afford to have the tables locked for that long.
EDIT:
NOTE: I am required to commit in batches. The amount of data is too large to do it in one batch. The rollback tables will crash our database.
If there is a way to commit in batches other than looping, I'd be willing to hear it. Otherwise, don't bother saying that I shouldn't use a loop...
Why loop at all?
delete from t1 where id IN (select id from source where import_source = 'bad.txt';
delete from t2 where id IN (select id from source where import_source = 'bad.txt';
delete from source where import_source = 'bad.txt'
That's using standard SQL. I don't know Oracle specifically, but many DBMSes also feature multi-table JOIN-based DELETEs as well that would let you do the whole thing in a single statement.
David,
If you insist on commiting, you can use the following code:
declare
type import_ids is table of integer index by pls_integer;
my_import_ids import_ids;
cursor c is select id from source where import_source = 'bad.txt';
begin
open c;
loop
fetch c bulk collect into my_import_ids limit 500;
forall h in 1..my_import_ids.count
delete from t1 where id = my_import_ids(h);
forall h in 1..my_import_ids.count
delete from t2 where id = my_import_ids(h);
commit;
exit when c%notfound;
end loop;
close c;
end;
This program fetches ids by pieces of 500 rows, deleting and commiting each piece. It should be much faster then row-by-row processing, because bulk collect and forall works as a single operation (in a single round-trip to and from database), thus minimizing the number of context switches. See Bulk Binds, Forall, Bulk Collect for details.
First of all, you shouldn't commit in the loop - it is not efficient (generates lots of redo) and if some error occurrs, you can't rollback.
As mentioned in previous answers, you should issue single deletes, or, if you are deleting most of the records, then it could be more optimal to create new tables with remaining rows, drop old ones and rename the new ones to old names.
Something like this:
CREATE TABLE new_table as select * from old_table where <filter only remaining rows>;
index new_table
grant on new table
add constraints on new_table
etc on new_table
drop table old_table
rename new_table to old_table;
See also Ask Tom
Larry Lustig is right that you don't need a loop. Nonetheless there may be some benefit in doing the delete in smaller chunks. Here PL/SQL bulk binds can improve speed greatly:
declare
type import_ids is table of integer index by pls_integer;
my_count integer := 0;
begin
select id bulk collect into my_import_ids from source where import_source = 'bad.txt'
forall h in 1..my_import_ids.count
delete from t1 where id = my_import_ids(h);
forall h in 1..my_import_ids.count
delete from t2 where id = my_import_ids(h);
The way I wrote it it does it all at once, in which case yeah the single SQL is better. But you can change your loop conditions to break it into chunks. The key points are
don't commit on every row. If anything, commit only every N rows.
When using chunks of N, don't run the delete in an ordinary loop. Use forall to run the delete as a bulk bind, which is much faster.
The reason, aside from the overhead of commits, is that each time you execute an SQL statement inside PL/SQL code it essentially does a context switch. Bulk binds avoid that.
You may try partitioning anyway to use parallel execution, not just to drop one partition. The Oracle documentation may prove useful in setting this up. Each partition would use it's own rollback segment in this case.
If you are doing the delete from the source before the t1/t2 deletes, that suggests you don't have referential integrity constraints (as otherwise you'd get errors saying child records exist).
I'd go for creating the constraint with ON DELETE CASCADE. Then a simple
DECLARE
v_cnt NUMBER := 1;
BEGIN
WHILE v_cnt > 0 LOOP
DELETE FROM source WHERE import_source = 'bad.txt' and rownum < 5000;
v_cnt := SQL%ROWCOUNT;
COMMIT;
END LOOP;
END;
The child records would get deleted automatically.
If you can't have the ON DELETE CASCADE, I'd go with a GLOBAL TEMPORARY TABLE with ON COMMIT DELETE ROWS
DECLARE
v_cnt NUMBER := 1;
BEGIN
WHILE v_cnt > 0 LOOP
INSERT INTO temp (id)
SELECT id FROM source WHERE import_source = 'bad.txt' and rownum < 5000;
v_cnt := SQL%ROWCOUNT;
DELETE FROM t1 WHERE id IN (SELECT id FROM temp);
DELETE FROM t2 WHERE id IN (SELECT id FROM temp);
DELETE FROM source WHERE id IN (SELECT id FROM temp);
COMMIT;
END LOOP;
END;
I'd also go for the largest chunk your DBA will allow.
I'd expect each transaction to last for at least a minute. More frequent commits would be a waste.
This is happening on a system that
can't afford to have the tables locked
for that long.
Oracle doesn't lock tables, only rows. I'm assuming no-one will be locking the rows you are deleting (or at least not for long). So locking is not an issue.