PL SQL bulk collect fetchall not completing - sql

I made this procedure to bulk delete data (35m records). Can you see why this pl/sql procedure runs without exiting and rows are not getting deleted ?
create or replace procedure clear_logs
as
CURSOR c_logstodel IS SELECT * FROM test where id=23;
TYPE typ_log is table of test%ROWTYPE;
v_log_del typ_log;
BEGIN
OPEN c_logstodel;
LOOP
FETCH c_logstodel BULK COLLECT INTO v_log_del LIMIT 5000;
EXIT WHEN c_logstodel%NOTFOUND;
FORALL i IN v_log_del.FIRST..v_log_del.LAST
DELETE FROM test WHERE id =v_log_del(i).id;
COMMIT;
END LOOP;
CLOSE c_logstodel;
END clear_logs;

Adding in rowid instead of column name, exit when v_delete_data.count = 0; instead of EXIT WHEN c_logstodel%NOTFOUND; and changing chunk limit to 50,000 allowed the script clear 35 million rows in 15 mins
create or replace procedure clear_logs
as
CURSOR c_logstodel IS SELECT rowid FROM test where id=23;
TYPE typ_log is table of rowid index by binary_integer;
v_log_del typ_log;
BEGIN
OPEN c_logstodel;
LOOP
FETCH c_logstodel BULK COLLECT INTO v_log_del LIMIT 50000;
exit when v_log_del.count = 0;
FORALL i IN v_log_del.FIRST..v_log_del.LAST
DELETE FROM test WHERE rowid =v_log_del(i);
exit when v_log_del.count = 0;
COMMIT;
END LOOP;
COMMIT;
CLOSE c_logstodel;
END clear_logs;

First off when using BULK COLLECT LIMIT X the %NOTFOUND takes on a slightly unexpected meaning. In this case %NOTFOUND actually means Oracle could not retrieve X rows. (I guess technically it always does you fetch the next 1 and it says it could not fill the 1 row buffer.) Just move the EXIT WHEN %NOTFOUND to after the FORALL. But there is actually no reason to retrieve the data and then delete the retrieved rows. While one statement would be considerable faster 35M rows would require signifient rollback space. There is an interment solution.
Although not commonly used Delate statements generate rownum as do selects. This value can be user to limit the number or rows processed. So to break into a given commit size just limit rownum on the delete:
create or replace procedure clear_logs
as
k_max_rows_per_interation constant integer := 50000;
begin
loop
delete
from test
where id=23
and rownum <= k_max_rows_per_interation;
exit when sql%rowcount < k_max_rows_per_interation;
commit;
end loop;
commit;
end;
As #Stilgar points out deletes are expensive, meaning slow, so their solution may be better. But this has the advantage that it does not essentially take the table completely out-of-service during the operation. NOTE: I tend to use a much larger commit interval size, generally around 400,000 - 300,000 rows. I suggest you talk with your DBA see what they think this limit should be. Remember it is their job to properly size rollback space for typical operations. If this is normal in your operation they need to set it correctly. If you can get rollback space for 35M deletes then that is the fastest you are going to get.

Related

Updating Millions of Records Oracle

I have created one query to update the 35 million records column,
but unfortunately, it took around more than one hour to process.
did I miss anything on the below query?
DECLARE
CURSOR exp_cur IS
SELECT
DECODE(
COLUMN_NAME,
NULL, NULL,
standard_hash(COLUMN_NAME)
) AS COLUMN_NAME
FROM TABLE1;
TYPE nt_fName IS TABLE OF VARCHAR2(100);
fname nt_fName;
BEGIN
OPEN exp_cur;
FETCH exp_cur BULK COLLECT INTO fname LIMIT 1000000;
CLOSE exp_cur;
--Print data
FOR idx IN 1 .. fname.COUNT
LOOP
UPDATE TABLE1 SET COLUMN_NAME=fname(idx);
commit;
DBMS_OUTPUT.PUT_LINE (idx||' '||fname(idx) );
END LOOP;
END;
The reason why bulk collect used with a forall construction is generally faster than the equivalent row-by-row loop is because it applies all the updates in one shot, instead of laboriously stepping though the rows one at a time and launching 35 million separate update statements, each one requiring the database to search for the individual row before updating it. But what you have written (even when the bugs are fixed) is still a row-by-row loop with 35 million search and update statements, plus the additional overhead of populating a 700 MB array in memory, 35 million commits, and 35 million dbms_output messages. It has to be slower because it has significantly more work to do than a plain update.
If it is practical to copy the data to a new table, insert will be a lot faster than update. At the end you can reapply any grants, indexes and constraints to the new table, rename both tables and drop the old one. You can also insert /*+ parallel enable_parallel_dml */ (or prior to Oracle 12c, you have to alter session enable parallel dml separately.) You could define the new table as nologging during the copy, but check with your DBA as that can affect replication and backups, though that might not matter if this is a test system. This will all need careful scripting if it's going to form part of a routine workflow.
Your code is updating all records of TABLE1 in each loop. (It loops 35 million times and in each loop updating 35 million records, That's why it is taking time)
You can simply use a single update statement as follows:
UPDATE TABLE1 SET COLUMN_NAME = standard_hash(COLUMN_NAME)
WHERE COLUMN_NAME IS NOT NULL;
So, If you want to use the BULK COLLECT and FORALL then you can use it as follows:
DECLARE
CURSOR EXP_CUR IS
SELECT COLUMN_NAME FROM TABLE1
WHERE COLUMN_NAME IS NOT NULL;
TYPE NT_FNAME IS TABLE OF VARCHAR2(100);
FNAME NT_FNAME;
BEGIN
OPEN EXP_CUR;
FETCH EXP_CUR BULK COLLECT INTO FNAME LIMIT 1000000;
FORALL IDX IN FNAME.FIRST..FNAME.LAST
UPDATE TABLE1
SET COLUMN_NAME = STANDARD_HASH(COLUMN_NAME)
WHERE COLUMN_NAME = FNAME(IDX);
COMMIT;
CLOSE EXP_CUR;
END;
/

Oracle: how to limit number of rows in "select .. for update skip locked"

I have got a table:
table foo{
bar number,
status varchar2(50)
}
I have multiple threads/hosts each consuming the table. Each thread updates the status, i.e. pessimistically locks the row.
In oracle 12.2.
select ... for update skip locked seems to do the job but I want to limit number of rows. The new FETCH NEXT sounds right, but I cant get the syntax right:
SELECT * FROM foo ORDER BY bar
OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY
FOR UPDATE SKIP LOCKED;
What is the simplest way to achieve this, i.e. with minimum code1 (ideally without pl/sql function)?
I want something like this:
select * from (select * from foo
where status<>'baz' order by bar
) where rownum<10 for update skip locked
PS
1. We are considering moving away from oracle.
I suggest to create pl/sql function and use dynamic sql to control the number of locked records. The lock is acquired at a fetch time. So fetching N records automatically locks them. Keep in mind that records are unlocked once you finish the transaction - commit or rollback.
The following is the example to lock N records and return their id values as an array (assume you have added the primary key ID column in your table):
create or replace function get_next_unlocked_records(iLockSize number)
return sys.odcinumberlist
is
cRefCursor sys_refcursor;
aIds sys.odcinumberlist := sys.odcinumberlist();
begin
-- open cursor. No locks so far
open cRefCursor for
'select id from foo '||
'for update skip locked';
-- we fetch and lock at the same time
fetch cRefCursor bulk collect into aIds limit iLockSize;
-- close cursor
close cRefCursor;
-- return locked ID values,
-- lock is kept until the transaction is finished
return aIds;
end;
sys.odcinumberlist is the built-in array of numbers.
Here is the test script to run in db:
declare
aRes sys.odcinumberlist;
begin
aRes := get_next_unlocked_records(10);
for c in (
select column_value id
from table(aRes)
) loop
dbms_output.put_line(c.id);
end loop;
end;

Fast Update database with more than 10 million records

I am fairly new to SQL and was wondering if someone can help me.
I got a database that has around 10 million rows.
I need to make a script that finds the records that have some NULL fields, and then updates it to a certain value.
The problem I have from doing a simple update statement, is that it will blow the rollback space.
I was reading around that I need to use BULK COLLECT AND FETCH.
My idea was to fetch 10,000 records at a time, update, commit, and continue fetching.
I tried looking for examples on Google but I have not found anything yet.
Any help?
Thanks!!
This is what I have so far:
DECLARE
CURSOR rec_cur IS
SELECT DATE_ORIGIN
FROM MAIN_TBL WHERE DATE_ORIGIN IS NULL;
TYPE date_tab_t IS TABLE OF DATE;
date_tab date_tab_t;
BEGIN
OPEN rec_cur;
LOOP
FETCH rec_cur BULK COLLECT INTO date_tab LIMIT 1000;
EXIT WHEN date_tab.COUNT() = 0;
FORALL i IN 1 .. date_tab.COUNT
UPDATE MAIN_TBL SET DATE_ORIGIN = '23-JAN-2012'
WHERE DATE_ORIGIN IS NULL;
END LOOP;
CLOSE rec_cur;
END;
I think I see what you're trying to do. There are a number of points I want to make about the differences between the code below and yours.
Your forall loop will not use an index. This is easy to get round by using rowid to update your table.
By committing after each forall you reduce the amount of undo needed; but make it more difficult to rollback if something goes wrong. Though logically your query could be re-started in the middle easily and without detriment to your objective.
rowids are small, collect at least 25k at a time; if not 100k.
You cannot index a null in Oracle. There are plenty of questions on stackoverflow about this is you need more information. A functional index on something like nvl(date_origin,'x') as a loose example would increase the speed at which you select data. It also means you never actually have to use the table itself. You only select from the index.
Your date data-type seems to be a string. I've kept this but it's not wise.
If you can get someone to increase your undo tablespace size then a straight up update will be quicker.
Assuming as per your comments date_origin is a date then the index should be on something like:
nvl(date_origin,to_date('absolute_minimum_date_in_Oracle_as_a_string','yyyymmdd'))
I don't have access to a DB at the moment but to find out the amdiOaas run the following query:
select to_date('0001','yyyy') from dual;
It should raise a useful error for you.
Working example in PL/SQL Developer.
create table main_tbl as
select cast( null as date ) as date_origin
from all_objects
;
create index i_main_tbl
on main_tbl ( nvl( to_date(date_origin,'yyyy-mm-dd')
, to_date('0001-01-01' ,'yyyy-mm-dd') )
)
;
declare
cursor c_rec is
select rowid
from main_tbl
where nvl(date_origin,to_date('0001-01-01','yyyy-mm-dd'))
= to_date('0001-01-01','yyyy-mm-dd')
;
type t__rec is table of rowid index by binary_integer;
t_rec t__rec;
begin
open c_rec;
loop
fetch c_rec bulk collect into t_rec limit 50000;
exit when t_rec.count = 0;
forall i in t_rec.first .. t_rec.last
update main_tbl
set date_origin = to_date('23-JAN-2012','DD-MON-YYYY')
where rowid = t_rec(i)
;
commit ;
end loop;
close c_rec;
end;
/

how to fetch, delete, commit from cursor

I am trying to delete a lot of rows from a table. I want to try the approach of putting rows I want to delete into a cursor and then keep doing fetch, delete, commit on each row of the cursor until it is empty.
In the below code we are fetching rows and putting them in a TYPE.
How can I modify the below code to remove the TYPE from the picture and just simply do fetch,delete,commit on the cursor itself.
OPEN bulk_delete_dup;
LOOP
FETCH bulk_delete_dup BULK COLLECT INTO arr_inc_del LIMIT c_rows;
FORALL i IN arr_inc_del.FIRST .. arr_inc_del.LAST
DELETE FROM UIV_RESPONSE_INCOME
WHERE ROWID = arr_inc_del(i);
COMMIT;
arr_inc_del.DELETE;
EXIT WHEN bulk_delete_dup%NOTFOUND;
END LOOP;
arr_inc_del.DELETE;
CLOSE bulk_delete_dup;
Why do you want to commit in batches? That is only going to slow down your processing. Unless there are other sessions that are trying to modify the rows you are trying to delete, which seems problematic for other reasons, the most efficient approach would be simply to delete the data with a single DELETE, i.e.
DELETE FROM uiv_response_income uri
WHERE EXISTS(
SELECT 1
FROM (<<bulk_delete_dup query>>) bdd
WHERE bdd.rowid = uri.rowid
)
Of course, there may well be a more optimal way of writing this depending on how the query behind your cursor is designed.
If you really want to eliminate the BULK COLLECT (which will slow the process down substantially), you could use the WHERE CURRENT OF syntax to do the DELETE
SQL> create table foo
2 as
3 select level col1
4 from dual
5 connect by level < 10000;
Table created.
SQL> ed
Wrote file afiedt.buf
1 declare
2 cursor c1 is select * from foo for update;
3 l_rowtype c1%rowtype;
4 begin
5 open c1;
6 loop
7 fetch c1 into l_rowtype;
8 exit when c1%notfound;
9 delete from foo where current of c1;
10 end loop;
11* end;
SQL> /
PL/SQL procedure successfully completed.
Be aware, however, that since you have to lock the row (with the FOR UPDATE clause), you cannot put a commit in the loop. Doing a commit would release the locks you had requested with the FOR UPDATE and you'll get an ORA-01002: fetch out of sequence error
SQL> ed
Wrote file afiedt.buf
1 declare
2 cursor c1 is select * from foo for update;
3 l_rowtype c1%rowtype;
4 begin
5 open c1;
6 loop
7 fetch c1 into l_rowtype;
8 exit when c1%notfound;
9 delete from foo where current of c1;
10 commit;
11 end loop;
12* end;
SQL> /
declare
*
ERROR at line 1:
ORA-01002: fetch out of sequence
ORA-06512: at line 7
You may not get a runtime error if you remove the locking and avoid the WHERE CURRENT OF syntax, deleting the data based on the value(s) you fetched from the cursor. However, this is still doing a fetch across commit which is a poor practice and radically increases the odds that you will, at least intermittently, get an ORA-01555: snapshot too old error. It will also be painfully slow compared to the single SQL statement or the BULK COLLECT option.
SQL> ed
Wrote file afiedt.buf
1 declare
2 cursor c1 is select * from foo;
3 l_rowtype c1%rowtype;
4 begin
5 open c1;
6 loop
7 fetch c1 into l_rowtype;
8 exit when c1%notfound;
9 delete from foo where col1 = l_rowtype.col1;
10 commit;
11 end loop;
12* end;
SQL> /
PL/SQL procedure successfully completed.
Of course, you also have to ensure that your process is restartable in case you process some subset of rows and have some unknown number of interim commits before the process dies. If the DELETE is sufficient to cause the row to no longer be returned from your cursor, your process is probably already restartable. But in general, that's a concern if you try to break a single operation into multiple transactions.
A few things. It seems from your company's "no transaction over 8 second" rule (8 seconds, you in Texas?), you have a production db instance that traditionally supported apps doing OLTP stuff (insert 1 row, update 2 rows, etc), and has now also become the batch processing db (remove 50% of the rows and replace with 1mm new rows).
Batch processing should be separated from OLTP instance. In a batch ("data factory") instance, I wouldn't try deleting in this case, I'd probably do a CTAS, drop old table, rename new table, rebuild indexes/stats, recompile invalid objs approach.
Assuming you are stuck doing batch processing in your "8 second" instance, you'll probably find your company will ask for more and more of this in the future, so ask the DBAs for as much rollback as you can get, and hope you don't get a snapshot too old by fetching across commits (cursor select driving the deletes, commit every 1000 rows or so, delete using rowid).
If DBAs cant help, you may be able to first create a temp table containing the rowids that you wish to delete, and then loop through the temp table to delete from main table (avoid fetching across commits), but your company will probably have some rule against this as well as this is another (basic) batch technique.
Something like:
declare
-- assuming index on someCol
cursor sel_cur is
select rowid as row_id
from someTable
where someCol = 'BLAH';
v_ctr pls_integer := 0;
begin
for rec in sel_cur
loop
v_ctr := v_ctr + 1;
-- watch out for snapshot too old...
delete from someTable
where rowid = rec.row_id;
if (mod(v_ctr, 1000) = 0) then
commit;
end if;
end loop;
commit;
end;

In oracle, do explicit cursors load the entire query result in memory?

I have a table with about 1 billion rows. I'm the sole user so there's no contention on locks, etc.
I noticed that when I run something like this:
DECLARE
CURSOR cur IS SELECT col FROM table where rownum < N;
BEGIN
OPEN cur;
LOOP
dbms_output.put_line("blah")
END LOOP;
CLOSE cur;
END;
there is a lag between the time when I hit enter and the time the output begins to flow in. If N is small then it's insignificant. For large N (or no WHERE clause) this lag is on the order of hours.
I'm new to oracle as you can tell, and I assumed that cursors just keep a pointer in the table which they update on every iteration of the loop. So I didn't expect a lag proportional to the size of the table over which iteration is performed. Is this wrong? Do cursors load the entire query result prior to iterating over it?
Is there a way to iterate over a table row by row without an initial overhead?
What you are seeing is that the output from DBMS_OUTPUT.PUT_LINE is not displayed until the program has finished. It doesn't tell you anything about how fast the query returned a first row. (I assume you intended to actually fetch data in your example).
There are many ways you can monitor a session, one is like this:
DECLARE
CURSOR cur IS SELECT col FROM table;
l_col table.col%ROWTYPE;
BEGIN
OPEN cur;
LOOP
FETCH cur INTO l_col;
EXIT WHEN cur%NOTFOUND;
dbms_application_info.set_module('TEST',l_col);
END LOOP;
CLOSE cur;
END;
While that is running, from another session run:
select action from v$session where module='TEST';
You will see that the value of ACTION keeps changing as the cursor fetches rows.
I also like to monitor v$session_longops for operations deemed by the Oracle optimizer to be "long operations":
select message, time_remaining
from v$session_longops
where time_remaining > 0;