Nested cursor performance tuning - sql

I have 2 cursors , one to fetch records from a table of 50 columns and 10,000 + data and another to check if a particular column exists in another big table (2 million data). I should write to a file all the records from cursor 1 for a year , if that column exists in cursor 2 then i should print an error message as exists and not delete them . If it does not exist then i should delete the row and write it to the same file and message as record deleted.
I used a nested cursor , the performance is too bad as it is processing each row from cursor 1 against cursor 2 , every time .
CURSOR cursor1
IS
select a.* ,a.rowid
FROM table1 a
WHERE a.table1.year = p_year;
CURSOR check_c2(lv_cd )
IS
Select DISTINCT 'Y'
from table2
where table2 ='R'
AND table2.year= p_year
and table2_code= lv_cd ;
BEGIN :
FOR r in cursor1 LOOP
EXIT WHEN cursor1%NOTFOUND;
OPEN check_c2(r.cd);
FETCH check_c2 INTO lv_check;
IF check_c2%NOTFOUND THEN
lv_check :='N';
END IF;
CLOSE check_c2;
IF lv_check ='Y' THEN
lv_msg =(r.col1,r.col2....r.col50, R code exists do not delete)
utl_file.put_line(lv_log_file, lv_msg, autoflush=>TRUE);
ELSE
DELETE from table1 where rowid= r.rowid
lv_msg =(r.col1,r.col2....r.col50, delete row)
utl_file.put_line(lv_log_file, lv_msg, autoflush=>TRUE);
END IF;
END LOOP;

Don't have enough reputation to write comments som will write as an answer.
Didn't you try to add some time marks to understand which parts are the most time spending?
Does table2 have index by year and code? What's the explain plan of cursor2 query? If yes - how many rows are there average for year+code combination?
If the amount of data selected overall from table 2 is huge - then it probably can be faster to do a single query with full scan/index range scan by year on table2, grouping and hash left outer join from table1 to table2 like
select a.*, a.rowid, nvl2(c.code, 'Y', 'N') check_col
from table1 a,
(
select distinct code
from table2 b
where b.year = p_year
) c
where a.year = p_year
and c.code(+) = a.cd

How about this? A 3-steps-operation:
Step 1: "save" rows you'll later delete
create table log_table as
select *
from table1 a
where exists (select null
from table2 b
where b.year = a.year
and b.code = a.code
);
Step 2: delete rows:
delete from table1 a
where exists (select null
from table2 b
where b.year = a.year
and b.code = a.code
);
Step 3: if you must, store rows saved in the LOG_TABLE into that file of yours. If not, leave them in LOG_TABLE.

utl_file.put_line in the loop will be an overhead.Try appending to lv_msg till length of the string is 32767 bytes and write just once.
This will definitely reduce the I/O and performance should be improved.

Related

Delete two tables based on results of one join

I am trying to delete data from two tables at the same time using inner join. However when I tried to run my query, an error
SQL command not properly ended
error came out.
A brief background of what I am trying to do and some info on the tables, table1 and table2. So both tables has a same field, for instance "ABC". I would like to delete data from both tables using inner join but under the where condition of a field (XYZ) under table where it equals to a value.
This is my sql statment:
DELETE table1, table2
FROM table1
INNER JOIN table1 ON table1.ABC = table2.ABC
WHERE table1.XYZ = 'TESTIT';
You can't delete more than one table.
You must use two different DELETE statements.
For this you can create a temporary table to store IDs to delete, for example:
CREATE TABLE app (ABC varchar(100))
INSERT INTO app (ABC)
SELECT abc
FROM table1
INNER JOIN table1 ON table1.ABC = table2.ABC
WHERE table1.XYZ = 'TESTIT';
DELETE
FROM table1
WHERE table1.ABC IN (SELECT ABC FROM app);
DELETE
FROM table2
WHERE table2.ABC IN (SELECT ABC FROM app);
DROP TABLE app;
In Oracle you cannot delete from 2 tables in a single statement like you are doing. The syntax is wrong. You can use as below:
DELETE table1
where table1.ABC = (select table2.ABC
from table2
WHERE table2.ABC = table1.ABC
and table1.XYZ = 'TESTIT');
A PL/SQL solution might be something like this:
declare
type abc_tt is table of table1.abc%type index by pls_integer;
l_abc_collection abc_tt;
begin
select distinct t1.abc bulk collect into l_abc_collection
from table1 t1
join table2 t2 on t2.abc = t1.abc
where t1.xyz = 'TESTIT';
dbms_output.put_line('Stored ' || l_abc_collection.count || ' values for processing');
forall i in 1..l_abc_collection.count
delete table1 t
where t.xyz = 'TESTIT'
and t.abc = l_abc_collection(i);
dbms_output.put_line('Deleted ' || sql%rowcount || ' rows from table1');
forall i in 1..l_abc_collection.count
delete table2 t
where t.xyz = 'TESTIT'
and t.abc = l_abc_collection(i);
dbms_output.put_line('Deleted ' || sql%rowcount || ' rows from table2');
end;
Output:
Stored 1000 values for processing
Deleted 1000 rows from table1
Deleted 1000 rows from table1
Test setup:
create table table1 (abc, xyz) as
select rownum, 'TESTIT' from dual connect by rownum <= 1000
union all
select rownum, 'OTHER' from dual connect by rownum <= 100;
create table table2 as select * from table1;
After deletion there are 100 rows in each table. I have assumed we only want to delete the ones where xyz = 'TESTIT' even when abc values are common to both tables.
select distinct table1.ABC into Temptable
FROM table1
INNER JOIN table1 ON table1.ABC = table2.ABC
WHERE table1.XYZ = 'TESTIT'
delete table1 where ABC in (select ABC from Temptable)
delete table2 where ABC in (select ABC from Temptable)
drop table Temptable

PL SQL map result set to multiple records

If we have 3 tables for example A, B and C and a cursor like
FOR i IN (
SELECT *
FROM A
JOIN B ON(...)
JOIN C ON(...)
) LOOP
--is there an easy way to map every row to 3 records(A%rowtype, B%rowtype and C%rowtype)?
END LOOP;
Notice that fetching records by ROWID is faster than by index. But you will gain a tiny loss of performance (comparing to simple FOR LOOP) with the following trick, because nothing is free.
DECLARE
l_rec_a table_A%ROWTYPE;
l_rec_b table_B%ROWTYPE;
BEGIN
FOR i IN (SELECT a.ROWID first
, b.ROWID second
FROM table_A AS a
JOIN table_B AS b on a.id = b.id)
LOOP
SELECT * INTO l_rec_a FROM table_A WHERE ROWID = i.first;
SELECT * INTO l_rec_b FROM table_B WHERE ROWID = i.second;
/* do something */
END LOOP;
END;

Need help in SQL Select Query

Need some help with this query, I just want to know if what I am doing is fine or do I need JOIN to get it better. Sorry if this a silly question but I am little worried as I query the same table thrice. Thanks in advance
Select *
from TableA
where (A_id in (1, 2, 3, 4)
and flag = 'Y') or
(A_id in
(select A_id from TableB
where A_id in
(Select A_id from TableA
where (A_id in (1, 2, 3, 4)
and flag = 'N')
group by A_id
having sum(qty) > 0)
)
Relation between TableA and TableB is one-to-many
Condition or Logic:
if the flag is true, the data can be selected without further checks
if the flag is false, we have to refer TableB to see if sum of the qty column is greater than 0
Your approach is indeed way too complicated. Select from A where flag = Y or the sum of related B > 0. Do the latter in a subquery.
select *
from a
where a_id in (1,2,3,4)
and
(
flag = 'Y'
or
(select sum(qty) from b where b.a_id = a.a_id) > 0
)
There's nothing badly wrong with the query you've presented, but there are improvements that can be made. If you move the test for Flag='N' into your first select from TableA and correlate your select from TableB with your first select from TableA, then you can dispense with the second select from TableA:
Select *
from TableA A
where A_id in (1, 2, 3, 4)
and (flag = 'Y'
or (flag = 'N'
and A_id in (select A_id
from TableB B
where b.A_id = a.A_id
group by A_id
having sum(qty) > 0))
);
This will eliminate an extra lookup on TableA for information that should already be known. Second since TableA.A_Id is now correlated with TableB.A_Id, the A_Id in (...) can be changed to an exists clause:
Select *
from TableA A
where A_id in (1, 2, 3, 4)
and (flag = 'Y'
or (flag = 'N'
and exists (select A_id
from TableB B
where b.A_id = a.A_id
group by A_id
having sum(qty) > 0))
);
This may (depending on the database type) inform the databases query optimizer that it can stop retrieving rows from TableB after the first row is found.
In an Oracle database on a small unindexed sample dataset these two changes shaved 25% off of the cost of the query, so the performance increases could be significant.
Would it be possible for you to split this query into a store procedure?
In Example:
DELIMITER $$
CREATE FUNCTION flaggedSelection ( my_flag varchar(1) )
RETURNS varchar(255) -- TODO: change to appropriate output
BEGIN
DECLARE return_value varchar(255); -- TODO: change to appropriate output
IF flag = 'Y'
THEN
-- Performe select without further checks
-- return_value = QUERY;
ELSE
-- Refer TableB to see if sum of the qty column is greater than 0
-- return_value = QUERY;
END IF;
RETURN return_value;
END; $$
DELIMITER;

Oracle : after insert into select, update the table

I need your advice to my below case.
I get data from maintable and insert into dataTable where rownum <= some value
once all data already insert into datatable, i want this data in maintable will update the staus.
The problem is if the rownum more than 500k, it take about 10 minutes. This time there could be another request was pickup the same data. How i want to prevent this?
Below is my sql.
insert into dataTable(id,num,status) select m.id,m.num,m.status from mainTable m where m.status = 'FREE' and rownum <= 100000;
update mainTable m set m.status = 'RESERVED' where m.num in (select d.num from dataTable where d.status = 'FREE');
I do some research, but i dont know whether i need to use the select for update or merge statement?
You can't use MERGE, as you can only insert into or update the target table. I would guess that the problem is either the selectivity of the column STATUS in dataTable or of the column NUM in mainTable.
Either way, if you only want to update those rows in mainTable that you've just inserted into mainTable the simplest thing to do would be to remember what you've just inserted and update that. A BULK COLLECT seems apposite.
declare
cursor c_all is
select rowid as rid, id, num, status
from maintable
where status = 'FREE'
and rownum <= 100000;
type t__all is table of c_all%rowtype index by binary_integer;
t_all t__all;
begin
open c_all;
loop
fetch c_all bulk collect into t_all limit 10000;
forall i in t_all.first .. t_all.last
insert into datatable (id, num, status)
values (t_all(i).id, t_all(i).num, t_all(i.status));
forall i in t_all.first .. t_all.last
update maintable
set status = 'RESERVED'
where rowid t_all(i).rid;
end loop;
commit;
close c_all;
end;
/
This is not equivalent to your query, it assumes that maintable is unique on NUM. If it unique on ID I would change the UPDATE to a MERGE (it's cleaner) and remove the ROWID column from the cursor:
forall i in t_all.first .. t_all.last
merge into maintable m
using ( select t_all(i).num from dual ) d
on ( m.num = d.num )
when matched then
update
set m.status = 'RESERVED'
As I've written though, if the problem is the selectivity of the columns/indexing you need to post the explain plan, indexes etc.
I think that it is better that you use EXISTS exchange of using in in your update query, it is so faster:
update mainTable m
set m.status = 'RESERVED'
where exists (select * from dataTable where m.num = d.num and d.status = 'FREE');

Deleting rows in a table a chunk at a time

I have a single-column table Ids, which whose column ID is of type uniqueidentifier. I have another table MyTable which has an ID column as well as many other columns. I would like to delete rows from MyTable 1000 at a time, where the ID from MyTable matches an ID in Ids.
WHILE 1 = 1 BEGIN
DELETE t FROM (SELECT TOP 1000 ID FROM Ids) d INNER JOIN MyTable t ON d.ID = t.ID;
IF ##ROWCOUNT < 1 BREAK;
WAITFOR DELAY #sleeptime; -- some time to be determined later
END
This doesn't seem to work though. What should the statement actually be?
Try this:
DECLARE #BatchSize INT
SET #BatchSize = 100000
WHILE #BatchSize <> 0
BEGIN
DELETE TOP (#BatchSize) t
FROM [MyTable] t
INNER JOIN [Ids] d ON d.ID=t.ID
WHERE ????
SET #BatchSize = ##rowcount
END
Has the benefit that the only variable you need to create is the size, as it uses it for the WHILE loop check. When the delete gets below 100000, it will set the variable to that number, on the next pass there will be nothing to delete and the rowcount will be 0... and so you exit. Clean, simple, and easy to understand. Never use a CURSOR when WHILE will do the trick!
Try
Delete from MyTable
Where ID in
(select top 1000 t.ID
from Ids t inner
join MyTable d on d.Id = t.Id)
You could also try:
set rowcount 1000
delete from mytable where id in (select id from ids)
set rowcount 0 --reset it when you are done.
http://msdn.microsoft.com/en-us/library/ms188774.aspx
WHILE EXISTS (SELECT TOP 1 * FROM MyTable mt JOIN IDs i ON mt.ID = t.ID)
BEGIN
DELETE TOP (1000) FROM MyTable
FROM MyTable mt JOIN IDS i ON mt.ID = i.ID
--you can wait if you want, but probably not necessary
END
--Sorry for the quick post; was in a hurry :)
The DELETE statement in SQL Server supports two FROM clauses; the first FROM identifies the table that is having rows deleted, and the second FROM clause is used for JOINS.
See: http://msdn.microsoft.com/en-us/library/ms189835.aspx