Delete SQL - Taking forever - sql

Delete SQL scripts are taking very long time and even hanging forever in Oracle 12c. We are having hundreds of delete scripts like below and even tried to run it by parallel operation /*+ PARALLEL (a,4) */ as well, but no luck in the performance improvement.
Is there any way to tune the delete scripts.
Can we use PL/SQL - for loop to make any performance improvement?
If yes, please share your thoughts and advices.
Some of Sample SQL Scripts:
DELETE
FROM
E_PROJ_DETAIL
WHERE
CATEGORY_ID in (SELECT PRIMARY_KEY FROM Y_OBJ_CATEGORY WHERE TREE_POSITION='VEN$_MADD');
COMMIT;
delete
from
e_proj_group_access
where enterprise_object_id in (select primary_key from t_project where application_id in (select application_id from y_object_definition where unique_code ='VEN$_MADD'));
commit;

I don't know about any possibility to 'tune' DELETE statements, maybe except droping any useless (=unused) indexes and constraints upfront and recreating them afterwards.
In these cases (deleting many rows) I used FOR loops with commits inside, something like this:
I := 0;
FOR c IN (SELECT id FROM table WHERE [conditions to delete])
LOOP
DELETE FROM table WHERE t.id = c.id; /* id = primary key */
IF (I > 1000) THEN
COMMIT;
I := 0;
END IF;
I := I + 1;
END LOOP;
But here you can occasionally run into ORA-01555: snapshot too old, because you will delete rows from the same table from which you opened cursor in the FOR loop.
In other situations, you could do CREATE TABLE newtable AS SELECT * FROM oldtable WHERE [conditions for rows I want to keep] and then do TRUNCATE oldtable and INSERT /*+ APPEND */ INTO oldtable SELECT * from newtable; to write correct data back.
It really depends on the situation you are in (as others commented - how many rows do you have in the table, how many rows do you want to delete, etc.).
hth :-)

Depending if it is a one shot deal or not, creating a new table with only the rows you want to keep is often much faster :
CREATE TABLE E_PROJ_DETAIL_NEW AS
SELECT * FROM
E_PROJ_DETAIL
WHERE
CATEGORY_ID NOT IN (SELECT PRIMARY_KEY FROM Y_OBJ_CATEGORY WHERE TREE_POSITION='VEN$_MADD');
Then delete the old table and rename the new one.
You may need to re-create indexes / fk if you had some.

Related

Truncating part of a postgres table

I am running Postgres version 10.01
psql -V
psql (PostgreSQL) 10.5
I have a table mytable with about 250 million rows - my objective is to create a new table newtable and copy about half of the mytable into newtable (SELECT * WHERE time > '2019-01-01 ), then to delete the records I copied from mytable
Of course, I want to keep all indices in mytable
What is the most efficient command to do this in psql ? TRUNCATE TABLE is efficient but will remove all rows. DELETE would probably take a lot of time and prevent inserts from happening (INSERTS are scheduled every 10 mins)
Any suggestions would be helpful
You would need to proceed in two steps.
First, copy the rows to the new table. You can use a CREATE..AS SELECT statement (but you will need to recreate indexes and other objects such as constraints manually on the new table after that).
CREATE TABLE new_table
AS SELECT * FROM old_table WHERE time > '2019-01-01
Then, delete records from the old table. It looks like an efficient way would be to JOIN with the new table, using the DELETE...USING syntax. Assuming that you have a primary key called id :
DELETE FROM old_table o
USING new_table n
WHERE n.id = o.id
(Be sure to create an indice on id in the new table before running this).
If you are just trying to delete rows couldn't you just treat your delete as a transaction. Unless your insert is dependent on the existing data in the table, there should be no blocking.

Deletion of millions of records without parallel hint and bulk collect

I have a table PROD_MAIN which have 750 million records on a single database. The database infrastructure is very basic and does not have any RACs on it. It is just 1 database.
The requirement is to delete the records which are more than 1 year old. I wrote a PL SQL code with parallel hint and bulk collect. It takes very long time to execute. Please find the code below.
ALTER SESSION ENABLE PARALLEL DML;
DECLARE
TYPE TABLE_DELETE IS TABLE OF ROWID;
T_DELETE TABLE_DELETE;
CURSOR C_DELETE IS
SELECT /*+ PARALLEL(10) */ ROWID FROM PROD_MAIN WHERE RECORD_DATE < (TRUNC(SYSDATE) - 366);
L_DELETE_BUFFER PLS_INTEGER := 50000;
BEGIN
OPEN C_DELETE;
LOOP
FETCH C_DELETE BULK COLLECT
INTO T_DELETE LIMIT L_DELETE_BUFFER;
FORALL I IN 1..T_DELETE.COUNT
DELETE /*+ PARALLEL(10) */ PROD_MAIN WHERE ROWID = T_DELETE(I);
EXIT WHEN C_DELETE%NOTFOUND;
COMMIT;
END LOOP;
CLOSE C_DELETE;
COMMIT;
END;
ALTER SESSION DISABLE PARALLEL DML;
I also did NOLOGGING on the table. I created indexes and did stat gathering but the performance did not improve. So, is there any other way where I can delete the millions of records within 3 - 5 hours?
If the table is partitioned by date, you can truncate the partitions with more than one year (truncate a partition takes no time a dont degrades the table)
if it has no partitions, i think the best think you can do is not to try to remove all records in one single transaction. Try to remove a few records and put it in a loop. For example, i you want to delete 10.000 records you can do:
DELETE FROM your_table WHERE your_conditions LIMIT 10.000 (MySQL)
DELETE FROM your_table WHERE your_conditions AND rownum <10000 (Oracle)
Remember to optimize the table after finishing (or even alternated between deletes) due to it will degrade the index.
Depending on your environment requirements, another thing you can try is to create an empty table copy, and perform an INSERT from SELECT, inserting in the new table all the rows that you want to maintain. after that, truncate the original table, drop it, and rename the new one.
MyOriginalTable whit All Data
Create en Empty Copy: MyTemporalTable (without indexes)
Move valid data from MyOriginalTable to MyTemporalTable
Truncate and Drop MyOriginalTable
Create indexes in MyTemporalTable
Rename MyTemporalTable to MyORiginalTable
I think the problem is: this table is master table for other table(s).
To speed up disable those foreign keys in other tables. Then delete rows then enable the indexes.
But the third solution of 'Diego Sal Diaz' to copy remaining row to temp table and rename it is good also.
I resolved this issue by creating a temporary table PROD_MAIN_TEMP which has the exact table structure like PROD_MAIN. After creating, I inserted the data which I want to keep.
SELECT /*+ PARALLEL(10) */ * FROM PROD_MAIN WHERE RECORD_DATE < (TRUNC(SYSDATE) - 366);
Dropped the table main table PROD_MAIN and renamed the temporary table PROD_MAIN_TEMP to PROD_MAIN.
This whole process completed in 3 hours.

How to delete large amount of data from Oracle table using separated transactions?

I need to delete about 5 millions of records from Oracle table.
Due to the performance (REDO logs) I would like to remove 100000 records per transaction, like this:
DECLARE
v_limit PLS_INTEGER :=100000;
CURSOR person_deleted_cur
IS
SELECT rowid
FROM Persons p
WHERE City = 'ABC'
AND NOT EXISTS
(SELECT O_Id
FROM Orders o
WHERE p.P_Id = o.P_Id);
TYPE person_deleted_nt IS TABLE OF person_deleted_cur%ROWTYPE
INDEX BY PLS_INTEGER;
BEGIN
OPEN person_deleted_cur;
LOOP
FETCH person_deleted_cur
BULK COLLECT INTO person_deleted_nt LIMIT v_limit;
FORALL indx IN 1 .. person_deleted_nt.COUNT
DELETE FROM Persons WHERE rowid=person_deleted_nt(indx);
EXIT WHEN person_deleted_cur%NOTFOUND;
END LOOP;
CLOSE person_deleted_cur;
COMMIT;
END;
But Liquibase runs changeSet in one transaction and rolls it back if there are any errors. Is a good habit to use COMMIT manifestly in Liquibase scripts?
What should be a well-written script?
In the book "Oracle for professionals" Tom Kyte written about update in others transactions. The point is: if you can change table with one query then so do. Because one query will faster than differ transactions or plsql loop with partition delete.
Another a approach would be to use CREATE TABLE with NOLOGGING instead of UPDATE/DELETE. It is the best solution for a change many rows.
So create nologging table with your query, than to delete original table and recreate index, constraints and etc, than rename temp table to original table.
Agree with #jimmbraddock, but a more simple solution that is lower impact when it comes to an OLTP system might be to repeatedly run this query until it affects no more rows:
DELETE FROM Persons p
WHERE City = 'ABC'
AND NOT EXISTS
(SELECT O_Id
FROM Orders o
WHERE p.P_Id = o.P_Id)
AND ROWNUM <= 100000;
The total resource usage would be higher than a single delete, and thus a single delete would still be better if your system can accommodate it, but this would be pretty robust, and with an index on persons(city,p_id) and one on orders(p_id) it should be very performant.

Oracle MERGE deadlock

I want to insert rows with a MERGE statement in a specified order to avoid deadlocks. Deadlocks could otherwise happen because multiple transaction will call this statement with overlapping sets of keys. Note that this code is also sensitive to duplicate value exception but I handle that by retrying so that is not my question. I was doing the following:
MERGE INTO targetTable
USING (
SELECT ...
FROM sourceCollection
ORDER BY <desiredUpdateOrder>
)
WHEN MATCHED THEN
UPDATE ...
WHEN NOT MATCHED THEN
INSERT ...
Now I'm still getting the dead lock so I'm becoming unsure whether oracle maintains the order of the sub-query. Does anyone know how to best make sure that oracle locks the rows in targetTable in the same order in this case? Do I have to do a SELECT FOR UPDATE before the merge? In which order does the SELECT FOR UPDATE lock the rows? Oracle UPDATE statement has an ORDER BY clause that MERGE seems to be missing. Is there another way to avoid dead locks other than locking the rows in the same order every time?
[Edit]
This query is used to maintain a count of how often a certain action has taken place. When the action happens the first time a row is inserted, when it happens a second time the "count" column is incremented. There are millions of different actions and they happen very often. A table lock wouldn't work.
Controlling the order in which the target table rows are modified requires that you control the query execution plan of the USING subquery. That's a tricky business, and depends on what sort of execution plans your query is likely to be getting.
If you're getting deadlocks then I'd guess that you're getting a nested loop join from the source collection to the target table, as a hash join would probably be based on hashing the source collection and would modify the target table roughly in target-table rowid order because that would be full scanned -- in any case, the access order would be consistent across all of the query executions.
Likewise, if there was a sort-merge between the two data sets you'd get consistency in the order in which target table rows are accessed.
Ordering of the source collection seems to be desirable, but the optimiser might not be applying it so check the execution plan. If it is not then try inserting your data into a global temporary table using APPEND and with an ORDER BY clause, and then selecting from there without an order by clause, and explore the us of hints to entrench a nested loop join.
I don't believe the ORDER BY will affect anything (though I'm more than willing to be proven wrong); I think MERGE will lock everything it needs to.
Assume I'm completely wrong, assume that you get row-by-row locks with MERGE. Your problem still isn't solved as you have no guarantees that your two MERGE statements won't hit the same row simultaneously. In fact, from the information given, you have no guarantees that an ORDER BY improves the situation; it might make it worse.
Despite there being no skip locked rows syntax as there is with UPDATE there is still a simple answer, stop trying to update the same row from within different transactions. If feasible, you can use some form of parallel execution, for instance the DBMS_PARALLEL_EXECUTE subprogram CREATE_CHUNKS_BY_ROWID and ensure that your transactions only work on a specific sub-set of the rows in the table.
As an aside I'm a little worried by your description of the problem. You say there's some duplicate erroring that you fix by rerunning the MERGE. If the data in these duplicates is different you need to ensure that the ORDER BY is done not only on the data to be merged but the data being merged into. If you don't then there's no guarantee that you don't overwrite the correct data with older, incorrect, data.
First locks are not really managed at row level but at block level. You may encounter an ORA-00060 error even without modifying the same row. This can be tricky. Managing this is the request developper's job.
One possible workaround is to organize your table (never do that on huge tables or table with heavy change rates)
https://use-the-index-luke.com/sql/clustering/index-organized-clustered-index
Rather than do a merge, I suggest that you try and lock the row. If successful update it, if not insert new row. By default lock will wait if another process has a lock on the same thing.
CREATE TABLE brianl.deleteme_table
(
id INTEGER PRIMARY KEY
, cnt INTEGER NOT NULL
);
CREATE OR REPLACE PROCEDURE brianl.deleteme_table_proc (
p_id IN deleteme_table.id%TYPE)
AUTHID DEFINER
AS
l_id deleteme_table.id%TYPE;
-- This isolates this procedure so that it doesn't commit
-- anything outside of the procedure.
PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
-- select the row for update
-- this will pause if someone already has the row locked.
SELECT id
INTO l_id
FROM deleteme_table
WHERE id = p_id
FOR UPDATE;
-- Row was locked, update it.
UPDATE deleteme_table
SET cnt = cnt + 1
WHERE id = p_id;
COMMIT;
EXCEPTION
WHEN NO_DATA_FOUND
THEN
-- we were unable to lock the record, insert a new row
INSERT INTO deleteme_table (id, cnt)
VALUES (p_id, 1);
COMMIT;
END deleteme_table_proc;
CREATE OR REPLACE PROCEDURE brianl.deleteme_proc_test
AUTHID CURRENT_USER
AS
BEGIN
-- This resets the table to empty for the test
EXECUTE IMMEDIATE 'TRUNCATE TABLE brianl.deleteme_table';
brianl.deleteme_table_proc (p_id => 1);
brianl.deleteme_table_proc (p_id => 2);
brianl.deleteme_table_proc (p_id => 3);
brianl.deleteme_table_proc (p_id => 2);
FOR eachrec IN ( SELECT id, cnt
FROM brianl.deleteme_table
ORDER BY id)
LOOP
DBMS_OUTPUT.put_line (
a => 'id: ' || eachrec.id || ', cnt:' || eachrec.cnt);
END LOOP;
END;
BEGIN
-- runs the test;
brianl.deleteme_proc_test;
END;

Optimize Delete query with large number of data on oracle

I'm working on oracle 9i. I have a table with 135,000,000 records, partitioned where each partition having approx. 10,000,000 rows. all indexed and everything.
I need to delete around 70,000,000 rows from this as the new business requirement.
So I created a backup of the rows to be deleted as separate table.
Table1 <col1, col2........> -- main table (135,000,000 rows)
Table2 <col1, col2........> -- backup table (70,000,000 rows)
Tried the below delete query.
Delete from table1 t1 where exists (select 1 from table2 t2 where t2.col1 = t1.col1)
but it takes infinite hours.
then tried
declare
cursor c1 is
select col1 from table2;
c2 c1%rowtype;
cnt number;
begin
cnt :=0;
open c1;
loop
fetch c1 into c2;
exit when c1%notfound;
delete from table1 t1 where t1.col1 = c2.col1;
if cnt >= 100000 then
commit;
end if;
cnt:=cnt+1;
end loop;
close c1;
end;
even still its been running for more than 12 hours. and still not completed.
Please note that there are multiple indexes on table1 and an index on col1 on table2. all the tables and indexes are analysed.
Please advise if there is any way of optimizing for this scenario.
Thanks guys.
Drop all indexes (backup the create statements)
Use the select statement that used to build the backup table, create from it a DELETE command
Recreate all index
I remember facing this issue earlier. In that case, we resorted to doing this since it worked out faster than any other delete operation:
1) Create another table with identical structure
2) Insert into the new table the records you want to keep (use Direct path insert to speed this up)
3) Drop the old table
4) Rename the new table
You say that the table is partitioned. Is your intention to drop all the data in certain partitions? If so, you should be able to simply drop the 7 partitions that have the 70 million rows that you want to drop. I'm assuming, however, that your problem isn't that simple.
If you can do interim commits, that implies that you don't care about transactional consistency, the most efficient approach is likely something along the lines of
CREATE TABLE rows_to_save
AS SELECT *
FROM table1
WHERE <<criteria to select the 65 million rows you want to keep>>
TRUNCATE TABLE table1;
INSERT /*+ append */
INTO table1
SELECT *
FROM rows_to_save;
Barring that, rather than creating the backup table, it would be more efficient to simply issue the DELETE statement
DELETE FROM table1
WHERE <<criteria to select the 70 million rows you want to keep>>
You may also benefit from dropping or disabling the indexes and constraints before running the DELETE.
I'm going to answer this assuming that it is cheaper to filter against the backup table, but it would probably be cheaper to just use the negation of the criteria you used to populate the backup table.
1) create a new table with the same structure. No indexes, constraints, or triggers.
2)
select 'insert /*+ append nologging */ into new_table partition (' || n.partition_name || ') select * from old_table partition (' || o.partition_name || ') minus select * from bak_table partition (' || b.partition_name || ');'
from all_tab_partitions o, all_tab_partitions n, all_tab_partitions b
where o.partition_no = all( n.partition_no, b.partition_no)
and o.table_name = 'OLD_TABLE' and o.table_owner = 'OWNER'
and n.table_name = 'NEW_TABLE' and n.table_owner = 'OWNER'
and b.table_name = 'BAK_TABLE' and b.table_owner = 'OWNER';
-- note, I haven't run this it may need minor corrections in addition to the obvious substitutions
3) verify and the run the result of the previous query
4) build the indexes, constraints, and triggers if needed
This avoids massive amounts of redo and undo compared to the delete.
append hint for direct path inserts
no logging to further reduce redo - make sure you backup afterwards
takes advantage of your partitioning to break the work into chunks that can be sorted in less passes
You could probably go faster with parallel insert + parallel select, but it is probably not necessary. Just don't do a parallel select without the insert and an "alter session enable parallel dml"