Update taking time in sql - sql

I have one query regarding update
update stg set flg = 'Y' where exists (
select 1
from err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999 )
and pid = 123 ;
so staging table should update its records when entry is there in err table
for 999 error_id. Err
table has 3mn records for 999.Stg has total 13mn records for 123 pid. Both
tables are partition
on pid. Err has index on all 3 cols. This is running so slow.
Please suggest any better approach for doing this. I have to do this in
plsql block.ALso let me know if more details are required. This job will run daily. No foreign keys .

You may try to transform your update into an inline view update, look for an example here: https://oracle-base.com/articles/misc/updates-based-on-queries
So I tried to reproduce a test case with the information you did provide:
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> create table err (pid number not null, imp_id number not null, error_id number not null);
Table created.
SQL> create table stg (pid number not null, imp_id number not null, flg char(1) not null );
Table created.
SQL> insert into stg select 123, level, 'N' from dual connect by level <= 13000000;
13000000 rows created.
SQL> commit;
Commit complete.
SQL> insert into err select 123, level, 999 from dual connect by level <= 3000000;
3000000 rows created.
SQL> commit;
Commit complete.
SQL> set timing on
SQL> update stg set flg = 'Y' where exists (
select 1
from err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999 )
and pid = 123;
3000000 rows updated.
Elapsed: 00:00:13.63
SQL> rollback;
Rollback complete.
Elapsed: 00:00:09.71
SQL> create unique index udx_err on err (pid, imp_id, error_id);
Index created.
Elapsed: 00:00:06.21
SQL> create index idx_stg on stg (pid, imp_id);
Index created.
Elapsed: 00:00:23.21
SQL> exec dbms_stats.gather_table_stats( user, 'ERR' );
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.11
SQL> exec dbms_stats.gather_table_stats( user, 'STG' );
PL/SQL procedure successfully completed.
Elapsed: 00:00:09.44
SQL> update stg set flg = 'Y' where exists (
select 1
from err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999 )
and pid = 123;
3000000 rows updated.
Elapsed: 00:00:13.84
SQL> explain plan for update stg set flg = 'Y' where exists (
select 1
from err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999 )
and pid = 123;
Explained.
Elapsed: 00:00:00.01
SQL> SELECT * FROM TABLE(DBMS_XPLAN.display(format => 'ALL'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 728722196
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 3002K| 74M| | 29231 (1)| 00:00:02 |
| 1 | UPDATE | STG | | | | | |
|* 2 | HASH JOIN RIGHT SEMI| | 3002K| 74M| 74M| 29231 (1)| 00:00:02 |
|* 3 | TABLE ACCESS FULL | ERR | 2999K| 40M| | 2146 (2)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | STG | 12M| 148M| | 8547 (2)| 00:00:01 |
--------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$3FF8579E
3 - SEL$3FF8579E / ERR#SEL$1
4 - SEL$3FF8579E / STG#UPD$1
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ERR"."PID"="STG"."PID" AND "ERR"."IMP_ID"="STG"."IMP_ID")
3 - filter("ERR"."ERROR_ID"=999 AND "ERR"."PID"=123)
4 - filter("PID"=123)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
2 - (#keys=2; upd=4; cmp=1,2) "STG"."PID"[NUMBER,22], "STG"."IMP_ID"[NUMBER,22], "STG".ROWID[ROWID,10], "FLG"[CHARACTER,1]
3 - (rowset=256) "ERR"."PID"[NUMBER,22], "ERR"."IMP_ID"[NUMBER,22], "ERR"."ERROR_ID"[NUMBER,22]
4 - "STG".ROWID[ROWID,10], "PID"[NUMBER,22], "STG"."IMP_ID"[NUMBER,22], "FLG"[CHARACTER,1]
35 rows selected.
Elapsed: 00:00:00.08
Using inline view now:
SQL> update (select stg.flg
from stg, err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999
and stg.pid = 123
)
set flg = 'Y';
3000000 rows updated.
Elapsed: 00:00:05.69
SQL>
And it runs faster...
My advices:
- check your statistics
- check these indexes, are they well defined?
- check your columns data types
- check if columns should be not nullable
Your execution plan should not use indexes here as it will run forever...

Related

01722. 00000 - "invalid number" : Oracle

I have a table1 with
t1id NUMBER(10,0)
channel_id NUMBER(10,0)
and another table2 with columns
t1id NUMBER(10,0)
channel2_id NVARCHAR2(100 CHAR)
cl2_id NVARCHAR2(100 CHAR)
Having called the query
SELECT t1.id, t2.cl2_id
FROM table1 t1
LEFT JOIN table2 t2
ON t1.channel_id = c.channel2_id;
I recieve the below error while joining the query.? Is it due to the data type of both the columns? how to resolve this
01722. 00000 - "invalid number"
*Cause: The specified number was invalid.
*Action: Specify a valid number.
If you have a datatype mismatch, we will (silently) try to correct that mismatch, eg
SQL> create table t ( num_col_stored_as_string varchar2(10));
Table created.
SQL>
SQL> insert into t values ('123');
1 row created.
SQL> insert into t values ('456');
1 row created.
SQL> insert into t values ('789');
1 row created.
SQL>
SQL> explain plan for
2 select * from t
3 where num_col_stored_as_string = 456;
Explained.
SQL>
SQL> select * from dbms_xplan.display();
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 7 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("NUM_COL_STORED_AS_STRING")=456)
Notice that we silently added a TO_NUMBER to the filter to ensure that the column matched with the input value (456).
It then becomes obvious as to why this can cause problems, because:
SQL> insert into t values ('NOT A NUM');
1 row created.
SQL> select * from t
2 where num_col_stored_as_string = 456;
ERROR:
ORA-01722: invalid number
As others have suggested in comments, look at
getting your datatypes aligned
use TO_CHAR
use VALIDATE_CONVERSION
but ideally, data type alignment is the way to go

how to obtain execution plan from a PLSQL

I have a code snippet like :
declare
cursor c is select ....
begin
for i in c loop
update table t1 where ti.c1 = i.column ...
end loop;
end;
I am wondering how to obtain an execution plan from the update statement since there is involved a condition between a column from a table and a column from cursor.
Can anyone give me a hint, please?
Thank you.
The simplest way might be to capture the plan while it's executing, by noting the sql_id and sql_child_number from v$session, and running
select plan_table_output
from table(dbms_xplan.display_cursor(sql_id, sql_child_number));
Alternatively, you can embed the call in the PL/SQL for testing purposes and have it output the plan using dbms_output.put_line. (The plan will be the same for every loop iteration, unless you want to check the actual row counts etc, so I have added an exit; statement to make it stop after one execution.)
declare
cursor c is select * from departments;
begin
for r in c loop
update employees e set e.email = e.email
where e.department_id = r.department_id;
for p in (
select p.plan_table_output
from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')) p
)
loop
dbms_output.put_line(p.plan_table_output);
end loop;
rollback;
exit;
end loop;
end;
Which in my 19c demo HR schema gives:
SQL_ID b5xbwbu4kd3r4, child number 1
-------------------------------------
UPDATE EMPLOYEES E SET E.EMAIL = E.EMAIL WHERE E.DEPARTMENT_ID = :B1
Plan hash value: 4075606039
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 1 | | 0 |00:00:00.01 | 4 |
| 1 | UPDATE | EMPLOYEES | 1 | | 0 |00:00:00.01 | 4 |
|* 2 | INDEX RANGE SCAN| EMP_DEPARTMENT_IX | 1 | 1 | 1 |00:00:00.01 | 1 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("E"."DEPARTMENT_ID"=:B1)
To get the actual row counts as displayed above you would need to
alter session set statistics_level = all;
or else include the /*+ gather_plan_statistics */ hint in the update.
That's just an ordinary UPDATE statement. As it is in a loop, it executes row-by-row (returned from the cursor) (it might, though, update multiple rows at once, depending on its (UPDATE's) WHERE clause).
Therefore, you'd just "extract" it from the loop and explain it, alone, e.g. if it is
cursor c is
select id, job from emp_2;
begin
for i in c loop
update emp e set e.job = i.job where e.id = i.id;
end loop;
end;
Suppose one of rows returned by cursor has id = 7369, job = CLERK; then
explain plan for update emp e set e.job = 'CLERK' where e.id = 7369;
select * from plan_table;

How to debug the issue of index monitoring wherein index entry is not found in table v$object_usage?

I am facing an issue while enabling monitoring on indexes , I executed following command to enable index monitoring and then checked the entry in v$object_usage view but couldn't find any record in it :
alter index REPORT.DY_SUM_DLY_SCH_TRN_DIV monitoring usage;
O/p is Index REPORT.DY_SUM_DLY_SCH_TRN_DIV altered.
Checking the entry in v$object_usage :
SELECT index_name,table_name,monitoring,used,start_monitoring,end_monitoring
FROM v$object_usage
In output no records are coming .
How to debug this issue ?
If you are not seeing records in the v$object_usage is either for one of the two following reasons:
The index is not used by any statement
You are not querying the view with the owner of the index.
Normally, it is the second cause the reason.
The V$OBJECT_USAGE view does not contain an OWNER column so you must to log on as the object owner to see the usage data. From Oracle 12.1 onward the V$OBJECT_USAGE view has been deprecated in favour of the {DBA|USER]}_OBJECT_USAGE views. The structure is the same, but the DBA_OBJECT_USAGE view includes an OWNER column.
Logged as SYS
SQL> drop table t1 ;
Table dropped.
SQL> create table t1 ( c1 number , c2 varchar2(20) ) ;
Table created.
SQL> create index idx_t1 on t1 ( c1 ) ;
Index created.
SQL> declare
2 begin
3 for h in 1 .. 10000
4 loop
5 insert into t1 values ( h , round(dbms_random.value(1,100000)) ) ;
6 end loop;
7 commit;
8* end;
SQL> /
PL/SQL procedure successfully completed.
SQL> select count(*) from t1 ;
COUNT(*)
----------
10000
SQL> analyze table t1 compute statistics ;
Table analyzed.
SQL> analyze index idx_t1 compute statistics ;
Index analyzed.
SQL> set autotrace traceonly explain
SQL> set lines 220 pages 600
SQL> select * from t1 where c1 = 1009 ;
Execution Plan
----------------------------------------------------------
Plan hash value: 3491035275
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 1 | 8 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T1 | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C1"=1009)
SQL> alter index idx_t1 monitoring usage ;
Index altered.
SQL> set autotrace off
SQL> select * from t1 where c1 = 1009 ;
C1 C2
---------- --------------------
1009 21639
SQL> select index_name from v$object_usage where index_name = 'IDX_T1' ;
no rows selected
Logged as Owner
SQL> conn test1/Oracle_12
Connected.
SQL> col index_name for a30
SQL> col table_name for a30
SQL>
SQL> select index_name,table_name,monitoring,used,start_monitoring,end_monitoring
from v$object_usage where index_name = 'IDX_T1' ; 2
INDEX_NAME TABLE_NAME MON USE START_MONITORING END_MONITORING
------------------------------ ------------------------------ --- --- ------------------- -------------------
IDX_T1 T1 YES YES 09/22/2021 08:06:28
SQL>

Why does explain plan show the wrong number of rows?

I am trying to simulate this and to do that I have created the following procedure to insert a large number of rows:
create or replace PROCEDURE a_lot_of_rows is
i carte.cod%TYPE;
a carte.autor%TYPE := 'Author #';
t carte.titlu%TYPE := 'Book #';
p carte.pret%TYPE := 3.23;
e carte.nume_editura%TYPE := 'Penguin Random House';
begin
for i in 8..1000 loop
insert into carte
values (i, e, a || i, t || i, p, 'hardcover');
commit;
end loop;
for i in 1001..1200 loop
insert into carte
values (i, e, a || i, t || i, p, 'paperback');
commit;
end loop;
end;
I have created a bitmap index on the tip_coperta column (which can only have the values 'hardcover' and 'paperback') and then inserted 1200 more rows. However, the result given by the explain plan is the following (before the insert procedure, the table had 7 rows, of which 4 had the tip_coperta = 'paperback'):
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 284 | 34 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 4 | 284 | 34 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TIP_COPERTA"='paperback')
Motto: Bad Statistics are Worse than No Statistics
TLDR your statistics are stale and need to be recollected. If you create index the index statistics are automatically gathered but not the table statistics that are relevant for your case.
Lets simulate your example with the following script to create the table and fill it with 1000 hardcovers and 200 paperbacks.
create table CARTE
(cod int,
autor VARCHAR2(100),
titlu VARCHAR2(100),
pret NUMBER,
nume_editura VARCHAR2(100),
tip_coperta VARCHAR2(100)
);
insert into CARTE
(cod,autor,titlu,pret,nume_editura,tip_coperta)
select rownum,
'Author #'||rownum ,
'Book #'||rownum,
3.23,
'Penguin Random Number',
case when rownum <=1000 then 'hardcover'
else 'paperback' end
from dual connect by level <= 1200;
commit;
This leaves the new table without optimizer object statistics, which you can verfiy with the following query that return only NULLs
select NUM_ROWS, LAST_ANALYZED from user_tables where table_name = 'CARTE';
So, let's check what is the Oracle impression of the table:
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from CARTE
where tip_coperta = 'paperback'
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
The script above produce the execution plan for your query asking for paberbacks and you see that the Rows is fine (= 200). How is this possible?
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 46800 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 200 | 46800 | 5 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TIP_COPERTA"='paperback')
The explanation is in the Notes of the plan output - the dynamic sampling was used.
Basically Oracle execute while parsing the query an additional query to estimate the number of rows with the filter predicate.
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Dynamic sampling is fine for tables that are used seldon, but if the table is queried regularly, we need optimizer statsitics to save the overhead of dynamic sampling.
So let's collect statistics
exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'CARTE' );
Now you see that the statistics are gathered, the total number of rows is fine and
in the column statistics a frequency histogram is created - this is important to estimate the count of records with a specific value!
select NUM_ROWS, LAST_ANALYZED from user_tables where table_name = 'CARTE';
NUM_ROWS LAST_ANALYZED
---------- -------------------
1200 09.01.2021 16:48:26
select NUM_DISTINCT,HISTOGRAM from user_tab_columns where table_name = 'CARTE' and column_name = 'TIP_COPERTA';
NUM_DISTINCT HISTOGRAM
------------ ---------------
2 FREQUENCY
Lets vefiry how the statistics are working now in the execution plan
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from CARTE
where tip_coperta = 'paperback'
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
Basically we see the same correct result
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 12400 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 200 | 12400 | 5 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TIP_COPERTA"='paperback')
Now we delete all but the four 'paperback's from the table
delete from CARTE
where tip_coperta = 'paperback' and cod > 1004;
commit;
select count(*) from CARTE
where tip_coperta = 'paperback'
COUNT(*)
----------
4
With this action the statistics went stale and give a wrong result based on obsolet data. This wrong result will occure until the statistics are recollected.
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from CARTE
where tip_coperta = 'paperback'
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 12400 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 200 | 12400 | 5 (0)| 00:00:01 |
---------------------------------------------------------------------------
Set up a policy that maintains your statistics up-to-date!
It is the table statistics that are important for this cardinality.
You will need to wait until the automatic stats gathering task to fire up and obtain the new statistics, or you can do it yourself:
exec dbms_stats.gather_table_stats(null,'carte',method_opt=>'for all columns size 1 for columns size 254 TIP_COPERTA')
This will force there to be a histogram on the TIP_COPERTA column and not on the others (you may wish to use for all columns size skew or for all columns size auto or even just let it default to whatever the set preferred method_opt parameter is. Have a read of this article for details about this parameter.
In some of the later versions of Oracle, depending on where you are running it, you may also have Real-Time Statistics. This is where Oracle will be keeping your statistics up to date even after conventional DML.
It's important to remember that cardinality estimates do not need to be completely accurate for you to obtain reasonable execution plans. A common rule of thumb is that it should be within an order of magnitude, and even then you will probably be fine most of the time.
To obtain an estimate for the number of rows, Oracle needs you to analyze the table (or index). When you create an index, there is an automatic analyze.

Is there any DB server that can optimize the following query?

Let's say I have the table my_table(id int not null primary key, datafield varchar(100)). Query
SELECT * from my_table where id = 100 performs an index seek. If I change it to
SELECT * from my_table where id+1 = 101
it scans the whole index (index scan) (at least it does it in SQL Server and Mysql). Is there any DB server which 'understands' that id +1 = 101 is the same as id = 101-1 ? I do realize that it's not a typical database operation, and server doesn't have to perform any math in such cases, but I wonder if it's implemented anywhere?
Thanks
UPDATE
So far I've tried SQL Server 2008 Enterprise, Mysql 5.1, 5.5. SQL Server shows clustered index seek and clustered index scan respectively. Mysql explain shows ref:const, key:primary, rows:1 and ref:null, key:null,rows: #total number of rows in the table
id +1 = 101 is the same as id = 101-1
No it isn't. What if the +1 overflows the id?
I tried this with PostgreSQL 9.0 and it does not use an index unless I create one on (id - 1).
So with the following index definition
create index idx_minus on my_table ( (id - 1) );
PostgreSQL uses an index for the query
select *
from my_table
where id - 1 = 12345
Interesting.
You can add Oracle Release 10.2.0.1.0 to your list (not able to rewrite the query).
create table t(
id
,x
,padding
,primary key (id)
) as
select rownum as id
,'x' as x
,lpad('x', 100, 'x') as padding
from dual
connect by level <= 50000;
Query 1.
select id
from t
where id = 100 + 1;
----------------------------------------+
| Id | Operation | Name |
-----------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | INDEX UNIQUE SCAN| SYS_C006659 |
-----------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ID"=101)
Query 2.
select id
from t
where id + 1 = 101;
--------------------------------------------
| Id | Operation | Name |
--------------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | INDEX FAST FULL SCAN| SYS_C006659 |
--------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"+1=101)
Query 3.
select x
from t
where id + 1 = 101;
------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
|* 1 | TABLE ACCESS FULL| T | 1 |
------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"+1=101)
Why not just do this instead (assuming you don't want the server to do the math for calculating the actual ID you're looking for)?
SELECT * FROM my_table WHERE id = (101 - 1)