I have a table1 with
t1id NUMBER(10,0)
channel_id NUMBER(10,0)
and another table2 with columns
t1id NUMBER(10,0)
channel2_id NVARCHAR2(100 CHAR)
cl2_id NVARCHAR2(100 CHAR)
Having called the query
SELECT t1.id, t2.cl2_id
FROM table1 t1
LEFT JOIN table2 t2
ON t1.channel_id = c.channel2_id;
I recieve the below error while joining the query.? Is it due to the data type of both the columns? how to resolve this
01722. 00000 - "invalid number"
*Cause: The specified number was invalid.
*Action: Specify a valid number.
If you have a datatype mismatch, we will (silently) try to correct that mismatch, eg
SQL> create table t ( num_col_stored_as_string varchar2(10));
Table created.
SQL>
SQL> insert into t values ('123');
1 row created.
SQL> insert into t values ('456');
1 row created.
SQL> insert into t values ('789');
1 row created.
SQL>
SQL> explain plan for
2 select * from t
3 where num_col_stored_as_string = 456;
Explained.
SQL>
SQL> select * from dbms_xplan.display();
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 7 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("NUM_COL_STORED_AS_STRING")=456)
Notice that we silently added a TO_NUMBER to the filter to ensure that the column matched with the input value (456).
It then becomes obvious as to why this can cause problems, because:
SQL> insert into t values ('NOT A NUM');
1 row created.
SQL> select * from t
2 where num_col_stored_as_string = 456;
ERROR:
ORA-01722: invalid number
As others have suggested in comments, look at
getting your datatypes aligned
use TO_CHAR
use VALIDATE_CONVERSION
but ideally, data type alignment is the way to go
Related
I am facing an issue while enabling monitoring on indexes , I executed following command to enable index monitoring and then checked the entry in v$object_usage view but couldn't find any record in it :
alter index REPORT.DY_SUM_DLY_SCH_TRN_DIV monitoring usage;
O/p is Index REPORT.DY_SUM_DLY_SCH_TRN_DIV altered.
Checking the entry in v$object_usage :
SELECT index_name,table_name,monitoring,used,start_monitoring,end_monitoring
FROM v$object_usage
In output no records are coming .
How to debug this issue ?
If you are not seeing records in the v$object_usage is either for one of the two following reasons:
The index is not used by any statement
You are not querying the view with the owner of the index.
Normally, it is the second cause the reason.
The V$OBJECT_USAGE view does not contain an OWNER column so you must to log on as the object owner to see the usage data. From Oracle 12.1 onward the V$OBJECT_USAGE view has been deprecated in favour of the {DBA|USER]}_OBJECT_USAGE views. The structure is the same, but the DBA_OBJECT_USAGE view includes an OWNER column.
Logged as SYS
SQL> drop table t1 ;
Table dropped.
SQL> create table t1 ( c1 number , c2 varchar2(20) ) ;
Table created.
SQL> create index idx_t1 on t1 ( c1 ) ;
Index created.
SQL> declare
2 begin
3 for h in 1 .. 10000
4 loop
5 insert into t1 values ( h , round(dbms_random.value(1,100000)) ) ;
6 end loop;
7 commit;
8* end;
SQL> /
PL/SQL procedure successfully completed.
SQL> select count(*) from t1 ;
COUNT(*)
----------
10000
SQL> analyze table t1 compute statistics ;
Table analyzed.
SQL> analyze index idx_t1 compute statistics ;
Index analyzed.
SQL> set autotrace traceonly explain
SQL> set lines 220 pages 600
SQL> select * from t1 where c1 = 1009 ;
Execution Plan
----------------------------------------------------------
Plan hash value: 3491035275
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 1 | 8 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T1 | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C1"=1009)
SQL> alter index idx_t1 monitoring usage ;
Index altered.
SQL> set autotrace off
SQL> select * from t1 where c1 = 1009 ;
C1 C2
---------- --------------------
1009 21639
SQL> select index_name from v$object_usage where index_name = 'IDX_T1' ;
no rows selected
Logged as Owner
SQL> conn test1/Oracle_12
Connected.
SQL> col index_name for a30
SQL> col table_name for a30
SQL>
SQL> select index_name,table_name,monitoring,used,start_monitoring,end_monitoring
from v$object_usage where index_name = 'IDX_T1' ; 2
INDEX_NAME TABLE_NAME MON USE START_MONITORING END_MONITORING
------------------------------ ------------------------------ --- --- ------------------- -------------------
IDX_T1 T1 YES YES 09/22/2021 08:06:28
SQL>
I am trying to simulate this and to do that I have created the following procedure to insert a large number of rows:
create or replace PROCEDURE a_lot_of_rows is
i carte.cod%TYPE;
a carte.autor%TYPE := 'Author #';
t carte.titlu%TYPE := 'Book #';
p carte.pret%TYPE := 3.23;
e carte.nume_editura%TYPE := 'Penguin Random House';
begin
for i in 8..1000 loop
insert into carte
values (i, e, a || i, t || i, p, 'hardcover');
commit;
end loop;
for i in 1001..1200 loop
insert into carte
values (i, e, a || i, t || i, p, 'paperback');
commit;
end loop;
end;
I have created a bitmap index on the tip_coperta column (which can only have the values 'hardcover' and 'paperback') and then inserted 1200 more rows. However, the result given by the explain plan is the following (before the insert procedure, the table had 7 rows, of which 4 had the tip_coperta = 'paperback'):
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 284 | 34 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 4 | 284 | 34 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TIP_COPERTA"='paperback')
Motto: Bad Statistics are Worse than No Statistics
TLDR your statistics are stale and need to be recollected. If you create index the index statistics are automatically gathered but not the table statistics that are relevant for your case.
Lets simulate your example with the following script to create the table and fill it with 1000 hardcovers and 200 paperbacks.
create table CARTE
(cod int,
autor VARCHAR2(100),
titlu VARCHAR2(100),
pret NUMBER,
nume_editura VARCHAR2(100),
tip_coperta VARCHAR2(100)
);
insert into CARTE
(cod,autor,titlu,pret,nume_editura,tip_coperta)
select rownum,
'Author #'||rownum ,
'Book #'||rownum,
3.23,
'Penguin Random Number',
case when rownum <=1000 then 'hardcover'
else 'paperback' end
from dual connect by level <= 1200;
commit;
This leaves the new table without optimizer object statistics, which you can verfiy with the following query that return only NULLs
select NUM_ROWS, LAST_ANALYZED from user_tables where table_name = 'CARTE';
So, let's check what is the Oracle impression of the table:
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from CARTE
where tip_coperta = 'paperback'
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
The script above produce the execution plan for your query asking for paberbacks and you see that the Rows is fine (= 200). How is this possible?
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 46800 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 200 | 46800 | 5 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TIP_COPERTA"='paperback')
The explanation is in the Notes of the plan output - the dynamic sampling was used.
Basically Oracle execute while parsing the query an additional query to estimate the number of rows with the filter predicate.
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Dynamic sampling is fine for tables that are used seldon, but if the table is queried regularly, we need optimizer statsitics to save the overhead of dynamic sampling.
So let's collect statistics
exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'CARTE' );
Now you see that the statistics are gathered, the total number of rows is fine and
in the column statistics a frequency histogram is created - this is important to estimate the count of records with a specific value!
select NUM_ROWS, LAST_ANALYZED from user_tables where table_name = 'CARTE';
NUM_ROWS LAST_ANALYZED
---------- -------------------
1200 09.01.2021 16:48:26
select NUM_DISTINCT,HISTOGRAM from user_tab_columns where table_name = 'CARTE' and column_name = 'TIP_COPERTA';
NUM_DISTINCT HISTOGRAM
------------ ---------------
2 FREQUENCY
Lets vefiry how the statistics are working now in the execution plan
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from CARTE
where tip_coperta = 'paperback'
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
Basically we see the same correct result
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 12400 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 200 | 12400 | 5 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TIP_COPERTA"='paperback')
Now we delete all but the four 'paperback's from the table
delete from CARTE
where tip_coperta = 'paperback' and cod > 1004;
commit;
select count(*) from CARTE
where tip_coperta = 'paperback'
COUNT(*)
----------
4
With this action the statistics went stale and give a wrong result based on obsolet data. This wrong result will occure until the statistics are recollected.
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from CARTE
where tip_coperta = 'paperback'
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 12400 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 200 | 12400 | 5 (0)| 00:00:01 |
---------------------------------------------------------------------------
Set up a policy that maintains your statistics up-to-date!
It is the table statistics that are important for this cardinality.
You will need to wait until the automatic stats gathering task to fire up and obtain the new statistics, or you can do it yourself:
exec dbms_stats.gather_table_stats(null,'carte',method_opt=>'for all columns size 1 for columns size 254 TIP_COPERTA')
This will force there to be a histogram on the TIP_COPERTA column and not on the others (you may wish to use for all columns size skew or for all columns size auto or even just let it default to whatever the set preferred method_opt parameter is. Have a read of this article for details about this parameter.
In some of the later versions of Oracle, depending on where you are running it, you may also have Real-Time Statistics. This is where Oracle will be keeping your statistics up to date even after conventional DML.
It's important to remember that cardinality estimates do not need to be completely accurate for you to obtain reasonable execution plans. A common rule of thumb is that it should be within an order of magnitude, and even then you will probably be fine most of the time.
To obtain an estimate for the number of rows, Oracle needs you to analyze the table (or index). When you create an index, there is an automatic analyze.
I have one query regarding update
update stg set flg = 'Y' where exists (
select 1
from err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999 )
and pid = 123 ;
so staging table should update its records when entry is there in err table
for 999 error_id. Err
table has 3mn records for 999.Stg has total 13mn records for 123 pid. Both
tables are partition
on pid. Err has index on all 3 cols. This is running so slow.
Please suggest any better approach for doing this. I have to do this in
plsql block.ALso let me know if more details are required. This job will run daily. No foreign keys .
You may try to transform your update into an inline view update, look for an example here: https://oracle-base.com/articles/misc/updates-based-on-queries
So I tried to reproduce a test case with the information you did provide:
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> create table err (pid number not null, imp_id number not null, error_id number not null);
Table created.
SQL> create table stg (pid number not null, imp_id number not null, flg char(1) not null );
Table created.
SQL> insert into stg select 123, level, 'N' from dual connect by level <= 13000000;
13000000 rows created.
SQL> commit;
Commit complete.
SQL> insert into err select 123, level, 999 from dual connect by level <= 3000000;
3000000 rows created.
SQL> commit;
Commit complete.
SQL> set timing on
SQL> update stg set flg = 'Y' where exists (
select 1
from err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999 )
and pid = 123;
3000000 rows updated.
Elapsed: 00:00:13.63
SQL> rollback;
Rollback complete.
Elapsed: 00:00:09.71
SQL> create unique index udx_err on err (pid, imp_id, error_id);
Index created.
Elapsed: 00:00:06.21
SQL> create index idx_stg on stg (pid, imp_id);
Index created.
Elapsed: 00:00:23.21
SQL> exec dbms_stats.gather_table_stats( user, 'ERR' );
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.11
SQL> exec dbms_stats.gather_table_stats( user, 'STG' );
PL/SQL procedure successfully completed.
Elapsed: 00:00:09.44
SQL> update stg set flg = 'Y' where exists (
select 1
from err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999 )
and pid = 123;
3000000 rows updated.
Elapsed: 00:00:13.84
SQL> explain plan for update stg set flg = 'Y' where exists (
select 1
from err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999 )
and pid = 123;
Explained.
Elapsed: 00:00:00.01
SQL> SELECT * FROM TABLE(DBMS_XPLAN.display(format => 'ALL'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 728722196
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 3002K| 74M| | 29231 (1)| 00:00:02 |
| 1 | UPDATE | STG | | | | | |
|* 2 | HASH JOIN RIGHT SEMI| | 3002K| 74M| 74M| 29231 (1)| 00:00:02 |
|* 3 | TABLE ACCESS FULL | ERR | 2999K| 40M| | 2146 (2)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | STG | 12M| 148M| | 8547 (2)| 00:00:01 |
--------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$3FF8579E
3 - SEL$3FF8579E / ERR#SEL$1
4 - SEL$3FF8579E / STG#UPD$1
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ERR"."PID"="STG"."PID" AND "ERR"."IMP_ID"="STG"."IMP_ID")
3 - filter("ERR"."ERROR_ID"=999 AND "ERR"."PID"=123)
4 - filter("PID"=123)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
2 - (#keys=2; upd=4; cmp=1,2) "STG"."PID"[NUMBER,22], "STG"."IMP_ID"[NUMBER,22], "STG".ROWID[ROWID,10], "FLG"[CHARACTER,1]
3 - (rowset=256) "ERR"."PID"[NUMBER,22], "ERR"."IMP_ID"[NUMBER,22], "ERR"."ERROR_ID"[NUMBER,22]
4 - "STG".ROWID[ROWID,10], "PID"[NUMBER,22], "STG"."IMP_ID"[NUMBER,22], "FLG"[CHARACTER,1]
35 rows selected.
Elapsed: 00:00:00.08
Using inline view now:
SQL> update (select stg.flg
from stg, err
where err.pid = stg.pid
and err.imp_id = stg.imp_id
and err.error_id = 999
and stg.pid = 123
)
set flg = 'Y';
3000000 rows updated.
Elapsed: 00:00:05.69
SQL>
And it runs faster...
My advices:
- check your statistics
- check these indexes, are they well defined?
- check your columns data types
- check if columns should be not nullable
Your execution plan should not use indexes here as it will run forever...
Trying to prove something out currently to see if adding an index is necessary.
If I have an index on columns A,B,C and I create a query that in the where clause is only explicitly utilizing A and C, will I get the benefit of the index?
In this scenario imagine the where clause is like this:
A = 'Q' AND (B is not null OR B is null) AND C='G'
I investigated this in Oracle using EXPLAIN PLAN and it doesn't seem to use the index. Also, from my understanding of how indexes are created and used it won't be able to benefit because the index can't leverage column B due to the lack of specifics.
Currently looking at this in either MSSQL or ORACLE. Not sure if one optimizes differently than the other.
Any advice is appreciated! Thank you!
Connected to Oracle Database 12c Enterprise Edition Release 12.1.0.2.0
SQL> create table t$ (a integer not null, b integer, c integer, d varchar2(100 char));
Table created
SQL> insert into t$ select rownum, rownum, rownum, lpad('0', '1', 100) from dual connect by level <= 1000000;
1000000 rows inserted
SQL> create index t$i on t$(a, b, c);
Index created
SQL> analyze table t$ estimate statistics;
Table analyzed
SQL> explain plan for select * from t$ where a = 128 and c = 128;
Explained
SQL> select * from table(dbms_xplan.display());
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3274478018
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 4 (0)
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T$ | 1 | 13 | 4 (0)
|* 2 | INDEX RANGE SCAN | T$I | 1 | | 3 (0)
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"=128 AND "C"=128)
filter("C"=128)
15 rows selected
Any question?
If you look at the B + tree structure of the index, then the answer is as follows
The left-hand side of the index, including the first inequality, will go to Seek Predicate, the rest in Predicate in queryplan.
For example read http://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys
This question already has answers here:
How to optimize an Oracle query that has to_char in where clause for date
(6 answers)
Closed 9 years ago.
I have simple SQL query.. on Oracle 10g. I want to know the difference between these queries:
select * from employee where id = 123 and
to_char(start_date, 'yyyyMMdd') >= '2013101' and
to_char(end_date, 'yyyyMMdd') <= '20121231';
select * from employee where id = 123 and
start_date >= to_date('2013101', 'yyyyMMdd') and
end_date <= to_date('20121231', 'yyyyMMdd');
Questions:
1. Are these queries the same? start_date, end_date are indexed date columns.
2. Does one work better over the other?
Please let me know. thanks.
The latter is almost certain to be faster.
It avoids data type conversions on a column value.
Oracle will estimate better the number of possible values between two dates, rather than two strings that are representations of dates.
Note that neither will return any rows as the lower limit is probably intended to be higher than the upper limit according to the numbers you've given. Also you've missed a numeral in 2013101.
One of the biggest flaw when you converting, casting or transforming to expression (i.e. "NVL", "COALESCE" etc.) columns in WHERE clause is that CBO will not be able to use index on that column. I slightly modified your example to show the difference:
SQL> create table t_test as
2 select * from all_objects;
Table created
SQL> create index T_TEST_INDX1 on T_TEST(CREATED, LAST_DDL_TIME);
Index created
Created table and index for our experiment.
SQL> execute dbms_stats.set_table_stats(ownname => 'SCOTT',
tabname => 'T_TEST',
numrows => 100000,
numblks => 10000);
PL/SQL procedure successfully completed
We are making CBO think that our table kind of big one.
SQL> explain plan for
2 select *
3 from t_test tt
4 where tt.owner = 'SCOTT'
5 and to_char(tt.last_ddl_time, 'yyyyMMdd') >= '20130101'
6 and to_char(tt.created, 'yyyyMMdd') <= '20121231';
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2796558804
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 300 | 2713 (1)| 00:00:33 |
|* 1 | TABLE ACCESS FULL| T_TEST | 3 | 300 | 2713 (1)| 00:00:33 |
----------------------------------------------------------------------------
Full table scan is used which would be costly on big table.
SQL> explain plan for
2 select *
3 from t_test tt
4 where tt.owner = 'SCOTT'
5 and tt.last_ddl_time >= to_date('20130101', 'yyyyMMdd')
6 and tt.created <= to_date('20121231', 'yyyyMMdd');
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1868991173
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 300 | 4 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| T_TEST | 3 | 300 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T_TEST_INDX1 | 8 | | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
See, now it's index range scan and the cost is significantly lower.
SQL> drop table t_test;
Table dropped
Finally cleaning.
for output (displaying) purpose use to_char
for "date" handling (insert, update, compare etc) use to_date
I don't have any performance link to share, but using to_date in above Query should run faster!
While the to_char it will first cast the date and then for making the compare it will need to resolve it as date type. There will be a small performance loss.
As using to_date it will not need to cast first, it will use date type directly.