I have the following query and in the from clause there is a left join with ga and following other tables.
should we use left join keyword for all other tables after ga table or we can use as it is in the query. Is there any performance issues with this query?
query:
from
a#db_link st left join (Select a,b,c,d
from b#db_link where id = 'AD' and num = 4) ga
on st.compensationdate = ga.compensationdate
and st.salestransactionseq = ga.salestransactionseq ,
b#db_link ta,
c#db_link cr,
d#db_link crd_typ,
e#db_link evt_typ,
f#db_link disputes
where st.salestransactionseq = ta.salestransactionseq
and st.id = 'AD'
This is the query plan:
Plan hash value: 3767304471
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | Inst |
------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT REMOTE | | 1 | 661 | 342 (1)| 00:00:01 | | | |
| 1 | NESTED LOOPS | | 1 | 661 | 342 (1)| 00:00:01 | | | |
| 2 | NESTED LOOPS | | 1 | 661 | 342 (1)| 00:00:01 | | | |
| 3 | NESTED LOOPS | | 1 | 612 | 342 (1)| 00:00:01 | | | |
| 4 | NESTED LOOPS | | 1 | 564 | 342 (1)| 00:00:01 | | | |
|* 5 | HASH JOIN | | 1 | 549 | 342 (1)| 00:00:01 | | | |
| 6 | NESTED LOOPS | | 1 | 503 | 0 (0)| 00:00:01 | | | |
| 7 | NESTED LOOPS | | 1 | 503 | 0 (0)| 00:00:01 | | | |
| 8 | NESTED LOOPS | | 1 | 450 | 0 (0)| 00:00:01 | | | |
| 9 | NESTED LOOPS | | 1 | 407 | 0 (0)| 00:00:01 | | | |
| 10 | PARTITION RANGE SINGLE | | 1 | 217 | 0 (0)| 00:00:01 | 1357 | 1357 | |
|* 11 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| CS_SALESTRANSACTION | 1 | 217 | 0 (0)| 00:00:01 | 1357 | 1357 | PRD121 |
|* 12 | INDEX RANGE SCAN | CS_SALESTRANSACTION_PK | 1 | | 0 (0)| 00:00:01 | 1357 | 1357 | PRD121 |
| 13 | PARTITION RANGE SINGLE | | 1 | 190 | 0 (0)| 00:00:01 | 1356 | 1356 | |
| 14 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| CS_TRANSACTIONASSIGNMENT | 1 | 190 | 0 (0)| 00:00:01 | 1356 | 1356 | PRD121 |
|* 15 | INDEX RANGE SCAN | CS_TRANSACTIONASSIGNMENT_PK | 1 | | 0 (0)| 00:00:01 | 1356 | 1356 | PRD121 |
|* 16 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | CS_GASALESTRANSACTION | 1 | 43 | 0 (0)| 00:00:01 | ROWID | ROWID | PRD121 |
|* 17 | INDEX RANGE SCAN | GASALESTRANSACTION_IDX | 3 | | 0 (0)| 00:00:01 | | | PRD121 |
| 18 | PARTITION RANGE SINGLE | | 1 | | 2 (0)| 00:00:01 | 8 | 8 | |
| 19 | PARTITION LIST ALL | | 1 | | 2 (0)| 00:00:01 | 1 | 268 | |
|* 20 | INDEX RANGE SCAN | OD_CREDIT_UTVALUE | 1 | | 2 (0)| 00:00:01 | 1347 | 1614 | PRD121 |
|* 21 | TABLE ACCESS BY LOCAL INDEX ROWID | CS_CREDIT | 1 | 53 | 3 (0)| 00:00:01 | 1 | 1 | PRD121 |
| 22 | TABLE ACCESS FULL | ADTV_FRS_DISPUTES | 27011 | 1213K| 341 (0)| 00:00:01 | | | PRD121 |
|* 23 | TABLE ACCESS BY INDEX ROWID | ADTV_FRS_CONTROL | 1 | 15 | 1 (0)| 00:00:01 | | | PRD121 |
|* 24 | INDEX UNIQUE SCAN | ADTV_FRS_CONTROL_PK | 1 | | 0 (0)| 00:00:01 | | | PRD121 |
| 25 | PARTITION LIST SINGLE | | 1 | 48 | 1 (0)| 00:00:01 | 2 | 2 | |
|* 26 | TABLE ACCESS BY LOCAL INDEX ROWID | CS_EVENTTYPE | 1 | 48 | 1 (0)| 00:00:01 | 2 | 2 | PRD121 |
|* 27 | INDEX UNIQUE SCAN | CS_EVENTTYPE_PK | 1 | | 0 (0)| 00:00:01 | 2 | 2 | PRD121 |
| 28 | PARTITION LIST SINGLE | | 1 | | 0 (0)| 00:00:01 | 2 | 2 | |
|* 29 | INDEX UNIQUE SCAN | CS_CREDITTYPE_PK | 1 | | 0 (0)| 00:00:01 | 2 | 2 | PRD121 |
|* 30 | TABLE ACCESS BY LOCAL INDEX ROWID | CS_CREDITTYPE | 1 | 49 | 1 (0)| 00:00:01 | 2 | 2 | PRD121 |
------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("GENERICATTRIBUTE13"="A1"."ACTIVITY_ID" AND "LINENUMBER"="A1"."ITEM_ID")
11 - filter("SUBLINENUMBER"<>2)
12 - access("TENANTID"='ADTV' AND "PROCESSINGUNITSEQ"=3.82805968326498E16)
15 - access("TENANTID"='ADTV' AND "PROCESSINGUNITSEQ"=3.82805968326498E16 AND "COMPENSATIONDATE"="COMPENSATIONDATE" AND
"SALESTRANSACTIONSEQ"="SALESTRANSACTIONSEQ")
16 - filter("COMPENSATIONDATE"="COMPENSATIONDATE" AND "PAGENUMBER"=4)
17 - access("SALESTRANSACTIONSEQ"="SALESTRANSACTIONSEQ" AND "TENANTID"='ADTV')
20 - access("TENANTID"='ADTV' AND "PROCESSINGUNITSEQ"=3.82805968326498E16)
21 - filter("SALESTRANSACTIONSEQ"="SALESTRANSACTIONSEQ" AND "SALESORDERSEQ"="SALESORDERSEQ")
23 - filter(UPPER("A8"."STATUS")='NEW')
24 - access("A1"."CASE_NO"="A8"."CASE_NO")
26 - filter("EVENTTYPEID"='PROTECTIONPLAN CHARGEBACK' OR "EVENTTYPEID"='PROTECTIONPLAN CHARGEBACK-FRS' OR "EVENTTYPEID"='PROTECTIONPLAN
INCENTIVE' OR "EVENTTYPEID"='PROTECTIONPLAN INCENTIVE-FRS' OR "EVENTTYPEID"='PROTECTIONPLAN KICKER' OR "EVENTTYPEID"='PROTECTIONPLAN KICKER-FRS' OR
"EVENTTYPEID"='UNIVERSAL BILLER' OR "EVENTTYPEID"='UNIVERSAL BILLER-FRS' OR "EVENTTYPEID"='WORK ORDER' OR "EVENTTYPEID"='WORK ORDER-FRS')
27 - access("TENANTID"='ADTV' AND "EVENTTYPESEQ"="DATATYPESEQ" AND "REMOVEDATE"=TO_DATE(' 2200-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
29 - access("TENANTID"='ADTV' AND "CREDITTYPESEQ"="DATATYPESEQ" AND "REMOVEDATE"=TO_DATE(' 2200-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
30 - filter("CREDITTYPEID"="A1"."CREDIT_TYPE" OR "CREDITTYPEID" LIKE "A1"."CREDIT_TYPE"||'%FRS')
Note
-----
- fully remote statement
- dynamic statistics used: dynamic sampling (level=7)
should we use left join keyword for all other tables after ga table or we can use as it is in the query. Is there any performance issues with this query?
LEFT OUTER JOIN, to give it its full name, is a two part thing
OUTER JOIN is a special case where "if the join fails, permit the solid side table/resultset to exist in the output and fill the partial side with NULLs"
The LEFT is a direction to the database as to which side shall be considered "solid". All the rows from the solid side are present at least once.
Absent any parentheses or sub queries driving execution direction:
SELECT *
FROM
a
LEFT OUTER JOIN b ON ...
a is the left; a is thus the solid side. All rows from a will be present. Rows from b may be present or null if the join predicates matched no rows
Once this is done this whole resultset of "a and b, nulls, warts and all" will become "the left side" for subsequent joins
SELECT *
FROM
a
LEFT OUTER JOIN b
some_kind_of JOIN c
Is effectively the same as:
SELECT *
FROM
(
SELECT *
FROM
a
LEFT OUTER JOIN b
) newLeft
some_kind_of JOIN c
Remember, the OUTER specifier permits the join to fail and still keeps the declared solid side rows
Whether you can use INNER or LEFT/RIGHT OUTER to join c in depends on what you're joining it to
If you're joining it to, say, a column from a then it could be fine to use INNER or OUTER - you'd use whatever you'd use if b wasn't even in the picture.
Will the join from a to c fail sometimes and you still want the rows from a? Use an OUTER.
Will it never fail, or do you not want any rows that do fail? Use an INNER.
However, if you're joining it to a column that was provided by b then you probably are going to want to use some OUTER join, otherwise there will have been no point making the query do a left outer join b - rows from bb will definitely have a NULL where the join failed but you wanted to keep those ones.. If you then INNER JOIN c to some column from b, that was NULL because the join failed, then the row will disappear from the output. Nothing is ever equal to NULL, so the INNER JOIN to c on the NULL in the column from b. In effect the INNER JOIN undoes all that good work done keeping a's data, by the OUTER join that joined b's data
Doing
a
LEFT JOIN b ON a.b_id = b.id
LEFT JOIN c ON b.c_id = c.id
allows those rows from a-join-b where b.c_id is null (because the join failed) to stay in the output (because it's an outer join to C, not an inner one)..
..
Generally we inner join everything we can, then switch to left joining everything else because it makes the queries easier to follow. In that "if c is being inner joined to a" scenario we would perhaps:
a
INNER JOIN c on a.c_id = c.id
LEFT JOIN b on a.b_id = b.id
Rather than:
a
LEFT JOIN b on a.b_id = b.id
INNER JOIN c on a.c_id = c.id
If a table is being joined to a table that was left joined, left join it too. Avoid RIGHT join because it goes against the evaluation direction of SQL and makes things harder to reason about; any time you think about using a right join, turn it around and rewrite it as a left.
Don't forget to use sub queries too. If you want every a joined to b which is joined to c only if both b and c sides match, it's probably clearest to:
a
LEFT JOIN (
SELECT * FROM b INNER JOIN c ON b.c_id = c.id
) b_and_c
Try to see your SQL as developing some growing-wider-with-every-join resultset that, at every join, becomes the new left side
Related
I have query something like
WITH
str_table as (
SELECT stringtext, stringnumberid
FROM STRING_TABLE
WHERE LANGID IN (23,62)
),
data as (
select *
from employee emp
left outer join str_table st on emp.nameid = st.stringnumberid
)
select * from data
I know With clause will work in this manner
Step 1 : The SQL Query within the with clause is executed at first step.
Step 2 : The output of the SQL query is stored into temporary relation of with clause.
Step 3 : The Main query is executed with temporary relation produced at the last stage.
Now I want to ask whether the indexes created on the actual STRING_TABLE are going to help in temporary str_table produce by the With clause? I want to ask whether the indexes also have impact on str_table or not?
Oracle will not process CTE one by one. It will analyze the SQL as a whole. Your SQL is most likely the same as following in the eye of Oracle optimizer
select emp.*
from employee emp left outer join STRING_TABLE st
on emp.nameid = st.stringnumberid
where st.LANGID IN (23,62);
Oracle can use index on STRING_TABLE. Whether it will depends on the table statistics. For example, if the table has few rows (say a few hundred), Oracle will likely not use index.
It depends.
First of all, with clause is not a temporary table. As documentation says:
Oracle Database optimizes the query by treating the query name as either an inline view or as a temporary table.
Optimizer decides to materialize with subquery if either you forse it to do so by using /*+materialize*/ hint inside the subquery or you reuse this with subquery more than once.
In the example below Oracle uses with clause as inline view and merges it within the main query:
explain plan for
with a as (
select
s.textid,
s.textvalue,
a.id,
a.other_column
from string_table s
join another_tab a
on s.textid = a.textid
where langid in (1)
)
select *
from big_table b
join a a_name
on b.name_textid = a_name.textid
and b.job_textid = a_name.id
| PLAN_TABLE_OUTPUT |
| :----------------------------------------------------------------------------------- |
| Plan hash value: 1854049435 |
| |
| ------------------------------------------------------------------------------------ |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ------------------------------------------------------------------------------------ |
| | 0 | SELECT STATEMENT | | 1 | 1147 | 74 (0)| 00:00:01 | |
| |* 1 | HASH JOIN | | 1 | 1147 | 74 (0)| 00:00:01 | |
| | 2 | TABLE ACCESS FULL | ANOTHER_TAB | 39 | 3042 | 3 (0)| 00:00:01 | |
| |* 3 | HASH JOIN | | 31 | 33139 | 71 (0)| 00:00:01 | |
| | 4 | TABLE ACCESS FULL| BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | |
| |* 5 | TABLE ACCESS FULL| STRING_TABLE | 1143 | 589K| 68 (0)| 00:00:01 | |
| ------------------------------------------------------------------------------------ |
But depending on the statistics and hints it may evaluate subquery first and then add it to the main query:
explain plan for
with a as (
select
s.textid,
s.textvalue,
a.id,
a.other_column
from string_table s
join another_tab a
on s.textid = a.textid
where langid in (1)
)
select /*+NO_MERGE(a_name)*/ *
from big_table b
join a a_name
on b.name_textid = a_name.textid
and b.job_textid = a_name.id
| PLAN_TABLE_OUTPUT |
| :------------------------------------------------------------------------------------ |
| Plan hash value: 4105667421 |
| |
| ------------------------------------------------------------------------------------- |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ------------------------------------------------------------------------------------- |
| | 0 | SELECT STATEMENT | | 101 | 110K| 74 (0)| 00:00:01 | |
| |* 1 | HASH JOIN | | 101 | 110K| 74 (0)| 00:00:01 | |
| | 2 | TABLE ACCESS FULL | BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | |
| | 3 | VIEW | | 64 | 37120 | 71 (0)| 00:00:01 | |
| |* 4 | HASH JOIN | | 64 | 38784 | 71 (0)| 00:00:01 | |
| | 5 | TABLE ACCESS FULL| ANOTHER_TAB | 39 | 3042 | 3 (0)| 00:00:01 | |
| |* 6 | TABLE ACCESS FULL| STRING_TABLE | 1143 | 589K| 68 (0)| 00:00:01 | |
| ------------------------------------------------------------------------------------- |
When you use with subquery twice, optimizer decides to materialize it:
explain plan for
with a as (
select
s.textid,
s.textvalue
from string_table s
where langid in (1)
)
select *
from big_table b
join a a_name
on b.name_textid = a_name.textid
join a a_job
on b.job_textid = a_job.textid
| PLAN_TABLE_OUTPUT |
| :--------------------------------------------------------------------------------------------------------------------- |
| Plan hash value: 1371454296 |
| |
| ---------------------------------------------------------------------------------------------------------------------- |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ---------------------------------------------------------------------------------------------------------------------- |
| | 0 | SELECT STATEMENT | | 63 | 98973 | 67 (0)| 00:00:01 | |
| | 1 | TEMP TABLE TRANSFORMATION | | | | | | |
| | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D7224_469C01 | | | | | |
| | 3 | TABLE ACCESS BY INDEX ROWID BATCHED | STRING_TABLE | 999 | 515K| 22 (0)| 00:00:01 | |
| |* 4 | INDEX RANGE SCAN | IX | 999 | | 4 (0)| 00:00:01 | |
| |* 5 | HASH JOIN | | 63 | 98973 | 45 (0)| 00:00:01 | |
| |* 6 | HASH JOIN | | 35 | 36960 | 24 (0)| 00:00:01 | |
| | 7 | TABLE ACCESS FULL | BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | |
| | 8 | VIEW | | 999 | 502K| 21 (0)| 00:00:01 | |
| | 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D7224_469C01 | 999 | 502K| 21 (0)| 00:00:01 | |
| | 10 | VIEW | | 999 | 502K| 21 (0)| 00:00:01 | |
| | 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D7224_469C01 | 999 | 502K| 21 (0)| 00:00:01 | |
| ---------------------------------------------------------------------------------------------------------------------- |
So when there are some indexes on tables inside with subquery they may be used in all above cases: before materialization, when subquery is not merged and when subquery is merged and some idexes provide better query plan on merged subquery (even when those indexes are not used when you execute subquery alone).
What about idexes: if they provide high selectivity (i.e. number of rows retrieved by index is small compared to the overall number of rows), then Oracle will consider to use it. Note, that index access has two steps: read index blocks and then read table blocks that contain rowids found by index. If table size is not much bigger than index size, then Oracle may use table scan instead of index scan even for quite selective predicate (because of doubled IO).
In the below example I've used "small" texts (100 chars) and big_table table of 20 rows and this index for text table:
create index ix
on string_table(langid, textid)
Optimizer decides to use index range scan and read only blocks of the first level (first column of the index):
explain plan for
with a as (
select
s.textid,
s.textvalue
from string_table s
where langid in (1)
)
select *
from big_table b
join a a_name
on b.name_textid = a_name.textid
| PLAN_TABLE_OUTPUT |
| :---------------------------------------------------------------------------------------------------- |
| Plan hash value: 1660330381 |
| |
| ----------------------------------------------------------------------------------------------------- |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ----------------------------------------------------------------------------------------------------- |
| | 0 | SELECT STATEMENT | | 29 | 31001 | 26 (0)| 00:00:01 | |
| |* 1 | HASH JOIN | | 29 | 31001 | 26 (0)| 00:00:01 | |
| | 2 | TABLE ACCESS FULL | BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | |
| | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| STRING_TABLE | 999 | 515K| 23 (0)| 00:00:01 | |
| |* 4 | INDEX RANGE SCAN | IX | 999 | | 4 (0)| 00:00:01 | |
| ----------------------------------------------------------------------------------------------------- |
| |
| Predicate Information (identified by operation id): |
| --------------------------------------------------- |
| |
| 1 - access("B"."NAME_TEXTID"="S"."TEXTID") |
| 4 - access("LANGID"=1) | |
But when we reduce the number of rows in big_table, it uses both the columns for index scan:
delete from big_table
where id > 4
explain plan for
with a as (
select
s.textid,
s.textvalue
from string_table s
where langid in (1)
)
select *
from big_table b
join a a_name
on b.name_textid = a_name.textid
| PLAN_TABLE_OUTPUT |
| :-------------------------------------------------------------------------------------------- |
| Plan hash value: 1766926914 |
| |
| --------------------------------------------------------------------------------------------- |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| --------------------------------------------------------------------------------------------- |
| | 0 | SELECT STATEMENT | | 6 | 18216 | 11 (0)| 00:00:01 | |
| | 1 | NESTED LOOPS | | 6 | 18216 | 11 (0)| 00:00:01 | |
| | 2 | NESTED LOOPS | | 6 | 18216 | 11 (0)| 00:00:01 | |
| | 3 | TABLE ACCESS FULL | BIG_TABLE | 4 | 4032 | 3 (0)| 00:00:01 | |
| |* 4 | INDEX RANGE SCAN | IX | 1 | | 1 (0)| 00:00:01 | |
| | 5 | TABLE ACCESS BY INDEX ROWID| STRING_TABLE | 2 | 4056 | 2 (0)| 00:00:01 | |
| --------------------------------------------------------------------------------------------- |
| |
| Predicate Information (identified by operation id): |
| --------------------------------------------------- |
| |
| 4 - access("LANGID"=1 AND "B"."NAME_TEXTID"="S"."TEXTID") |
| |
You may check above code snippets in the db<>fiddle.
I have a view on which I apply some filters to retrieve data. This query to retrieve data is taking long time. Provided explain plan below with the query and it's access info. I have requirement to retrieve this data at a quick pace (within 30 seconds). But it is taking more than 15mins but not able to get data and timing out. Any idea how we can retrieve data quickly?
View definition as below:
CREATE VIEW DQ_DB.DQM_RESULT_VIEW
AS SELECT
res.ACTIVE_FL AS ACTIVE_FL,
res.VERSION as VERSION,
res.rule_constituents_tx,
nvl(ruletable.rule_desc,'N/A') AS rule_ds,
nvl(res.effective_dt, TO_DATE('31-dec-9999','dd-mon-yyyy')) AS effective_dt,
nvl(res.rule_id,'N/A') AS rule_id,
res.audit_update_ts AS rule_processed_at,
res.load_dt,
res.vendor_group_key,
nvl(res.vendor_entity_key,'N/A') AS vendor_entity_key,
res.vendor_entity_producer_nm,
(SELECT category_value_tx FROM dq_db.category_lookup_view WHERE category_nm = 'RESULT_STATUS_NB' AND category_value_cd = res.result_status_nb ) AS result,
--catlkp.category_value_tx as result,
res.entity_type,
nvl(rgrp.grp_nm,'N/A') AS rule_category,
nvl(ruletable.rule_nm,'N/A') AS rule_nm,
feedsumm.feed_run_nm AS file_nm,
res.application_id AS application,
res.data_source_id AS datasource,
res.entity_nm,
res.rule_entity_effective_dt,
res.result_id,
dim.dimension_nm,
dim.sub_dimension_nm,
ruletable.execution_env AS execution_env,
ruletable.ops_action AS ops_action,
rulefunctiontable.func_nm AS rule_func_nm,
-- nvl2(res.primary_dco_sid,dq_db.get_dco_name(res.primary_dco_sid),null) AS dco_primary,
-- nvl2(res.delegate_dco_sid,dq_db.get_dco_name(res.delegate_dco_sid),null) AS dco_delegate,
res.primary_dco_sid AS dco_primary,
res.delegate_dco_sid AS dco_delegate,
ruletable.data_concept_id AS data_concept_id,
res.latest_result_fl as latest_result_fl,
res.batch_execution_ts as batch_execution_ts
FROM
dq_db.dqm_result res
--LEFT OUTER JOIN dq_db.category_lookup_view catlkp on (catlkp.category_nm = 'RESULT_STATUS_NB' AND catlkp.category_value_cd = res.result_status_nb)
LEFT OUTER JOIN dq_db.feed_run_summary feedsumm ON res.vendor_group_key = feedsumm.batch_id
LEFT OUTER JOIN dq_db.dqm_rule ruletable ON res.rule_id = ruletable.rule_id
LEFT OUTER JOIN dq_db.dqm_rule_grp rgrp ON ruletable.rule_grp_id = rgrp.rule_grp_id
LEFT OUTER JOIN dq_db.dqm_rule_function rulefunctiontable ON ruletable.func_id = rulefunctiontable.func_id
LEFT OUTER JOIN dq_db.dq_dimension_view dim ON dim.dimension_id = ruletable.dimension_id
Explain plan of query used:
select * from ( select count(resultview0_.RULE_CATEGORY) as col_0_0_,
resultview0_.RULE_CATEGORY as col_1_0_ from DQ_DB.DQM_RESULT_VIEW
resultview0_ where (resultview0_.LATEST_RESULT_FL like :1 ) and
resultview0_.APPLICATION=:2 and (resultview0_.DATASOURCE in (:3 )) and
resultview0_.EFFECTIVE_DT>=:4 and resultview0_.EFFECTIVE_DT<=:5 and
resultview0_.LOAD_DT>=:6 and resultview0_.LOAD_DT<=:7 and
(resultview0_.RESULT in (:8 , :9 )) group by
resultview0_.RULE_CATEGORY ) where rownum <= :10
Plan hash value: 722164065
---------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 746K(100)| | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | VIEW | | 592 | 155K| 746K (1)| 02:29:24 | | |
|* 3 | SORT GROUP BY STOPKEY | | 592 | 222K| 746K (1)| 02:29:24 | | |
| 4 | NESTED LOOPS | | 1 | 102 | 4 (0)| 00:00:01 | | |
| 5 | NESTED LOOPS | | 1 | 102 | 4 (0)| 00:00:01 | | |
|* 6 | TABLE ACCESS FULL | DATA_LOOKUP_VALUE | 1 | 51 | 3 (0)| 00:00:01 | | |
|* 7 | INDEX UNIQUE SCAN | PK_DATA_LOOKUP_CATEGORY | 1 | | 0 (0)| | | |
|* 8 | TABLE ACCESS BY INDEX ROWID | DATA_LOOKUP_CATEGORY | 1 | 51 | 1 (0)| 00:00:01 | | |
|* 9 | VIEW | DQM_RESULT_VIEW | 592 | 222K| 746K (1)| 02:29:24 | | |
|* 10 | FILTER | | | | | | | |
|* 11 | HASH JOIN OUTER | | 592 | 287K| 746K (1)| 02:29:24 | | |
|* 12 | HASH JOIN RIGHT OUTER | | 592 | 259K| 746K (1)| 02:29:16 | | |
| 13 | VIEW | index$_join$_009 | 39 | 3783 | 2 (0)| 00:00:01 | | |
|* 14 | HASH JOIN | | | | | | | |
| 15 | INDEX FAST FULL SCAN | PK_DQM_RULE_GRP | 39 | 3783 | 1 (0)| 00:00:01 | | |
| 16 | INDEX FAST FULL SCAN | UK_DQM_RULE_GRP | 39 | 3783 | 1 (0)| 00:00:01 | | |
|* 17 | HASH JOIN RIGHT OUTER | | 592 | 202K| 746K (1)| 02:29:16 | | |
| 18 | VIEW | DQ_DIMENSION_VIEW | 28 | 224 | 2 (0)| 00:00:01 | | |
| 19 | NESTED LOOPS OUTER | | 28 | 840 | 2 (0)| 00:00:01 | | |
|* 20 | HASH JOIN OUTER | | 28 | 616 | 2 (0)| 00:00:01 | | |
| 21 | INDEX FULL SCAN | PK_DQM_FW_DQ_DIM | 28 | 224 | 1 (0)| 00:00:01 | | |
| 22 | INDEX FULL SCAN | PK_DQM_FW_DQ_DIM_HRCHY | 21 | 294 | 1 (0)| 00:00:01 | | |
|* 23 | INDEX UNIQUE SCAN | PK_DQM_FW_DQ_DIM | 1 | 8 | 0 (0)| | | |
|* 24 | HASH JOIN RIGHT OUTER | | 592 | 198K| 746K (1)| 02:29:16 | | |
| 25 | TABLE ACCESS FULL | DQM_RULE | 451 | 37884 | 16 (0)| 00:00:01 | | |
| 26 | PARTITION RANGE ITERATOR | | 592 | 149K| 746K (1)| 02:29:16 | KEY | KEY |
|* 27 | TABLE ACCESS BY LOCAL INDEX ROWID| DQM_RESULT | 592 | 149K| 746K (1)| 02:29:16 | KEY | KEY |
|* 28 | INDEX SKIP SCAN | IDX_PK_DQM_RESULT | 379K| | 373K (1)| 01:14:42 | KEY | KEY |
|* 29 | INDEX FAST FULL SCAN | INDEX_BATCH_ID_RUN_SMRY | 149K| 7158K| 637 (1)| 00:00:08 | | |
---------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=:10)
3 - filter(ROWNUM<=:10)
6 - filter(TO_NUMBER("VAL"."CATEGORY_VALUE_CD")=:B1)
7 - access("CAT"."CATEGORY_ID"="VAL"."CATEGORY_ID")
8 - filter("CAT"."CATEGORY_NM"='RESULT_STATUS_NB')
9 - filter(("RESULTVIEW0_"."RESULT"=:8 OR "RESULTVIEW0_"."RESULT"=:9))
10 - filter((:5>=:4 AND :7>=:6))
11 - access("RES"."VENDOR_GROUP_KEY"="FEEDSUMM"."BATCH_ID")
12 - access("RULETABLE"."RULE_GRP_ID"="RGRP"."RULE_GRP_ID")
14 - access(ROWID=ROWID)
17 - access("DIM"."DIMENSION_ID"="RULETABLE"."DIMENSION_ID")
20 - access("SUB_DIM"."SUB_DIMENSION_ID"="DIM"."DIMENSION_ID")
23 - access("DIM1"."DIMENSION_ID"="SUB_DIM"."DIMENSION_ID")
24 - access("RES"."RULE_ID"="RULETABLE"."RULE_ID")
27 - filter(NVL("RES"."LATEST_RESULT_FL",U'Y') LIKE SYS_OP_C2C(:1))
28 - access("RES"."LOAD_DT">=:6 AND "RES"."APPLICATION_ID"=SYS_OP_C2C(:2) AND "RES"."DATA_SOURCE_ID"=SYS_OP_C2C(:3) AND
"RES"."EFFECTIVE_DT">=:4 AND "RES"."LOAD_DT"<=:7 AND "RES"."EFFECTIVE_DT"<=:5)
filter(("RES"."EFFECTIVE_DT">=:4 AND "RES"."DATA_SOURCE_ID"=SYS_OP_C2C(:3) AND "RES"."APPLICATION_ID"=SYS_OP_C2C(:2)
AND "RES"."EFFECTIVE_DT"<=:5))
29 - filter("FEEDSUMM"."BATCH_ID" IS NOT NULL)
I have different indexes on DQM_RESULT table as below.
IDX_RULE_ID --> {RULE_ID}
IDX_PK_DQM_RESULT --> {LOAD_DT, APPLICATION_ID, DATA_SOURCE_ID, EFFECTIVE_DT, RESULT_ID}
IDX_EFF_DT_VENDOR_KEY --> {EFFECTIVE_DT, VENDOR_ENTITY_KEY}
INDEX_VENDOR_GROUP_KEY --> {VENDOR_GROUP_KEY}
IDX_EFFDT_APPDS_RUL_EID --> {LOAD_DT, APPLICATION_ID, DATA_SOURCE_ID, EFFECTIVE_DT, RULE_ID, VENDOR_ENTITY_KEY, LATEST_RESULT_FL, RESULT_ID}
DQM_RESULT Table is partitioned on LOAD_DT column and each load date contains around 15 data sources. Each data source loads around 1.5 million rows of data to each load date partition.
Change the order of the columns in this index to have the most selective columns first, or create another index with only the selective columns:
IDX_PK_DQM_RESULT --> {LOAD_DT, APPLICATION_ID, DATA_SOURCE_ID, EFFECTIVE_DT, RESULT_ID}
According to the execution plan, these operations are responsible for most of the time of the query:
|* 27 | TABLE ACCESS BY LOCAL INDEX ROWID| DQM_RESULT | 592 | 149K| 746K (1)| 02:29:16 | KEY | KEY |
|* 28 | INDEX SKIP SCAN | IDX_PK_DQM_RESULT | 379K| | 373K (1)| 01:14:42 | KEY | KEY |
Skip scans require an index access for each distinct value of the initial columns, which in this case is LOAD_DT. That column might be in some sort of anti-Goldilocks zone, where it's too distinct to be useful for a skip scan, but not distinct enough to be useful for a range scan.
If the above suggestion doesn't help, you should gather more data. The explain plan only shows the guesses about what the optimizer will do. Use the below code to generate an execution plan, which will show both the estimates and the actual values. Edit your question and post the results and you may get better answers.
--Run the query with this hint to generate extra statistics.
select /*+ gather_plan_statistics */ ... your query here ...;
--Find the SQL_ID for your statement.
select sql_id, sql_text from gv$sql where lower(sql_text) like '%gather_plan_statistics%';
--Generate execution plan.
select * from table(dbms_xplan.display_cursor(sql_id => 'SQL_ID from above', format => 'allstats last'));
I'm having a hard time optimising a sql query that takes about 1 min to complete. Here is the query :
SELECT mpg.ID_PROD_GARN, mpg.NO_PROD
FROM ACO.prime p
JOIN ACO.facture_compt fc
ON fc.id_factr = p.id_factr
JOIN ACO.V_CONTRAT vc
ON vc.NO_POLC = p.NO_POLC
JOIN ACO.MV_PRODUIT mp
ON mp.NO_PROD =vc.NO_PROD
JOIN ACO.MV_PROD_GARN mpg
ON mpg.NO_PROD = mp.NO_PROD
WHERE p.id_prime =
( SELECT MAX(id_prime) AS prime FROM ACO.prime p WHERE p.no_polc='T3167978')
AND mpg.ID_PROD_GARN = '1238'
AND fc.cd_stat_factr = 'comp';
V_CONTRAT is a view and (if my understanding is correct) when joining the view in this way SQL is running trough all the rows to find the result. I did a bit of research and found that indexing this view could speed up my query. So :
CREATE INDEX indx_no_produit ON ACO.V_CONTRAT(NO_PROD);
Unfortunately I get an error saying that I can't index a view SQL : ORA-01702 : you can't use that here.
*Cause: Among other possible causes, this message will be produced if an
attempt was made to define an Editioning View over a view.
*Action: An Editioning View may only be created over a base table.
So my question is how could I speed up this query elegantly?
Many thanks in advance!
Edit 1 : here is the explained plan
Plan hash value: 3107129748
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 102 | | 543K (1)| 00:00:22 |
| 1 | NESTED LOOPS | | 1 | 102 | | 543K (1)| 00:00:22 |
| 2 | NESTED LOOPS | | 1 | 90 | | 543K (1)| 00:00:22 |
| 3 | NESTED LOOPS | | 1 | 35 | | 3 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 14 | | 2 (0)| 00:00:01 |
| 5 | MAT_VIEW ACCESS BY INDEX ROWID| MV_PROD_GARN | 1 | 9 | | 1 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | PK_PROG | 1 | | | 1 (0)| 00:00:01 |
| 7 | BITMAP CONVERSION TO ROWIDS | | 1 | 5 | | 1 (0)| 00:00:01 |
|* 8 | BITMAP INDEX FAST FULL SCAN | MV_PROD_NO_PROD_IDX | | | | | |
| 9 | TABLE ACCESS BY INDEX ROWID | PRIME | 1 | 21 | | 1 (0)| 00:00:01 |
|* 10 | INDEX UNIQUE SCAN | PK_PRIME | 1 | | | 1 (0)| 00:00:01 |
| 11 | SORT AGGREGATE | | 1 | 15 | | | |
| 12 | FIRST ROW | | 1 | 15 | | 1 (0)| 00:00:01 |
|* 13 | INDEX RANGE SCAN (MIN/MAX) | PRIME_NO_POLC_IDX | 1 | 15 | | 1 (0)| 00:00:01 |
|* 14 | VIEW | V_CONTRAT | 1 | 55 | | 543K (1)| 00:00:22 |
| 15 | SORT UNIQUE | | 16M| 5011M| 2740M| 543K (1)| 00:00:22 |
| 16 | UNION-ALL | | | | | | |
|* 17 | HASH JOIN | | 16M| 2502M| | 55963 (4)| 00:00:03 |
| 18 | VIEW | index$_join$_016 | 103 | 1339 | | 2 (0)| 00:00:01 |
|* 19 | HASH JOIN | | | | | | |
| 20 | INDEX FAST FULL SCAN | PK_PROD | 103 | 1339 | | 1 (0)| 00:00:01 |
| 21 | INDEX FAST FULL SCAN | PRODUIT_COMBINE_IDX | 103 | 1339 | | 1 (0)| 00:00:01 |
|* 22 | HASH JOIN RIGHT OUTER | | 16M| 2293M| 261M| 55860 (4)| 00:00:03 |
| 23 | INDEX FAST FULL SCAN | ROL_INDEX1 | 10M| 145M| | 5703 (2)| 00:00:01 |
|* 24 | HASH JOIN RIGHT OUTER | | 6751K| 824M| 117M| 44540 (4)| 00:00:02 |
| 25 | INLIST ITERATOR | | | | | | |
|* 26 | INDEX RANGE SCAN | ROL_CDROL_NOINTR_IDX | 3975K| 72M| | 756 (2)| 00:00:01 |
|* 27 | HASH JOIN RIGHT OUTER | | 4192K| 435M| 90M| 40898 (3)| 00:00:02 |
|* 28 | TABLE ACCESS FULL | ROLE | 2881K| 57M| | 14553 (4)| 00:00:01 |
|* 29 | HASH JOIN | | 2941K| 246M| 104M| 24558 (3)| 00:00:01 |
| 30 | INDEX FAST FULL SCAN | INFO_BASE_DISTRIBUTEUR_FK1 | 4047K| 57M| | 1925 (2)| 00:00:01 |
|* 31 | HASH JOIN | | 2961K| 206M| 136M| 20967 (3)| 00:00:01 |
| 32 | TABLE ACCESS FULL | CONTRAT_ITER | 4088K| 89M| | 12159 (2)| 00:00:01 |
|* 33 | HASH JOIN RIGHT OUTER | | 2961K| 141M| 32M| 7292 (3)| 00:00:01 |
|* 34 | INDEX RANGE SCAN | ROL_CDROL_NOINTR_IDX | 933K| 22M| | 890 (1)| 00:00:01 |
|* 35 | TABLE ACCESS FULL | CONTRAT | 2615K| 62M| | 5781 (3)| 00:00:01 |
|* 36 | HASH JOIN OUTER | | 29239 | 2912K| | 10228 (3)| 00:00:01 |
|* 37 | HASH JOIN | | 29239 | 1941K| | 285 (2)| 00:00:01 |
| 38 | TABLE ACCESS FULL | DISTRIBUTEUR | 9142 | 91420 | | 45 (3)| 00:00:01 |
|* 39 | HASH JOIN | | 29239 | 1656K| | 240 (1)| 00:00:01 |
| 40 | INDEX FULL SCAN | PRODUIT_COMBINE_IDX | 103 | 1030 | | 1 (0)| 00:00:01 |
|* 41 | TABLE ACCESS FULL | TPA_CONTRAT_MENSUEL | 29239 | 1370K| | 239 (1)| 00:00:01 |
| 42 | TABLE ACCESS FULL | COUNTERPARTY | 3547K| 115M| | 9921 (2)| 00:00:01 |
|* 43 | INDEX RANGE SCAN | FACTURE_COMPT_INDEX9 | 1 | 12 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("MPG"."ID_PROD_GARN"=1238)
8 - filter("MPG"."NO_PROD"="MP"."NO_PROD")
10 - access("P"."ID_PRIME"= (SELECT MAX("ID_PRIME") FROM "ACO"."PRIME" "P" WHERE "P"."NO_POLC"='T3167978'))
13 - access("P"."NO_POLC"='T3167978')
14 - filter("VC"."NO_POLC"="P"."NO_POLC" AND "MP"."NO_PROD"="VC"."NO_PROD")
17 - access("PROD"."NO_PROD"="ITER"."NO_PROD" AND "PROD"."NO_VERS_PROD"="ITER"."NO_VERS_PROD")
19 - access(ROWID=ROWID)
22 - access("ROLAUTRE"."ID_CONT"(+)="CONT"."ID_CONT" AND "ROLAUTRE"."NO_ITER_CONT"(+)="CONT"."NO_DERN_ITER")
24 - access("ROLA"."ID_CONT"(+)="CONT"."ID_CONT" AND "ROLA"."NO_ITER_CONT"(+)="CONT"."NO_DERN_ITER")
26 - access("ROLA"."CD_ROLE"(+)='ap' OR "ROLA"."CD_ROLE"(+)='debit' OR "ROLA"."CD_ROLE"(+)='emprun')
27 - access("ROLPAY"."ID_CONT"(+)="CONT"."ID_CONT" AND "ROLPAY"."NO_ITER_CONT"(+)="CONT"."NO_DERN_ITER")
28 - filter("ROLPAY"."CD_ROLE"(+)='pay' AND "ROLPAY"."IND_PAY_PRI"(+)='1')
29 - access("ITER"."ID_INFO_BASE"="IB"."ID_INFO_BASE" AND "ITER"."NO_ITER_CONT"="IB"."NO_ITER_CONT" AND
SYS_OP_DESCEND("ITER"."NO_ITER_CONT")=SYS_OP_DESCEND("IB"."NO_ITER_CONT"))
31 - access("ITER"."ID_CONT"="CONT"."ID_CONT" AND "ITER"."NO_ITER_CONT"="CONT"."NO_DERN_ITER" AND
SYS_OP_DESCEND("ITER"."NO_ITER_CONT")=SYS_OP_DESCEND("CONT"."NO_DERN_ITER"))
33 - access("ROLP"."ID_CONT"(+)="CONT"."ID_CONT" AND "ROLP"."NO_ITER_CONT"(+)="CONT"."NO_DERN_ITER")
34 - access("ROLP"."CD_ROLE"(+)='pren')
35 - filter("CONT"."NO_SEQ_PROPO_SEL"=0)
36 - access("CTRPY"."LAST_NAME"(+)="TCM"."CON_NOM_ASS_PRINC" AND
"CTRPY"."FRST_NAME"(+)="TCM"."CON_PRENOM_ASS_PRINC" AND "CTRPY"."DT_BIRTH"(+)="TCM"."CON_DATE_NAISS_ASS_PRINC")
37 - access("TCM"."CON_NO_DISTRIBUTEUR"="A"."NO_DIST")
39 - access("PROD"."NO_PROD"="TCM"."CON_CODE_PRODDUIT")
41 - filter("TCM"."CON_VERSION_CONTRAT"=1)
43 - access("FC"."ID_FACTR"="P"."ID_FACTR" AND "FC"."CD_STAT_FACTR"='comp')
Edit 2 : Here is the view V_CONTRAT
SELECT DISTINCT cont.ID_CONT,
cont.NO_POLC,
cont.cd_divs,
prod.no_prod,
prod.cd_cie_encai,
prod.cd_faml_cptb,
ib.no_dist_init,
CASE
WHEN ROLP.ID_ROLE IS NOT NULL THEN ROLP.NO_INTR
WHEN rola.no_intr IS NOT NULL THEN rola.no_intr
WHEN rolpay.no_intr IS NOT NULL THEN rolpay.no_intr
ELSE rolAutre.no_intr
END
AS NO_INTR_PRINC,
rolpay.no_intr AS NO_INTR_PAY
FROM VIRAGE.CONTRAT CONT
INNER JOIN
VIRAGE.contrat_iter iter
ON iter.id_cont = cont.id_cont
AND ITER.NO_ITER_CONT = CONT.NO_DERN_ITER
AND CONT.NO_SEQ_PROPO_SEL = 0
INNER JOIN
VIRAGE.info_base ib
ON iter.id_info_base = ib.id_info_base
AND iter.no_iter_cont = ib.no_iter_cont
INNER JOIN
VIRAGE.produit prod
ON prod.no_prod = iter.no_prod
AND prod.no_vers_prod = iter.no_vers_prod
LEFT JOIN
VIRAGE.role rolp
ON rolp.id_cont = cont.id_cont
AND rolp.no_iter_cont = cont.no_dern_iter
AND rolp.cd_role = 'pren'
LEFT JOIN
VIRAGE.role rola
ON rola.id_cont = cont.id_cont
AND rola.no_iter_cont = cont.no_dern_iter
AND ( rola.cd_role = 'ap'
OR ROLA.CD_ROLE = 'debit'
OR rola.cd_role = 'emprun')
LEFT JOIN
VIRAGE.role rolpay
ON rolpay.id_cont = cont.id_cont
AND rolpay.no_iter_cont = cont.no_dern_iter
AND rolpay.cd_role = 'pay'
AND rolpay.ind_pay_pri = '1'
LEFT JOIN
VIRAGE.role rolAutre
ON rolAutre.id_cont = cont.id_cont
AND rolAutre.no_iter_cont = cont.no_dern_iter
UNION
SELECT DISTINCT
CAST (tcm.con_sequence AS NUMBER (10, 0)) AS id_cont,
CAST (tcm.con_numero_contrat AS VARCHAR2 (20)) AS no_polc,
CAST (vd.cd_divs_compt AS VARCHAR2 (11)) AS cd_divs,
CAST (tcm.con_code_prodduit AS NUMBER (5, 0)) AS no_prod,
CAST (prod.cd_cie_encai AS VARCHAR2 (1)) AS cd_cie_encai,
CAST (prod.cd_faml_cptb AS VARCHAR2 (11)) AS cd_faml_cptb,
CAST (tcm.con_no_distributeur AS VARCHAR2 (15)) AS no_dist_init,
CAST (ctrpy.no_ctrpy AS NUMBER) AS no_intr_princ,
CAST (ctrpy.no_ctrpy AS NUMBER) AS no_intr_pay
FROM TPA.TPA_CONTRAT_MENSUEL tcm
INNER JOIN
VIRAGE.produit prod
ON prod.no_prod = tcm.con_code_prodduit
LEFT JOIN
counterparty ctrpy
ON ctrpy.last_name = tcm.con_nom_ass_princ
AND ctrpy.frst_name = tcm.con_prenom_ass_princ
AND ctrpy.dt_birth = tcm.con_date_naiss_ass_princ
INNER JOIN
v_distributeur vd
ON tcm.con_no_distributeur = vd.no_dist
WHERE TCM.CON_VERSION_CONTRAT=1
Indexes can only be created in tables.
Make sure you create an index for every single foreign key in every table. Oracle does not create them by default as other systems do.
Check also the performance of the sub-query and if it takes too long you should also create indexes for every attribute in the WHERE clause.
Don't forget that indexes need maintenance.
I'm working to optimize queries due to huge amount of data on Oracle.
There is one query like this.
With subquery :
SELECT
STG.ID1,
STG.ID2
FROM (SELECT
DISTINCT
H1.ID1,
H2.ID2
FROM T_STGDV STG
INNER JOIN T_HUB1 H1 ON STG.BK1 = H1.BK1
INNER JOIN T_HUB2 H2 ON STG.BK2 = H2.BK2 ) STG
LEFT OUTER JOIN T_LINK L ON L.ID1 = STG.ID1 AND L.ID2 = STG.ID2
WHERE L.IDL IS NULL;
I'm doing this optimization :
SELECT
DISTINCT
H1.ID1,
H2.ID2
FROM T_STGDV STG
INNER JOIN T_HUB1 H1 ON STG.BK1 = H1.BK1
INNER JOIN T_HUB2 H2 ON STG.BK2 = H2.BK2
LEFT OUTER JOIN T_LINK L ON L.ID1 = H1.ID1 AND L.ID2 = H2.ID2
WHERE L.IDL IS NULL;
I want to know if the result will be the same, the behavior is the same.
I did some tests, I didn't find difference but maybe i missed some test case ?
Any idea what could be the difference between those queries ?
Thanks.
Some details, the Explain plan for those testing tables (the cost are not representative of the real tables)
the First query :
Plan hash value: 2680307749
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 11 (28)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | HASH JOIN OUTER | | 1 | 65 | 11 (28)| 00:00:01 |
| 3 | VIEW | | 1 | 26 | 8 (25)| 00:00:01 |
| 4 | HASH UNIQUE | | 1 | 134 | 8 (25)| 00:00:01 |
|* 5 | HASH JOIN | | 1 | 134 | 7 (15)| 00:00:01 |
|* 6 | HASH JOIN | | 1 | 94 | 5 (20)| 00:00:01 |
| 7 | TABLE ACCESS FULL| T_STGDV | 1 | 54 | 2 (0)| 00:00:01 |
| 8 | TABLE ACCESS FULL| T_HUB1 | 2 | 80 | 2 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | T_HUB2 | 2 | 80 | 2 (0)| 00:00:01 |
| 10 | TABLE ACCESS FULL | T_LINK | 3 | 117 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("L"."IDL" IS NULL)
2 - access("L"."ID2"(+)="STG"."ID2" AND "L"."ID1"(+)="STG"."ID1")
5 - access("STG"."BK2"="H2"."BK2")
6 - access("STG"."BK1"="H1"."BK1")
Note
-----
- dynamic sampling used for this statement (level=2)
the second query
Plan hash value: 2149614538
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 11 (28)| 00:00:01 |
| 1 | HASH UNIQUE | | 1 | 65 | 11 (28)| 00:00:01 |
|* 2 | FILTER | | | | | |
|* 3 | HASH JOIN OUTER | | 1 | 65 | 10 (20)| 00:00:01 |
| 4 | VIEW | | 1 | 26 | 7 (15)| 00:00:01 |
|* 5 | HASH JOIN | | 1 | 134 | 7 (15)| 00:00:01 |
|* 6 | HASH JOIN | | 1 | 94 | 5 (20)| 00:00:01 |
| 7 | TABLE ACCESS FULL| T_STGDV | 1 | 54 | 2 (0)| 00:00:01 |
| 8 | TABLE ACCESS FULL| T_HUB1 | 2 | 80 | 2 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | T_HUB2 | 2 | 80 | 2 (0)| 00:00:01 |
| 10 | TABLE ACCESS FULL | T_LINK | 3 | 117 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("L"."IDL" IS NULL)
3 - access("L"."ID2"(+)="H2"."ID2" AND "L"."ID1"(+)="H1"."ID1")
5 - access("STG"."BK2"="H2"."BK2")
6 - access("STG"."BK1"="H1"."BK1")
Note
-----
- dynamic sampling used for this statement (level=2)
The queries look equivalent to me, because of the where clause.
Without the where clause they are not equivalent. Duplicates in t_link (relative to the join keys) would result in duplicate rows. However, you are looking for no matches, so this is not an issue. When there is no match, the two versions should be equivalent.
If you want to test them with your current dataset you can use minus.
query 1
MINUS
query 2
If any results are displayed, they are not the same.
You have to flip them around to try the other way too...
query 2
MINUS
query 1
If both tests return no records, the queries have the same effect on your current dataset.
This might be the difference: look at these lines in your execution plans:
2 - access("L"."ID2"(+)="STG"."ID2" AND "L"."ID1"(+)="STG"."ID1")
and
3 - access("L"."ID2"(+)="H2"."ID2" AND "L"."ID1"(+)="H1"."ID1")
STG is a temporary table created by Oracle for the duration of the query (that ambiguousness between T_STGDV alias and the subquery alias was alone a reason to rewrite the query). And this temporary table is of course unindexed. After your refactoring, Oracle optimiser start joining T_LINK with H1 and H2 instead of a temporary table and that allows it to utilize indexes built on those table, thus giving you the 20x increase in speed.
After testing, there are giving the same result. And the second one is more efficient.
I have one query which completes when selecting all columns (using select * from), but it doesn't complete when selecting one column name. I have created necessary indexes. here is my query
SELECT q2.ssn vn_ssn
--when * here instead of column name then the query completes
FROM table_0 q2
LEFT OUTER JOIN
(SELECT ial_t.pin,
ial_t.serial_number,
ial_t.surname,
ial_t.name,
ial_t.patronymic,
ial_t.prev_surname
FROM
(SELECT pin,
MAX(serial_number) m_serial_number
FROM table_1
GROUP BY pin
) ial_m
INNER JOIN table_1 ial_t
ON ial_t.serial_number = ial_m.m_serial_number
) ial ON q2.pincode = ial.pin
LEFT OUTER JOIN table_2 v_q2
ON V_Q2.VN_TPN = Q2.TPN
WHERE v_q2.vn_tpn IS NULL;
** EDIT: **
1. Plan (Select * from table_name)
Plan hash value: 2508092269
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 438K| 248M| | 341K (1)| 01:08:13 |
|* 1 | HASH JOIN OUTER | | 438K| 248M| 193M| 341K (1)| 01:08:13 |
|* 2 | HASH JOIN RIGHT OUTER| | 438K| 188M| 54M| 19424 (1)| 00:03:54 |
| 3 | TABLE ACCESS FULL | VN_Q2 | 439K| 49M| | 1673 (2)| 00:00:21 |
| 4 | TABLE ACCESS FULL | Q2 | 438K| 139M| | 7889 (1)| 00:01:35 |
| 5 | VIEW | | 6751K| 914M| | 262K (1)| 00:52:34 |
|* 6 | HASH JOIN | | 6751K| 386M| 122M| 262K (1)| 00:52:34 |
| 7 | VIEW | | 6742K| 45M| | 134K (1)| 00:26:55 |
| 8 | HASH GROUP BY | | 6742K| 109M| 458M| 134K (1)| 00:26:55 |
| 9 | TABLE ACCESS FULL| IAMAS_ALL_LAST_2 | 10M| 167M| | 90003 (1)| 00:18:01 |
| 10 | TABLE ACCESS FULL | IAMAS_ALL_LAST_2 | 10M| 521M| | 90270 (1)| 00:18:04 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("Q2"."PINCODE"="IAL"."PIN"(+))
2 - access("V_Q2"."VN_TPN"(+)="Q2"."TPN")
6 - access("IAL_T"."SERIAL_NUMBER"="IAL_M"."M_SERIAL_NUMBER")
2. Plan (Select column_name from table_name)
Plan hash value: 1784658367
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 55 | | 144K (1)| 00:28:52 |
| 1 | NESTED LOOPS OUTER | | 1 | 55 | | 144K (1)| 00:28:52 |
|* 2 | HASH JOIN RIGHT ANTI | | 1 | 51 | 9880K| 9735 (1)| 00:01:57 |
| 3 | INDEX FAST FULL SCAN | VN_Q2_TPN_IDX | 439K| 4722K| | 301 (2)| 00:00:04 |
| 4 | TABLE ACCESS FULL | Q2 | 438K| 16M| | 7867 (1)| 00:01:35 |
| 5 | VIEW PUSHED PREDICATE | | 1 | 4 | | 134K (1)| 00:26:55 |
|* 6 | HASH JOIN | | 1 | 32 | | 134K (1)| 00:26:55 |
| 7 | TABLE ACCESS BY INDEX ROWID| IAMAS_ALL_LAST_2 | 2 | 50 | | 5 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | IAMAS_ALL_LAST_2_INDEX2 | 2 | | | 3 (0)| 00:00:01 |
| 9 | VIEW | | 6742K| 45M| | 134K (1)| 00:26:55 |
| 10 | SORT GROUP BY | | 6742K| 109M| 458M| 134K (1)| 00:26:55 |
| 11 | TABLE ACCESS FULL | IAMAS_ALL_LAST_2 | 10M| 167M| | 90003 (1)| 00:18:01 |
------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("V_Q2"."VN_TPN"="Q2"."TPN")
6 - access("IAL_T"."SERIAL_NUMBER"="IAL_M"."M_SERIAL_NUMBER")
8 - access("IAL_T"."PIN"="Q2"."PINCODE")
You can replace the first outer join with an OLAP function:
FROM table_0 q2
LEFT OUTER JOIN
(
SELECT *
FROM
(
SELECT ial_t.pin,
ial_t.serial_number,
ial_t.surname,
ial_t.name,
ial_t.patronymic,
ial_t.prev_surname,
ROW_NUMBER() OVER (PARTITION BY pin ORDER BY serial_number DESC) AS rn
FROM table_1
) ial_m
WHERE rn = 1
) ial ON q2.pincode = ial.pin
And if you don't need to access any rows of this table in your select list you can simply remove this join, it will not change the number of rows returned as it's an outer join.