I have a table that looks like this:
Bundleref
Subbundleref
123
456
456
789
Starting from a certain reference (e.g. 123), I want a list of all descendants, all the way until the leaf values.
In Oracle, I can use a CONNECT BY clause like this:
select subbundleref from store.tw_bundles
start with bundleref = 2201114
connect by prior subbundleref = bundleref
For compatibility reasons, I am trying to convert it to a recursive CTE, like this:
WITH bundles(br,sr)
AS
(
SELECT bundleref, subbundleref
FROM store.tw_bundles where bundleref = 2201114
UNION ALL
SELECT bundleref, subbundleref
FROM store.tw_bundles twb
inner join bundles on twb.bundleref = bundles.sr
)
select sr from bundles
This gives me the same result. There is one problem though: the CONNECT BY query takes 300 ms, the recursive query takes about 50 seconds. Am I doing something inefficient here or is this not being optimized? (I'm using Oracle 19c.)
Explain plan for first query:
Plan hash value: 4216745508
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 218 | 5668 | 36 (6)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING| | | | | |
|* 2 | INDEX RANGE SCAN | TW_BUNDLES_BUNDLEREF_IDX | 14 | 168 | 3 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 204 | 5100 | 31 (0)| 00:00:01 |
| 4 | CONNECT BY PUMP | | | | | |
|* 5 | INDEX RANGE SCAN | TW_BUNDLES_BUNDLEREF_IDX | 15 | 180 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("BUNDLEREF"=PRIOR "SUBBUNDLEREF")
2 - access("BUNDLEREF"=2201114)
5 - access("connect$_by$_pump$_002"."prior subbundleref "="BUNDLEREF")
Note
-----
- this is an adaptive plan
And for the second one:
Plan hash value: 1467025167
-------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 259K (3)| 00:00:11 |
| 1 | SORT AGGREGATE | | 1 | | | | |
| 2 | VIEW | | 1975M| | | 259K (3)| 00:00:11 |
| 3 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | | |
|* 4 | INDEX RANGE SCAN | TW_BUNDLES_BUNDLEREF_IDX | 14 | 168 | | 3 (0)| 00:00:01 |
|* 5 | HASH JOIN | | 1975M| 45G| 258M| 259K (3)| 00:00:11 |
| 6 | BUFFER SORT (REUSE) | | | | | | |
| 7 | TABLE ACCESS FULL | TW_BUNDLES | 11M| 129M| | 9208 (1)| 00:00:01 |
| 8 | RECURSIVE WITH PUMP | | | | | | |
-------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("BUNDLEREF"=2201114)
5 - access("TWB"."BUNDLEREF"="BUNDLES"."SR")
Related
I have an Oracle 19c database table with "resources" that are organized hierarchically like a nested folder tree. The table contains around 2.5 million rows and the tree is up to 10 levels deep.
create table RESOURCES (
ID_ NUMBER(10) not null constraint PK_RESOURCES primary key,
FOLDERID_ NUMBER(10) constraint FK_PARENTFOLDER references RESOURCES
);
create index FOLDERIDINDEX on RESOURCES (FOLDERID_);
I'm using SQL recursive common table expressions (aka recursive subquery factoring) to find all descendants of some given resources.
In general, this works quite nicely, but if I try to get descendants of multiple folders with one query, some queries don't perform at all using Oracle. I'd like to understand why this is the case, and if there's some easy way to speed things up (query hint?, bugfix?, ...?)
For example, this statement does not return within 60 minutes(!):
WITH cte1 (id_) AS (SELECT id_ FROM Resources where id_ = 11
UNION ALL
SELECT r.id_ FROM Resources r, cte1 c WHERE r.folderId_ = c.id_),
cte2 (id_) AS (SELECT id_ FROM Resources where id_ = 808965
UNION ALL
SELECT r.id_ FROM Resources r, cte2 c WHERE r.folderId_ = c.id_)
SELECT count(*)
FROM Resources r
WHERE (r.folderId_ IN (SELECT * FROM cte1) OR r.folderId_ IN (SELECT * FROM cte2));
If I replace the two sub-selects in the last line with a UNION, it just takes a few seconds:
WITH cte1 (id_) AS (SELECT id_ FROM Resources where id_ = 11
UNION ALL
SELECT r.id_ FROM Resources r, cte1 c WHERE r.folderId_ = c.id_),
cte2 (id_) AS (SELECT id_ FROM Resources where id_ = 808965
UNION ALL
SELECT r.id_ FROM Resources r, cte2 c WHERE r.folderId_ = c.id_)
SELECT count(*)
FROM Resources r
WHERE (r.folderId_ IN (SELECT * FROM cte1 UNION SELECT * FROM cte2));
While that could already be the solution, it's a bit hard to change in my project, because SQL is auto-generated by code from application queries, and at that point not so easy to change. Also, application queries could be much more complex and such a replacement might not even be possible. These are just simplified examples. Maybe there's some other way to speed things up?
Interestingly, the slow query works without any performance problems on other databases like MySQL 8, PostgreSQL 13, SQL Server 2016 (with small syntax changes for the databases). It's just Oracle which seems to have a problem here.
This is the query plan for the first query, i.e. the slow one:
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 54G (1)|591:38:12 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | FILTER | | | | | |
| 3 | TABLE ACCESS FULL | RESOURCES | 2410K| 11M| 9389 (1)| 00:00:01 |
|* 4 | VIEW | | 239K| 3046K| 23128 (1)| 00:00:01 |
| 5 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | |
|* 6 | INDEX UNIQUE SCAN | PK_RESOURCES | 1 | 6 | 2 (0)| 00:00:01 |
|* 7 | HASH JOIN | | 239K| 5623K| 23126 (1)| 00:00:01 |
| 8 | RECURSIVE WITH PUMP | | | | | |
| 9 | BUFFER SORT (REUSE) | | | | | |
| 10 | TABLE ACCESS FULL | RESOURCES | 2410K| 25M| 9386 (1)| 00:00:01 |
|* 11 | VIEW | | 239K| 3046K| 23128 (1)| 00:00:01 |
| 12 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | |
|* 13 | INDEX UNIQUE SCAN | PK_RESOURCES | 1 | 6 | 2 (0)| 00:00:01 |
|* 14 | HASH JOIN | | 239K| 5623K| 23126 (1)| 00:00:01 |
| 15 | RECURSIVE WITH PUMP | | | | | |
| 16 | BUFFER SORT (REUSE) | | | | | |
| 17 | TABLE ACCESS FULL | RESOURCES | 2410K| 25M| 9386 (1)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
" 2 - filter( EXISTS (SELECT 0 FROM ""CTE1"" ""CTE1"" WHERE ""CTE1"".""ID_""=:B1) OR EXISTS (SELECT 0 "
" FROM ""CTE2"" ""CTE2"" WHERE ""CTE2"".""ID_""=:B2))"
" 4 - filter(""CTE1"".""ID_""=:B1)"
" 6 - access(""ID_""=11)"
" 7 - access(""R"".""FOLDERID_""=""C"".""ID_"")"
" 11 - filter(""CTE2"".""ID_""=:B1)"
" 13 - access(""ID_""=808965)"
" 14 - access(""R"".""FOLDERID_""=""C"".""ID_"")"
For comparison, the faster query using a UNION seems to use a better plan:
------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 18 | | 55733 (1)| 00:00:03 |
| 1 | SORT AGGREGATE | | 1 | 18 | | | |
|* 2 | HASH JOIN | | 2806K| 48M| 11M| 55733 (1)| 00:00:03 |
| 3 | VIEW | VW_NSO_1 | 479K| 6092K| | 50820 (1)| 00:00:02 |
| 4 | SORT UNIQUE | | 479K| 6092K| 9424K| 50820 (1)| 00:00:02 |
| 5 | UNION-ALL | | | | | | |
| 6 | VIEW | | 239K| 3046K| | 23128 (1)| 00:00:01 |
| 7 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | | |
|* 8 | INDEX UNIQUE SCAN | PK_RESOURCES | 1 | 6 | | 2 (0)| 00:00:01 |
|* 9 | HASH JOIN | | 239K| 5623K| | 23126 (1)| 00:00:01 |
| 10 | RECURSIVE WITH PUMP | | | | | | |
| 11 | BUFFER SORT (REUSE) | | | | | | |
| 12 | TABLE ACCESS FULL | RESOURCES | 2410K| 25M| | 9386 (1)| 00:00:01 |
| 13 | VIEW | | 239K| 3046K| | 23128 (1)| 00:00:01 |
| 14 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | | |
|* 15 | INDEX UNIQUE SCAN | PK_RESOURCES | 1 | 6 | | 2 (0)| 00:00:01 |
|* 16 | HASH JOIN | | 239K| 5623K| | 23126 (1)| 00:00:01 |
| 17 | RECURSIVE WITH PUMP | | | | | | |
| 18 | BUFFER SORT (REUSE) | | | | | | |
| 19 | TABLE ACCESS FULL | RESOURCES | 2410K| 25M| | 9386 (1)| 00:00:01 |
| 20 | INDEX FAST FULL SCAN | FOLDERIDINDEX | 2410K| 11M| | 2392 (1)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
" 2 - access(""R"".""FOLDERID_""=""ID_"")"
" 8 - access(""ID_""=11)"
" 9 - access(""R"".""FOLDERID_""=""C"".""ID_"")"
" 15 - access(""ID_""=808965)"
" 16 - access(""R"".""FOLDERID_""=""C"".""ID_"")"
I have query something like
WITH
str_table as (
SELECT stringtext, stringnumberid
FROM STRING_TABLE
WHERE LANGID IN (23,62)
),
data as (
select *
from employee emp
left outer join str_table st on emp.nameid = st.stringnumberid
)
select * from data
I know With clause will work in this manner
Step 1 : The SQL Query within the with clause is executed at first step.
Step 2 : The output of the SQL query is stored into temporary relation of with clause.
Step 3 : The Main query is executed with temporary relation produced at the last stage.
Now I want to ask whether the indexes created on the actual STRING_TABLE are going to help in temporary str_table produce by the With clause? I want to ask whether the indexes also have impact on str_table or not?
Oracle will not process CTE one by one. It will analyze the SQL as a whole. Your SQL is most likely the same as following in the eye of Oracle optimizer
select emp.*
from employee emp left outer join STRING_TABLE st
on emp.nameid = st.stringnumberid
where st.LANGID IN (23,62);
Oracle can use index on STRING_TABLE. Whether it will depends on the table statistics. For example, if the table has few rows (say a few hundred), Oracle will likely not use index.
It depends.
First of all, with clause is not a temporary table. As documentation says:
Oracle Database optimizes the query by treating the query name as either an inline view or as a temporary table.
Optimizer decides to materialize with subquery if either you forse it to do so by using /*+materialize*/ hint inside the subquery or you reuse this with subquery more than once.
In the example below Oracle uses with clause as inline view and merges it within the main query:
explain plan for
with a as (
select
s.textid,
s.textvalue,
a.id,
a.other_column
from string_table s
join another_tab a
on s.textid = a.textid
where langid in (1)
)
select *
from big_table b
join a a_name
on b.name_textid = a_name.textid
and b.job_textid = a_name.id
| PLAN_TABLE_OUTPUT |
| :----------------------------------------------------------------------------------- |
| Plan hash value: 1854049435 |
| |
| ------------------------------------------------------------------------------------ |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ------------------------------------------------------------------------------------ |
| | 0 | SELECT STATEMENT | | 1 | 1147 | 74 (0)| 00:00:01 | |
| |* 1 | HASH JOIN | | 1 | 1147 | 74 (0)| 00:00:01 | |
| | 2 | TABLE ACCESS FULL | ANOTHER_TAB | 39 | 3042 | 3 (0)| 00:00:01 | |
| |* 3 | HASH JOIN | | 31 | 33139 | 71 (0)| 00:00:01 | |
| | 4 | TABLE ACCESS FULL| BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | |
| |* 5 | TABLE ACCESS FULL| STRING_TABLE | 1143 | 589K| 68 (0)| 00:00:01 | |
| ------------------------------------------------------------------------------------ |
But depending on the statistics and hints it may evaluate subquery first and then add it to the main query:
explain plan for
with a as (
select
s.textid,
s.textvalue,
a.id,
a.other_column
from string_table s
join another_tab a
on s.textid = a.textid
where langid in (1)
)
select /*+NO_MERGE(a_name)*/ *
from big_table b
join a a_name
on b.name_textid = a_name.textid
and b.job_textid = a_name.id
| PLAN_TABLE_OUTPUT |
| :------------------------------------------------------------------------------------ |
| Plan hash value: 4105667421 |
| |
| ------------------------------------------------------------------------------------- |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ------------------------------------------------------------------------------------- |
| | 0 | SELECT STATEMENT | | 101 | 110K| 74 (0)| 00:00:01 | |
| |* 1 | HASH JOIN | | 101 | 110K| 74 (0)| 00:00:01 | |
| | 2 | TABLE ACCESS FULL | BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | |
| | 3 | VIEW | | 64 | 37120 | 71 (0)| 00:00:01 | |
| |* 4 | HASH JOIN | | 64 | 38784 | 71 (0)| 00:00:01 | |
| | 5 | TABLE ACCESS FULL| ANOTHER_TAB | 39 | 3042 | 3 (0)| 00:00:01 | |
| |* 6 | TABLE ACCESS FULL| STRING_TABLE | 1143 | 589K| 68 (0)| 00:00:01 | |
| ------------------------------------------------------------------------------------- |
When you use with subquery twice, optimizer decides to materialize it:
explain plan for
with a as (
select
s.textid,
s.textvalue
from string_table s
where langid in (1)
)
select *
from big_table b
join a a_name
on b.name_textid = a_name.textid
join a a_job
on b.job_textid = a_job.textid
| PLAN_TABLE_OUTPUT |
| :--------------------------------------------------------------------------------------------------------------------- |
| Plan hash value: 1371454296 |
| |
| ---------------------------------------------------------------------------------------------------------------------- |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ---------------------------------------------------------------------------------------------------------------------- |
| | 0 | SELECT STATEMENT | | 63 | 98973 | 67 (0)| 00:00:01 | |
| | 1 | TEMP TABLE TRANSFORMATION | | | | | | |
| | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D7224_469C01 | | | | | |
| | 3 | TABLE ACCESS BY INDEX ROWID BATCHED | STRING_TABLE | 999 | 515K| 22 (0)| 00:00:01 | |
| |* 4 | INDEX RANGE SCAN | IX | 999 | | 4 (0)| 00:00:01 | |
| |* 5 | HASH JOIN | | 63 | 98973 | 45 (0)| 00:00:01 | |
| |* 6 | HASH JOIN | | 35 | 36960 | 24 (0)| 00:00:01 | |
| | 7 | TABLE ACCESS FULL | BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | |
| | 8 | VIEW | | 999 | 502K| 21 (0)| 00:00:01 | |
| | 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D7224_469C01 | 999 | 502K| 21 (0)| 00:00:01 | |
| | 10 | VIEW | | 999 | 502K| 21 (0)| 00:00:01 | |
| | 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D7224_469C01 | 999 | 502K| 21 (0)| 00:00:01 | |
| ---------------------------------------------------------------------------------------------------------------------- |
So when there are some indexes on tables inside with subquery they may be used in all above cases: before materialization, when subquery is not merged and when subquery is merged and some idexes provide better query plan on merged subquery (even when those indexes are not used when you execute subquery alone).
What about idexes: if they provide high selectivity (i.e. number of rows retrieved by index is small compared to the overall number of rows), then Oracle will consider to use it. Note, that index access has two steps: read index blocks and then read table blocks that contain rowids found by index. If table size is not much bigger than index size, then Oracle may use table scan instead of index scan even for quite selective predicate (because of doubled IO).
In the below example I've used "small" texts (100 chars) and big_table table of 20 rows and this index for text table:
create index ix
on string_table(langid, textid)
Optimizer decides to use index range scan and read only blocks of the first level (first column of the index):
explain plan for
with a as (
select
s.textid,
s.textvalue
from string_table s
where langid in (1)
)
select *
from big_table b
join a a_name
on b.name_textid = a_name.textid
| PLAN_TABLE_OUTPUT |
| :---------------------------------------------------------------------------------------------------- |
| Plan hash value: 1660330381 |
| |
| ----------------------------------------------------------------------------------------------------- |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ----------------------------------------------------------------------------------------------------- |
| | 0 | SELECT STATEMENT | | 29 | 31001 | 26 (0)| 00:00:01 | |
| |* 1 | HASH JOIN | | 29 | 31001 | 26 (0)| 00:00:01 | |
| | 2 | TABLE ACCESS FULL | BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | |
| | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| STRING_TABLE | 999 | 515K| 23 (0)| 00:00:01 | |
| |* 4 | INDEX RANGE SCAN | IX | 999 | | 4 (0)| 00:00:01 | |
| ----------------------------------------------------------------------------------------------------- |
| |
| Predicate Information (identified by operation id): |
| --------------------------------------------------- |
| |
| 1 - access("B"."NAME_TEXTID"="S"."TEXTID") |
| 4 - access("LANGID"=1) | |
But when we reduce the number of rows in big_table, it uses both the columns for index scan:
delete from big_table
where id > 4
explain plan for
with a as (
select
s.textid,
s.textvalue
from string_table s
where langid in (1)
)
select *
from big_table b
join a a_name
on b.name_textid = a_name.textid
| PLAN_TABLE_OUTPUT |
| :-------------------------------------------------------------------------------------------- |
| Plan hash value: 1766926914 |
| |
| --------------------------------------------------------------------------------------------- |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| --------------------------------------------------------------------------------------------- |
| | 0 | SELECT STATEMENT | | 6 | 18216 | 11 (0)| 00:00:01 | |
| | 1 | NESTED LOOPS | | 6 | 18216 | 11 (0)| 00:00:01 | |
| | 2 | NESTED LOOPS | | 6 | 18216 | 11 (0)| 00:00:01 | |
| | 3 | TABLE ACCESS FULL | BIG_TABLE | 4 | 4032 | 3 (0)| 00:00:01 | |
| |* 4 | INDEX RANGE SCAN | IX | 1 | | 1 (0)| 00:00:01 | |
| | 5 | TABLE ACCESS BY INDEX ROWID| STRING_TABLE | 2 | 4056 | 2 (0)| 00:00:01 | |
| --------------------------------------------------------------------------------------------- |
| |
| Predicate Information (identified by operation id): |
| --------------------------------------------------- |
| |
| 4 - access("LANGID"=1 AND "B"."NAME_TEXTID"="S"."TEXTID") |
| |
You may check above code snippets in the db<>fiddle.
I'm having performance issues executing the following query (Q1):
select
z_out.*,
a_out.id
from orders a_out, test z_out
where a_out.id=z_out.id and a_out.created>trunc(sysdate) and rownum<10
Table orders contains millions of rows; orders.id is the primary key and orders.craeted is indexed.
The view is:
create or replace view test as
select/*+qb_name(q_outer)*/
id,
min(value) keep (dense_rank first order by id) as value
from (
select/*+qb_name(q_inner)*/
id,
case
when substr(id, -1)<'5'
--and exists(select 1 from dual#db2)
then 'YYY'
end as attr_1
from orders a1
) a2, small_table b2
where b2.attr_1 in (nvl(a2.attr_1, '#'), '*')
group by id
where small_table b2 contains about 200 records (all the columns are varchar2).
Executing Q1 has great performances and the following execution plan:
Plan hash value: 2906430222
-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 274 | 64 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | NESTED LOOPS | | 1 | 274 | 64 (0)| 00:00:01 | | |
| 3 | PARTITION LIST ALL | | 1 | 22 | 59 (0)| 00:00:01 | 1 | 2 |
| 4 | PARTITION RANGE ALL | | 1 | 22 | 59 (0)| 00:00:01 | 1 | LAST |
| 5 | TABLE ACCESS BY LOCAL INDEX ROWID| ORDERS | 1 | 22 | 59 (0)| 00:00:01 | 1 | 29 |
|* 6 | INDEX RANGE SCAN | IDX_ORDERS_CREATED | 1 | | 57 (0)| 00:00:01 | 1 | 29 |
| 7 | VIEW PUSHED PREDICATE | TEST | 1 | 252 | 5 (0)| 00:00:01 | | |
|* 8 | FILTER | | | | | | | |
| 9 | SORT AGGREGATE | | 1 | 55 | | | | |
| 10 | NESTED LOOPS | | 259 | 14245 | 5 (0)| 00:00:01 | | |
|* 11 | INDEX UNIQUE SCAN | PK_ID | 1 | 14 | 2 (0)| 00:00:01 | | |
|* 12 | INDEX STORAGE FAST FULL SCAN | IDX_MN_AN_AD_ALL | 259 | 10619 | 3 (0)| 00:00:01 | | |
-----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
6 - access("A_OUT"."CREATED">TRUNC(SYSDATE#!))
8 - filter(COUNT(*)>0)
11 - access("ID"="A_OUT"."ID")
12 - storage("B2"."ATTR_1"=NVL(CASE WHEN SUBSTR("ID",(-1))<'5' THEN 'YYY' END ,'#') OR "B2"."ATTR_1"='*')
filter("B2"."ATTR_1"=NVL(CASE WHEN SUBSTR("ID",(-1))<'5' THEN 'YYY' END ,'#') OR "B2"."ATTR_1"='*')
Q1 performance issues happen when the line --and exists(select 1 from dual#db2) in the view is uncommented.
The new execution plan is:
Plan hash value: 3271081243
----------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | Inst |IN-OUT|
----------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 288 | 5273K (1)| 00:03:27 | | | | |
|* 1 | COUNT STOPKEY | | | | | | | | | |
|* 2 | HASH JOIN | | 1 | 288 | 5273K (1)| 00:03:27 | | | | |
| 3 | JOIN FILTER CREATE | :BF0000 | 1 | 22 | 59 (0)| 00:00:01 | | | | |
| 4 | PARTITION LIST ALL | | 1 | 22 | 59 (0)| 00:00:01 | 1 | 2 | | |
| 5 | PARTITION RANGE ALL | | 1 | 22 | 59 (0)| 00:00:01 | 1 | LAST | | |
| 6 | TABLE ACCESS BY LOCAL INDEX ROWID| ORDERS | 1 | 22 | 59 (0)| 00:00:01 | 1 | 29 | | |
|* 7 | INDEX RANGE SCAN | IDX_ORDERS_CREATED | 1 | | 57 (0)| 00:00:01 | 1 | 29 | | |
| 8 | VIEW | TEST | 3840K| 974M| 5273K (1)| 00:03:27 | | | | |
| 9 | SORT GROUP BY | | 3840K| 201M| 5273K (1)| 00:03:27 | | | | |
| 10 | JOIN FILTER USE | :BF0000 | 994M| 50G| 5273K (1)| 00:03:27 | | | | |
| 11 | NESTED LOOPS | | 994M| 50G| 5273K (1)| 00:03:27 | | | | |
| 12 | INDEX FULL SCAN | PK_ID | 3840K| 51M| 66212 (1)| 00:00:03 | | | | |
|* 13 | INDEX STORAGE FAST FULL SCAN | IDX_MN_AN_AD_ALL | 259 | 10619 | 1 (0)| 00:00:01 | | | | |
| 14 | REMOTE | | | | | | | | DB2 | R->S |
----------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
2 - access("A_OUT"."ID"="Z_OUT"."ID")
7 - access("A_OUT"."CREATED">TRUNC(SYSDATE#!))
13 - filter("B2"."ATTR_1"=NVL(CASE WHEN (SUBSTR("ID",(-1))<'5' AND EXISTS (SELECT 0 FROM "A1")) THEN 'YYY' END ,'#') OR
"B2"."ATTR_1"='*')
Remote SQL Information (identified by operation id):
----------------------------------------------------
14 - EXPLAIN PLAN INTO PLAN_TABLE#! FOR SELECT 0 FROM "DUAL" "A1" (accessing 'DB2' )
I would like the view to be accessed n times, like in the first scenario.
I tried using hints but didn't succeed.
May be useful to say that even with the line and exists(select 1 from dual#db2) uncommented in the view, the following query has great performances (I know that is different from Q1).
select
(select value from test z_out where a_out.id=z_out.id) as value,
a_out.id
from orders a_out
where a_out.created>trunc(sysdate) and rownum<10
So, I guess the view works fine when it's accessed n times even if the line and exists(select 1 from dual#db2) is uncommented. But I'm not being able to force the execution plan in that direction.
If hints are necessary, I'd like to add them inside the view DDL only (if possible) so that who uses the view won't have to worry about it.
================================================================
Edit: the following were executed:
alter session set statistics_level = 'ALL';
-- Q1 (the query I'm having problems with)
select * from table (dbms_xplan.display_cursor (format=>'ALLSTATS LAST'));
Plan hash value: 3271081243
------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 0 | 0 | | | |
|* 1 | COUNT STOPKEY | | 1 | | 0 |00:00:00.01 | 0 | 0 | | | |
|* 2 | HASH JOIN | | 1 | 1 | 0 |00:00:00.01 | 0 | 0 | 3789K| 3789K| 1078K (0)|
| 3 | JOIN FILTER CREATE | :BF0000 | 1 | 1 | 25602 |00:00:00.22 | 23345 | 161 | | | |
| 4 | PARTITION LIST ALL | | 1 | 1 | 25602 |00:00:00.21 | 23345 | 161 | | | |
| 5 | PARTITION RANGE ALL | | 2 | 1 | 25602 |00:00:00.21 | 23345 | 161 | | | |
| 6 | TABLE ACCESS BY LOCAL INDEX ROWID| ORDERS | 29 | 1 | 25602 |00:00:00.20 | 23345 | 161 | | | |
|* 7 | INDEX RANGE SCAN | IDX_CREATED | 13 | 1 | 25602 |00:00:00.12 | 474 | 161 | 1025K| 1025K| |
| 8 | VIEW | TEST | 1 | 3820K| 0 |00:00:00.01 | 0 | 0 | | | |
| 9 | SORT GROUP BY | | 1 | 3820K| 0 |00:00:00.01 | 0 | 0 | 73728 | 73728 | |
| 10 | JOIN FILTER USE | :BF0000 | 1 | 989M| 106M|00:03:38.87 | 60M| 52960 | | | |
| 11 | NESTED LOOPS | | 1 | 989M| 328M|00:03:04.11 | 60M| 52960 | | | |
| 12 | INDEX FULL SCAN | PK_ID | 1 | 3820K| 1245K|00:00:21.04 | 200K| 52959 | 1025K| 1025K| |
|* 13 | INDEX STORAGE FAST FULL SCAN | IDX_MN_AN_AD_ALL | 1245K| 259 | 328M|00:02:12.09 | 60M| 1 | 1025K| 1025K| |
| 14 | REMOTE | | 1 | | 1 |00:00:00.01 | 0 | 0 | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
2 - access("A_OUT"."ID"="Z_OUT"."ID")
7 - access("A_OUT"."CREATED">TRUNC(SYSDATE#!))
13 - filter(("B2"."ATTR_1"=NVL(CASE WHEN (SUBSTR("ID",(-1))<'5' AND IS NOT NULL) THEN 'YYY' END ,'#') OR "B2"."ATTR_1"='*'))
Note: Q1 performances prevent the query to complete if and exists(select 1 from dual#db2) in the view is uncommented. To get the previous execution plan I had to alter the session, run Q1, stop Q1 (after about 4 minutes) and then calculate the plan.
The following execution plan was generated the same way, but the view had the line --and exists(select 1 from dual#db2) commented (performances were good).
Plan hash value: 2906430222
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 9 |00:00:00.01 | 223 |
|* 1 | COUNT STOPKEY | | 1 | | 9 |00:00:00.01 | 223 |
| 2 | NESTED LOOPS | | 1 | 1 | 9 |00:00:00.01 | 223 |
| 3 | PARTITION LIST ALL | | 1 | 1 | 9 |00:00:00.01 | 41 |
| 4 | PARTITION RANGE ALL | | 1 | 1 | 9 |00:00:00.01 | 41 |
| 5 | TABLE ACCESS BY LOCAL INDEX ROWID| ORDERS | 14 | 1 | 9 |00:00:00.01 | 41 |
|* 6 | INDEX RANGE SCAN | IDX_CREATED | 12 | 1 | 9 |00:00:00.01 | 33 |
| 7 | VIEW PUSHED PREDICATE | TEST | 9 | 1 | 9 |00:00:00.01 | 182 |
|* 8 | FILTER | | 9 | | 9 |00:00:00.01 | 182 |
| 9 | SORT AGGREGATE | | 9 | 1 | 9 |00:00:00.01 | 182 |
| 10 | NESTED LOOPS | | 9 | 259 | 2376 |00:00:00.01 | 182 |
|* 11 | INDEX UNIQUE SCAN | PK_ID | 9 | 1 | 9 |00:00:00.01 | 20 |
|* 12 | INDEX STORAGE FAST FULL SCAN | IDX_MN_AN_AD_ALL | 9 | 259 | 2376 |00:00:00.01 | 162 |
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
6 - access("A_OUT"."CREATED">TRUNC(SYSDATE#!))
8 - filter(COUNT(*)>0)
11 - access("ID"="A_OUT"."ID")
12 - storage(("B2"."ATTR_1"=NVL(CASE WHEN SUBSTR("ID",(-1))<'5' THEN 'YYY' END ,'#') OR
"B2"."ATTR_1"='*'))
filter(("B2"."ATTR_1"=NVL(CASE WHEN SUBSTR("ID",(-1))<'5' THEN 'YYY' END ,'#') OR
"B2"."ATTR_1"='*'))
Oracle query.
The following query takes some time to execute:
SELECT GCCOM_ACCOUNT_CONTRACT.ID_ACCOUNT_CONTRACT
FROM GCCOM_ACCOUNT_CONTRACT, GCCOM_ACCOUNT_GROUP
WHERE
GCCOM_ACCOUNT_CONTRACT.ID_ACCOUNT_GROUP = GCCOM_ACCOUNT_GROUP.ID_ACCOUNT_GROUP (+) AND
EXISTS (SELECT 1 FROM GCCOM_CONTRACTED_SERVICE
WHERE ID_ACCOUNT_CONTRACT = GCCOM_ACCOUNT_CONTRACT.ID_ACCOUNT_CONTRACT AND
STATUS = 'ESTSC00002' AND
DROP_DATE IS NULL ) AND
EXISTS (SELECT 1 FROM GCCOM_SEND_SERVICE
WHERE (ID_ACCOUNT_CONTRACT = GCCOM_ACCOUNT_CONTRACT.ID_ACCOUNT_CONTRACT OR
ID_ACCOUNT_GROUP = GCCOM_ACCOUNT_GROUP.ID_ACCOUNT_GROUP)
) AND
(( GCCOM_ACCOUNT_CONTRACT.ACCOUNT_CODE between 200000001 AND 900468243))
ORDER BY
GCCOM_ACCOUNT_CONTRACT.ID_COMPANY,
GCCOM_ACCOUNT_GROUP.ID_ACCOUNT_GROUP,
GCCOM_ACCOUNT_CONTRACT.ACCOUNT_CODE
The explain plan shows as follows:
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 391653930
------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 45 | | 570K (1)| 01:54:06 | | |
| 1 | SORT ORDER BY | | 1 | 45 | 12M| 570K (1)| 01:54:06 | | |
|* 2 | FILTER | | | | | | | | |
| 3 | NESTED LOOPS OUTER | | 255K| 10M| | 17381 (1)| 00:03:29 | | |
|* 4 | HASH JOIN RIGHT SEMI | | 255K| 9979K| 8648K| 17380 (1)| 00:03:29 | | |
| 5 | PARTITION HASH ALL | | 260K| 5592K| | 7175 (1)| 00:01:27 | 1 | 16 |
|* 6 | TABLE ACCESS FULL | GCCOM_CONTRACTED_SERVICE | 260K| 5592K| | 7175 (1)| 00:01:27 | 1 | 16 |
|* 7 | TABLE ACCESS FULL | GCCOM_ACCOUNT_CONTRACT | 810K| 13M| | 8627 (1)| 00:01:44 | | |
|* 8 | INDEX UNIQUE SCAN | PK_GCCOM_ACCOUNT_GROUP | 1 | 5 | | 1 (0)| 00:00:01 | | |
| 9 | CONCATENATION | | | | | | | | |
| 10 | TABLE ACCESS BY GLOBAL INDEX ROWID| GCCOM_SEND_SERVICE | 1 | 7 | | 1 (0)| 00:00:01 | ROWID | ROWID |
|* 11 | INDEX RANGE SCAN | IDX_GCCOMSENDSERVICE_27 | 1 | | | 1 (0)| 00:00:01 | | |
|* 12 | TABLE ACCESS BY GLOBAL INDEX ROWID| GCCOM_SEND_SERVICE | 2 | 14 | | 1 (0)| 00:00:01 | ROWID | ROWID |
|* 13 | INDEX RANGE SCAN | IDX_GCCOMSENDSERVICE_04 | 1 | | | 1 (0)| 00:00:01 | | |
------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------
2 - filter( EXISTS (SELECT 0 FROM "GCCOM_SEND_SERVICE" "GCCOM_SEND_SERVICE"<not feasible>)
4 - access("ID_ACCOUNT_CONTRACT"="GCCOM_ACCOUNT_CONTRACT"."ID_ACCOUNT_CONTRACT")
6 - filter("DROP_DATE" IS NULL AND "STATUS"='ESTSC00002')
7 - filter("GCCOM_ACCOUNT_CONTRACT"."ACCOUNT_CODE">=200000001 AND "GCCOM_ACCOUNT_CONTRACT"."ACCOUNT_CODE"<=900468243)
8 - access("GCCOM_ACCOUNT_CONTRACT"."ID_ACCOUNT_GROUP"="GCCOM_ACCOUNT_GROUP"."ID_ACCOUNT_GROUP"(+))
11 - access("ID_ACCOUNT_CONTRACT"=:B1)
12 - filter(LNNVL("ID_ACCOUNT_CONTRACT"=:B1))
13 - access("ID_ACCOUNT_GROUP"=:B1)
32 filas seleccionadas.
The average cardinality of table looks as follows:
select count(*) from GCCOM_ACCOUNT_CONTRACT >> rows: 810412
select avg(distinct ID_ACCOUNT_GROUP) from GCCOM_ACCOUNT_CONTRACT >> cardinality: 87173
Highly indexed.
Tried many things, but useless.
Any idea ?
I'm having a big performance problem with the following query. And need help to make it as fast as possible.
VIEW_SHIPMENT_ORDER_RELEASE got 2 million rows and I'm sure that I can make a better query to speed this. The application is taking almost 2 minutes to run.
SELECT O.ORDER_RELEASE_GID
FROM ORDER_RELEASE O, ORDER_RELEASE_STATUS S
WHERE O.ORDER_RELEASE_GID = S.ORDER_RELEASE_GID
AND S.STATUS_TYPE_GID = 'STATUS'
AND S.STATUS_VALUE_GID IN ('OPEN', 'OPEN-HANDLE')
AND O.SOURCE_LOCATION_GID = '114'
AND O.ORDER_RELEASE_GID NOT IN
(SELECT V.ORDER_RELEASE_GID FROM VIEW_SHIPMENT_ORDER_RELEASE V
WHERE V.ORDER_RELEASE_GID = O.ORDER_RELEASE_GID)
Here's the view code:
create or replace view glogowner.view_shipment_order_release as
select distinct shp.perspective, shp.shipment_gid, ssul.order_release_gid
from shipment shp,
shipment_s_equipment_join ssej,
s_equipment_s_ship_unit_join sessuj,
s_ship_unit_line ssul
where shp.shipment_gid = ssej.shipment_gid
and ssej.s_equipment_gid = sessuj.s_equipment_gid
and sessuj.s_ship_unit_gid = ssul.s_ship_unit_gid
and ssul.order_release_gid is not null
The explain plan:
1 Plan hash value: 1257125198
2
3 --------------------------------------------------------------------------------------------------------------------------------------
4 | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Inst |
5 --------------------------------------------------------------------------------------------------------------------------------------
6 | 0 | SELECT STATEMENT REMOTE | | 314 | 98596 | | 35795 (1)| 00:07:10 | |
7 | 1 | NESTED LOOPS | | | | | | | |
8 | 2 | NESTED LOOPS | | 314 | 98596 | | 35795 (1)| 00:07:10 | |
9 |* 3 | HASH JOIN ANTI | | 201 | 48441 | | 35192 (1)| 00:07:03 | |
10 | 4 | TABLE ACCESS BY INDEX ROWID | ORDER_RELEASE | 20104 | 726K| | 3893 (1)| 00:00:47 | ABC123 |
11 |* 5 | INDEX RANGE SCAN | OR_SOURCE_LOCATION_GID | 20104 | | | 157 (0)| 00:00:02 | ABC123 |
12 | 6 | VIEW | VW_SQ_1 | 1515K| 294M| | 31293 (1)| 00:06:16 | ABC123 |
13 |* 7 | HASH JOIN | | 1515K| 144M| | 31293 (1)| 00:06:16 | |
14 | 8 | INDEX STORAGE FAST FULL SCAN | IND_SSEJ_SEQUIPGID | 69218 | 811K| | 91 (0)| 00:00:02 | ABC123 |
15 |* 9 | HASH JOIN | | 1515K| 127M| 73M| 31195 (1)| 00:06:15 | |
16 | 10 | INDEX STORAGE FAST FULL SCAN| PK_S_EQUIPMENT_S_SHIP_UNIT_JOI | 1515K| 56M| | 3958 (1)| 00:00:48 | ABC123 |
17 |* 11 | TABLE ACCESS STORAGE FULL | S_SHIP_UNIT_LINE | 1619K| 75M| | 18893 (1)| 00:03:47 | ABC123 |
18 |* 12 | INDEX UNIQUE SCAN | PK_ORDER_RELEASE_STATUS | 1 | | | 2 (0)| 00:00:01 | ABC123 |
19 |* 13 | TABLE ACCESS BY INDEX ROWID | ORDER_RELEASE_STATUS | 2 | 146 | | 3 (0)| 00:00:01 | ABC123 |
20 --------------------------------------------------------------------------------------------------------------------------------------
21
22 Predicate Information (identified by operation id):
23 ---------------------------------------------------
24
25 3 - access("A2"."ORDER_RELEASE_GID"="ORDER_RELEASE_GID")
26 5 - access("A2"."SOURCE_LOCATION_GID"='114')
27 7 - access("SSEJ"."S_EQUIPMENT_GID"="SESSUJ"."S_EQUIPMENT_GID")
28 9 - access("SESSUJ"."S_SHIP_UNIT_GID"="SSUL"."S_SHIP_UNIT_GID")
29 11 - storage("SSUL"."ORDER_RELEASE_GID" IS NOT NULL)
30 filter("SSUL"."ORDER_RELEASE_GID" IS NOT NULL)
31 12 - access("A2"."ORDER_RELEASE_GID"="A1"."ORDER_RELEASE_GID" AND "A1"."STATUS_TYPE_GID"='STATUS')
32 13 - filter("A1"."STATUS_VALUE_GID"='OPEN' OR "A1"."STATUS_VALUE_GID"='OPEN-HANDLE')
I'd make sure that the following are indexed:
shipment.shipment_gid
shipment_s_equipment_join.s_equipment_gid
s_equipment_s_ship_unit_join.s_ship_unit_gid
s_ship_unit_line.order_release_gid
The NOT IN might work better as a NOT EXISTS.