Comparing two join queries in Oracle - sql

I have 2 queries do the same job:
SELECT * FROM student_info
INNER JOIN class
ON student_info.id = class.studentId
WHERE student_info.name = 'Ken'
SELECT * FROM (SELECT * FROM student_info WHERE name = 'Ken') studInfo
INNER JOIN class
ON student_info.id = class.studentId
Which one is faster? I guess the second but not sure, I am using Oracle 11g.
UPDATED:
My tables are non-indexed and I confirm two PLAN_TABLE_OUTPUTs are almost same:
Full size image

In the latest versions of Oracle, the optimizer is smart enough to do its job. So it won't matter and both of your queries would be internally optimized to do the task efficiently. Optimizer might do a query re-write and opt an efficient execution plan.
Let's understand this with a small example of EMP and DEPT table. I will use two similar queries like yours in the question.
I will take two cases, first a predicate having a non-indexed column, second with an indexed column.
Case 1 - predicate having a non-indexed column
SQL> explain plan for
2 SELECT * FROM emp e
3 INNER JOIN dept d
4 ON e.deptno = d.deptno
5 where ename = 'SCOTT';
Explained.
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 3625962092
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 59 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 1 | 59 | 4 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | EMP | 1 | 39 | 3 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("E"."ENAME"='SCOTT')
4 - access("E"."DEPTNO"="D"."DEPTNO")
Note
-----
- this is an adaptive plan
22 rows selected.
SQL>
SQL> explain plan for
2 SELECT * FROM (SELECT * FROM emp WHERE ename = 'SCOTT') e
3 INNER JOIN dept d
4 ON e.deptno = d.deptno;
Explained.
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 3625962092
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 59 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 1 | 59 | 4 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | EMP | 1 | 39 | 3 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("ENAME"='SCOTT')
4 - access("EMP"."DEPTNO"="D"."DEPTNO")
Note
-----
- this is an adaptive plan
22 rows selected.
SQL>
Case 2 - predicate having an indexed column
SQL> explain plan for
2 SELECT * FROM emp e
3 INNER JOIN dept d
4 ON e.deptno = d.deptno
5 where empno = 7788;
Explained.
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 2385808155
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 59 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 59 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 39 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("E"."EMPNO"=7788)
5 - access("E"."DEPTNO"="D"."DEPTNO")
18 rows selected.
SQL>
SQL> explain plan for
2 SELECT * FROM (SELECT * FROM emp where empno = 7788) e
3 INNER JOIN dept d
4 ON e.deptno = d.deptno;
Explained.
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 2385808155
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 59 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 59 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 39 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("EMPNO"=7788)
5 - access("EMP"."DEPTNO"="D"."DEPTNO")
18 rows selected.
SQL>
Is there any difference between the explain plans in each case respectively? No.

You'd need to show us the query plans and the execution statistics to be certain. That said, assuming name is indexed and statistics are reasonably accurate, I'd be shocked if the two queries didn't generate the same plan (and, thus, the same performance). With either query, Oracle is free to evaluate the predicate before or after it evaluates the join so it is unlikely that it would choose differently in the two cases.

I would definitely lean towards the first query.
When selects are nested, Oracle has fewer optimization opportunities. It generally has to evaluate the inner select into a temporary view and then apply the outer select to that. That is rarely faster than a JOIN where Oracle will evaluate everything together.
Showing your EXPLAIN PLAN would provide extra info for us as well.

Related

Two similar tables but different join performances

I am running the exact same join query using two different tables, but the first one (table A) times out whereas the second (table B) does not.
SELECT * FROM table_X
INNER JOIN table_A
ON table_A.point_origin = table_X.item_id
WHERE ROWNUM < 10;
SELECT * FROM table_X
INNER JOIN table_B
ON table_B.point_origin = table_X.item_id
WHERE ROWNUM < 10;
As far as I know, table A is a subset of table B. Neither table A nor table B have point_origin indexed.
(Edit for clarification: table A is a only a subset of table B in terms of row identifiers, not in terms of exact column data.)
For what it's worth, I'm dealing with very large tables and item_id is indexed.
Is there anything else that would affect performance here or am I definitely wrong about some information provided?
Edit: Additional information per a comment below
table_A:
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Pstart| Pstop |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 4743 | 12 (0)| | |
|* 1 | COUNT STOPKEY | | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| table_X | 1 | 227 | 1 (0)| | |
| 3 | NESTED LOOPS | | 11 | 5797 | 12 (0)| | |
| 4 | PARTITION RANGE ALL | | 10M| 2969M| 2 (0)| 1 | 4 |
| 5 | TABLE ACCESS FULL | table_A | 10M| 2969M| 2 (0)| 1 | 4 |
|* 6 | INDEX RANGE SCAN | table_X_IP_PK | 1 | | 1 (0)| | |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
6 - access("table_A"."POINT_ORIGIN"="table_X"."ITEM_ID")
Note
-----
- 'PLAN_TABLE' is old version
table_B:
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 3879 | 11 (0)|
|* 1 | COUNT STOPKEY | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| table_X | 1 | 227 | 1 (0)|
| 3 | NESTED LOOPS | | 10 | 4310 | 11 (0)|
| 4 | TABLE ACCESS FULL | table_B | 118M| 22G| 2 (0)|
|* 5 | INDEX RANGE SCAN | table_X_IP_PK | 1 | | 1 (0)|
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
5 - access("table_B"."POINT_ORIGIN"="table_X"."ITEM_ID")
Note
-----
- 'PLAN_TABLE' is old version
It appears that table_a is partitioned and that the query only needs to scan 4 partitions while table_b is not partitioned and must be read in its entirety. The optimizer estimates that 4 partitions of table_a have 10 million rows while table_b has 118 million rows. You're using a nested loop so you'd expect O(n) performance so based on the statistics, it would make sense that the second query would take ~11.8 times as long as the first query.
Are the optimizer's estimates accurate? The optimizer is only as good as the statistics you've given it and it is possible that one or both tables have stale statistics.

SQL Multiple Minus vs Multiple Join Performance

I hope someone can explain the performance of joining multiple tables vs. using MINUS to eliminate records. I looked at a few other stack overflow questions but didn't see what I was looking for.
I thought these two queries would produce the same output, and I have always heard "use joins, use joins!", particularly from stackoverflow posts, that they were expected to be faster...
This is the first query I ran which I thought would be much slower, but it takes only a matter of minutes to run...
select some_id
from table1
MINUS
select some_id
from table2
where table2.value = 'some_value'
MINUS
select some_id
from table3
where table3.value = 'some_value'
group by some_id
This is the second query which I thought would be faster, but it has been running for over 3 hours now (with no end in sight?)
select some_id
from table1
join table2 on table1.id=table2.id
join table3 on table1.id=table3.id
where table2.value = 'some_value'
or table3.value = 'some_value'
group by some_id
I should note all 3 tables have > 1 Million records, up to 15 Million records each.
EDIT:
Sorry - I meant to let you know I was avoiding the use of NOT EXISTS in this question as a response, as I really am curious about just these two scenarios.
Try this version:
select some_id
from table1
where not exists (select 1 from table2 t2 on t1.id = t2.id and t2.value = 'some_value') or
not exists (select 1 from table3 t3 on t1.id = t3.id and t3.value = 'some_value')
For best performance, you want indexes on table2(id, value) and table3(id, value).
Firstly make sure you have the indexes in place,
to see the plan, if it is making use of full table scan, the go ahead with the creating of indexes else it is going to take a long , long time.
if you have plsql developer, then paste the query in the in sql window and press F5 it would give you the explain plan .
or can do this also,
SCOTT#research 17-APR-15> EXPLAIN PLAN FOR
2 select empno
3 from emp
4 MINUS
5 select empno
6 from empp
7 where empp.empno = '7839'
8 MINUS
9 select empno
10 from emppp
11 where emppp.empno = '7902'
12 group by empno
13 ;
Explained.
SCOTT#research 17-APR-15> SET LINESIZE 130
SCOTT#research 17-APR-15> SET PAGESIZE 0
SCOTT#research 17-APR-15> SELECT *
2 FROM TABLE(DBMS_XPLAN.DISPLAY);
Plan hash value: 4222598102
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 82 | 10 (90)| 00:00:01 |
| 1 | MINUS | | | | | |
| 2 | MINUS | | | | | |
| 3 | SORT UNIQUE NOSORT | | 14 | 56 | 2 (50)| 00:00:01 |
| 4 | INDEX FULL SCAN | PK_EMP | 14 | 56 | 1 (0)| 00:00:01 |
| 5 | SORT UNIQUE NOSORT | | 1 | 13 | 4 (25)| 00:00:01 |
|* 6 | TABLE ACCESS FULL | EMPP | 1 | 13 | 3 (0)| 00:00:01 |
| 7 | SORT UNIQUE NOSORT | | 1 | 13 | 4 (25)| 00:00:01 |
| 8 | SORT GROUP BY NOSORT| | 1 | 13 | 4 (25)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | EMPPP | 1 | 13 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("EMPP"."EMPNO"=7839)
9 - filter("EMPPP"."EMPNO"=7902)
Note
-----
- dynamic sampling used for this statement (level=2)
26 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2137789089
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 29 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR PICKLER FETCH| DISPLAY | 8168 | 16336 | 29 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
or if you want to use autotrace then do,
set autotrace on explain
This is how it would look,
SCOTT#research 17-APR-15> select empno
2 from emp
3 MINUS
4 select empno
5 from empp
6 where empp.empno = '7839'
7 MINUS
8 select empno
9 from emppp
10 where emppp.empno = '7902'
11 group by empno
12 ;
EMPNO
----------
234
7499
7521
7566
7654
7698
7782
7788
7844
7876
7900
7934
12 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 4222598102
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 82 | 10 (90)| 00:00:01 |
| 1 | MINUS | | | | | |
| 2 | MINUS | | | | | |
| 3 | SORT UNIQUE NOSORT | | 14 | 56 | 2 (50)| 00:00:01 |
| 4 | INDEX FULL SCAN | PK_EMP | 14 | 56 | 1 (0)| 00:00:01 |
| 5 | SORT UNIQUE NOSORT | | 1 | 13 | 4 (25)| 00:00:01 |
|* 6 | TABLE ACCESS FULL | EMPP | 1 | 13 | 3 (0)| 00:00:01 |
| 7 | SORT UNIQUE NOSORT | | 1 | 13 | 4 (25)| 00:00:01 |
| 8 | SORT GROUP BY NOSORT| | 1 | 13 | 4 (25)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | EMPPP | 1 | 13 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("EMPP"."EMPNO"=7839)
9 - filter("EMPPP"."EMPNO"=7902)
Note
-----
- dynamic sampling used for this statement (level=2)
SCOTT#research 17-APR-15>
SCOTT#research 17-APR-15> select emp.empno
2 from emp
3 join empp on emp.empno=empp.empno
4 join emppp on emp.empno=emppp.empno
5 where empp.empno = '7839'
6 or emppp.empno = '7902'
7 group by emp.empno
8 ;
EMPNO
----------
7839
7902
Execution Plan
----------------------------------------------------------
Plan hash value: 1435156579
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 30 | 8 (25)| 00:00:01 |
| 1 | HASH GROUP BY | | 1 | 30 | 8 (25)| 00:00:01 |
|* 2 | HASH JOIN | | 1 | 30 | 7 (15)| 00:00:01 |
| 3 | NESTED LOOPS | | 6 | 102 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| EMPPP | 6 | 78 | 3 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN| PK_EMP | 1 | 4 | 0 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | EMPP | 10 | 130 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("EMP"."EMPNO"="EMPP"."EMPNO")
filter("EMPP"."EMPNO"=7839 OR "EMPPP"."EMPNO"=7902)
5 - access("EMP"."EMPNO"="EMPPP"."EMPNO")
Note
-----
- dynamic sampling used for this statement (level=2)

Why is Oracle ignoring index with ORDER BY?

My intention is to obtain a paginated resultset of customers. I am using this algorithm, from Tom:
select * from (
select /*+ FIRST_ROWS(20) */ FIRST_NAME, ROW_NUMBER() over (order by FIRST_NAME) RN
from CUSTOMER C
)
where RN between 1 and 20
order by RN;
I also have an index defined on the column "CUSTOMER"."FIRST_NAME":
CREATE INDEX CUSTOMER_FIRST_NAME_TEST ON CUSTOMER (FIRST_NAME ASC);
The query returns the expected resultset, but from the explain plan I notice that the index is not used:
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 15467 | 679K| 157 (3)| 00:00:02 |
| 1 | SORT ORDER BY | | 15467 | 679K| 157 (3)| 00:00:02 |
|* 2 | VIEW | | 15467 | 679K| 155 (2)| 00:00:02 |
|* 3 | WINDOW SORT PUSHED RANK| | 15467 | 151K| 155 (2)| 00:00:02 |
| 4 | TABLE ACCESS FULL | CUSTOMER | 15467 | 151K| 154 (1)| 00:00:02 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("RN">=1 AND "RN"<=20)
3 - filter(ROW_NUMBER() OVER ( ORDER BY "FIRST_NAME")<=20)
I am using Oracle 11g. Since I just query for the first 20 rows, ordered by the indexed column, I would expect the index to be used.
Why is the Oracle optimizer ignoring the index? I assume it's something wrong with the pagination algorithm, but I can't figure out what.
Thanks.
more than likely your FIRST_NAME column is nullable.
SQL> create table customer (first_name varchar2(20), last_name varchar2(20));
Table created.
SQL> insert into customer select dbms_random.string('U', 20), dbms_random.string('U', 20) from dual connect by level <= 100000;
100000 rows created.
SQL> create index c on customer(first_name);
Index created.
SQL> explain plan for select * from (
2 select /*+ FIRST_ROWS(20) */ FIRST_NAME, ROW_NUMBER() over (order by FIRST_NAME) RN
3 from CUSTOMER C
4 )
5 where RN between 1 and 20
6 order by RN;
Explained.
SQL> #explain ""
Plan hash value: 1474094583
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 117K| 2856K| | 1592 (1)| 00:00:20 |
| 1 | SORT ORDER BY | | 117K| 2856K| 4152K| 1592 (1)| 00:00:20 |
|* 2 | VIEW | | 117K| 2856K| | 744 (2)| 00:00:09 |
|* 3 | WINDOW SORT PUSHED RANK| | 117K| 1371K| 2304K| 744 (2)| 00:00:09 |
| 4 | TABLE ACCESS FULL | CUSTOMER | 117K| 1371K| | 205 (1)| 00:00:03 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("RN">=1 AND "RN"<=20)
3 - filter(ROW_NUMBER() OVER ( ORDER BY "FIRST_NAME")<=20)
Note
-----
- dynamic sampling used for this statement (level=2)
21 rows selected.
SQL> alter table customer modify first_name not null;
Table altered.
SQL> explain plan for select * from (
2 select /*+ FIRST_ROWS(20) */ FIRST_NAME, ROW_NUMBER() over (order by FIRST_NAME) RN
3 from CUSTOMER C
4 )
5 where RN between 1 and 20
6 order by RN;
Explained.
SQL> #explain ""
Plan hash value: 1725028138
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 117K| 2856K| | 850 (1)| 00:00:11 |
| 1 | SORT ORDER BY | | 117K| 2856K| 4152K| 850 (1)| 00:00:11 |
|* 2 | VIEW | | 117K| 2856K| | 2 (0)| 00:00:01 |
|* 3 | WINDOW NOSORT STOPKEY| | 117K| 1371K| | 2 (0)| 00:00:01 |
| 4 | INDEX FULL SCAN | C | 117K| 1371K| | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("RN">=1 AND "RN"<=20)
3 - filter(ROW_NUMBER() OVER ( ORDER BY "FIRST_NAME")<=20)
Note
-----
- dynamic sampling used for this statement (level=2)
21 rows selected.
SQL>
add a NOT NULL in there to resolve it.
SQL> explain plan for select * from (
2 select /*+ FIRST_ROWS(20) */ FIRST_NAME, ROW_NUMBER() over (order by FIRST_NAME) RN
3 from CUSTOMER C
4 where first_name is not null
5 )
6 where RN between 1 and 20
7 order by RN;
Explained.
SQL> #explain ""
Plan hash value: 1725028138
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 117K| 2856K| | 850 (1)| 00:00:11 |
| 1 | SORT ORDER BY | | 117K| 2856K| 4152K| 850 (1)| 00:00:11 |
|* 2 | VIEW | | 117K| 2856K| | 2 (0)| 00:00:01 |
|* 3 | WINDOW NOSORT STOPKEY| | 117K| 1371K| | 2 (0)| 00:00:01 |
|* 4 | INDEX FULL SCAN | C | 117K| 1371K| | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("RN">=1 AND "RN"<=20)
3 - filter(ROW_NUMBER() OVER ( ORDER BY "FIRST_NAME")<=20)
4 - filter("FIRST_NAME" IS NOT NULL)
Note
-----
- dynamic sampling used for this statement (level=2)
22 rows selected.
SQL>
You're querying for more columns than first_name. The index on first_name just contains the first_name column and a reference to the table. So to retrieve the other columns, Oracle has to perform a lookup to the table itself for each row. Most databases try to avoid this if they can't guarantee a low record count.
A database is typically not smart enough to know the effects of a where clause on a row_number column. However, your hint /*+ FIRST_ROWS(20) */ might have done the trick.
Perhaps the table is really small, so that Oracle expects the table scan to be cheaper than lookups, even for just 20 rows.

sql query optimisation

Please compare the following:
INNER JOIN table1 t1 ON t1.someID LIKE 'search.%' AND
t1.someID = ( 'search.' || t0.ID )
vs.
INNER JOIN table1 t1 ON t1.someID = ( 'search.' || t0.ID )
I've been told, that the first case is optimized. But you know, I can not understand why it is. As far as I understand the 2nd example should run faster.
We use Oracle, but I suppose it does not matter at the moment.
Please explain if I'm wrong.
Thank you
So, here is the explain plan for a query which joins on just the concatenated string:
SQL> explain plan for
2 select e.* from emp e
3 join big_table bt on bt.col2 = 'search'||trim(to_char(e.empno))
4 /
Explained.
SQL> select * from table(dbms_xplan.display)
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 179424166
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1052 | 65224 | 43 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1052 | 65224 | 43 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 20 | 780 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | BIG_VC_I | 53 | 1219 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("BT"."COL2"='search'||TRIM(TO_CHAR("E"."EMPNO")))
15 rows selected.
SQL>
Compare and contrast with the plan for a query which includes the LIKE clause in its join:
SQL> explain plan for
2 select e.* from emp e
3 join big_table bt on (bt.col2 like 'search%'
4 and bt.col2 = 'search'||trim(to_char(e.empno)))
5 /
Explained.
SQL> select * from table(dbms_xplan.display)
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 179424166
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 62 | 5 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 62 | 5 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| EMP | 1 | 39 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | BIG_VC_I | 1 | 23 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter('search'||TRIM(TO_CHAR("E"."EMPNO")) LIKE 'search%')
3 - access("BT"."COL2"='search'||TRIM(TO_CHAR("E"."EMPNO")))
filter("BT"."COL2" LIKE 'search%')
17 rows selected.
SQL>
The cost of the second query is much lower than the first. But this is because the optimizer is estimating that the second query will return far fewer rows than the first query. More information allows the database to make a more accurate prediction. (In fact the query will return no rows).
Of course this does presume the joined column is indexed, otherwise it won't make any difference.
The other thing to bear in mind is that the columns which are queried can affect the plan. This version selects from BIG_TABLE rather than EMP.
SQL> explain plan for
2 select bt.* from emp e
3 join big_table bt on (bt.col2 like 'search%'
4 and bt.col2 = 'search'||trim(to_char(e.empno)))
5 /
Explained.
SQL> select * from table(dbms_xplan.display)
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------
Plan hash value: 4042413806
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 46 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 1 | 46 | 4 (0)| 00:00:01 |
|* 3 | INDEX FULL SCAN | PK_EMP | 1 | 4 | 1 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | BIG_VC_I | 1 | | 2 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE | 1 | 42 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter('search'||TRIM(TO_CHAR("E"."EMPNO")) LIKE 'search%')
4 - access("BT"."COL2"='search'||TRIM(TO_CHAR("E"."EMPNO")))
filter("BT"."COL2" LIKE 'search%')
19 rows selected.
SQL>
The query analysis of the various database engines would really tell the story but my first instinct would be that the first form is in fact optimized. The reason is that the compiler cannot guess as the to results of the concatenation. It must do more work to determine the value against which to do the match and would likely result in a table scan. The first still must do that, however, it is able to narrow the resultset using the LIKE operator (presuming an index exists on the someID column) first and thus has to do fewer concatenations.

(Oracle Performance) Will a query based on a view limit the view using the where clause?

In Oracle (10g), when I use a View (not Materialized View), does Oracle take into account the where clause when it executes the view?
Let's say I have:
MY_VIEW =
SELECT *
FROM PERSON P, ORDERS O
WHERE P.P_ID = O.P_ID
And I then execute the following:
SELECT *
FROM MY_VIEW
WHERE MY_VIEW.P_ID = '1234'
When this executes, does oracle first execute the query for the view and THEN filter it based on my where clause (where MY_VIEW.P_ID = '1234') or does it do this filtering as part of the execution of the view? If it does not do the latter, and P_ID had an index, would I also lose out on the indexing capability since Oracle would be executing my query against the view which doesn't have the index rather than the base table which has the index?
It will not execute the query first. If you have a index on P_ID, it will be used.
Execution plan is the same as if you would merge both view-code and WHERE-clause into a single select statement.
You can try this for yourself:
EXPLAIN PLAN FOR
SELECT *
FROM MY_VIEW
WHERE MY_VIEW.P_ID = '1234'
followed by
SELECT * FROM TABLE( dbms_xplan.display );
---------------------------------------------------------------------------------
|Id | Operation | Name |Rows| Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | 2 (0)| 00:00:01|
| 1 | NESTED LOOPS | | 1 | 52 | 2 (0)| 00:00:01|
| 2 | TABLE ACCESS BY INDEX ROWID| PERSON | 1 | 26 | 2 (0)| 00:00:01|
| 3 | INDEX UNIQUE SCAN | PK_P | 1 | | 1 (0)| 00:00:01|
| 4 | TABLE ACCESS BY INDEX ROWID| ORDERS | 1 | 26 | 0 (0)| 00:00:01|
| 5 | INDEX RANGE SCAN | IDX_O | 1 | | 0 (0)| 00:00:01|
---------------------------------------------------------------------------------
WOW!! This is interesting.. I have two different explain plan depends on different data volumn & query inside logical view(This is my assumption)
The original question case : It is definitely doing filtering first.
I have small number of data(<10 in total) in this test table.
`
| 0 | SELECT STATEMENT | | 2 | 132 | 2 (0)|
00:00:01 |
| 1 | NESTED LOOPS | | 2 | 132 | 2 (0)|
00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| PERSON | 1 | 40 | 1 (0)|
00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PERSON_PK | 1 | | 0 (0)|
00:00:01 |
|* 4 | INDEX RANGE SCAN | ORDERS_PK | 2 | 52 | 1 (0)|
00:00:01 |
Predicate Information (identified by operation id)
3 - access("P"."P_ID"=1)
4 - access("O"."P_ID"=1)
Note
dynamic sampling used for this statement
`
However, when the data become larger(just several hundreads though, 300 ~ 400) and the view query become complex(using "connect by" etc..), the plan changed, I think...
| 0 | SELECT STATEMENT | | 1 | 29 | 2
(0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 29 | 2
(0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| RP_TRANSACTION | 1 | 12 | 1
(0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | RP_TRANSACTION_PK | 1 | | 0
(0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| RP_REQUEST | 279 | 4743 | 1
(0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | RP_REQUEST_PK | 1 | | 0
(0)| 00:00:01 |
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("TRANSACTION_ID"=18516648)
5 - access("REQ"."REQUEST_ID"="TRANS"."REQUEST_ID")
---- Below is my original post
In my knowledge, the oracle first execute the view(logical view) using temporary space and then do the filter.. So your query is basically same as
SELECT *
FROM ( SELECT *
FROM PERSON P, ORDERS O
WHERE P.P_ID = O.P_ID
) where P_ID='1234'
I don't think you can create index on logical view(Materialized view uses index)
Also, you should be aware, you would execute the query for MY_VIEW, everytime you using
select *
from MY_VIEW
where P_ID = '1234'.
I mean every single time. Naturally, it is not a good idea for the performance matter