How is this cardinality being calculated in Explain plan? - sql

I am analyzing the "explanation plan" about the following instruction
SELECT * FROM friends WHERE SUBSTR(activity,1,2) = '49';
and Oracle SQL Developer tells me that it has a cardinality of 1513 and cost of 1302.
How are these calculations performed? Could be reproduced with an instruction (calculate with a select and obtain de same value)?

The cardinality generated by an explain plan can be based on many factors, but in your code Oracle is probably just guessing that the SUBSTR expression will return 1% of all rows from the table.
For example, we can recreate your cardinality estimate by creating a simple table with 151,300 rows:
drop table friends;
create table friends(activity varchar2(100));
create index friends_idx on friends(activity);
insert into friends select level from dual connect by level <= 1513 * 100;
begin
dbms_stats.gather_table_stats(user, 'FRIENDS', no_invalidate => false);
end;
/
The resulting explain plan estimates the query will return 1% of the table, or 1513 rows:
explain plan for SELECT * FROM friends WHERE SUBSTR(activity,1,2) = '49';
select * from table(dbms_xplan.display);
Plan hash value: 3524934291
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1513 | 9078 | 72 (6)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| FRIENDS | 1513 | 9078 | 72 (6)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUBSTR("ACTIVITY",1,2)='49')
The above code is the simplest explanation, but there are potentially dozens of other weird things that are going on with your query. Running EXPLAIN PLAN FOR SELECT... and then SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY); is often enough to investigate the cardinality. Pay special attention to the "Note" section for any unexpected gotchas.
Not all of these cardinality rules and features are documented. But if you have a lot of free time, and want to understand the math behind it all, run some 10053 trace files and read Jonathan Lewis' blog and book. His book also explains how the "cost" is generated, but the calculations are so complicated that it's not worth worrying about.
Why doesn't Oracle calculate a perfect cardinality estimate?
It's too expensive to calculate actual cardinalities before running the queries. To create an always-perfect estimate for the SUBSTR operation, Oracle would have to run something like the below query:
SELECT SUBSTR(activity,1,2), COUNT(*)
FROM friends
GROUP BY SUBSTR(activity,1,2);
For my sample data, the above query returns 99 counts, and determines that the cardinality estimate should be 1111 for the original query.
But the above query has to first read all the data from FRIENDS.ACTIVITY, which requires either an index fast full scan or a full table scan. Then the data has to be sorted or hashed to get the counts per group (which is likely an O(N*LOG(N)) operation). If the table is large, the intermediate results won't fit in memory and must be written and then read from disk.
Pre-calculating the cardinality would be more work than the actual query itself. The results could perhaps be saved, but storing those results could take up a lot of space, and how does the database know that the predicate will ever be needed again? And even if the pre-calculated cardinalities were stored, as soon as someone modifies the table those values may become worthless.
And this whole effort assumes that the functions are deterministic. While SUBSTR works reliably, what if there was a custom function like DBMS_RANDOM.VALUE? These problems are both theoretically impossible (the halting problem), and very difficult in practice. Instead, the optimizer relies on guesses like DBA_TABLES.NUM_ROWS (from when the statistics were last gathered) * 0.01 for "complex" predicates.
Dynamic Sampling
Dynamic sampling, also known as dynamic statistics, will pre-run parts of your SQL statement to create a better estimate. You can set the amount of data to be sampled, and by setting the value to 10, Oracle will effectively run the whole thing ahead of time to determine the cardinality. This feature can obviously be pretty slow, and there are lots of weird edge cases and other features I'm not discussing here, but for your query it can create a perfect estimate of 1,111 rows:
EXPLAIN PLAN FOR SELECT /*+ dynamic_sampling(10) */ * FROM friends WHERE SUBSTR(activity,1,2) = '49';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
Plan hash value: 3524934291
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1111 | 6666 | 72 (6)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| FRIENDS | 1111 | 6666 | 72 (6)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUBSTR("ACTIVITY",1,2)='49')
Note
-----
- dynamic statistics used: dynamic sampling (level=10)
Dynamic Reoptimization
Oracle can keep track of the number of rows at run-time and adjust the plan accordingly. This feature doesn't help you with your simple sample query. But if the table was used as part of a join, when the cardinality estimates become more important, Oracle will build multiple versions of the explain plan and use the one depending on the actual cardinality.
In the below explain plan, you can see the estimate is still the same old 1513. But if the actual number is much lower at run time, Oracle will disable the HASH JOIN operation meant for a large number of rows, and will switch to the NESTED LOOPS operation that is better suited for a smaller number of rows.
EXPLAIN PLAN FOR
SELECT *
FROM friends friends1
JOIN friends friends2
ON friends1.activity = friends2.activity
WHERE SUBSTR(friends1.activity,1,2) = '49';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(format => '+adaptive'));
Plan hash value: 215764417
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1530 | 18360 | 143 (5)| 00:00:01 |
| * 1 | HASH JOIN | | 1530 | 18360 | 143 (5)| 00:00:01 |
|- 2 | NESTED LOOPS | | 1530 | 18360 | 143 (5)| 00:00:01 |
|- 3 | STATISTICS COLLECTOR | | | | | |
| * 4 | TABLE ACCESS FULL | FRIENDS | 1513 | 9078 | 72 (6)| 00:00:01 |
|- * 5 | INDEX RANGE SCAN | FRIENDS_IDX | 1 | 6 | 168 (2)| 00:00:01 |
| 6 | TABLE ACCESS FULL | FRIENDS | 151K| 886K| 70 (3)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("FRIENDS1"."ACTIVITY"="FRIENDS2"."ACTIVITY")
4 - filter(SUBSTR("FRIENDS1"."ACTIVITY",1,2)='49')
5 - access("FRIENDS1"."ACTIVITY"="FRIENDS2"."ACTIVITY")
Note
-----
- this is an adaptive plan (rows marked '-' are inactive)
Expression Statistics
Expression statistics tells Oracle to gather additional types of statistics. We can force Oracle to gather statistics on the SUBSTR expression, and then those statistics can be used for more accurate estimates. In the below example, the final estimate is actually only slightly different. Expression statistics alone don't work well here, but that was just bad luck in this case.
SELECT dbms_stats.create_extended_stats(extension => '(SUBSTR(activity,1,2))', ownname => user, tabname => 'FRIENDS')
FROM DUAL;
begin
dbms_stats.gather_table_stats(user, 'FRIENDS');
end;
/
EXPLAIN PLAN FOR SELECT * FROM friends WHERE SUBSTR(activity,1,2) = '49';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
Plan hash value: 3524934291
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1528 | 13752 | 72 (6)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| FRIENDS | 1528 | 13752 | 72 (6)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUBSTR("ACTIVITY",1,2)='49')
Expression Statistics and Histograms
With the addition of a histogram, we're finally creating something pretty similar to what your teacher described. When the expression statistics are gathered, a histogram will save information about the number of unique values in up to 255 different ranges or buckets. In our case, since there are only 99 unique rows, the histogram will perfectly estimate the number of rows for '49' as '1111'.
--(There are several ways to gather histograms. Instead of directly forcing it, I prefer to call the query
-- multiple times so that Oracle will register the need for a histogram, and automatically create one.)
SELECT * FROM friends WHERE SUBSTR(activity,1,2) = '49';
SELECT * FROM friends WHERE SUBSTR(activity,1,2) = '49';
begin
dbms_stats.gather_table_stats(user, 'FRIENDS');
end;
/
EXPLAIN PLAN FOR SELECT * FROM friends WHERE SUBSTR(activity,1,2) = '49';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
Plan hash value: 3524934291
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1111 | 9999 | 72 (6)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| FRIENDS | 1111 | 9999 | 72 (6)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUBSTR("ACTIVITY",1,2)='49')
Summary
Oracle will not automatically pre-run all predicates to perfectly estimate cardinalities. But there are several mechanisms we can use to get Oracle to do something very similar for a small number of queries that we care about.
The situation gets even more complicated when you consider bind variables - what if the value '49' changes frequently? (Adaptive Cursor Sharing can help with that.) Or what if a huge amount of rows are modified, how do we update statistics quickly? (Online Statistics Gathering and Incremental Statistics can help with that.)
The optimizer doesn't really optimize. There's only enough time to satisfice.

Related

Why my Oracle sql query is not using the available indexes on join columns?

I have executed the below query but the indexes are not being used.
Following are the indexes available for the below tables.
I have provided the explain plan generated for the query.
Can some one please tell me why the indexes are not being used.
I have gathered the table statistics multiple times also.
wms_area_master - Index name: WMS_AREA_MASTER_PK - Index columns: DC_CODE, DC_AREA
wms_bin_master - WMS_BIN_MASTER_IDX - DC_CODE, DC_AREA
EXPLAIN PLAN FOR
SELECT *
from wms_area_master wam ,
wms_bin_master wbm
where WAM.DC_CODE = wBM.DC_CODE
and WAM.DC_AREA = wBM.DC_AREA;
select * from table(dbms_xplan.display);
Plan hash value: 2387754896
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 41079 | 12M| 252 (2)| 00:00:01 |
|* 1 | HASH JOIN | | 41079 | 12M| 252 (2)| 00:00:01 |
| 2 | TABLE ACCESS FULL| WMS_AREA_MASTER | 217 | 32984 | 4 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| WMS_BIN_MASTER | 41058 | 6214K| 248 (2)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("WAM"."DC_CODE"="WBM"."DC_CODE" AND
"WAM"."DC_AREA"="WBM"."DC_AREA")
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
- this is an adaptive plan
- 1 Sql Plan Directive used for this statement
Thanks
Your query doesn't appear to have any predicates, just join conditions, so there doesn't appear to be any reason to use an index here. Since you need to read all the data from both tables, the fastest way to do so will be to do table scans. Using an index isn't necessarily faster and doing a table scan isn't necessarily slower-- it depends on how much of the data you need to access.
If you had predicates in your query that restricted the rows that were returned, Oracle might find it advantageous to use an index on those columns. If your projection (the columns in the select) list were only columns that were part of an index rather than every column in the table, it is possible that Oracle would choose to do a full scan of the index rather than of the table assuming the index segment was meaningfully smaller than the table segment.

Oracle SQL execution plan changes due to SYS_OP_C2C internal conversion

I'm wondering why cost of this query
select * from address a
left join name n on n.adress_id=a.id
where a.street='01';
is higher than
select * from address a
left join name n on n.adress_id=a.id
where a.street=N'01';
where address table looks like this
ID NUMBER
STREET VARCHAR2(255 CHAR)
POSTAL_CODE VARCHAR2(255 CHAR)
and name table looks like this
ID NUMBER
ADDRESS_ID NUMBER
NAME VARCHAR2(255 CHAR)
SURNAME VARCHAR2(255 CHAR)
These are costs returned by explain plan
Explain plan for '01'
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3591 | 1595K| 87 (0)| 00:00:02 |
| 1 | NESTED LOOPS OUTER | | 3591 | 1595K| 87 (0)| 00:00:02 |
|* 2 | TABLE ACCESS FULL | ADDRESS | 3 | 207 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| NAME | 1157 | 436K| 47 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | NAME_HSI | 1157 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"."STREET"='01')
4 - access("N"."ADDRESS_ID"(+)="A"."ID")
Explain plan for N'01'
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 347 | 154K| 50 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 347 | 154K| 50 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL | ADDRESS | 1 | 69 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| NAME | 1157 | 436K| 47 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | NAME_HSI | 1157 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(SYS_OP_C2C("A"."STREET")=U'01')
4 - access("N"."ADDRESS_ID"(+)="A"."ID")
As you can see cost for N'01' query is lower than cost for '01'. Any idea why? N'01' needs additionally convert varchar to nvarchar so cost should be higher (SYS_OP_C2C()). The other question is why rows processed by N'01' query is lower than '01'?
[EDIT]
Table address has 30 rows.
Table name has 19669 rows.
SYS_OP_C2C is an internal function which does an implicit conversion of varchar2 to national character set using TO_NCHAR function. Thus, the filter completely changes as compared to the filter using normal comparison.
I am not sure about the reason why the number of rows are less, but I can guarantee it could be more too. Cost estimation won't be affected.
Let's try to see step-by-step in a test case.
SQL> CREATE TABLE t AS SELECT 'a'||LEVEL col FROM dual CONNECT BY LEVEL < 1000;
Table created.
SQL>
SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE col = 'a10';
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 5 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
1 - filter("COL"='a10')
13 rows selected.
SQL>
So far so good. Since there is only one row with value as 'a10', optimizer estimated one row.
Let's see with the national characterset conversion.
SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE col = N'a10';
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 50 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 10 | 50 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
1 - filter(SYS_OP_C2C("COL")=U'a10')
13 rows selected.
SQL>
What happened here? We can see filter(SYS_OP_C2C("COL")=U'a10'), which means an internal function is applied and it converts the varchar2 value to nvarchar2. The filter now found 10 rows.
This will also suppress any index usage, since now a function is applied on the column. We can tune it by creating a function-based index to avoid full table scan.
SQL> create index nchar_indx on t(to_nchar(col));
Index created.
SQL>
SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE to_nchar(col) = N'a10';
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1400144832
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 50 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 10 | 50 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | NCHAR_INDX | 4 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
---------------------------------------------------
2 - access(SYS_OP_C2C("COL")=U'a10')
14 rows selected.
SQL>
However, will this make the execution plans similar? No. i think with two different charactersets , the filter will not be applied alike. Thus, the difference lies.
My research says,
Usually, such scenarios occur when the data coming via an application
is nvarchar2 type, but the table column is varchar2. Thus, Oracle
applies an internal function in the filter operation. My suggestion
is, to know your data well, so that you use similar data types during
design phase.
When worrying about explain plans, it matters whether there are current statistics on the tables. If the statistics do not represent the actual data reasonably well, then the optimizer will make mistakes and estimate cardinalities incorrectly.
You can check how long ago statistics were gathered by querying the data dictionary:
select table_name, last_analyzed
from user_tables
where table_name in ('ADDRESS','NAME');
You can gather statistics for the optimizer to use by calling DBMS_STATS:
begin
dbms_stats.gather_table_stats(user, 'ADDRESS');
dbms_stats.gather_table_stats(user, 'NAME');
end;
So perhaps after gathering statistics you will get different explain plans. Perhaps not.
The difference in your explain plans is primarily because the optimizer estimates how many rows it will find in address table differently in the two cases.
In the first case you have an equality predicate with same datatype - this is good and the optimizer can often estimate cardinality (row count) reasonably well for cases like this.
In the second case a function is applied to the column - this is often bad (unless you have function based indexes) and will force the optimizer to take a wild guess. That wild quess will be different in different versions of Oracle as the developers of the optimizer tries to improve upon it. Some versions the wild guess will simply be something like "I guess 5% of the number of rows in the table."
When comparing different datatypes, it is best to avoid implicit conversions, particularly when like this case the implicit conversion makes a function on the column rather than the literal. If you have cases where you get a value as datatype NVARCHAR2 and need to use it in a predicate like above, it can be a good idea to explicitly convert the value to the datatype of the column.
select * from address a
left join name n on n.adress_id=a.id
where a.street = CAST( N'01' AS VARCHAR2(255));
In this case with a literal it does not make sense, of course. Here you would just use your first query. But if it was a variable or function parameter, maybe you could have use cases for doing something like this.
As I can see the first query returns 3591 rows, the second one returns 347 rows. So Oracle needs less I/O operation that's why the cost is less.
Don't be confused with
N'01' needs additionally convert varchar to nvarchar
Oracle does one hard parse and then uses soft parse for the same queries. So the longer your oracle works the faster it becomes.

Explain plan and Query execution time differences

I have two tables TABLE_A and TABLE_B ( one to many. FK of table_a in table_b ). I have written the following 3 queries and each one of it will perform at different speeds on the tables but basically they all are doing the same.
Time: 3.916 seconds.
SELECT count(*)
FROM TABLE_A hconn
WHERE EXISTS
(SELECT *
FROM TABLE_B hipconn
WHERE HIPCONN.A_ID = HCONN.A_ID
);
Time: 3.52 seconds
SELECT COUNT(*)
FROM TABLE_A hconn,
TABLE_B HIPCONN
WHERE HCONN.A_ID = HIPCONN.A_ID;
Time: 2.72 seconds.
SELECT COUNT(*)
FROM TABLE_A HCONN
JOIN TABLE_B HIPCONN
ON HCONN.A_ID = HIPCONN.A_ID;
From the above timings, we can know that the last query is performing better than other. (I've tested them a bunch of times and they all perform in the same order mentioned but the last query performed well always).
I've started looking at the explain plan for the above queries to find out why it is happening.
Query explain plan, it prints out the same cost and time for all the above queries without any difference.(Explain plan below) I re-ran a couple of times, but the result is same for all the above queries.
Question: Why does the speed of the results vary when the explain plan showed that it takes same amount of time for all the queries? where am I going wrong?
Plan hash value: 600428245
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 11 | | 12913 (2)| 00:02:35 |
| 1 | SORT AGGREGATE | | 1 | 11 | | | |
|* 2 | HASH JOIN RIGHT SEMI | | 2273K| 23M| 39M| 12913 (2)| 00:02:35 |
| 3 | INDEX STORAGE FAST FULL SCAN| BIN$ACCkNNuTHKPgUJAKNJgj5Q==$0 | 2278K| 13M| | 1685 (2)| 00:00:21 |
| 4 | INDEX STORAGE FAST FULL SCAN| BIN$ACCkNNubHKPgUJAKNJgj5Q==$0 | 6448K| 30M| | 4009 (2)| 00:00:49 |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("HIPCONN"."A_ID"="HCONN"."A_ID")
You may use DBMS_XPLAN.DISPLAY_CURSOR to display the actual execution plan for the last SQL statement executed, since the queries may have more than one execution plan in the library cache.
Also you may enable a 10046 trace at level 12 to check why the queries are responding with different execution times.

Oracle: is there any logical reason not to use parallel execution with subqueries in the SELECT list?

Is there any logical reason for Oracle not to use parallel execution with scalar subqueries in the SELECT list? Why it shouldn't use them?
A SELECT statement can be parallelized only if the following
conditions are satisfied:
The query includes a parallel hint specification (PARALLEL or
PARALLEL_INDEX) or the schema objects referred to in the query have a
PARALLEL declaration associated with them.
At least one of the tables specified in the query requires one of the
following:
A full table scan
An index range scan spanning multiple partitions
No scalar subqueries are in the SELECT list.
Every item in that list is wrong.
(At least for Oracle 11gR2, and probably10g as well. The list may be accurate for some obsolete versions of Oracle.)
I recommend using the official Oracle documentation whenever possible, but the parallel execution chapter is not very accurate.
And even when the manual isn't wrong, it is often misleading, because parallel execution is very complicated. If you go through all the documentation you'll find there are about 30 different variables that determine the degree of parallelism. If you ever see a short checklist of items, you should be very skeptical. Those checklists are usually just the most relevant items to consider in a very specific context.
Example:
SQL> --Create a table without any parallel settings
SQL> create table parallel_test(a number primary key, b number);
Table created.
SQL> --Create some test data
SQL> insert into parallel_test
2 select level, level from dual connect by level <= 100000;
100000 rows created.
SQL> commit;
Commit complete.
SQL> --Force the session to run the query in parallel
SQL> alter session force parallel query;
Session altered.
SQL> --Generate explain plan
SQL> explain plan for
2 select a
3 ,(
4 select a
5 from parallel_test parallel_test2
6 where parallel_test2.a = parallel_test.a
7 )
8 from parallel_test;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3823224058
---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 116K| 1477K| 9 (0)| 00:00:01 | | | |
|* 1 | INDEX UNIQUE SCAN | SYS_C0028894 | 1 | 13 | 1 (0)| 00:00:01 | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 116K| 1477K| 9 (0)| 00:00:01 | Q1,00 | P->S | QC (RAND) |
| 4 | PX BLOCK ITERATOR | | 116K| 1477K| 9 (0)| 00:00:01 | Q1,00 | PCWC | |
| 5 | INDEX FAST FULL SCAN| SYS_C0028894 | 116K| 1477K| 9 (0)| 00:00:01 | Q1,00 | PCWP | |
---------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("PARALLEL_TEST2"."A"=:B1)
Note
-----
- dynamic sampling used for this statement (level=2)
21 rows selected.
SQL>
No parallel hint, no parallel objects, no full table scans, no index range scans spanning multiple partitions, and a scalar subquery.
Not a single condition met, yet the query still uses parallelism. (I also verified v$px_process to make sure that the query really does use parallelism, and it's not just an explain plan failure.)
This means the answer to your other question is wrong.
I'm not sure exactly what's going on in that case, but I think it has to do with the FAST DUAL optimization. In some contexts, DUAL isn't used as a table, so there's nothing to parallelize. This is probably a "bug", but if you're using DUAL then you really don't want parallelism anyway. (Although I assume you used DUAL for demonstration purposes, and your real query is more complicated. If so, you may need to update the query with a more realistic example.)

Total cost of a query through Oracle's explain plan

I am a bit new to Oracle and I am have a question regarding Oracle's explain plan. I have used the 'auto-trace' feature for a particular query.
SQL> SELECT * from myTable;
11 rows selected.
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 1233351234
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 330 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| MYTABLE| 11 | 330 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
My question is if I want to calculate the 'total' cost of this query, is it 6 (3+3) or its only 3. Suppose I had a larger query with more steps in the plan, do I have to add up all the values in the cost column to get the total cost or is it the first value (ID=0) that is the total cost of a query?
Cost is 3, the plan is shown as a hierarchy, with the cost of the sub-components already included in the parent components.
You might also want to take a look at some of the responses to:
How do you interpret a query's explain plan?