How to optimize query? Explain Plan - sql

I have one table with 3 fields and I neeed get all value of fields, I have next query:
SELECT COM.FIELD1, COM.FIELD2, COM.FIELD3
FROM OWNER.TABLE_NAME COM
WHERE COM.FIELD1 <> V_FIELD
ORDER BY COM.FIELD3 ASC;
And i want optimaze, I have next values of explain plan:
Plan
SELECT STATEMENT CHOOSECost: 4 Bytes: 90 Cardinality: 6
2 SORT ORDER BY Cost: 4 Bytes: 90 Cardinality: 6
1 TABLE ACCESS FULL OWNER.TABLE_NAME Cost: 2 Bytes: 90 Cardinality: 6
Any solution for not get TAF(Table Acces Full)?
Thanks!

Since your WHERE condition is on the column FIELD1, an index on that column many help.
You may already have an index on that column. Even then, you will still see a full table access, if the expected number of rows that don't have VAL1 in that column is sufficiently large.
The only case when you will NOT see full table access is if you have an index on that column, the vast majority (at least, say, 80% to 90%) of rows in the table do have the value VAL1 in the column FIELD1, and statistics are up to date AND, perhaps, you need to use a histogram (because in this case the distribution of values in FIELD1 would be very skewed).

I suppose that your table has a very large number of rows with a given key (let call it 'B') and a very small number of rows with other keys.
Note, that the index access will work only for conditions FIELD1 <> 'B', all other predicates will return 'B' and therefore are not suitable for index access.
Note also that if you have more that one large key, the index access will not work from the same reason - you will never get only a few record where index can profit.
As a starting point you can reformulte the predicate
FIELD1 <> V_FIELD
as
DECODE(FIELD1,V_FIELD,1,0) = 0
The DECODE return 1 if FIELD1 = V_FIELD and returns 0 if FIELD1 <> V_FIELD
This transformation allows you to define a function based index with the DECODE expression.
Example
create table tt as
select
decode(mod(rownum,10000),1,'A','B') FIELD1
from dual connect by level <= 50000;
select field1, count(*) from tt group by field1;
FIELD1 COUNT(*)
------ ----------
A 5
B 49995
FBIndex
create index tti on tt(decode(field1,'B',1,0));
Use your large key for the index definition.
Access
To select FIELD1 <> 'B' use reformulated predicate decode(field1,'B',1,0) = 0
Which leads nicely to an index access:
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
SELECT * from tt where decode(field1,'B',1,0) = 0;
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 471 | 2355 | 24 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT | 471 | 2355 | 24 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TTI | 188 | | 49 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(DECODE("FIELD1",'B',1,0)=0)
To select FIELD1 <> 'A' use reformulated predicate decode(field1,'A',1,0) = 0
Here you don't want index access as nearly the whole table is returned- and the CBO opens FULL TABLE SCAN.
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
SELECT * from tt where decode(field1,'A',1,0) = 0;
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 47066 | 94132 | 26 (4)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TT | 47066 | 94132 | 26 (4)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(DECODE("FIELD1",'A',1,0)=0)
Bind Variables
This will work the same way even if you use bind variables FIELD1 <> V_FIELD - provided you pass always the same value.
The bind variable peeking will evaluate the correct plan in the first parse and generate the proper plan.
If you will use more that one values as bind variable (and therefore expect to get different plans for different values) - you will learn the feature of adaptive cursor sharing

The query is already optimized, don't spend any more time on it unless it's running noticeably slow. If you have a tuning checklist that says "avoid all full table scans" it might be time to change that checklist.
The cost of the full table scan is only 2. The exact meaning of the cost is tricky, and not always particularly helpful. But in this case it's probably safe to say that 2 means the full table scan will run quickly.
If the query is not running in less than a few microseconds, or is returning significantly more than the estimated 6 rows, then there may be a problem with the optimizer statistics. If that's the case, try gathering statistics like this:
begin
dbms_stats.gather_table_stats('OWNER', 'TABLE_NAME');
end;
/
As #symcbean pointed out, a full table scan is not always a bad thing. If a table is incredibly small, like this one might be, all the data may fit inside a single block. (Oracle accesses data by block(s)-at-a-time, where the block is usually 8KB of data.) When the data structures are trivially small there won't be any significant difference between using a table or an index.
Also, full table scans can use multi-block reads, whereas most index access paths use single-block reads. For reading a large percentage of data it's faster to read the whole thing with multi-block reads than reading it one-block-at-a-time with an index. Since this query only has a <> condition, it looks likely that this query will read a large percentage of data and a full table scan is optimal.

Related

What is SYS_OP_UNDESCEND and SYS_OP_DESCEND in Oracle Explain Plan?

I have an Oracle explain plan that looks like this:
Plan hash value: 2484140766
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 180K| 84M| 5 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 180K| 84M| 5 (0)| 00:00:01 |
|* 3 | TABLE ACCESS BY INDEX ROWID | OSTRICH | 6500K| 793M| 5 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN DESCENDING| OSTRICH_ENDDATE_IDX_2 | 1 | | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=180000)
3 - filter("OSTRICH_STATUS_ID"=2)
4 - access(SYS_OP_DESCEND("END_DATE")>=SYS_OP_DESCEND(SYSDATE#!))
filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("END_DATE"))<=SYSDATE#!)
I have been trying to understand what is happening with these 2 lines at the bottom:
4 - access(SYS_OP_DESCEND("END_DATE")>=SYS_OP_DESCEND(SYSDATE#!))
filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("END_DATE"))<=SYSDATE#!)
What do SYS_OP_UNDESCEND and SYS_OP_DESCEND mean?
The index that the explain plan references is (I think) called a descending index. (I do not know a lot about Oracle indexing.) The DDL for that index is:
CREATE INDEX
OSTRICH_ENDDATE_IDX_2
ON
OSTRICH
(
"END_DATE" DESC
);
The actual query looks like this:
SELECT
l.id,
l.end_date,
l.status
FROM
(
SELECT
*
from OSTRICH l2
where END_DATE <= SYSDATE
and OSTRICH_STATUS_ID = 2
order by l2.END_DATE
) l
WHERE ROWNUM <= 180000;
What do SYS_OP_UNDESCEND and SYS_OP_DESCEND mean? This query is taking much longer than I would expect, and I am trying to understand what impact the descending and undescending has on the query?
Oracle implements the descending index "as if" it were a function-based index. Function-based indexes are invoked when a query uses the function call; thus an FBI on upper(col1) would be used when the WHERE clause filters on upper(col1) = 'WHATEVER'.
In this case I think the SYS_OP_DESCEND is the "function" Oracle uses when creating a descending index I think it is then invoking SYS_OP_UNDESCEND because your WHERE clause is unsuited to a descending index. It's not surprising the performance sucks.
There are very few use cases where a descending index is a good idea. Why are you using one on this column on this table?
Assuming there is a good reason for using the index and you can't just drop it, your best bet for improved performance would be to not use the index for this query. Doing something like this should prevent the optimiser not using the index:
SELECT
l.id,
l.end_date,
l.status
FROM
(
SELECT /*+ NO_INDEX(l2 OSTRICH_ENDDATE_IDX_2) */
*
from OSTRICH l2
where END_DATE <= SYSDATE
and OSTRICH_STATUS_ID = 2
order by l2.END_DATE
) l
WHERE ROWNUM <= 180000;
SYS_OP_UNDESCENDand SYS_OP_DESCEND are internal functions used by the CBO that appear in the EXPLAIN PLAN when a function based index is used or a sort operation inside an index clause has been specified.
In your case, you are using an INDEX with a SORT clause
CREATE INDEX
OSTRICH_ENDDATE_IDX_2
ON
OSTRICH
(
"END_DATE" DESC
);
Your plan shows these two operations:
access(SYS_OP_DESCEND("END_DATE")>=SYS_OP_DESCEND(SYSDATE#!))
filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("END_DATE"))<=SYSDATE#!)
The first operation is the access, based on the desc index clause of the index itself, and the second the filter. Both appear because the query is done against the nature of the index.
I would never use this clause in any index unless the access is done in that way always, which is quite rare because sorting in different ways is what normally SQL is used for.
There is also this bug: ( fixed in 20.1 )
Bug 27589260 wrong sort order due to virtual column replacement in function based index
That degrades the performance of the query when a virtual column is present in the table and a function based index has been used.

Hint FIRST_ROWS(n) not giving optimized result for Order by clause

We have around 8 million records in a table having around 50 columns, we need to see few records very quickly so we are using FIRST_ROWS(10) hint for this purpose and its working amazingly fast.
SELECT /*+ FIRST_ROWS(10) */ ABC.view_ABC.ID, ABC.view_ABC.VERSION, ABC.view_ABC.M_UUID, ABC.view_ABC.M_PROCESS_NAME FROM ABC.view_ABC
However when we put a clause of ORDER BY e.g. creationtime (which is almost a unique value for each row in that table), this query will take ages to return all columns.
SELECT /*+ FIRST_ROWS(10) */ ABC.view_ABC.ID, ABC.view_ABC.VERSION, ABC.view_ABC.M_UUID, ABC.view_ABC.M_PROCESS_NAME FROM ABC.view_ABC ORDER BY ABC.view_ABC.CREATIONTIME DESC
One thing that I noticed is; if we put a ORDER BY for some column like VERSION which has same value for multiple rows, it gives the result better.
This ORDER BY is not working efficiently for any unique column like for ID column in this table.
Another thing worth considering is; if we reduce the number of columns to be fetched e.g. 3 columns instead of 50 columns the results are somehow coming faster.
P.S. gather statistics are run on this table weekly, but data is pushed hourly. Only INSERT statement is running on this table, no DELETE or UPDATE queries are running on this table.
Also, there is a simple view created no this table, the above queries are being run on same view.
Without an order by clause the optimiser can perform whatever join operations your view is hiding and start returning data as soon as it has some. The hint is changing how it accesses the underlying tables so that it, for example, does a nested loop join instead of a merge join - which would allow it to find the first matching rows quickly; but might be less efficient overall for returning all of the data. Your hint is telling the optimiser that you want it prioritise the speed of the first batch of rows returned over the speed of the entire query.
When you add the order by clause then all of the data has to be found before it can be ordered. All of the join conditions have to be met and all of the nested loops/merges etc. completed, and then the entire result set has to be sorted into the order you specified, before any rows can be returned.
If the column you're ordering by is indexed and that index is being used (or can be used) by the optimiser to identify rows in the driving table then it's possible it could be incorporating that into the sort, but you can't rely on that as the optimiser can change the plan as the data and statistics change.
You may find it useful to look at the execution plans of your various queries, with and without the hint, to see what the optimiser is doing in each case, including where in the chain of steps it is doing the sort operation, and the types of joins it is doing.
There is a multi-column index on this column (CREATION_TIME), somehow oracle hint optimizer was not using this index.
However on same table there was another column (TERMINATION_TIME), it had an index on itself. So we use the same query but with this indexed column in ORDER BY clause.
Below is the explain plan for first query with CREATION_TIME in ORDER BY clause which is part of multi-column index.
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7406K| 473M| | 308K (1)| 01:01:40 |
| 1 | SORT ORDER BY | | 7406K| 473M| 567M| 308K (1)| 01:01:40 |
| 2 | TABLE ACCESS FULL| Table_ABC | 7406K| 473M| | 189K (1)| 00:37:57 |
-------------------------------------------------------------------------------------------------------------
And this one is with TERMINATION_TIME as ORDER BY clause.
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 670 | 10 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TABLE_ABC | 7406K| 473M| 10 (0)| 00:00:01 |
| 2 | INDEX FULL SCAN DESCENDING| XGN620150305000000 | 10 | | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------
If you see, its a clear difference in the Cost, Rows involved, Usage of Temporary Space (which is not even used in later case) and finally the Time.
Now the query response time is much better.
Thanks.

Oracle SQL query running slow, full table scan on primary key, why?

I have a problem with a piece of code, I can't understand why the below query is doing a full table scan on the works table when wrk.cre_surr_id is the primary key. The stats on both tables are both up to date below are the indexes on both tables.
TABLE INDEXES
WORKS
INDEX NAME UNIQUE LOGGING COLUMN NAME ORDER
WRK_I1 N NO LOGICALLY_DELETED_Y Asc
WRK_ICE_WRK_KEY N YES ICE_WRK_KEY Asc
WRK_PK Y NO CRE_SURR_ID Asc
WRK_TUNECODE_UK Y NO TUNECODE Asc
TLE_TITLE_TOKENS
INDEX NAME UNIQUE LOGGING COLUMN NAME ORDER
TTT_I1 N YES TOKEN_TYPE, Asc
SEARCH_TOKEN,
DN_WRK_CRE_SURR_ID
TTT_TLE_FK_1 N YES TLE_SURR_ID
Problem query below. It has a cost of 245,876 which seems high, it's doing a FULL TABLE SCAN of the WORKS table which has 21,938,384 rows in the table. It is doing an INDEX RANGE SCAN of the TLE_TITLE_TOKENS table which has 19,923,002 rows in it. On the explain plan also is an INLIST ITERATOR which I haven't a clue what it means but it I think it's to do with having an "in ('E','N')" in my sql query.
SELECT wrk.cre_surr_id
FROM works wrk,
tle_title_tokens ttt
WHERE ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id
AND wrk.logically_deleted_y IS NULL
AND ttt.token_type in ('E','N')
AND ttt.search_token LIKE 'BELIEVE'||'%'
When I break the query down and do a simple select from the TLE_TITLE_TOKENS table I get 280,000 records back.
select ttt.dn_wrk_cre_surr_id
from tle_title_tokens ttt
where ttt.token_type in ('E','N')
and ttt.search_token LIKE 'BELIEVE'||'%'
How do I stop it doing a FULL TABLE scan on the WORKS table. I could put a hint on the query but I would have thought Oracle would be clever enough to know to use the index without a hint.
Also on TLE_TITLE_TOKENS table would it be better to create a fuction based index on the column SEARCH_TOKEN as users seem to do LIKE % searches on this field. What would that fuction based index look like.
I'm running on an Oracle 11g database.
Thanks in Advance to any answers.
First, rewrite the query using a join:
SELECT wrk.cre_surr_id
FROM tle_title_tokens ttt JOIN
works wrk
ON ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id
WHERE wrk.logically_deleted_y IS NULL AND
ttt.token_type in ('E', 'N') AND
ttt.search_token LIKE 'BELIEVE'||'%';
You should be able to speed this query by using indexes. It is not clear what the best index is. I would suggest either tle_title_tokens(search_token, toekn_type, dn_wrk_cre_surr_id) and works(cre_surr_id, logically_deleted_y).
Another possibility is to write the query using EXISTS, such as:
SELECT wrk.cre_surr_id
FROM works wrk
WHERE wrk.logically_deleted_y IS NULL AND
EXISTS (SELECT 1
FROM tle_title_tokens ttt
WHERE ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id AND
ttt.token_type IN ('N', 'E') AND
ttt.search_token LIKE 'BELIEVE'||'%'
) ;
For this version, you want indexes on works(logically_deleted_y, cre_surr_id) and tle_title_tokens(dn_wrk_cre_surr_id, token_type, search_token).
try this:
SELECT /*+ leading(ttt) */ wrk.cre_surr_id
FROM works wrk,
tle_title_tokens ttt
WHERE ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id
AND wrk.logically_deleted_y IS NULL
AND ttt.token_type in ('E','N')
AND ttt.search_token LIKE 'BELIEVE'||'%'
Out of the 19,923,002 rows in LE_TITLE_TOKENS,
How many records have TOKEN_TYPE 'E', how many have 'N'? Are there any other TokenTypes? If yes, then how many are they put together?
If E and N put together forms a small part of the total records, then check if histogram statistics are updated for that column.
The execution plan depends on how many records are being selected from LE_TITLE_TOKENS out of the 20M records for the given filters.
I'm assuming this index definition
create index works_idx on works (cre_surr_id,logically_deleted_y);
create index title_tokens_idx on tle_title_tokens(search_token,token_type,dn_wrk_cre_surr_id);
There are typically two possible scenarios to execute the join
NESTED LOOPS which access the inner table WORKS using index, but repeatedly in a loop for each row in the outer table
HASH JOIN which access the WORKS using FULL SCAN but only once.
It is not possible to say that one option is bad and the other good.
Nested loops is better if there are only few row in the outer table (few loops), but with increasing number of records in the outer table (TOKEN) gets slower and
slower and at some number of row the HASH JOIN is bettwer.
How to see what execution plan is better? Simple force Oracle using hint to run both scanarios and compare the elapsed time.
In your case you should see those two execution plans
HASH JOIN
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 207K| 10M| | 2439 (1)| 00:00:30 |
|* 1 | HASH JOIN | | 207K| 10M| 7488K| 2439 (1)| 00:00:30 |
|* 2 | INDEX RANGE SCAN | TITLE_TOKENS_IDX | 207K| 5058K| | 29 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| WORKS | 893K| 22M| | 431 (2)| 00:00:06 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TTT"."DN_WRK_CRE_SURR_ID"="WRK"."CRE_SURR_ID")
2 - access("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%')
filter("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%' AND ("TTT"."TOKEN_TYPE"='E' OR
"TTT"."TOKEN_TYPE"='N'))
3 - filter("WRK"."LOGICALLY_DELETED_Y" IS NULL)
NESTED LOOPS
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 207K| 10M| 414K (1)| 01:22:56 |
| 1 | NESTED LOOPS | | 207K| 10M| 414K (1)| 01:22:56 |
|* 2 | INDEX RANGE SCAN| TITLE_TOKENS_IDX | 207K| 5058K| 29 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN| WORKS_IDX | 1 | 26 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%')
filter("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%' AND
("TTT"."TOKEN_TYPE"='E' OR "TTT"."TOKEN_TYPE"='N'))
3 - access("TTT"."DN_WRK_CRE_SURR_ID"="WRK"."CRE_SURR_ID" AND
"WRK"."LOGICALLY_DELETED_Y" IS NULL)
My gues is the (with 280K loops) the hash join (i.e. FULLTABLE SCAN) will be bettwer, but it could be that you recognise that nested loops should be used.
In this case the optimize doesn't correct recognise the switching point between nested loops and hash join.
Common cause of this is wrong or missing system statistics or improper optimizer parameters.

Which sql statement is faster when counting nulls?

I need to make a query to determine if 3 columns are not filled out. Should I make a column in the table just as a flag to note that 3 columns are empty? My instinct tells me that I shouldn't make an extra column. I'm just wondering if I would get any performance boost from doing so. This is for oracle server.
select count(*) from my_table t where t.not_available = 1;
or
select count(*) from my_table t where t.col1 is null and t.col2 is null and t.col3 is null;
I think you are doing a pre-mature optimization.
Adding an extra column into a table increases the size of each record. This would typically mean that a table would occupy more space on disk. Large table sizes imply longer full table scans.
Adding indexes might help. But, there is an associated cost with them. If an index would help, you don't need to add another column, because Oracle supports functional indexes. So, you can index on an expression.
In most cases, your query is going to do a full table scan or full index scan, unless some of the conditions are rare.
In other words, to have a change of really answering your question requires understanding:
The record layout
The distribution of values in the three columns
Any additional factors that might affect access, such as partitioned columns
Only when performance leaves you with no other choice should you resort to an extra redundant column. In this case, you should probably avoid it. Just introduce an index on (col1,col2,col3,1) if performance of this statement is too poor.
Here is an example of why putting the 4th constant value 1 in the index is probably a good idea.
First a table with 1000 rows, out of which only 1 row (456) has all three columns NULL:
SQL> create table my_table (id,col1,col2,col3,fill)
2 as
3 select level
4 , nullif(level,456)
5 , nullif(level,456)
6 , nullif(level,456)
7 , rpad('*',100,'*')
8 from dual
9 connect by level <= 1000
10 /
Table created.
A row with three NULLS is not indexed by the index below:
SQL> create index my_table_i1 on my_table(col1,col2,col3)
2 /
Index created.
and will use a full table scan in my test case (likely a full index scan on your primary key index in your case)
SQL> exec dbms_stats.gather_table_stats(user,'my_table')
PL/SQL procedure successfully completed.
SQL> set autotrace on
SQL> select count(*) from my_table t where t.col1 is null and t.col2 is null and t.col3 is null
2 /
COUNT(*)
----------
1
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 228900979
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 8 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
|* 2 | TABLE ACCESS FULL| MY_TABLE | 1 | 12 | 8 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T"."COL1" IS NULL AND "T"."COL2" IS NULL AND "T"."COL3"
IS NULL)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
37 consistent gets
0 physical reads
0 redo size
236 bytes sent via SQL*Net to client
247 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
But if I add a constant 1 to the index:
SQL> set autotrace off
SQL> drop index my_table_i1
2 /
Index dropped.
SQL> create index my_table_i2 on my_table(col1,col2,col3,1)
2 /
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'my_table')
PL/SQL procedure successfully completed.
Then it will use the index and your statement will fly
SQL> set autotrace on
SQL> select count(*) from my_table t where t.col1 is null and t.col2 is null and t.col3 is null
2 /
COUNT(*)
----------
1
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 623815834
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
|* 2 | INDEX RANGE SCAN| MY_TABLE_I2 | 1 | 12 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."COL1" IS NULL AND "T"."COL2" IS NULL AND "T"."COL3"
IS NULL)
filter("T"."COL2" IS NULL AND "T"."COL3" IS NULL)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
236 bytes sent via SQL*Net to client
247 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL execution speed depends on numerous factors you didn't list in your question.
Therefore I think you should check the execution plan on your specific server to get first-hand benchmarks of both situtations.
See e.g. here for info on how to display the execution plan on Oracle..
If the column(s) in the WHERE clause allow the use of an index then most likely that would be the faster. However if no columns are indexed then I would expect the first query to be superior.
But checking the plan is the always the best way to know.
I would create on index on not_available and then query that.
CREATE INDEX index_name
ON table_name (not_available)
Something like this might help you?
select count(NVL2(t.col1||t.col2||t.col3),NULL,1) FROM my_table t;

Single-row subqueries in Oracle -- what is the join plan?

I've just discovered that Oracle lets you do the following:
SELECT foo.a, (SELECT c
FROM bar
WHERE foo.a = bar.a)
from foo
As long as only one row in bar matches any row in foo.
The explain plan I get from PL/SQL developer is this:
SELECT STATEMENT, GOAL = ALL_ROWS
TABLE ACCESS FULL BAR
TABLE ACCESS FULL FOO
This doesn't actually specify how the tables are joined. A colleague asserted that this is more efficient than doing a regular join. Is that true? What is the join strategy on such a select statement, and why doesn't it show up in the explain plan?
Thanks.
The plan you have there does not provide much information at all.
Use SQL*Plus and use dbms_xplan to get a more detailed plan. Look for a script called utlxpls.sql.
This gives a bit more information:-
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1837 | 23881 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| BAR | 18 | 468 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| FOO | 1837 | 23881 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("BAR"."A"=:B1)
Note
-----
- dynamic sampling used for this statement
18 rows selected.
I didn't create any indexes or foreign keys or collect statistics on the tables, which would change the plan and the join mechanism choosen. Oracle is actually doing a NESTED LOOPS type join here. Step 1, your inline sub-select, is performed for every row returned from FOO.
This way of performing a SELECT is not quicker. It could be the same or slower. In general try and join everything in the main WHERE clause unless it becomes horribly unreadable.
If you create a normal index on bar(a) the CBO should be able to use, but I'm pretty sure that it won't be able to do hash joins. These kind of queries only make sense if you're using an aggregate function and you got multiple single-row subqueries in your top SELECT. Even so, you can always rewrite the query as:
SELECT foo.a, bar1.c, pub1.d
FROM foo
JOIN (SELECT a, MIN(c) as c
FROM bar
GROUP BY a) bar1
ON foo.a = bar1.a
JOIN (SELECT a, MAX(d) as d
FROM pub
GROUP BY a) pub1
ON foo.a = pub1.a
This would enable the CBO to use more options, while at the same time it would enable you to easily retrieve multiple columns from the child tables without having to scan the same tables multiple times.