GATHER_PLAN_STATISTICS does does not generate basic plan statistics

GATHER_PLAN_STATISTICS does does not generate basic plan statistics - sql

all,
I am learning to tune query now, when I ran the following:
select /*+ gather_plan_statistics */ * from emp;
select * from table(dbms_xplan.display(FORMAT=>'ALLSTATS LAST'));
The result always says:
Warning: basic plan statistics not available. These are only collected when:
hint 'gather_plan_statistics' is used for the statement or
parameter 'statistics_level' is set to 'ALL', at session or system level
I tried the alter session set statistics_level = ALL; too in sqlplus, but that did not change anything in the result.
Could anyone please let me know what I might have missed?
Thanks so much.

DISPLAY Function displays a content of PLAN_TABLE generated (filled) by EXPLAIN PLAN FOR command. So you can use it to generate and display an (theoretical) plan using EXPLAIN PLAN FOR command, for example in this way:
create table emp as select * from all_objects;
explain plan for
select /*+ gather_plan_statistics */ count(*) from emp where object_id between 100 and 150;
select * from table(dbms_xplan.display );
Plan hash value: 2083865914
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 351 (1)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| EMP | 12 | 60 | 351 (1)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OBJECT_ID"<=150 AND "OBJECT_ID">=100)
/*+ gather_plan_statistics */ hint does not save data into PLAN_TABLE, but it stores execution statistics in V$SQL_PLAN performance view.
To display these data you can use a method described here: http://www.dba-oracle.com/t_gather_plan_statistics.htm, but this not always work, because you must execute the second command immediately after the SQL query.
The better method is to query V$SQL view to obtain SQL_ID of the query, and then use DISPLAY_CURSOR function, for example in this way:
select /*+ gather_plan_statistics */ count(*) from emp where object_id between 100 and 150;
select sql_id, plan_hash_value, child_number, executions, fetches, cpu_time, elapsed_time, physical_read_requests, physical_read_bytes
from v$sql s
where sql_fulltext like 'select /*+ gather_plan_statistics */ count(*)%from emp%'
and sql_fulltext not like '%from v$sql' ;
SQL_ID PLAN_HASH_VALUE CHILD_NUMBER EXECUTIONS FETCHES CPU_TIME ELAPSED_TIME PHYSICAL_READ_REQUESTS PHYSICAL_READ_BYTES
------------- --------------- ------------ ---------- ---------- ---------- ------------ ---------------------- -------------------
9jjm288hx7buz 2083865914 0 1 1 15625 46984 26 10305536
The above query returns SQL_ID=9jjm288hx7buz and CHILD_NUMBER=0(child number is just a cursor number). Use these values to query the colledted plan:
SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR('9jjm288hx7buz', 0, 'ALLSTATS'));
SQL_ID 9jjm288hx7buz, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from emp where object_id
between 100 and 150
Plan hash value: 2083865914
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | | 2 |00:00:00.05 | 10080 |
| 1 | SORT AGGREGATE | | 2 | 1 | 2 |00:00:00.05 | 10080 |
|* 2 | TABLE ACCESS FULL| EMP | 2 | 47 | 24 |00:00:00.05 | 10080 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("OBJECT_ID"<=150 AND "OBJECT_ID">=100))

If all you ran were the two statements in your question:
select /*+ gather_plan_statistics */ * from emp;
select * from table(dbms_xplan.display(FORMAT=>'ALLSTATS LAST'));
Then I think your problem is your use of DBMS_XPLAN.DISPLAY. The way you are using it, you are printing the plan of the last statement you explained, not the last statement you executed. And "explain" will not execute the query, so it will not benefit from a gather_plan_statistics hint.
This works for me in 12c:
select /*+ gather_plan_statistics */ count(*) from dba_objects;
SELECT *
FROM TABLE (DBMS_XPLAN.display_cursor (null, null, 'ALLSTATS LAST'));
i.e., display_cursor instead of just display.

What I leared from the answers so far:
When a query is parsed, the optimizer estimates how many rows are produced during each step of the query plan. Sometimes it is neccessary to check how good the prediction was. If the estimates are off by more than a order of magnitude, this might lead to the wrong plan being used.
To compare estimated and actual numbers, the following steps are necessary:
You need read access to V$SQL_PLAN, V$SESSION and V$SQL_PLAN_STATISTICS_ALL. These privileges are included in the SELECT_CATALOG role. (source)
Switch on statistics gathering, either by
ALTER SESSION SET STATISTICS_LEVEL = ALL;
or by using the hint /*+ gather_plan_statistics */ in the query.
There seems to be a certain performance overhead.
See for instance Jonathan's blog.
Run the query. You'll need to find it later, so it's best to include an arbitrary hint:
SELECT /*+ gather_plan_statistics HelloAgain */ * FROM scott.emp;
EXPLAIN PLAN FOR SELECT ... is not sufficient, as it will only create the estimates without running the actual query.
Furthermore, as #Matthew suggested (thanks!), it is important to actually fetch all rows. Most GUIs will show only the first 50 rows or so. In SQL Developer, you can use the shortcut ctrl+End in the query result window.
Find the query in the cursor cache and note it's SQL_ID:
SELECT sql_id, child_number, sql_text
FROM V$SQL
WHERE sql_text LIKE '%HelloAgain%';
dbqbqxp9srftn 0 SELECT /*+ gather_plan...
Format the result:
SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('dbqbqxp9srftn',0,'ALLSTATS LAST'));
Steps 4. and 5. can be combined:
SELECT x.*
FROM v$sql s,
TABLE(DBMS_XPLAN.DISPLAY_CURSOR(s.sql_id, s.child_number)) x
WHERE s.sql_text LIKE '%HelloAgain%';
The result shows the estimated rows (E-Rows) and the actual rows (A-Rows):
SQL_ID dbqbqxp9srftn, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics HelloAgain */ * FROM scott.emp
Plan hash value: 3956160932
------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 6 |
| 1 | TABLE ACCESS FULL| EMP | 1 | 14 | 14 |00:00:00.01 | 6 |
------------------------------------------------------------------------------------

ALLSTATS LAST starts working after you ran the statement twice.

Related

Why does explain plan show the wrong number of rows?

I am trying to simulate this and to do that I have created the following procedure to insert a large number of rows:
create or replace PROCEDURE a_lot_of_rows is
i carte.cod%TYPE;
a carte.autor%TYPE := 'Author #';
t carte.titlu%TYPE := 'Book #';
p carte.pret%TYPE := 3.23;
e carte.nume_editura%TYPE := 'Penguin Random House';
begin
for i in 8..1000 loop
insert into carte
values (i, e, a || i, t || i, p, 'hardcover');
commit;
end loop;
for i in 1001..1200 loop
insert into carte
values (i, e, a || i, t || i, p, 'paperback');
commit;
end loop;
end;
I have created a bitmap index on the tip_coperta column (which can only have the values 'hardcover' and 'paperback') and then inserted 1200 more rows. However, the result given by the explain plan is the following (before the insert procedure, the table had 7 rows, of which 4 had the tip_coperta = 'paperback'):
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 284 | 34 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 4 | 284 | 34 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TIP_COPERTA"='paperback')

Motto: Bad Statistics are Worse than No Statistics
TLDR your statistics are stale and need to be recollected. If you create index the index statistics are automatically gathered but not the table statistics that are relevant for your case.
Lets simulate your example with the following script to create the table and fill it with 1000 hardcovers and 200 paperbacks.
create table CARTE
(cod int,
autor VARCHAR2(100),
titlu VARCHAR2(100),
pret NUMBER,
nume_editura VARCHAR2(100),
tip_coperta VARCHAR2(100)
);
insert into CARTE
(cod,autor,titlu,pret,nume_editura,tip_coperta)
select rownum,
'Author #'||rownum ,
'Book #'||rownum,
3.23,
'Penguin Random Number',
case when rownum <=1000 then 'hardcover'
else 'paperback' end
from dual connect by level <= 1200;
commit;
This leaves the new table without optimizer object statistics, which you can verfiy with the following query that return only NULLs
select NUM_ROWS, LAST_ANALYZED from user_tables where table_name = 'CARTE';
So, let's check what is the Oracle impression of the table:
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from CARTE
where tip_coperta = 'paperback'
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
The script above produce the execution plan for your query asking for paberbacks and you see that the Rows is fine (= 200). How is this possible?
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 46800 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 200 | 46800 | 5 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TIP_COPERTA"='paperback')
The explanation is in the Notes of the plan output - the dynamic sampling was used.
Basically Oracle execute while parsing the query an additional query to estimate the number of rows with the filter predicate.
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Dynamic sampling is fine for tables that are used seldon, but if the table is queried regularly, we need optimizer statsitics to save the overhead of dynamic sampling.
So let's collect statistics
exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'CARTE' );
Now you see that the statistics are gathered, the total number of rows is fine and
in the column statistics a frequency histogram is created - this is important to estimate the count of records with a specific value!
select NUM_ROWS, LAST_ANALYZED from user_tables where table_name = 'CARTE';
NUM_ROWS LAST_ANALYZED
---------- -------------------
1200 09.01.2021 16:48:26
select NUM_DISTINCT,HISTOGRAM from user_tab_columns where table_name = 'CARTE' and column_name = 'TIP_COPERTA';
NUM_DISTINCT HISTOGRAM
------------ ---------------
2 FREQUENCY
Lets vefiry how the statistics are working now in the execution plan
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from CARTE
where tip_coperta = 'paperback'
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
Basically we see the same correct result
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 12400 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 200 | 12400 | 5 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TIP_COPERTA"='paperback')
Now we delete all but the four 'paperback's from the table
delete from CARTE
where tip_coperta = 'paperback' and cod > 1004;
commit;
select count(*) from CARTE
where tip_coperta = 'paperback'
COUNT(*)
----------
4
With this action the statistics went stale and give a wrong result based on obsolet data. This wrong result will occure until the statistics are recollected.
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from CARTE
where tip_coperta = 'paperback'
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 12400 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CARTE | 200 | 12400 | 5 (0)| 00:00:01 |
---------------------------------------------------------------------------
Set up a policy that maintains your statistics up-to-date!

It is the table statistics that are important for this cardinality.
You will need to wait until the automatic stats gathering task to fire up and obtain the new statistics, or you can do it yourself:
exec dbms_stats.gather_table_stats(null,'carte',method_opt=>'for all columns size 1 for columns size 254 TIP_COPERTA')
This will force there to be a histogram on the TIP_COPERTA column and not on the others (you may wish to use for all columns size skew or for all columns size auto or even just let it default to whatever the set preferred method_opt parameter is. Have a read of this article for details about this parameter.
In some of the later versions of Oracle, depending on where you are running it, you may also have Real-Time Statistics. This is where Oracle will be keeping your statistics up to date even after conventional DML.
It's important to remember that cardinality estimates do not need to be completely accurate for you to obtain reasonable execution plans. A common rule of thumb is that it should be within an order of magnitude, and even then you will probably be fine most of the time.

To obtain an estimate for the number of rows, Oracle needs you to analyze the table (or index). When you create an index, there is an automatic analyze.

ORACLE SQL rewritten query

In an Oracle database, how can I see the SQL that is really executed?
Let's say I have a query that looks like this:
WITH numsA
AS (SELECT 1 num FROM DUAL
UNION
SELECT 2 FROM DUAL
UNION
SELECT 3 FROM DUAL)
SELECT *
FROM numsA
FULL OUTER JOIN (SELECT 3 num FROM DUAL
UNION
SELECT 4 FROM DUAL
UNION
SELECT 5 FROM DUAL) numsB
ON numsA.num = numsB.num
I suppose that the SQL engine will rewrite this SQL into something different before executing it.
Can some tell me how can I see that rewritten query (with tkprof maybe)?

As #Gordon already commented that query does not execute in the oracle. Oracle creates the execution plan and further processing is done using the best plan chosen by the optimizer.
If you are keen to see how the query is executed then you must go for the execution plan.
Many tools provide the feature to directly see the execution plan and if you want to see the execution plan by yourself then you can achieve it using the following technique(taking the simplest example with query SELECT 1 FROM DUAL):
SQL> explain plan for
2 select 1 from dual;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1388734953
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------
8 rows selected.
SQL>
To understand the explain plan you must have to go through all the details related to it.
I advise you to refer to Oracle documentation for Reading Execution Plans
Cheers!!

where column is null taking longer time to execute

I am executing a select statement like the one below which is taking more than 6mins to execute.
select * from table where col1 is null;
whereas:
select * from table;
returns results in few seconds. The table contains 25million records. No indexes are used. there is a composite PK but not on the col used. Same query when executed on a different table with 50 million records, returns results in few seconds. only this table poses a problem.
Rebuilt the table to check if there was a miss, but still facing the same issue.
can some one help me here on why it is taking time?
datatype: VARCHAR2(40)
PLAN:
Plan hash value: 2838772322
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 794 | 60973 (16)| 00:00:03 |
|* 1 | TABLE ACCESS STORAGE FULL| table | 1 | 794 | 60973 (16)| 00:00:03 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - storage("column" IS NULL)
filter("column" IS NULL)

select * from table;
Oracle SQL Developer tool has a default setting to display only 50 records unless it was manually edited. So the entire 25 million records will not be fetched as you don't need all the records for display.
select * from table where col1 is null;
But when you filter for null values, the entire set of 25 million rows has to be scanned to apply the filter and get your 81 records satisfying that predicate. Hence your second query takes longer.

performance difference between to_char and to_date [duplicate]

This question already has answers here:
How to optimize an Oracle query that has to_char in where clause for date
(6 answers)
Closed 9 years ago.
I have simple SQL query.. on Oracle 10g. I want to know the difference between these queries:
select * from employee where id = 123 and
to_char(start_date, 'yyyyMMdd') >= '2013101' and
to_char(end_date, 'yyyyMMdd') <= '20121231';
select * from employee where id = 123 and
start_date >= to_date('2013101', 'yyyyMMdd') and
end_date <= to_date('20121231', 'yyyyMMdd');
Questions:
1. Are these queries the same? start_date, end_date are indexed date columns.
2. Does one work better over the other?
Please let me know. thanks.

The latter is almost certain to be faster.
It avoids data type conversions on a column value.
Oracle will estimate better the number of possible values between two dates, rather than two strings that are representations of dates.
Note that neither will return any rows as the lower limit is probably intended to be higher than the upper limit according to the numbers you've given. Also you've missed a numeral in 2013101.

One of the biggest flaw when you converting, casting or transforming to expression (i.e. "NVL", "COALESCE" etc.) columns in WHERE clause is that CBO will not be able to use index on that column. I slightly modified your example to show the difference:
SQL> create table t_test as
2 select * from all_objects;
Table created
SQL> create index T_TEST_INDX1 on T_TEST(CREATED, LAST_DDL_TIME);
Index created
Created table and index for our experiment.
SQL> execute dbms_stats.set_table_stats(ownname => 'SCOTT',
tabname => 'T_TEST',
numrows => 100000,
numblks => 10000);
PL/SQL procedure successfully completed
We are making CBO think that our table kind of big one.
SQL> explain plan for
2 select *
3 from t_test tt
4 where tt.owner = 'SCOTT'
5 and to_char(tt.last_ddl_time, 'yyyyMMdd') >= '20130101'
6 and to_char(tt.created, 'yyyyMMdd') <= '20121231';
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2796558804
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 300 | 2713 (1)| 00:00:33 |
|* 1 | TABLE ACCESS FULL| T_TEST | 3 | 300 | 2713 (1)| 00:00:33 |
----------------------------------------------------------------------------
Full table scan is used which would be costly on big table.
SQL> explain plan for
2 select *
3 from t_test tt
4 where tt.owner = 'SCOTT'
5 and tt.last_ddl_time >= to_date('20130101', 'yyyyMMdd')
6 and tt.created <= to_date('20121231', 'yyyyMMdd');
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1868991173
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 300 | 4 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| T_TEST | 3 | 300 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T_TEST_INDX1 | 8 | | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
See, now it's index range scan and the cost is significantly lower.
SQL> drop table t_test;
Table dropped
Finally cleaning.

for output (displaying) purpose use to_char
for "date" handling (insert, update, compare etc) use to_date
I don't have any performance link to share, but using to_date in above Query should run faster!
While the to_char it will first cast the date and then for making the compare it will need to resolve it as date type. There will be a small performance loss.
As using to_date it will not need to cast first, it will use date type directly.

Oracle equivalent of Postgres EXPLAIN ANALYZE

Similar to this question.
I'd like to get a detailed query plan and actual execution in Oracle (10g) similar to EXPLAIN ANALYZE in PostgreSQL. Is there an equivalent?

The easiest way is autotrace in sql*plus.
SQL> set autotrace on exp
SQL> select count(*) from users ;
COUNT(*)
----------
137553
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=66 Card=1)
1 0 SORT (AGGREGATE)
2 1 INDEX (FAST FULL SCAN) OF 'SYS_C0062362' (INDEX (UNIQUE)
) (Cost=66 Card=137553)
Alternately, oracle does have an explain plan statement, that you can execute and then query the various plan tables. Easiest way is using the DBMS_XPLAN package:
SQL> explain plan for select count(*) from users ;
Explained.
SQL> SELECT * FROM table(DBMS_XPLAN.DISPLAY);
--------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 66 |
| 1 | SORT AGGREGATE | | 1 | |
| 2 | INDEX FAST FULL SCAN| SYS_C0062362 | 137K| 66 |
--------------------------------------------------------------
If you're old-school, you can query the plan table yourself:
SQL> explain plan set statement_id = 'my_statement' for select count(*) from users;
Explained.
SQL> column "query plan" format a50
SQL> column object_name format a25
SQL> select lpad(' ',2*(level-1))||operation||' '||options "query plan", object_name
from plan_table
start with id=0 and statement_id = '&statement_id'
connect by prior id=parent_id
and prior statement_id=statement_id
Enter value for statement_id: my_statement
old 3: start with id=0 and statement_id = '&statement_id'
new 3: start with id=0 and statement_id = 'my_statement'
SELECT STATEMENT
SORT AGGREGATE
INDEX FAST FULL SCAN SYS_C0062362
Oracle used to ship with a utility file utlxpls.sql that had a more complete version of that query. Check under $ORACLE_HOME/rdbms/admin.
For any of these methods, your DBA must have set up the appropriate plan tables already.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

GATHER_PLAN_STATISTICS does does not generate basic plan statistics - sql

ALLSTATS LAST starts working after you ran the statement twice.

Related

Why does explain plan show the wrong number of rows?

ORACLE SQL rewritten query

where column is null taking longer time to execute

performance difference between to_char and to_date [duplicate]

Oracle equivalent of Postgres EXPLAIN ANALYZE

Categories

Resources