Script to query table's column analysis data - sql

e.g:
select * from SomeView where tablename = 'tablename';
and expected system'll return this result:
+-----------+------------+----------+------+----------+----------+------------+------------+------------------------+
| tablename | columnName | type | size | minvalue | maxvalue | rows_count | avg_length | last_Analysis_Datetime |
+-----------+------------+----------+------+----------+----------+------------+------------+------------------------+
| xxxx | xxxx | nvarchar | 100 | null | xxx | 1000 | 3 | 2020-02-26 |
+-----------+------------+----------+------+----------+----------+------------+------------+------------------------+
what I've tried:
I can use EXEC SP_HELPSTATS + DBCC SHOW_STATISTICS to get result, but it's not table result set data format.
EXEC SP_HELPSTATS 'tablename','ALL'
DBCC SHOW_STATISTICS(tablename,'STATISTIC_Name')

Below official documentation might help you.
DBCC SHOW_STATISTICS displays current query optimization statistics for a table or indexed view. The query optimizer uses statistics to estimate the cardinality or number of rows in the query result, which enables the query optimizer to create a high quality query plan. For example, the query optimizer could use cardinality estimates to choose the index seek operator instead of the index scan operator in the query plan, improving query performance by avoiding a resource-intensive index scan.
DBCC SHOW_STATISTICS (Transact-SQL)
B. Returning all statistics properties for a table
Edit: For further need.

Related

How to optimize query? Explain Plan

I have one table with 3 fields and I neeed get all value of fields, I have next query:
SELECT COM.FIELD1, COM.FIELD2, COM.FIELD3
FROM OWNER.TABLE_NAME COM
WHERE COM.FIELD1 <> V_FIELD
ORDER BY COM.FIELD3 ASC;
And i want optimaze, I have next values of explain plan:
Plan
SELECT STATEMENT CHOOSECost: 4 Bytes: 90 Cardinality: 6
2 SORT ORDER BY Cost: 4 Bytes: 90 Cardinality: 6
1 TABLE ACCESS FULL OWNER.TABLE_NAME Cost: 2 Bytes: 90 Cardinality: 6
Any solution for not get TAF(Table Acces Full)?
Thanks!
Since your WHERE condition is on the column FIELD1, an index on that column many help.
You may already have an index on that column. Even then, you will still see a full table access, if the expected number of rows that don't have VAL1 in that column is sufficiently large.
The only case when you will NOT see full table access is if you have an index on that column, the vast majority (at least, say, 80% to 90%) of rows in the table do have the value VAL1 in the column FIELD1, and statistics are up to date AND, perhaps, you need to use a histogram (because in this case the distribution of values in FIELD1 would be very skewed).
I suppose that your table has a very large number of rows with a given key (let call it 'B') and a very small number of rows with other keys.
Note, that the index access will work only for conditions FIELD1 <> 'B', all other predicates will return 'B' and therefore are not suitable for index access.
Note also that if you have more that one large key, the index access will not work from the same reason - you will never get only a few record where index can profit.
As a starting point you can reformulte the predicate
FIELD1 <> V_FIELD
as
DECODE(FIELD1,V_FIELD,1,0) = 0
The DECODE return 1 if FIELD1 = V_FIELD and returns 0 if FIELD1 <> V_FIELD
This transformation allows you to define a function based index with the DECODE expression.
Example
create table tt as
select
decode(mod(rownum,10000),1,'A','B') FIELD1
from dual connect by level <= 50000;
select field1, count(*) from tt group by field1;
FIELD1 COUNT(*)
------ ----------
A 5
B 49995
FBIndex
create index tti on tt(decode(field1,'B',1,0));
Use your large key for the index definition.
Access
To select FIELD1 <> 'B' use reformulated predicate decode(field1,'B',1,0) = 0
Which leads nicely to an index access:
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
SELECT * from tt where decode(field1,'B',1,0) = 0;
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 471 | 2355 | 24 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT | 471 | 2355 | 24 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TTI | 188 | | 49 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(DECODE("FIELD1",'B',1,0)=0)
To select FIELD1 <> 'A' use reformulated predicate decode(field1,'A',1,0) = 0
Here you don't want index access as nearly the whole table is returned- and the CBO opens FULL TABLE SCAN.
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
SELECT * from tt where decode(field1,'A',1,0) = 0;
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 47066 | 94132 | 26 (4)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TT | 47066 | 94132 | 26 (4)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(DECODE("FIELD1",'A',1,0)=0)
Bind Variables
This will work the same way even if you use bind variables FIELD1 <> V_FIELD - provided you pass always the same value.
The bind variable peeking will evaluate the correct plan in the first parse and generate the proper plan.
If you will use more that one values as bind variable (and therefore expect to get different plans for different values) - you will learn the feature of adaptive cursor sharing
The query is already optimized, don't spend any more time on it unless it's running noticeably slow. If you have a tuning checklist that says "avoid all full table scans" it might be time to change that checklist.
The cost of the full table scan is only 2. The exact meaning of the cost is tricky, and not always particularly helpful. But in this case it's probably safe to say that 2 means the full table scan will run quickly.
If the query is not running in less than a few microseconds, or is returning significantly more than the estimated 6 rows, then there may be a problem with the optimizer statistics. If that's the case, try gathering statistics like this:
begin
dbms_stats.gather_table_stats('OWNER', 'TABLE_NAME');
end;
/
As #symcbean pointed out, a full table scan is not always a bad thing. If a table is incredibly small, like this one might be, all the data may fit inside a single block. (Oracle accesses data by block(s)-at-a-time, where the block is usually 8KB of data.) When the data structures are trivially small there won't be any significant difference between using a table or an index.
Also, full table scans can use multi-block reads, whereas most index access paths use single-block reads. For reading a large percentage of data it's faster to read the whole thing with multi-block reads than reading it one-block-at-a-time with an index. Since this query only has a <> condition, it looks likely that this query will read a large percentage of data and a full table scan is optimal.

Hint FIRST_ROWS(n) not giving optimized result for Order by clause

We have around 8 million records in a table having around 50 columns, we need to see few records very quickly so we are using FIRST_ROWS(10) hint for this purpose and its working amazingly fast.
SELECT /*+ FIRST_ROWS(10) */ ABC.view_ABC.ID, ABC.view_ABC.VERSION, ABC.view_ABC.M_UUID, ABC.view_ABC.M_PROCESS_NAME FROM ABC.view_ABC
However when we put a clause of ORDER BY e.g. creationtime (which is almost a unique value for each row in that table), this query will take ages to return all columns.
SELECT /*+ FIRST_ROWS(10) */ ABC.view_ABC.ID, ABC.view_ABC.VERSION, ABC.view_ABC.M_UUID, ABC.view_ABC.M_PROCESS_NAME FROM ABC.view_ABC ORDER BY ABC.view_ABC.CREATIONTIME DESC
One thing that I noticed is; if we put a ORDER BY for some column like VERSION which has same value for multiple rows, it gives the result better.
This ORDER BY is not working efficiently for any unique column like for ID column in this table.
Another thing worth considering is; if we reduce the number of columns to be fetched e.g. 3 columns instead of 50 columns the results are somehow coming faster.
P.S. gather statistics are run on this table weekly, but data is pushed hourly. Only INSERT statement is running on this table, no DELETE or UPDATE queries are running on this table.
Also, there is a simple view created no this table, the above queries are being run on same view.
Without an order by clause the optimiser can perform whatever join operations your view is hiding and start returning data as soon as it has some. The hint is changing how it accesses the underlying tables so that it, for example, does a nested loop join instead of a merge join - which would allow it to find the first matching rows quickly; but might be less efficient overall for returning all of the data. Your hint is telling the optimiser that you want it prioritise the speed of the first batch of rows returned over the speed of the entire query.
When you add the order by clause then all of the data has to be found before it can be ordered. All of the join conditions have to be met and all of the nested loops/merges etc. completed, and then the entire result set has to be sorted into the order you specified, before any rows can be returned.
If the column you're ordering by is indexed and that index is being used (or can be used) by the optimiser to identify rows in the driving table then it's possible it could be incorporating that into the sort, but you can't rely on that as the optimiser can change the plan as the data and statistics change.
You may find it useful to look at the execution plans of your various queries, with and without the hint, to see what the optimiser is doing in each case, including where in the chain of steps it is doing the sort operation, and the types of joins it is doing.
There is a multi-column index on this column (CREATION_TIME), somehow oracle hint optimizer was not using this index.
However on same table there was another column (TERMINATION_TIME), it had an index on itself. So we use the same query but with this indexed column in ORDER BY clause.
Below is the explain plan for first query with CREATION_TIME in ORDER BY clause which is part of multi-column index.
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7406K| 473M| | 308K (1)| 01:01:40 |
| 1 | SORT ORDER BY | | 7406K| 473M| 567M| 308K (1)| 01:01:40 |
| 2 | TABLE ACCESS FULL| Table_ABC | 7406K| 473M| | 189K (1)| 00:37:57 |
-------------------------------------------------------------------------------------------------------------
And this one is with TERMINATION_TIME as ORDER BY clause.
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 670 | 10 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TABLE_ABC | 7406K| 473M| 10 (0)| 00:00:01 |
| 2 | INDEX FULL SCAN DESCENDING| XGN620150305000000 | 10 | | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------
If you see, its a clear difference in the Cost, Rows involved, Usage of Temporary Space (which is not even used in later case) and finally the Time.
Now the query response time is much better.
Thanks.

Need to speed up my UPDATE QUERY based on the EXPLAIN PLAN

I am updating my table data using a temporary table and it takes forever and it still has not completed. So I collected an explain plan on the query. Can someone advise me on how to tune the query or build indexes on them.
The query:
UPDATE w_product_d A
SET A.CREATED_ON_DT = (SELECT min(B.creation_date)
FROM mtl_system_items_b_temp B
WHERE to_char(B.inventory_item_id) = A.integration_id
and B.organization_id IN ('102'))
where A.CREATED_ON_DT is null;
Explain plan:
Plan hash value: 1520882583
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 47998 | 984K| 33M (2)|110:06:25 |
| 1 | UPDATE | W_PRODUCT_D | | | | |
|* 2 | TABLE ACCESS FULL | W_PRODUCT_D | 47998 | 984K| 9454 (1)| 00:01:54 |
| 3 | SORT AGGREGATE | | 1 | 35 | | |
|* 4 | TABLE ACCESS FULL| MTL_SYSTEM_ITEMS_B_TEMP | 1568 | 54880 | 688 (2)| 00:00:09 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"."CREATED_ON_DT" IS NULL)
4 - filter("B"."ORGANIZATION_ID"=102 AND TO_CHAR("B"."INVENTORY_ITEM_ID")=:B1)
Note
-----
- dynamic sampling used for this statement (level=2)
For this query:
UPDATE w_product_d A
SET A.CREATED_ON_DT = (SELECT min(B.creation_date)
FROM mtl_system_items_b_temp B
WHERE to_char(B.inventory_item_id) = A.integration_id
and B.organization_id IN ('102'))
where A.CREATED_ON_DT is null;
You have a problem. Why are you creating a temporary table with the wrong type for inventory_item_id? That is likely to slow down any access. So, let's fix the table first and then do the update:
alter table mtl_system_items_b_temp
add better_inventory_item_id varchar2(255); -- or whatever the right type is
update mtl_system_items_b_temp
set better_inventory_item_id = to_char(inventory_item_id);
Next, let's define the appropriate index:
create index idx_mtl_system_items_b_temp_3 on mtl_system_items_b_temp(better_inventory_item_id, organization_id, creation_date);
Finally, an index on w_product_d can also help:
create index idx_ w_product_d_1 w_product_d(CREATED_ON_DT);
Then, write the query as:
UPDATE w_product_d p
SET CREATED_ON_DT = (SELECT min(t.creation_date)
FROM mtl_system_items_b_temp t
WHERE t.better_nventory_item_id) = p.integration_id and
t.organization_id IN ('102')
)
WHERE p.CREATED_ON_DT is null;
Try a MERGE statement. It will likely go faster because it can read all the mtl_system_items_b_temp records at once rather than reading them over-and-over again for each row in w_product_d.
Also, your tables look like they're part of an Oracle e-BS environment. In the MTL_SYSTEM_ITEMS_B in such an environment, the INVENTORY_ITEM_ID and ORGANIZATION_ID columns are NUMBER. You seem to be using VARCHAR2 in your tables. Whenever you don't use the correct data types in your queries, you invite performance problems because Oracle must implicitly convert to the correct data type and, in doing so, loses its ability to use indexes on the column. So, make sure your queries treat each column correctly according to it's datatype. (E.g., if a column is a NUMBER use COLUMN_X = 123 instead of COLUMN_X = '123'.
Here's the MERGE example:
MERGE INTO w_product_d t
USING ( SELECT to_char(inventory_item_id) inventory_item_id_char, min(creation_date) min_creation_date
FROM mtl_system_items_b_temp
WHERE organization_id IN ('102') -- change this to IN (102) if organization_id is a NUMBER field!
) u
ON ( t.integration_id = u.inventory_item_id_char AND t.created_on_dt IS NULL )
WHEN MATCHED THEN UPDATE SET t.created_on_dt = nvl(t.created_on_date, u.min_creation_date) -- NVL is just in case...

Oracle equivalent of Postgres EXPLAIN ANALYZE

Similar to this question.
I'd like to get a detailed query plan and actual execution in Oracle (10g) similar to EXPLAIN ANALYZE in PostgreSQL. Is there an equivalent?
The easiest way is autotrace in sql*plus.
SQL> set autotrace on exp
SQL> select count(*) from users ;
COUNT(*)
----------
137553
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=66 Card=1)
1 0 SORT (AGGREGATE)
2 1 INDEX (FAST FULL SCAN) OF 'SYS_C0062362' (INDEX (UNIQUE)
) (Cost=66 Card=137553)
Alternately, oracle does have an explain plan statement, that you can execute and then query the various plan tables. Easiest way is using the DBMS_XPLAN package:
SQL> explain plan for select count(*) from users ;
Explained.
SQL> SELECT * FROM table(DBMS_XPLAN.DISPLAY);
--------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 66 |
| 1 | SORT AGGREGATE | | 1 | |
| 2 | INDEX FAST FULL SCAN| SYS_C0062362 | 137K| 66 |
--------------------------------------------------------------
If you're old-school, you can query the plan table yourself:
SQL> explain plan set statement_id = 'my_statement' for select count(*) from users;
Explained.
SQL> column "query plan" format a50
SQL> column object_name format a25
SQL> select lpad(' ',2*(level-1))||operation||' '||options "query plan", object_name
from plan_table
start with id=0 and statement_id = '&statement_id'
connect by prior id=parent_id
and prior statement_id=statement_id
Enter value for statement_id: my_statement
old 3: start with id=0 and statement_id = '&statement_id'
new 3: start with id=0 and statement_id = 'my_statement'
SELECT STATEMENT
SORT AGGREGATE
INDEX FAST FULL SCAN SYS_C0062362
Oracle used to ship with a utility file utlxpls.sql that had a more complete version of that query. Check under $ORACLE_HOME/rdbms/admin.
For any of these methods, your DBA must have set up the appropriate plan tables already.

How to optimize an update SQL that runs on a Oracle table with 700M rows

UPDATE [TABLE] SET [FIELD]=0 WHERE [FIELD] IS NULL
[TABLE] is an Oracle database table with more than 700 million rows. I cancelled the SQL execution after it had been running for 6 hours.
Is there any SQL hint that could improve performance? Or any other solution to speed that up?
EDIT: This query will be run once and then never again.
First of all is it a one-time query or is it a recurrent query ? If you only have to do it once you may want to look into running the query in parallel mode. You will have to scan all rows anyway, you could either divide the workload yourself with ranges of ROWID (do-it-yourself parallelism) or use Oracle built-in features.
Assuming you want to run it frequently and want to optimize this query, the number of rows with the field column as NULL will eventually be small compared to the total number of rows. In that case an index could speed things up. Oracle doesn't index rows that have all indexed columns as NULL so an index on field won't get used by your query (since you want to find all rows where field is NULL).
Either:
create an index on (FIELD, 0), the 0 will act as a non-NULL pseudocolumn and all rows will be indexed on the table.
create a function-based index on (CASE WHEN field IS NULL THEN 1 END), this will only index the rows that are NULLs (the index would therefore be very compact). In that case you would have to rewrite your query:
UPDATE [TABLE] SET [FIELD]=0 WHERE (CASE WHEN field IS NULL THEN 1 END)=1
Edit:
Since this is a one-time scenario, you may want to use the PARALLEL hint:
SQL> EXPLAIN PLAN FOR
2 UPDATE /*+ PARALLEL(test_table 4)*/ test_table
3 SET field=0
4 WHERE field IS NULL;
Explained
SQL> select * from table( dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 4026746538
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 22793 | 289K| 12 (9)| 00:00:
| 1 | UPDATE | TEST_TABLE | | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM)| :TQ10000 | 22793 | 289K| 12 (9)| 00:00:
| 4 | PX BLOCK ITERATOR | | 22793 | 289K| 12 (9)| 00:00:
|* 5 | TABLE ACCESS FULL| TEST_TABLE | 22793 | 289K| 12 (9)| 00:00:
--------------------------------------------------------------------------------
Are other users are updating the same rows in the table at the same time ?
If so, you could be hitting lots of concurrency issues (waiting for locks) and it may be worth breaking it into smaller transactions.
DECLARE
v_cnt number := 1;
BEGIN
WHILE v_cnt > 0 LOOP
UPDATE [TABLE] SET [FIELD]=0 WHERE [FIELD] IS NULL AND ROWNUM < 50000;
v_cnt := SQL%ROWCOUNT;
COMMIT;
END LOOP;
END;
/
The smaller the ROWNUM limit the less concurrency/locking issues you'll hit, but the more time you'll spend in table scanning.
Vincent already answered your question perfectly, but I'm curious about the "why" behind this action. Why are you updating all NULL's to 0?
Regards,
Rob.
Some suggestions:
Drop any indexes that contain FIELD before running your UPDATE statement, and then re-add them later.
Write a PL/SQL procedure to do this that commits after every 1000 or 10000 rows.
Hope this helps.
You could acheive the same result without updating by using an ALTER table to set the columns "DEFAULT" value to 0.