are views that query from other views also refreshed automatically - sql

Let's say I have a view that queries from another view:
create view another_view as (select * from my_table);
create view one_view as (select * from another_view);
select * from one_view;
When I issue the last statement select * from one_view; does that also refreshes and queries another_view ?

Views are not persisted anywhere so they cannot be "refreshed". When you query from a view then the SQL engine will rewrite the query to use the view's query and select directly from the underlying table(s) applying all the joins, filters, etc. from the view.
Given the setup:
CREATE TABLE my_table (value) AS
SELECT 1 FROM DUAL;
create view another_view as (select * from my_table);
create view one_view as (select * from another_view);
Then you can look at the explain plan for selecting from the view:
EXPLAIN PLAN FOR
select * from one_view;
Then:
SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());
Which outputs:
PLAN_TABLE_OUTPUT
Plan hash value: 3804444429
 
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| MY_TABLE | 1 | 3 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------
Oracle does not select from any view, it rewrites the query to select directly from the underlying table. Therefore there is no concept of the view having to "refresh"; it is always whatever is current in the table.
fiddle

Related

How to make a schema dependant of other schema

Got a question and I don't find the answer, can someone help me ? here is the situation :
I have a schema, that is a template.
And I want to have 10 schemas of this template.
But I want that everytime I change the structure of the template schema, like making a new column, the column is created in all the schemas related to the template schema.
is this possible with Oracle ?
As the others said, it is not possible in Oracle to do this by default. BUT, if your on the latest versions (12.2 and higher), and don't mind paying for the multitenant option, you can look into something called application containers. This will trade your schemas in a single DB for the same schema but in different PDBs. Application containers allows you to define the schema in a parent PDB (including tables, views, triggers, ....) and have every modification propagated to the PDBs (you sync each PDB when you want).
But I want that everytime I change the structure of the template schema, like making a new column, the column is created in all the schemas related to the template schema.
is this possible with Oracle ?
No, it is not. You would need to separately create the column in the table owned by each individual user (a.k.a. schema).
As Justin Cave suggested, your problem is practically screaming for Oracle Partitioning.
If you do not have Partitioning licensed, there is still the old (but free!) approach of making a partitioned view.
In this approach, you would keep your historical tables unchanged (i.e., don't go back and add new columns to them). Instead, you would create a partitioned view that includes each historical table (concatenated together via UNION ALL). The view definition can provide values for newer columns that did not exist in the original table that year.
A partitioned view also has the benefits of
Making it easy to report across multiple years
"Partition pruning" -- skipping tables that are not of interest in a given query
Here is a walk through of the approach:
Create tables for 2019 and 2020 data
CREATE TABLE matt_data_2019
( id NUMBER NOT NULL,
creation_date DATE NOT NULL,
data_column1 NUMBER,
data_column2 VARCHAR2(200),
CONSTRAINT matt_data_2019 PRIMARY KEY ( id ),
CONSTRAINT matt_data_2019_c1 CHECK ( creation_date BETWEEN to_date('01-JAN-2019','DD-MON-YYYY') AND to_date('01-JAN-2020','DD-MON-YYYY') - interval '1' second )
);
CREATE TABLE matt_data_2020
( id NUMBER NOT NULL,
creation_date DATE NOT NULL,
data_column1 NUMBER,
data_column2 VARCHAR2(200),
data_column3 DATE, -- This is new for 2020
CONSTRAINT matt_data_2020 PRIMARY KEY ( id ),
CONSTRAINT matt_data_2020_c1 CHECK ( creation_date BETWEEN to_date('01-JAN-2020','DD-MON-YYYY') AND to_date('01-JAN-2021','DD-MON-YYYY') - interval '1' second )
);
Notice there is a new column for 2020 that does not exist in 2019.
Put some test data in to ensure accurate test results...
INSERT INTO matt_data_2019 ( id, creation_date, data_column1, data_column2 )
SELECT rownum id,
to_date('01-JAN-2019','DD-MON-YYYY') + (dbms_random.value(0, 365*24*60*60-1) / (365*24*60*60)), -- Some random date in 2019
dbms_random.value(0,1000),
lpad('2019',200,'X')
FROM dual
CONNECT BY rownum <= 100000;
INSERT INTO matt_data_2020 ( id, creation_date, data_column1, data_column2, data_column3 )
SELECT rownum id,
to_date('01-JAN-2020','DD-MON-YYYY') + (dbms_random.value(0, 365*24*60*60-1) / (365*24*60*60)), -- Some random date in 2020
dbms_random.value(0,1000),
lpad('2020',200,'X'),
to_date('01-JAN-2021','DD-MON-YYYY') + (dbms_random.value(0, 365*24*60*60-1) / (365*24*60*60)) -- Some random date in 2021
FROM dual
CONNECT BY rownum <= 100000;
Gather statistics on both tables for accurate test results ...
EXEC DBMS_STATS.GATHER_TABLE_STATS(user,'MATT_DATA_2019');
EXEC DBMS_STATS.GATHER_TABLE_STATS(user,'MATT_DATA_2020');
Create a view that includes all the tables.
You would need to modify this view every time a new table was created.
CREATE OR REPLACE VIEW matt_data_v AS
SELECT 2019 source_year,
id,
creation_date,
data_column1,
data_column2,
NULL data_column3 -- data_column3 did not exist in 2019
FROM matt_data_2019
UNION ALL
SELECT 2020 source_year,
id,
creation_date,
data_column1,
data_column2,
data_column3 -- data_column3 did not exist in 2019
FROM matt_data_2020;
Check how Oracle will process a query specifying a single year
EXPLAIN PLAN SET STATEMENT_ID = 'MM' FOR SELECT * FROM MATT_DATA_V WHERE SOURCE_YEAR = 2020
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY('PLAN_TABLE','MM'));
Plan hash value: 393585474
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 110K| 15M| 620 (2)| 00:00:01 |
| 1 | VIEW | MATT_DATA_V | 110K| 15M| 620 (2)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
|* 3 | FILTER | | | | | |
| 4 | TABLE ACCESS FULL| MATT_DATA_2019 | 71238 | 9530K| 596 (2)| 00:00:01 |
| 5 | TABLE ACCESS FULL | MATT_DATA_2020 | 110K| 15M| 620 (2)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(NULL IS NOT NULL)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Hmmm, it looks like Oracle is still including the 2019 table...
... but it isn't. That NULL IS NOT NULL filter condition will cause Oracle to skip the 2019 table completely.
Prove that Oracle is skipping the 2019 table when we ask for 2020 data ...
alter session set statistics_level = ALL;
SELECT * FROM MATT_DATA_V WHERE SOURCE_YEAR = 2020;
-- Be sure to fetch entire result set (e.g., scroll to the end in SQL*Developer)
SELECT *
FROM TABLE (DBMS_XPLAN.display_cursor (null, null,
'ALLSTATS LAST'));
SQL_ID 1u3nwcnxs20jb, child number 0
-------------------------------------
SELECT * FROM MATT_DATA_V WHERE SOURCE_YEAR = 2020
Plan hash value: 393585474
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 100K|00:00:00.21 | 5417 |
| 1 | VIEW | MATT_DATA_V | 1 | 110K| 100K|00:00:00.21 | 5417 |
| 2 | UNION-ALL | | 1 | | 100K|00:00:00.17 | 5417 |
|* 3 | FILTER | | 1 | | 0 |00:00:00.01 | 0 |
| 4 | TABLE ACCESS FULL| MATT_DATA_2019 | 0 | 71238 | 0 |00:00:00.01 | 0 |
| 5 | TABLE ACCESS FULL | MATT_DATA_2020 | 1 | 110K| 100K|00:00:00.09 | 5417 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(NULL IS NOT NULL)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
The results above show how Oracle skips the 2019 table when we don't ask for it.

Oracle index for a static like clause

I want to index this query clause -- note that the text is static.
SELECT * FROM tbl where flags LIKE '%current_step: complete%'
To re-iterate, the current_step: complete never changes in the query. I want to build an index that will effectively pre-calculate this boolean value, thereby preventing full table scans...
I would prefer not to add a boolean column to store the pre-calculated value as this would necessitate code changes in the application....
If you don't want to change the query, and it isn't just an issue of nor changing the data maintenance (in which case a virtual column and/or index would do the job), you could use a materialised view that applies the filter, and let query rewrite take case of using that instead of the real table. Which may well be overkill but is an option.
The original plan for a mocked-up version:
explain plan for
SELECT * FROM tbl where flags LIKE '%current_step: complete%';
select * from table(dbms_xplan.display);
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TBL | 2 | 60 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("FLAGS" IS NOT NULL AND "FLAGS" LIKE '%current_step:
complete%')
A materialised view that will only hold the records your query is interested in (this is a simple example but you'd need to decide how to refresh and add a log if needed):
create materialized view mvw
enable query rewrite as
SELECT * FROM tbl where flags LIKE '%current_step: complete%';
Now your query hits the materialised view, thanks to query rewrite:
explain plan for
SELECT * FROM tbl where flags LIKE '%current_step: complete%';
select * from table(dbms_xplan.display);
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 3 (0)| 00:00:01 |
| 1 | MAT_VIEW REWRITE ACCESS FULL| MVW | 2 | 60 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
But any other query will still use the original table:
explain plan for
SELECT * FROM tbl where flags LIKE '%current_step: working%';
select * from table(dbms_xplan.display);
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 27 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TBL | 1 | 27 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("FLAGS" LIKE '%current_step: success%' AND "FLAGS" IS NOT
NULL)
Of course a virtual index would be simpler if you are allowed to modify the query...
A full text search index might be what you are looking for.
There are a few ways you can implement this:
Oracle has Oracle Text where you can define which type of full text index you want.
Lucene is a Java full text search framework.
Solr is a server product that provides full text search.
I would prefer not to add a boolean column to store the pre-calculated value as this would necessitate code changes in the application
There are two ways I can suggest :
1.
If you are on 11g and up, you could have a VIRTUAL COLUMN always generated as 1 when the value is complete else 0. All you need to do then :
select * from table where virtual_column = 1
To improve performance, you could have an index over it, which is equivalent to a function-based index.
2.
Update : Perhaps, I should be more clear with my second point : source
There are instances where Oracle will use an index to resolve a like with the pattern of '%text%'. If the query can be resolved without having to go back to the table (rowid lookup), the index may be chosen. Example:
select distinct first_nm from person where first_nm like '%EV%';
And in above case, Oracle will do an index fast full scan - a full scan of the smaller index.

Unwanted queries merge in Oracle 10g

I am working on Oracle Database 10g Release 10.2.0.5.0. I have view like:
CREATE OR REPLACE VIEW some_view
(
A,
B
)
AS
SELECT A, B
FROM table_a
WHERE condition_a
UNION ALL
SELECT A, B
FROM table_b
WHERE condition_b;
and some database function some_db_package.foo(). My problem is that when I execute query:
SELECT A, some_db_package.foo(B) val
FROM some_view
WHERE some_db_package.foo(B) = 0;
Oracle is merging conditions from query and some_view, so I am getting something like:
SELECT A, some_db_package.foo(B) val
FROM table_a
WHERE some_db_package.foo(B) = 0 AND condition_a
UNION ALL
SELECT A, some_db_package.foo(B) val
FROM table_b
WHERE some_db_package.foo(B) = 0 AND condition_b;
some_db_package.foo() executes on all rows from table_a and table_b and I would like to execute some_db_package.foo() only on filtered (by condition_a and condition_b) rows. Is there any way to do that (i.e. by changing sql query or some_view definition) assuming that I can not use optimizer hints in query?
Problem solved. Just to summmarize:
some_db_package.foo() - for given event and date range counts event's errors which occured between dates (foo() access tables), so it is deterministic only when sysdate > dateTo.
select * from ( SELECT A, some_db_package.foo(B) val FROM some_view ) does not make difference.
Actually I do not need UNION ALL and I did test with UNION, but stil the same result.
with some_view_set as (select A, B from some_view) select * from ( select A, some_db_package.foo(B) val from some_view_set ) where val = 0 does not make difference.
I did test with optimizer hints and unfortunately Oracle ignored them.
Using ROWNUM >= 1 in some_view was the solution for my problem.
Thank you for help, I really appreciate it.
ROWNUM is usually the best way to stop optimizer transformations. Hints are difficult to get right - the syntax is weird and buggy and there are many potential transformations that need to be stopped. There are other ways to re-write the query, but ROWNUM is generally the best way because it is documented to work this way. ROWNUM has to evaluate last to be used in Top-N queries, you can always rely on it to prevent query blocks from being merged.
Sample schema
drop table table_a;
drop table table_b;
create table table_a(a number, b number);
create table table_b(a number, b number);
insert into table_a select level, level from dual connect by level <= 10;
insert into table_b select level, level from dual connect by level <= 10;
begin
dbms_stats.gather_table_stats(user, 'table_a');
dbms_stats.gather_table_stats(user, 'table_b');
end;
/
--FOO takes 1 second each time it is executed.
create or replace function foo(p_value number) return number is
begin
dbms_lock.sleep(1);
return 0;
end;
/
--BAR is fast, but the optimizer doesn't know it.
create or replace function bar(p_value number) return number is
begin
return p_value;
end;
/
--This view returns 2 rows.
CREATE OR REPLACE VIEW some_view AS
SELECT A, B
FROM table_a
WHERE a = bar(1)
UNION ALL
SELECT A, B
FROM table_b
WHERE a = bar(2);
Slow query
This query takes 20 seconds to run, implying the function is evaluated 20 times.
SELECT A, foo(B) val
FROM some_view
WHERE foo(B) = 0;
The explain plan shows the conditions are merged, and it appears that the conditions are evaluated from left to right (but don't rely on this always being true!).
explain plan for
SELECT A, foo(B) val
FROM some_view
WHERE foo(B) = 0;
select * from table(dbms_xplan.display);
Plan hash value: 4139878329
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 5 (0)| 00:00:01 |
| 1 | VIEW | SOME_VIEW | 1 | 6 | 5 (0)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
|* 3 | TABLE ACCESS FULL| TABLE_A | 1 | 6 | 3 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| TABLE_B | 1 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("FOO"("B")=0 AND "A"="BAR"(1))
4 - filter("FOO"("B")=0 AND "A"="BAR"(2))
Note
-----
- automatic DOP: skipped because of IO calibrate statistics are missing
Fast query
Add a seemingly redundant ROWNUM predicate that does nothing except prevent transformations.
CREATE OR REPLACE VIEW some_view2 AS
SELECT A, B
FROM table_a
WHERE a = bar(1)
AND ROWNUM >= 1 --Prevent optimizer transformations, for performance.
UNION ALL
SELECT A, B
FROM table_b
WHERE a = bar(2)
AND ROWNUM >= 1 --Prevent optimizer transformations, for performance.
;
Now the query only takes 4 seconds, the function is only run 4 times.
SELECT A, foo(B) val
FROM some_view2
WHERE foo(B) = 0;
In the new explain plan it's clear that the FOO function is evaluated last, after most of the filtering is complete.
explain plan for
SELECT A, foo(B) val
FROM some_view2
WHERE foo(B) = 0;
select * from table(dbms_xplan.display);
Plan hash value: 4228269064
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 52 | 6 (0)| 00:00:01 |
|* 1 | VIEW | SOME_VIEW2 | 2 | 52 | 6 (0)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
| 3 | COUNT | | | | | |
|* 4 | FILTER | | | | | |
|* 5 | TABLE ACCESS FULL| TABLE_A | 1 | 6 | 3 (0)| 00:00:01 |
| 6 | COUNT | | | | | |
|* 7 | FILTER | | | | | |
|* 8 | TABLE ACCESS FULL| TABLE_B | 1 | 6 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("FOO"("B")=0)
4 - filter(ROWNUM>=1)
5 - filter("A"="BAR"(1))
7 - filter(ROWNUM>=1)
8 - filter("A"="BAR"(2))
Note
-----
- automatic DOP: skipped because of IO calibrate statistics are missing
Ben's idea to make the function DETERMINISTIC may also help reduce the function calls.

Want to process 5000 records from the select query is taking long time in oracle database

Each time i want to process 5000 records like below.
First time i want to process records from 1 to 5000 rows.
second time i want to process records from 5001 to 10000 rows.
third time i want to process records from 10001 to 15001 rows like wise
I dont want to go for procedure or PL/SQL. I will change the rnum values in my code to fetch the 5000 records.
The given query is taking 3 minutes to fetch the records from 3 joined tables. How can i reduced the time to fetch the records.
select * from (
SELECT to_number(AA.MARK_ID) as MARK_ID, AA.SUPP_ID as supplier_id, CC.supp_nm as SUPPLIER_NAME, CC.supp_typ as supplier_type,
CC.supp_lock_typ as supplier_lock_type, ROW_NUMBER() OVER (ORDER BY AA.MARK_ID) as rnum
from TABLE_A AA, TABLE_B BB, TABLE_C CC
WHERE
AA.MARK_ID=BB.MARK_ID AND
AA.SUPP_ID=CC.location_id AND
AA.char_id='160' AND
BB.VALUE_KEY=AA.VALUE_KEY AND
BB.VALUE_KEY=CC.VALUE_KEY
AND AA.VPR_ID IS NOT NULL)
where rnum >=10001 and rnum<=15000;
I have tried below scenario but no luck.
I have tried the /*+ USE_NL(AA BB) */ hints.
I used exists in the where conditions. but its taking the same 3 minutes to fetch the records.
Below is the table details.
select count(*) from TABLE_B;
-----------------
2275
select count(*) from TABLE_A;
-----------------
2405276
select count(*) from TABLE_C;
-----------------
1269767
Result of my inner query total records is
SELECT count(*)
from TABLE_A AA, TABLE_B BB, TABLE_C CC
WHERE
AA.MARK_ID=BB.MARK_ID AND
AA.SUPP_ID=CC.location_id AND
AA.char_id='160' AND
BB.VALUE_KEY=AA.VALUE_KEY AND
BB.VALUE_KEY=CC.VALUE_KEY
AND AA.VPR_ID IS NOT NULL;
-----------------
2027055
All the used columns in where conditions are indexed properly.
Explain Table for the given query is...
Plan hash value: 3726328503
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2082K| 182M| | 85175 (1)| 00:17:03 |
|* 1 | VIEW | | 2082K| 182M| | 85175 (1)| 00:17:03 |
|* 2 | WINDOW SORT PUSHED RANK | | 2082K| 166M| 200M| 85175 (1)| 00:17:03 |
|* 3 | HASH JOIN | | 2082K| 166M| | 44550 (1)| 00:08:55 |
| 4 | TABLE ACCESS FULL | TABLE_C | 1640 | 49200 | | 22 (0)| 00:00:01 |
|* 5 | HASH JOIN | | 2082K| 107M| 27M| 44516 (1)| 00:08:55 |
|* 6 | VIEW | index$_join$_005 | 1274K| 13M| | 9790 (1)| 00:01:58 |
|* 7 | HASH JOIN | | | | | | |
| 8 | INLIST ITERATOR | | | | | | |
|* 9 | INDEX RANGE SCAN | TABLE_B_IN2 | 1274K| 13M| | 2371 (2)| 00:00:29 |
| 10 | INDEX FAST FULL SCAN| TABLE_B_IU1 | 1274K| 13M| | 4801 (1)| 00:00:58 |
|* 11 | TABLE ACCESS FULL | TABLE_A | 2356K| 96M| | 27174 (1)| 00:05:27 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RNUM">=10001 AND "RNUM"<=15000)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "A"."MARK_ID")<=15000)
3 - access("A"."SUPP_ID"="C"."LOC_ID" AND "A"."VALUE_KEY"="C"."VALUE_KEY")
5 - access("A"."MARK_ID"="A"."MARK_ID" AND "A"."VALUE_KEY"="A"."VALUE_KEY")
6 - filter("A"."MARK_CHN_IND"='C' OR "A"."MARK_CHN_IND"='D')
7 - access(ROWID=ROWID)
9 - access("A"."MARK_CHN_IND"='C' OR "A"."MARK_CHN_IND"='D')
11 - filter("A"."CHNL_ID"=160 AND "A"."VPR_ID" IS NOT NULL)
Could you please anyone help me on this to tune this query as i am trying from last 2 days?
Each query will take a long time because each query will have to join then sort all rows. The row_number analytic function can only return a result if the whole set has been read. This is highly inefficient. If the data set is large, you only want to sort and hash-join once.
You should fetch the whole set once, using batches of 5k rows. Alternatively, if you want to keep your existing code logic, you could store the result in a temporary table, for instance:
CREATE TABLE TMP AS <your above query>
CREATE INDEX ON TMP (rnum)
And then replace your query in your code by
SELECT * FROM TMP WHERE rnum BETWEEN :x AND :y
Obviously if your temp table is being reused periodically, just create it once and delete when done (or use a true temporary table).
How many unique MARK_ID values have you got in TABLE_A? I think you may get better performance if you limit the fetched ranges of records by MARK_ID instead of the artificial row number, because the latter is obviously not sargeable. Granted, you may not get exactly 5000 rows in each range but I have a feeling it's not as important as the query performance.
Firstly, giving obfuscated table names makes it nearly impossible to deduce anything about the data distributions and relationships between tables, so potential answerers are crippled from the start.
However, if every row in table_a matches one row in the other tables then you can avoid some of the usage of 200Mb of temporary disk space that is probably crippling performance by pushing the ranking down into an inline view or common table expression.
Monitor V$SQL_WORKAREA to check the exact amount of space being used for the window function, and if it is still excessive consider modifying the memory management to increase available sort area size.
Something like:
with cte_table_a as (
SELECT
to_number(MARK_ID) as MARK_ID,
SUPP_ID as supplier_id,
ROW_NUMBER() OVER (ORDER BY MARK_ID) as rnum
from
TABLE_A
where
char_id='160' and
VPR_ID IS NOT NULL)
select ...
from
cte_table_a aa,
TABLE_B BB,
TABLE_C CC
WHERE
aa.rnum >= 10001 and
aa.rnum <= 15000 and
AA.MARK_ID = BB.MARK_ID AND
AA.SUPP_ID = CC.location_id AND
BB.VALUE_KEY = AA.VALUE_KEY AND
BB.VALUE_KEY = CC.VALUE_KEY

Why won't Oracle use my index unless I tell it to?

I have an index:
CREATE INDEX BLAH ON EMPLOYEE(SUBSTR(TO_CHAR(EMPSHIRTNO), 1, 4));
and an SQL STATEMENT:
SELECT COUNT(*)
FROM (SELECT COUNT(*)
FROM EMPLOYEE
GROUP BY SUBSTR(TO_CHAR(EMPSHIRTNO), 1, 4)
HAVING COUNT(*) > 100);
but it keeps doing a full table scan instead of using the index unless I add a hint.
EMPSHIRTNO is not the primary key, EMPNO is (which isn't used here).
Complex query
EXPLAIN PLAN FOR SELECT COUNT(*) FROM (SELECT COUNT(*) FROM EMPLOYEE
GROUP BY SUBSTR(TO_CHAR(EMPSHIRTNO), 1, 4)
HAVING COUNT(*) > 100);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1712471557
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 24 (9)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | | |
| 2 | VIEW | | 497 | | 24 (9)| 00:00:01 |
|* 3 | FILTER | | | | | |
----------------------------------------------------------------------------------
| 4 | HASH GROUP BY | | 497 | 2485 | 24 (9)| 00:00:01 |
| 5 | TABLE ACCESS FULL| EMPLOYEE | 9998 | 49990 | 22 (0)| 00:00:01||
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(COUNT(*)>100)
17 rows selected.
ANALYZE INDEX BLAH VALIDATE STRUCTURE;
SELECT BTREE_SPACE, USED_SPACE FROM INDEX_STATS;
BTREE_SPACE USED_SPACE
----------- ----------
176032 150274
Simple query:
EXPLAIN PLAN FOR SELECT * FROM EMPLOYEE;
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2913724801
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9998 | 439K| 23 (5)| 00:00:01 |
| 1 | TABLE ACCESS FULL| EMPLOYEE | 9998 | 439K| 23 (5)| 00:00:01 |
------------------------------------------------------------------------------
8 rows selected.
Maybe it is because the NOT NULL constraint is enforced via a CHECK constraint rather than being defined originally in the table creation statement? It will use the index when I do:
SELECT * FROM EMPLOYEE WHERE SUBSTR(TO_CHAR(EMPSHIRTNO), 1, 4) = '1234';
For those suggesting that it needs to read all of the rows anyway (which I don't think it does as it is counting), the index is not used on this either:
SELECT SUBSTR(TO_CHAR(EMPSHIRTNO), 1, 4) FROM EMPLOYEE;
In fact, putting an index on EMPSHIRTNO and performing SELECT EMPSHIRTNO FROM EMPLOYEE; does not use the index either. I should point out that EMPSHIRTNO is not unique, there are duplicates in the table.
Because of the nature of your query it needs to scan every row of the table anyway. So oracle is probably deciding that a full table scan is the most efficient way to do this. Because its using a HASH GROUP BY there is no nasty sort at the end like in oracle 7 days.
First get the count per SUBSTR(...) of shirt no. Its thus first part of the query which has to scan the entire table
SELECT COUNT(*)
FROM EMPLOYEE
GROUP BY SUBSTR(TO_CHAR(EMPSHIRTNO), 1, 4)
Next you want to discard the SUBSTR(...) where the count is <= 100. Oracle needs to scan all rows to verify this. Technically you could argue that once it has 101 it doesn't need any more, but I don't think Oracle can work this out, especially as you are asking it what the total numer is in the SELECT COUNT(*) of the subquery.
HAVING COUNT(*) > 100);
So basically to give you the answer you want Oracle needs to scan every row in the table, so an index is no help on filtering. Because its using a hash group by, the index is no help on the grouping either. So to use the index would just slow your query down, which is why Oracle is not using it.
I think you may need to build a function-based index on SUBSTR(TO_CHAR(EMPSHIRTNO), 1,4); Functions in your SQL have a tendency to invalidate regular indexes on a column.
I believe #Codo is correct. Oracle cannot determine that the expression will always be non-null, and then must assume that some nulls may not
be stored in the index.
(It seems like Oracle should be able to figure out that the expression is not nullable. In general, the chance of any random SUBSTR expression always being
not null is probably very low, maybe Oracle just lumps all SUBSTR expressions together?)
You can make the index usable for your query with one of these work-arounds:
--bitmap index:
create bitmap index blah on employee(substr(to_char(empshirtno), 1, 4));
--multi-column index:
alter table employee add constraint blah primary key (id, empshirtno);
--indexed virtual column:
create table employee(empshirtno varchar2(10) not null
,empshirtno_for_index as (substr(empshirtno,1,4)) not null );
create index blah on employee(empshirtno_for_index);