I have a table A with intervals (COL1, COL2):
CREATE TABLE A (
COL1 NUMBER(15) NOT NULL,
COL2 NUMBER(15) NOT NULL,
VAL1 ...,
VAL2 ...
);
ALTER TABLE A ADD CONSTRAINT COL1_BEFORE_COL2 CHECK (COL1 <= COL2);
The intervals are guaranteed to be "exclusive", i.e. they will never overlap. In other words, this query yields no rows:
SELECT *
FROM (
SELECT
LEAD(COL1, 1) OVER (ORDER BY COL1) NEXT,
COL2
FROM A
)
WHERE COL2 >= NEXT;
There is currently an index on (COL1, COL2). Now, my query is the following:
SELECT /*+FIRST_ROWS(1)*/ *
FROM A
WHERE :some_value BETWEEN COL1 AND COL2
AND ROWNUM = 1
This performs well (less than a ms for millions of records in A) for low values of :some_value, because they're very selective on the index. But it performs quite badly (almost a second) for high values of :some_value because of a lower selectivity of the access predicate.
The execution plan seems good to me. As the existing index already fully covers the predicate, I get the expected INDEX RANGE SCAN:
------------------------------------------------------
| Id | Operation | Name | E-Rows |
------------------------------------------------------
| 0 | SELECT STATEMENT | | |
|* 1 | COUNT STOPKEY | | |
| 2 | TABLE ACCESS BY INDEX ROWID| A | 1 |
|* 3 | INDEX RANGE SCAN | A_PK | |
------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM=1)
3 - access("VAL2">=:some_value AND "VAL1"<=:some_value)
filter("VAL2">=:some_value)
In 3, it becomes obvious that the access predicate is selective only for low values of :some_value whereas for higher values, the filter operation "kicks in" on the index.
Is there any way to generally improve this query to be fast regardless of the value of :some_value? I can completely redesign the table if further normalisation is needed.
Your attempt is good, but misses a few crucial issues.
Let's start slowly. I'm assuming an index on COL1 and I actually don't mind if COL2 is included there as well.
Due to the constraints you have on your data (especially non-overlapping) you actually just want the row before the row where COL1 is <= some value....[--take a break--] it you order by COL1
This is a classic Top-N query:
select *
FROM ( select *
from A
where col1 <= :some_value
order by col1 desc
)
where rownum <= 1;
Please note that you must use ORDER BY to get a definite sort order. As WHERE is applied after ORDER BY you must now also wrap the top-n filter in an outer query.
That's almost done, the only reason why we actually need to filter on COL2 too is to filter out records that don't fall into the range at all. E.g. if some_value is 5 and you are having this data:
COL1 | COL2
1 | 2
3 | 4 <-- you get this row
6 | 10
This row would be correct as result, if COL2 would be 5, but unfortunately, in this case the correct result of your query is [empty set]. That's the only reason we need to filter for COL2 like this:
select *
FROM ( select *
FROM ( select *
from A
where col1 <= :some_value
order by col1 desc
)
where rownum <= 1
)
WHERE col2 >= :some_value;
Your approach had several problems:
missing ORDER BY - dangerous in connection with rownum filter!
applying the Top-N clause (rownum filter) too early. What if there is no result? Database reads index until the end, the rownum (STOPKEY) never kicks in.
An optimizer glitch. With the between predicate, my 11g installation doesn't come to the idea to read the index in descending order, so it was actually reading it from the beginning (0) upwards until it found a COL2 value that matched --OR-- the COL1 run out of the range.
.
COL1 | COL2
1 | 2 ^
3 | 4 | (2) go up until first match.
+----- your intention was to start here
6 | 10
What was actually happening was:
COL1 | COL2
1 | 2 +----- start at the beginning of the index
3 | 4 | Go down until first match.
V
6 | 10
Look at the execution plan of my query:
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 4 (0)| 00:00:01 |
|* 1 | VIEW | | 1 | 26 | 4 (0)| 00:00:01 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | VIEW | | 2 | 52 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID | A | 50000 | 585K| 4 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN DESCENDING| SIMPLE | 2 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Note the INDEX RANGE SCAN **DESCENDING**.
Finally, why didn't I include COL2 in the index? It's a single row-top-n query. You can save at most a single table access (irrespective of what the Rows estimation above says!) If you expect to find a row in most cases, you'll need to go to the table anyways for the other columns (probably) so you would not save ANYTHING, just consume space. Including the COL2 will only improve performance if you query doesn't return anything at all!
Related:
How to use index efficienty in mysql query
I answered a very similar question about this years ago. Same solution.
Use The Index, Lukas!
I think, because the ranges do not intersect, you can define col1 as primary key and execute the query like this:
SELECT *
FROM a
JOIN
(SELECT MAX (col1) AS col1
FROM a
WHERE col1 <= :somevalue) b
ON a.col1 = b.col1;
If there are gaps between the ranges you wil have to add:
Where col2 >= :somevalue
as last line.
Execution Plan:
SELECT STATEMENT
NESTED LOOPS
VIEW
SORT AGGREGATE
FIRST ROW
INDEX RANGE SCAN (MIN/MAX) PKU1
TABLE ACCESS BY INDEX A
INDEX UNIQUE SCAN PKU1
Maybe changing this heap table to IOT table would give better performance.
I didn't generate sample data to test this but you might want to give it a try.
ALTER TABLE A ADD COL3 NUMBER(15);
UPDATE A SET COL3 = COL2 - COL1;
Create index on COL3.
SELECT /*+FIRST_ROWS(1)*/ *
FROM A
WHERE :some_value < COL3
AND ROWNUM = 1;
Related
select * from Schem.Customer
where cust='20' and cust_id >= '890127'
and rownum between 1 and 2 order by cust, cust_id;
Execution time appr 2 min 10 sec
select * from Schem.Customer where cust='20'
and cust_id >= '890127'
order by cust, cust_id fetch first 2 rows only ;
Execution time appr 00.069 ms
The execution time is a huge difference but results are the same. My team is not adopting to later one. Don't ask why.
So what is the difference between Rownum and fetch first 2 rows and what should I do to improve or convince anyone to adopt.
DBMS : DB2 LUW
Although both SQL end up giving same resultset, it only happens for your data. There is a great chance that resultset would be different. Let me explain why.
I will make your SQL a little simpler to make it simple to understand:
SELECT * FROM customer
WHERE ROWNUM BETWEEN 1 AND 2;
In this SQL, you want only first and second rows. That's fine. DB2 will optimize your query and never look rows beyond 2nd. Because only first 2 rows qualify your query.
Then you add ORDER BY clause:
SELECT * FROM customer
WHERE ROWNUM BETWEEN 1 AND 2;
ORDER BY cust, cust_id;
In this case, DB2 first fetches 2 rows then order them by cust and cust_id. Then sends to client(you). So far so good. But what if you want to order by cust and cust_id first, then ask for first 2 rows? There is a great difference between them.
This is the simplified SQL for this case:
SELECT * FROM customer
ORDER BY cust, cust_id
FETCH FIRST 2 ROWS ONLY;
In this SQL, ALL rows qualify the query, so DB2 fetches all of the rows, then sorts them, then sends first 2 rows to client.
In your case, both queries give same results because first 2 rows are already ordered by cust and cust_id. But it won't work if first 2 rows would have different cust and cust_id values.
A hint about this is FETCH FIRST n ROWS comes after order by, that means DB2 orders the result then retrieves first n rows.
Excellent answer here:
https://blog.dbi-services.com/oracle-rownum-vs-rownumber-and-12c-fetch-first/
Now the index range scan is chosen, with the right cardinality estimation.
So which solution it the best one? I prefer row_number() for several reasons:
I like analytic functions. They have larger possibilities, such as setting the limit as a percentage of total number of rows for example.
11g documentation for rownum says:
The ROW_NUMBER built-in SQL function provides superior support for ordering the results of a query
12c allows the ANSI syntax ORDER BY…FETCH FIRST…ROWS ONLY which is translated to row_number() predicate
12c documentation for rownum adds:
The row_limiting_clause of the SELECT statement provides superior support
rownum has first_rows_n issues as well
PLAN_TABLE_OUTPUT
SQL_ID 49m5a3f33cmd0, child number 0
-------------------------------------
select /*+ FIRST_ROWS(10) */ * from test where contract_id=500
order by start_validity fetch first 10 rows only
Plan hash value: 1912639229
--------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | Buffers |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 | 15 |
|* 1 | VIEW | | 1 | 10 | 10 | 15 |
|* 2 | WINDOW NOSORT STOPKEY | | 1 | 10 | 10 | 15 |
| 3 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 10 | 11 | 15 |
|* 4 | INDEX RANGE SCAN | TEST_PK | 1 | | 11 | 4 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber" <=10)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "TEST"."START_VALIDITY") <=10 )
4 - access("CONTRACT_ID"=500)
I have one table with 3 fields and I neeed get all value of fields, I have next query:
SELECT COM.FIELD1, COM.FIELD2, COM.FIELD3
FROM OWNER.TABLE_NAME COM
WHERE COM.FIELD1 <> V_FIELD
ORDER BY COM.FIELD3 ASC;
And i want optimaze, I have next values of explain plan:
Plan
SELECT STATEMENT CHOOSECost: 4 Bytes: 90 Cardinality: 6
2 SORT ORDER BY Cost: 4 Bytes: 90 Cardinality: 6
1 TABLE ACCESS FULL OWNER.TABLE_NAME Cost: 2 Bytes: 90 Cardinality: 6
Any solution for not get TAF(Table Acces Full)?
Thanks!
Since your WHERE condition is on the column FIELD1, an index on that column many help.
You may already have an index on that column. Even then, you will still see a full table access, if the expected number of rows that don't have VAL1 in that column is sufficiently large.
The only case when you will NOT see full table access is if you have an index on that column, the vast majority (at least, say, 80% to 90%) of rows in the table do have the value VAL1 in the column FIELD1, and statistics are up to date AND, perhaps, you need to use a histogram (because in this case the distribution of values in FIELD1 would be very skewed).
I suppose that your table has a very large number of rows with a given key (let call it 'B') and a very small number of rows with other keys.
Note, that the index access will work only for conditions FIELD1 <> 'B', all other predicates will return 'B' and therefore are not suitable for index access.
Note also that if you have more that one large key, the index access will not work from the same reason - you will never get only a few record where index can profit.
As a starting point you can reformulte the predicate
FIELD1 <> V_FIELD
as
DECODE(FIELD1,V_FIELD,1,0) = 0
The DECODE return 1 if FIELD1 = V_FIELD and returns 0 if FIELD1 <> V_FIELD
This transformation allows you to define a function based index with the DECODE expression.
Example
create table tt as
select
decode(mod(rownum,10000),1,'A','B') FIELD1
from dual connect by level <= 50000;
select field1, count(*) from tt group by field1;
FIELD1 COUNT(*)
------ ----------
A 5
B 49995
FBIndex
create index tti on tt(decode(field1,'B',1,0));
Use your large key for the index definition.
Access
To select FIELD1 <> 'B' use reformulated predicate decode(field1,'B',1,0) = 0
Which leads nicely to an index access:
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
SELECT * from tt where decode(field1,'B',1,0) = 0;
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 471 | 2355 | 24 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT | 471 | 2355 | 24 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TTI | 188 | | 49 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(DECODE("FIELD1",'B',1,0)=0)
To select FIELD1 <> 'A' use reformulated predicate decode(field1,'A',1,0) = 0
Here you don't want index access as nearly the whole table is returned- and the CBO opens FULL TABLE SCAN.
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
SELECT * from tt where decode(field1,'A',1,0) = 0;
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 47066 | 94132 | 26 (4)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TT | 47066 | 94132 | 26 (4)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(DECODE("FIELD1",'A',1,0)=0)
Bind Variables
This will work the same way even if you use bind variables FIELD1 <> V_FIELD - provided you pass always the same value.
The bind variable peeking will evaluate the correct plan in the first parse and generate the proper plan.
If you will use more that one values as bind variable (and therefore expect to get different plans for different values) - you will learn the feature of adaptive cursor sharing
The query is already optimized, don't spend any more time on it unless it's running noticeably slow. If you have a tuning checklist that says "avoid all full table scans" it might be time to change that checklist.
The cost of the full table scan is only 2. The exact meaning of the cost is tricky, and not always particularly helpful. But in this case it's probably safe to say that 2 means the full table scan will run quickly.
If the query is not running in less than a few microseconds, or is returning significantly more than the estimated 6 rows, then there may be a problem with the optimizer statistics. If that's the case, try gathering statistics like this:
begin
dbms_stats.gather_table_stats('OWNER', 'TABLE_NAME');
end;
/
As #symcbean pointed out, a full table scan is not always a bad thing. If a table is incredibly small, like this one might be, all the data may fit inside a single block. (Oracle accesses data by block(s)-at-a-time, where the block is usually 8KB of data.) When the data structures are trivially small there won't be any significant difference between using a table or an index.
Also, full table scans can use multi-block reads, whereas most index access paths use single-block reads. For reading a large percentage of data it's faster to read the whole thing with multi-block reads than reading it one-block-at-a-time with an index. Since this query only has a <> condition, it looks likely that this query will read a large percentage of data and a full table scan is optimal.
I'm exectuting command on large table. It has around 7 millons rows.
Command is like this:
select * from mytable;
Now I'm restriting the number of rows to around 3 millons. I'm using this command:
select * from mytable where timest > add_months( sysdate, -12*4 )
I have an index on timest column. But the costs are almost same. I would expect they will decrease. What am I doing wrong?
Any clue?
Thank you in advance!
Here explain plans:
using an index for 3 out of 7 mio. of rows would most probably be even more expensive, so oracle makes a full table scan for both queries, which is IMO correct.
You may try to do parallel FTS (Full Table Scan) - it should be faster, BUT it will put your Oracle server under higher load, so don't do it on heavy loaded multiuser DBs.
Here is an example:
select /*+full(t) parallel(t,4)*/ *
from mytable t
where timest > add_months( sysdate, -12*4 );
To select a very small number of records from a table use index. To select a non-trivial part use partitioning.
In your case an effective acces would be enabled with range partitioning on timest column.
The big advantage is that only relevant partitions are accessed.
Here an exammple
create table test(ts date, s varchar2(4000))
PARTITION BY RANGE (ts)
(PARTITION t1p1 VALUES LESS THAN (TO_DATE('2010-01-01', 'YYYY-MM-DD')),
PARTITION t1p2 VALUES LESS THAN (TO_DATE('2015-01-01', 'YYYY-MM-DD')),
PARTITION t1p4 VALUES LESS THAN (MAXVALUE)
);
Query
select * from test where ts < to_date('2009-01-01','yyyy-mm-dd');
will access only the paartition 1, i.e. only before '2010-01-01'.
See pstart an dpstop in execution plan
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 10055 | 9 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 5 | 10055 | 9 (0)| 00:00:01 | 1 | 1 |
|* 2 | TABLE ACCESS FULL | TEST | 5 | 10055 | 9 (0)| 00:00:01 | 1 | 1 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("TS"<TO_DATE(' 2009-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
There are (at least) two problems.
add_months( sysdate, -12*4 ) is a function. It's not just a constant, so optimizer can't use index here.
Choosing 3 mln. from 7 mln. rows by index is not good idea anyway. Yes, you (would) go by index tree fast, yet each time you have to go to the heap (cause you need * = all rows). That means that there's no sense in using this index.
Thus the index plays no role here.
I need to make a query to determine if 3 columns are not filled out. Should I make a column in the table just as a flag to note that 3 columns are empty? My instinct tells me that I shouldn't make an extra column. I'm just wondering if I would get any performance boost from doing so. This is for oracle server.
select count(*) from my_table t where t.not_available = 1;
or
select count(*) from my_table t where t.col1 is null and t.col2 is null and t.col3 is null;
I think you are doing a pre-mature optimization.
Adding an extra column into a table increases the size of each record. This would typically mean that a table would occupy more space on disk. Large table sizes imply longer full table scans.
Adding indexes might help. But, there is an associated cost with them. If an index would help, you don't need to add another column, because Oracle supports functional indexes. So, you can index on an expression.
In most cases, your query is going to do a full table scan or full index scan, unless some of the conditions are rare.
In other words, to have a change of really answering your question requires understanding:
The record layout
The distribution of values in the three columns
Any additional factors that might affect access, such as partitioned columns
Only when performance leaves you with no other choice should you resort to an extra redundant column. In this case, you should probably avoid it. Just introduce an index on (col1,col2,col3,1) if performance of this statement is too poor.
Here is an example of why putting the 4th constant value 1 in the index is probably a good idea.
First a table with 1000 rows, out of which only 1 row (456) has all three columns NULL:
SQL> create table my_table (id,col1,col2,col3,fill)
2 as
3 select level
4 , nullif(level,456)
5 , nullif(level,456)
6 , nullif(level,456)
7 , rpad('*',100,'*')
8 from dual
9 connect by level <= 1000
10 /
Table created.
A row with three NULLS is not indexed by the index below:
SQL> create index my_table_i1 on my_table(col1,col2,col3)
2 /
Index created.
and will use a full table scan in my test case (likely a full index scan on your primary key index in your case)
SQL> exec dbms_stats.gather_table_stats(user,'my_table')
PL/SQL procedure successfully completed.
SQL> set autotrace on
SQL> select count(*) from my_table t where t.col1 is null and t.col2 is null and t.col3 is null
2 /
COUNT(*)
----------
1
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 228900979
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 8 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
|* 2 | TABLE ACCESS FULL| MY_TABLE | 1 | 12 | 8 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T"."COL1" IS NULL AND "T"."COL2" IS NULL AND "T"."COL3"
IS NULL)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
37 consistent gets
0 physical reads
0 redo size
236 bytes sent via SQL*Net to client
247 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
But if I add a constant 1 to the index:
SQL> set autotrace off
SQL> drop index my_table_i1
2 /
Index dropped.
SQL> create index my_table_i2 on my_table(col1,col2,col3,1)
2 /
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'my_table')
PL/SQL procedure successfully completed.
Then it will use the index and your statement will fly
SQL> set autotrace on
SQL> select count(*) from my_table t where t.col1 is null and t.col2 is null and t.col3 is null
2 /
COUNT(*)
----------
1
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 623815834
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
|* 2 | INDEX RANGE SCAN| MY_TABLE_I2 | 1 | 12 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."COL1" IS NULL AND "T"."COL2" IS NULL AND "T"."COL3"
IS NULL)
filter("T"."COL2" IS NULL AND "T"."COL3" IS NULL)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
236 bytes sent via SQL*Net to client
247 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL execution speed depends on numerous factors you didn't list in your question.
Therefore I think you should check the execution plan on your specific server to get first-hand benchmarks of both situtations.
See e.g. here for info on how to display the execution plan on Oracle..
If the column(s) in the WHERE clause allow the use of an index then most likely that would be the faster. However if no columns are indexed then I would expect the first query to be superior.
But checking the plan is the always the best way to know.
I would create on index on not_available and then query that.
CREATE INDEX index_name
ON table_name (not_available)
Something like this might help you?
select count(NVL2(t.col1||t.col2||t.col3),NULL,1) FROM my_table t;
I'm using Oracle 11gR2 and Hibernate 4.2.1.
My application is a searching application.
Only has SELECT operations and all of them are native queries.
Oracle uses case-sensitive sort by default.
I want to override it to case-insensitive.
I saw couple of option here http://docs.oracle.com/cd/A81042_01/DOC/server.816/a76966/ch2.htm#91066
Now I'm using this query before any search executes.
ALTER SESSION SET NLS_SORT='BINARY_CI'
If I execute above sql before execute the search query, hibernate takes about 15 minutes to return from search query.
If I do this in Sql Developer, It returns within couple of seconds.
Why this kind of two different behaviors,
What can I do to get rid of this slowness?
Note: I always open a new Hibernate session for each search.
Here is my sql:
SELECT *
FROM (SELECT
row_.*,
rownum rownum_
FROM (SELECT
a, b, c, d, e,
RTRIM(XMLAGG(XMLELEMENT("x", f || ', ') ORDER BY f ASC)
.extract('//text()').getClobVal(), ', ') AS f,
RTRIM(
XMLAGG(XMLELEMENT("x", g || ', ') ORDER BY g ASC)
.extract('//text()').getClobVal(), ', ') AS g
FROM ( SELECT src.a, src.b, src.c, src.d, src.e, src.f, src.g
FROM src src
WHERE upper(pp) = 'PP'
AND upper(qq) = 'QQ'
AND upper(rr) = 'RR'
AND upper(ss) = 'SS'
AND upper(tt) = 'TT')
GROUP BY a, b, c, d, e
ORDER BY b ASC) row_
WHERE rownum <= 400
) WHERE rownum_ > 0;
There are so may fields comes with LIKE operation, and it is a dynamic sql query. If I use order by upper(B) asc Sql Developer also takes same time.
But order by upper results are same as NLS_SORT=BINARY_CI. I have used UPPER('B') indexes, but nothings gonna work for me.
A's length = 10-15 characters
B's length = 34-50 characters
C's length = 5-10 characters
A, B and C are sort-able fields via app.
This SRC table has 3 million+ records.
We finally ended up with a SRC table which is a materialized view.
Business logic of the SQL is completely fine.
All of the sor-table fields and others are UPPER indexed.
UPPER() and BINARY_CI may produce the same results but Oracle cannot use them interchangeably. To use an index and BINARY_CI you must create an index like this:
create index src_nlssort_index on src(nlssort(b, 'nls_sort=''BINARY_CI'''));
Sample table and mixed case data
create table src(b varchar2(100) not null);
insert into src select 'MiXeD CAse '||level from dual connect by level <= 100000;
By default the upper() predicate can perform a range scan on the the upper() index
create index src_upper_index on src(upper(b));
explain plan for
select * from src where upper(b) = 'MIXED CASE 1';
select * from table(dbms_xplan.display(format => '-rows -bytes -cost -predicate
-note'));
Plan hash value: 1533361696
------------------------------------------------------------------
| Id | Operation | Name | Time |
------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| SRC | 00:00:01 |
| 2 | INDEX RANGE SCAN | SRC_UPPER_INDEX | 00:00:01 |
------------------------------------------------------------------
BINARY_CI and LINGUISTIC will not use the index
alter session set nls_sort='binary_ci';
alter session set nls_comp='linguistic';
explain plan for
select * from src where b = 'MIXED CASE 1';
select * from table(dbms_xplan.display(format => '-rows -bytes -cost -note'));
Plan hash value: 3368256651
---------------------------------------------
| Id | Operation | Name | Time |
---------------------------------------------
| 0 | SELECT STATEMENT | | 00:00:02 |
|* 1 | TABLE ACCESS FULL| SRC | 00:00:02 |
---------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(NLSSORT("B",'nls_sort=''BINARY_CI''')=HEXTORAW('6D69786564
2063617365203100') )
Function based index on NLSSORT() enables index range scans
create index src_nlssort_index on src(nlssort(b, 'nls_sort=''BINARY_CI'''));
explain plan for
select * from src where b = 'MIXED CASE 1';
select * from table(dbms_xplan.display(format => '-rows -bytes -cost -note'));
Plan hash value: 478278159
--------------------------------------------------------------------
| Id | Operation | Name | Time |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| SRC | 00:00:01 |
|* 2 | INDEX RANGE SCAN | SRC_NLSSORT_INDEX | 00:00:01 |
--------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(NLSSORT("B",'nls_sort=''BINARY_CI''')=HEXTORAW('6D69786564
2063617365203100') )
I investigated and found that The parameters NLS_COMP y NLS_SORT may affect how oracle make uses of execute plan for string ( when it is comparing or ordering).
Is not necesary to change NLS session. adding
ORDER BY NLSSORT(column , 'NLS_SORT=BINARY_CI')
and adding a index for NLS is enough
create index column_index_binary as NLSSORT(column , 'NLS_SORT=BINARY_CI')
I found a clue to a problem in this issue so i'm paying back.
Why oracle stored procedure execution time is greatly increased depending on how it is executed?