Is there any DB server that can optimize the following query? - sql

Let's say I have the table my_table(id int not null primary key, datafield varchar(100)). Query
SELECT * from my_table where id = 100 performs an index seek. If I change it to
SELECT * from my_table where id+1 = 101
it scans the whole index (index scan) (at least it does it in SQL Server and Mysql). Is there any DB server which 'understands' that id +1 = 101 is the same as id = 101-1 ? I do realize that it's not a typical database operation, and server doesn't have to perform any math in such cases, but I wonder if it's implemented anywhere?
Thanks
UPDATE
So far I've tried SQL Server 2008 Enterprise, Mysql 5.1, 5.5. SQL Server shows clustered index seek and clustered index scan respectively. Mysql explain shows ref:const, key:primary, rows:1 and ref:null, key:null,rows: #total number of rows in the table

id +1 = 101 is the same as id = 101-1
No it isn't. What if the +1 overflows the id?

I tried this with PostgreSQL 9.0 and it does not use an index unless I create one on (id - 1).
So with the following index definition
create index idx_minus on my_table ( (id - 1) );
PostgreSQL uses an index for the query
select *
from my_table
where id - 1 = 12345

Interesting.
You can add Oracle Release 10.2.0.1.0 to your list (not able to rewrite the query).
create table t(
id
,x
,padding
,primary key (id)
) as
select rownum as id
,'x' as x
,lpad('x', 100, 'x') as padding
from dual
connect by level <= 50000;
Query 1.
select id
from t
where id = 100 + 1;
----------------------------------------+
| Id | Operation | Name |
-----------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | INDEX UNIQUE SCAN| SYS_C006659 |
-----------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ID"=101)
Query 2.
select id
from t
where id + 1 = 101;
--------------------------------------------
| Id | Operation | Name |
--------------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | INDEX FAST FULL SCAN| SYS_C006659 |
--------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"+1=101)
Query 3.
select x
from t
where id + 1 = 101;
------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
|* 1 | TABLE ACCESS FULL| T | 1 |
------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"+1=101)

Why not just do this instead (assuming you don't want the server to do the math for calculating the actual ID you're looking for)?
SELECT * FROM my_table WHERE id = (101 - 1)

Related

Execution plan of table in cluster expects one row when it should expect multiple

I have created a cluster and a table in the cluster with the following definitions:
create cluster roald_dahl_titles (
title varchar2(100)
);
create index i_roald_dahl_titles
on cluster roald_dahl_titles
;
create table ROALD_DAHL_NOVELS (
title varchar2(100),
published_year number
)
cluster roald_dahl_titles (title)
;
Notably, this is index is not created with the unique constraint, and it's quite possible to insert duplicate values into the table ROALD_DAHL_NOVELS:
insert into roald_dahl_novels (title, published_year) values ('Esio Trot', 1990);
insert into roald_dahl_novels (title, published_year) values ('Esio Trot', 1990);
I then gather statistics on the both the table and the index, and look at an execution plan that uses the index:
begin
dbms_stats.gather_table_stats(user, 'ROALD_DAHL_NOVELS');
dbms_stats.gather_INDEX_stats(user, 'I_ROALD_DAHL_TITLES');
end;
/
explain plan for
select published_year
from roald_dahl_novels
where title = 'Esio Trot';
select *
from table(dbms_xplan.display(format => 'ALL'));
The contents of the execution plan I find a bit confusing, though:
Plan hash value: 2187850431
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 28 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS CLUSTER| ROALD_DAHL_NOVELS | 2 | 28 | 1 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | I_ROALD_DAHL_TITLES | 1 | | 0 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1 / ROALD_DAHL_NOVELS#SEL$1
2 - SEL$1 / ROALD_DAHL_NOVELS#SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TITLE"='Esio Trot')
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "ROALD_DAHL_NOVELS".ROWID[ROWID,10], "TITLE"[VARCHAR2,100],
"PUBLISHED_YEAR"[NUMBER,22]
2 - "ROALD_DAHL_NOVELS".ROWID[ROWID,10]
As part of operation 2, it performs an index unique scan, which tells me that 'Esio Trot' is expected to appear only once in the cluster. The execution plan also says that for that operation, it expects to return only one row.
The column projection information tells me that it expects to return a single column (which will be a ROWID for the table ROALD_DAHL_NOVELS), so this tells me that the total number of ROWIDs returned from that operation will be 1 (1 row at 1 ROWID per row). Since each of the two rows in the table ROALD_DAHL_NOVELS has a different ROWID, then this operation can only be used to return a single row from the table.
When the TABLE ACCESS CLUSTER operation is performed, the execution plan then (correctly) expects two rows to be returned, which is what I find confusing. If these rows are being accessed by ROWID, then I would expect the previous operation to return (at least) two ROWIDs. If they are not being accessed by ROWID, I would not expect the previous operation to return and ROWIDs.
Also, in the TABLE ACCESS CLUSTER, the ROWID of the table ROALD_DAHL_NOVELS is listed in the column projection information section. I am not attempting to select the ROWID, so I would not expect it to be returned from that operation. If anywhere, I would expect it to be in the predicate information section.
Additional investigation
I tried inserting the same row into the table repeatedly, until it contained 65536 identical copies of the same row. After gathering stats and querying USER_INDEXES for the index I_ROALD_DAHL_TITLES, we got the following:
UNIQUENESS DISTINCT_KEYS AVG_DATA_BLOCKS_PER_KEY
UNIQUE 1 109
As I understand it, this tells us:
The index is unique, so we expect each key to appear once in the index
The index has only one distinct key ('Esio Trot'), so must have exactly one entry
The index expects our one key to match to several rows in the table, across 109 blocks
This seems paradoxical - for one key to match to several rows in the table would mean that there must be several entries in the index for that key (each matching to a different ROWID), which would contradict the index being unique.
When checking USER_EXTENTS, the index only uses a single extent of 65536 bytes, which is not enough space to hold information for each of the ROWIDs in the table.
It's not a bug.
Run this query in your database:
select UNIQUENESS from dba_indexes where index_name = upper('i_roald_dahl_titles');
UNIQUENES
---------
UNIQUE
The reason for this is that B-tree cluster indexes only store the database block address of the cluster block that stores that data -- it does not store full rowid values, like a normal index would.
So, while your various rows for title = 'Esio Trot' might have rowid values like:
select rowid row_id, title from roald_dahl_novels n;
ROW_ID TITLE
------------------ ----------------------------------------------------------------------------------------------------
ABocNnACmAABWsWAAL Esio Trot
ABocNnACmAABWsWAAM Esio Trot
ABocNnACmAABWsWAAN Esio Trot
The B-tree cluster index only stores one entry: "Esio Trot", with the corresponding database block address. You can confirm this in your database with:
select num_rows from dba_indexes where index_Name = 'I_ROALD_DAHL_TITLES';
NUM_ROWS
----------
1
That is why you are getting an UNIQUE SCAN reported. Because that is what it is doing, as far as the index is concerned.
There is the same issue with the actual execution plan (tested in 19.5).
Maybe it is a limitation or bug of the displayed execution plan for cluster objects. I would ask this question on asktom.oracle.com to have some kind of official (and free) answer from Oracle.
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID f41cf1x2zdyyr, child number 0
-------------------------------------
select published_year from roald_dahl_novels where title = 'Esio
Trot'
Plan hash value: 2187850431
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 1 (100)| | 2 |00:00:00.01 | 3 |
| 1 | TABLE ACCESS CLUSTER| ROALD_DAHL_NOVELS | 1 | 2 | 28 | 1 (0)| 00:00:01 | 2 |00:00:00.01 | 3 |
|* 2 | INDEX UNIQUE SCAN | I_ROALD_DAHL_TITLES | 1 | 1 | | 0 (0)| | 1 |00:00:00.01 | 1 |
--------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1 / ROALD_DAHL_NOVELS#SEL$1
2 - SEL$1 / ROALD_DAHL_NOVELS#SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TITLE"='Esio Trot')
Column Projection Information (identified by operation id):
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------
1 - "ROALD_DAHL_NOVELS".ROWID[ROWID,10], "TITLE"[VARCHAR2,100], "PUBLISHED_YEAR"[NUMBER,22]
2 - "ROALD_DAHL_NOVELS".ROWID[ROWID,10]
32 rows selected.

Is a 3 column SQL index used when the middle column can be anything?

Trying to prove something out currently to see if adding an index is necessary.
If I have an index on columns A,B,C and I create a query that in the where clause is only explicitly utilizing A and C, will I get the benefit of the index?
In this scenario imagine the where clause is like this:
A = 'Q' AND (B is not null OR B is null) AND C='G'
I investigated this in Oracle using EXPLAIN PLAN and it doesn't seem to use the index. Also, from my understanding of how indexes are created and used it won't be able to benefit because the index can't leverage column B due to the lack of specifics.
Currently looking at this in either MSSQL or ORACLE. Not sure if one optimizes differently than the other.
Any advice is appreciated! Thank you!
Connected to Oracle Database 12c Enterprise Edition Release 12.1.0.2.0
SQL> create table t$ (a integer not null, b integer, c integer, d varchar2(100 char));
Table created
SQL> insert into t$ select rownum, rownum, rownum, lpad('0', '1', 100) from dual connect by level <= 1000000;
1000000 rows inserted
SQL> create index t$i on t$(a, b, c);
Index created
SQL> analyze table t$ estimate statistics;
Table analyzed
SQL> explain plan for select * from t$ where a = 128 and c = 128;
Explained
SQL> select * from table(dbms_xplan.display());
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3274478018
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 4 (0)
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T$ | 1 | 13 | 4 (0)
|* 2 | INDEX RANGE SCAN | T$I | 1 | | 3 (0)
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"=128 AND "C"=128)
filter("C"=128)
15 rows selected
Any question?
If you look at the B + tree structure of the index, then the answer is as follows
The left-hand side of the index, including the first inequality, will go to Seek Predicate, the rest in Predicate in queryplan.
For example read http://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys

Need to speed up my UPDATE QUERY based on the EXPLAIN PLAN

I am updating my table data using a temporary table and it takes forever and it still has not completed. So I collected an explain plan on the query. Can someone advise me on how to tune the query or build indexes on them.
The query:
UPDATE w_product_d A
SET A.CREATED_ON_DT = (SELECT min(B.creation_date)
FROM mtl_system_items_b_temp B
WHERE to_char(B.inventory_item_id) = A.integration_id
and B.organization_id IN ('102'))
where A.CREATED_ON_DT is null;
Explain plan:
Plan hash value: 1520882583
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 47998 | 984K| 33M (2)|110:06:25 |
| 1 | UPDATE | W_PRODUCT_D | | | | |
|* 2 | TABLE ACCESS FULL | W_PRODUCT_D | 47998 | 984K| 9454 (1)| 00:01:54 |
| 3 | SORT AGGREGATE | | 1 | 35 | | |
|* 4 | TABLE ACCESS FULL| MTL_SYSTEM_ITEMS_B_TEMP | 1568 | 54880 | 688 (2)| 00:00:09 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"."CREATED_ON_DT" IS NULL)
4 - filter("B"."ORGANIZATION_ID"=102 AND TO_CHAR("B"."INVENTORY_ITEM_ID")=:B1)
Note
-----
- dynamic sampling used for this statement (level=2)
For this query:
UPDATE w_product_d A
SET A.CREATED_ON_DT = (SELECT min(B.creation_date)
FROM mtl_system_items_b_temp B
WHERE to_char(B.inventory_item_id) = A.integration_id
and B.organization_id IN ('102'))
where A.CREATED_ON_DT is null;
You have a problem. Why are you creating a temporary table with the wrong type for inventory_item_id? That is likely to slow down any access. So, let's fix the table first and then do the update:
alter table mtl_system_items_b_temp
add better_inventory_item_id varchar2(255); -- or whatever the right type is
update mtl_system_items_b_temp
set better_inventory_item_id = to_char(inventory_item_id);
Next, let's define the appropriate index:
create index idx_mtl_system_items_b_temp_3 on mtl_system_items_b_temp(better_inventory_item_id, organization_id, creation_date);
Finally, an index on w_product_d can also help:
create index idx_ w_product_d_1 w_product_d(CREATED_ON_DT);
Then, write the query as:
UPDATE w_product_d p
SET CREATED_ON_DT = (SELECT min(t.creation_date)
FROM mtl_system_items_b_temp t
WHERE t.better_nventory_item_id) = p.integration_id and
t.organization_id IN ('102')
)
WHERE p.CREATED_ON_DT is null;
Try a MERGE statement. It will likely go faster because it can read all the mtl_system_items_b_temp records at once rather than reading them over-and-over again for each row in w_product_d.
Also, your tables look like they're part of an Oracle e-BS environment. In the MTL_SYSTEM_ITEMS_B in such an environment, the INVENTORY_ITEM_ID and ORGANIZATION_ID columns are NUMBER. You seem to be using VARCHAR2 in your tables. Whenever you don't use the correct data types in your queries, you invite performance problems because Oracle must implicitly convert to the correct data type and, in doing so, loses its ability to use indexes on the column. So, make sure your queries treat each column correctly according to it's datatype. (E.g., if a column is a NUMBER use COLUMN_X = 123 instead of COLUMN_X = '123'.
Here's the MERGE example:
MERGE INTO w_product_d t
USING ( SELECT to_char(inventory_item_id) inventory_item_id_char, min(creation_date) min_creation_date
FROM mtl_system_items_b_temp
WHERE organization_id IN ('102') -- change this to IN (102) if organization_id is a NUMBER field!
) u
ON ( t.integration_id = u.inventory_item_id_char AND t.created_on_dt IS NULL )
WHEN MATCHED THEN UPDATE SET t.created_on_dt = nvl(t.created_on_date, u.min_creation_date) -- NVL is just in case...

How to tune a range / interval query in Oracle?

I have a table A with intervals (COL1, COL2):
CREATE TABLE A (
COL1 NUMBER(15) NOT NULL,
COL2 NUMBER(15) NOT NULL,
VAL1 ...,
VAL2 ...
);
ALTER TABLE A ADD CONSTRAINT COL1_BEFORE_COL2 CHECK (COL1 <= COL2);
The intervals are guaranteed to be "exclusive", i.e. they will never overlap. In other words, this query yields no rows:
SELECT *
FROM (
SELECT
LEAD(COL1, 1) OVER (ORDER BY COL1) NEXT,
COL2
FROM A
)
WHERE COL2 >= NEXT;
There is currently an index on (COL1, COL2). Now, my query is the following:
SELECT /*+FIRST_ROWS(1)*/ *
FROM A
WHERE :some_value BETWEEN COL1 AND COL2
AND ROWNUM = 1
This performs well (less than a ms for millions of records in A) for low values of :some_value, because they're very selective on the index. But it performs quite badly (almost a second) for high values of :some_value because of a lower selectivity of the access predicate.
The execution plan seems good to me. As the existing index already fully covers the predicate, I get the expected INDEX RANGE SCAN:
------------------------------------------------------
| Id | Operation | Name | E-Rows |
------------------------------------------------------
| 0 | SELECT STATEMENT | | |
|* 1 | COUNT STOPKEY | | |
| 2 | TABLE ACCESS BY INDEX ROWID| A | 1 |
|* 3 | INDEX RANGE SCAN | A_PK | |
------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM=1)
3 - access("VAL2">=:some_value AND "VAL1"<=:some_value)
filter("VAL2">=:some_value)
In 3, it becomes obvious that the access predicate is selective only for low values of :some_value whereas for higher values, the filter operation "kicks in" on the index.
Is there any way to generally improve this query to be fast regardless of the value of :some_value? I can completely redesign the table if further normalisation is needed.
Your attempt is good, but misses a few crucial issues.
Let's start slowly. I'm assuming an index on COL1 and I actually don't mind if COL2 is included there as well.
Due to the constraints you have on your data (especially non-overlapping) you actually just want the row before the row where COL1 is <= some value....[--take a break--] it you order by COL1
This is a classic Top-N query:
select *
FROM ( select *
from A
where col1 <= :some_value
order by col1 desc
)
where rownum <= 1;
Please note that you must use ORDER BY to get a definite sort order. As WHERE is applied after ORDER BY you must now also wrap the top-n filter in an outer query.
That's almost done, the only reason why we actually need to filter on COL2 too is to filter out records that don't fall into the range at all. E.g. if some_value is 5 and you are having this data:
COL1 | COL2
1 | 2
3 | 4 <-- you get this row
6 | 10
This row would be correct as result, if COL2 would be 5, but unfortunately, in this case the correct result of your query is [empty set]. That's the only reason we need to filter for COL2 like this:
select *
FROM ( select *
FROM ( select *
from A
where col1 <= :some_value
order by col1 desc
)
where rownum <= 1
)
WHERE col2 >= :some_value;
Your approach had several problems:
missing ORDER BY - dangerous in connection with rownum filter!
applying the Top-N clause (rownum filter) too early. What if there is no result? Database reads index until the end, the rownum (STOPKEY) never kicks in.
An optimizer glitch. With the between predicate, my 11g installation doesn't come to the idea to read the index in descending order, so it was actually reading it from the beginning (0) upwards until it found a COL2 value that matched --OR-- the COL1 run out of the range.
.
COL1 | COL2
1 | 2 ^
3 | 4 | (2) go up until first match.
+----- your intention was to start here
6 | 10
What was actually happening was:
COL1 | COL2
1 | 2 +----- start at the beginning of the index
3 | 4 | Go down until first match.
V
6 | 10
Look at the execution plan of my query:
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 4 (0)| 00:00:01 |
|* 1 | VIEW | | 1 | 26 | 4 (0)| 00:00:01 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | VIEW | | 2 | 52 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID | A | 50000 | 585K| 4 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN DESCENDING| SIMPLE | 2 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Note the INDEX RANGE SCAN **DESCENDING**.
Finally, why didn't I include COL2 in the index? It's a single row-top-n query. You can save at most a single table access (irrespective of what the Rows estimation above says!) If you expect to find a row in most cases, you'll need to go to the table anyways for the other columns (probably) so you would not save ANYTHING, just consume space. Including the COL2 will only improve performance if you query doesn't return anything at all!
Related:
How to use index efficienty in mysql query
I answered a very similar question about this years ago. Same solution.
Use The Index, Lukas!
I think, because the ranges do not intersect, you can define col1 as primary key and execute the query like this:
SELECT *
FROM a
JOIN
(SELECT MAX (col1) AS col1
FROM a
WHERE col1 <= :somevalue) b
ON a.col1 = b.col1;
If there are gaps between the ranges you wil have to add:
Where col2 >= :somevalue
as last line.
Execution Plan:
SELECT STATEMENT
NESTED LOOPS
VIEW
SORT AGGREGATE
FIRST ROW
INDEX RANGE SCAN (MIN/MAX) PKU1
TABLE ACCESS BY INDEX A
INDEX UNIQUE SCAN PKU1
Maybe changing this heap table to IOT table would give better performance.
I didn't generate sample data to test this but you might want to give it a try.
ALTER TABLE A ADD COL3 NUMBER(15);
UPDATE A SET COL3 = COL2 - COL1;
Create index on COL3.
SELECT /*+FIRST_ROWS(1)*/ *
FROM A
WHERE :some_value < COL3
AND ROWNUM = 1;

Alter session slows down the query through Hibernate

I'm using Oracle 11gR2 and Hibernate 4.2.1.
My application is a searching application.
Only has SELECT operations and all of them are native queries.
Oracle uses case-sensitive sort by default.
I want to override it to case-insensitive.
I saw couple of option here http://docs.oracle.com/cd/A81042_01/DOC/server.816/a76966/ch2.htm#91066
Now I'm using this query before any search executes.
ALTER SESSION SET NLS_SORT='BINARY_CI'
If I execute above sql before execute the search query, hibernate takes about 15 minutes to return from search query.
If I do this in Sql Developer, It returns within couple of seconds.
Why this kind of two different behaviors,
What can I do to get rid of this slowness?
Note: I always open a new Hibernate session for each search.
Here is my sql:
SELECT *
FROM (SELECT
row_.*,
rownum rownum_
FROM (SELECT
a, b, c, d, e,
RTRIM(XMLAGG(XMLELEMENT("x", f || ', ') ORDER BY f ASC)
.extract('//text()').getClobVal(), ', ') AS f,
RTRIM(
XMLAGG(XMLELEMENT("x", g || ', ') ORDER BY g ASC)
.extract('//text()').getClobVal(), ', ') AS g
FROM ( SELECT src.a, src.b, src.c, src.d, src.e, src.f, src.g
FROM src src
WHERE upper(pp) = 'PP'
AND upper(qq) = 'QQ'
AND upper(rr) = 'RR'
AND upper(ss) = 'SS'
AND upper(tt) = 'TT')
GROUP BY a, b, c, d, e
ORDER BY b ASC) row_
WHERE rownum <= 400
) WHERE rownum_ > 0;
There are so may fields comes with LIKE operation, and it is a dynamic sql query. If I use order by upper(B) asc Sql Developer also takes same time.
But order by upper results are same as NLS_SORT=BINARY_CI. I have used UPPER('B') indexes, but nothings gonna work for me.
A's length = 10-15 characters
B's length = 34-50 characters
C's length = 5-10 characters
A, B and C are sort-able fields via app.
This SRC table has 3 million+ records.
We finally ended up with a SRC table which is a materialized view.
Business logic of the SQL is completely fine.
All of the sor-table fields and others are UPPER indexed.
UPPER() and BINARY_CI may produce the same results but Oracle cannot use them interchangeably. To use an index and BINARY_CI you must create an index like this:
create index src_nlssort_index on src(nlssort(b, 'nls_sort=''BINARY_CI'''));
Sample table and mixed case data
create table src(b varchar2(100) not null);
insert into src select 'MiXeD CAse '||level from dual connect by level <= 100000;
By default the upper() predicate can perform a range scan on the the upper() index
create index src_upper_index on src(upper(b));
explain plan for
select * from src where upper(b) = 'MIXED CASE 1';
select * from table(dbms_xplan.display(format => '-rows -bytes -cost -predicate
-note'));
Plan hash value: 1533361696
------------------------------------------------------------------
| Id | Operation | Name | Time |
------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| SRC | 00:00:01 |
| 2 | INDEX RANGE SCAN | SRC_UPPER_INDEX | 00:00:01 |
------------------------------------------------------------------
BINARY_CI and LINGUISTIC will not use the index
alter session set nls_sort='binary_ci';
alter session set nls_comp='linguistic';
explain plan for
select * from src where b = 'MIXED CASE 1';
select * from table(dbms_xplan.display(format => '-rows -bytes -cost -note'));
Plan hash value: 3368256651
---------------------------------------------
| Id | Operation | Name | Time |
---------------------------------------------
| 0 | SELECT STATEMENT | | 00:00:02 |
|* 1 | TABLE ACCESS FULL| SRC | 00:00:02 |
---------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(NLSSORT("B",'nls_sort=''BINARY_CI''')=HEXTORAW('6D69786564
2063617365203100') )
Function based index on NLSSORT() enables index range scans
create index src_nlssort_index on src(nlssort(b, 'nls_sort=''BINARY_CI'''));
explain plan for
select * from src where b = 'MIXED CASE 1';
select * from table(dbms_xplan.display(format => '-rows -bytes -cost -note'));
Plan hash value: 478278159
--------------------------------------------------------------------
| Id | Operation | Name | Time |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| SRC | 00:00:01 |
|* 2 | INDEX RANGE SCAN | SRC_NLSSORT_INDEX | 00:00:01 |
--------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(NLSSORT("B",'nls_sort=''BINARY_CI''')=HEXTORAW('6D69786564
2063617365203100') )
I investigated and found that The parameters NLS_COMP y NLS_SORT may affect how oracle make uses of execute plan for string ( when it is comparing or ordering).
Is not necesary to change NLS session. adding
ORDER BY NLSSORT(column , 'NLS_SORT=BINARY_CI')
and adding a index for NLS is enough
create index column_index_binary as NLSSORT(column , 'NLS_SORT=BINARY_CI')
I found a clue to a problem in this issue so i'm paying back.
Why oracle stored procedure execution time is greatly increased depending on how it is executed?