Case insensitive searching in Oracle - sql

The default behaviour of LIKE and the other comparison operators, = etc is case-sensitive.
Is it possible make them case-insensitive?

There are 3 main ways to perform a case-insensitive search in Oracle without using full-text indexes.
Ultimately what method you choose is dependent on your individual circumstances; the main thing to remember is that to improve performance you must index correctly for case-insensitive searching.
1. Case your column and your string identically.
You can force all your data to be the same case by using UPPER() or LOWER():
select * from my_table where upper(column_1) = upper('my_string');
or
select * from my_table where lower(column_1) = lower('my_string');
If column_1 is not indexed on upper(column_1) or lower(column_1), as appropriate, this may force a full table scan. In order to avoid this you can create a function-based index.
create index my_index on my_table ( lower(column_1) );
If you're using LIKE then you have to concatenate a % around the string you're searching for.
select * from my_table where lower(column_1) LIKE lower('my_string') || '%';
This SQL Fiddle demonstrates what happens in all these queries. Note the Explain Plans, which indicate when an index is being used and when it isn't.
2. Use regular expressions.
From Oracle 10g onwards REGEXP_LIKE() is available. You can specify the _match_parameter_ 'i', in order to perform case-insensitive searching.
In order to use this as an equality operator you must specify the start and end of the string, which is denoted by the carat and the dollar sign.
select * from my_table where regexp_like(column_1, '^my_string$', 'i');
In order to perform the equivalent of LIKE, these can be removed.
select * from my_table where regexp_like(column_1, 'my_string', 'i');
Be careful with this as your string may contain characters that will be interpreted differently by the regular expression engine.
This SQL Fiddle shows you the same example output except using REGEXP_LIKE().
3. Change it at the session level.
The NLS_SORT parameter governs the collation sequence for ordering and the various comparison operators, including = and LIKE. You can specify a binary, case-insensitive, sort by altering the session. This will mean that every query performed in that session will perform case-insensitive parameters.
alter session set nls_sort=BINARY_CI
There's plenty of additional information around linguistic sorting and string searching if you want to specify a different language, or do an accent-insensitive search using BINARY_AI.
You will also need to change the NLS_COMP parameter; to quote:
The exact operators and query clauses that obey the NLS_SORT parameter
depend on the value of the NLS_COMP parameter. If an operator or
clause does not obey the NLS_SORT value, as determined by NLS_COMP,
the collation used is BINARY.
The default value of NLS_COMP is BINARY; but, LINGUISTIC specifies that Oracle should pay attention to the value of NLS_SORT:
Comparisons for all SQL operations in the WHERE clause and in PL/SQL
blocks should use the linguistic sort specified in the NLS_SORT
parameter. To improve the performance, you can also define a
linguistic index on the column for which you want linguistic
comparisons.
So, once again, you need to alter the session
alter session set nls_comp=LINGUISTIC
As noted in the documentation you may want to create a linguistic index to improve performance
create index my_linguistc_index on my_table
(NLSSORT(column_1, 'NLS_SORT = BINARY_CI'));

Since 10gR2, Oracle allows to fine-tune the behaviour of string comparisons by setting the NLS_COMP and NLS_SORT session parameters:
SQL> SET HEADING OFF
SQL> SELECT *
2 FROM NLS_SESSION_PARAMETERS
3 WHERE PARAMETER IN ('NLS_COMP', 'NLS_SORT');
NLS_SORT
BINARY
NLS_COMP
BINARY
SQL>
SQL> SELECT CASE WHEN 'abc'='ABC' THEN 1 ELSE 0 END AS GOT_MATCH
2 FROM DUAL;
0
SQL>
SQL> ALTER SESSION SET NLS_COMP=LINGUISTIC;
Session altered.
SQL> ALTER SESSION SET NLS_SORT=BINARY_CI;
Session altered.
SQL>
SQL> SELECT *
2 FROM NLS_SESSION_PARAMETERS
3 WHERE PARAMETER IN ('NLS_COMP', 'NLS_SORT');
NLS_SORT
BINARY_CI
NLS_COMP
LINGUISTIC
SQL>
SQL> SELECT CASE WHEN 'abc'='ABC' THEN 1 ELSE 0 END AS GOT_MATCH
2 FROM DUAL;
1
You can also create case insensitive indexes:
create index
nlsci1_gen_person
on
MY_PERSON
(NLSSORT
(PERSON_LAST_NAME, 'NLS_SORT=BINARY_CI')
)
;
This information was taken from Oracle case insensitive searches. The article mentions REGEXP_LIKE but it seems to work with good old = as well.
In versions older than 10gR2 it can't really be done and the usual approach, if you don't need accent-insensitive search, is to just UPPER() both the column and the search expression.

maybe you can try using
SELECT user_name
FROM user_master
WHERE upper(user_name) LIKE '%ME%'

From Oracle 12c R2 you could use COLLATE operator:
The COLLATE operator determines the collation for an expression. This operator enables you to override the collation that the database would have derived for the expression using standard collation derivation rules.
The COLLATE operator takes one argument, collation_name, for which you can specify a named collation or pseudo-collation. If the collation name contains a space, then you must enclose the name in double quotation marks.
Demo:
CREATE TABLE tab1(i INT PRIMARY KEY, name VARCHAR2(100));
INSERT INTO tab1(i, name) VALUES (1, 'John');
INSERT INTO tab1(i, name) VALUES (2, 'Joe');
INSERT INTO tab1(i, name) VALUES (3, 'Billy');
--========================================================================--
SELECT /*csv*/ *
FROM tab1
WHERE name = 'jOHN' ;
-- no rows selected
SELECT /*csv*/ *
FROM tab1
WHERE name COLLATE BINARY_CI = 'jOHN' ;
/*
"I","NAME"
1,"John"
*/
SELECT /*csv*/ *
FROM tab1
WHERE name LIKE 'j%';
-- no rows selected
SELECT /*csv*/ *
FROM tab1
WHERE name COLLATE BINARY_CI LIKE 'j%';
/*
"I","NAME"
1,"John"
2,"Joe"
*/
db<>fiddle demo

The COLLATE operator also works if you put it at the end of the expression, and that seems cleaner to me.
So you can use this:
WHERE name LIKE 'j%' COLLATE BINARY_CI
instead of this:
WHERE name COLLATE BINARY_CI LIKE 'j%'
Anyhow, I like the COLLATE operator solution for the following reasons:
you put it only once in the expression and you don't need to worry about multiple UPPER or LOWER, and where to put them
it is isolated to the exact statement and expression where you need it, unlike ALTER SESSION solution that makes it applicable to everything. And your query will work consistently regardless of the DB or session NLS_SORT setting.

select user_name
from my_table
where nlssort(user_name, 'NLS_SORT = Latin_CI') = nlssort('%AbC%', 'NLS_SORT = Latin_CI')

you can do something like that:
where regexp_like(name, 'string$', 'i');

Related

Case insensitive search without using a function in the where clause

Any way to make a case insensitive without using a function in the where clause?
Please specify the database you are talking about when/if you reply. I am aware that MySQL is already case insensitive by default. What about Oracle or MSSQL or HANA?
select * from mytable WHERE upper(fieldname) = 'VALUE'
collate SQL_Latin1_General_CP1_CS_AS.
Default Collation is SQL_Latin1_General_CP1_CI_AS which is case insensitive. And if we need to make it case sensitive, then adding COLLATE Latin1_General_CS_AS makes the search case sensitive.
Query
select * from [mytable]
where [fieldname] = 'VALUE' collate SQL_Latin1_General_CP1_CS_AS;
Find a demo here
Since your question is tagged Oracle, I will provide a solution which works in Oracle.
You can set these session parameters for case insensitive searching
SQL> alter session set NLS_COMP=ANSI;
SQL> alter session set NLS_SORT=BINARY_CI;
SQL> select 1 from DUAL where 'abc' = 'ABC';
1
----------
1
Read more at Linguistic Sorting and String Searching
as #mathguy points out,
ALTER SESSION SET NLS_COMP=LINGUISTIC;
is more common than using ANSI
Making the column upper (or lower) case as you are showing in order to compare it, is the standard way of making a case insensitive comparision. (UPPER and LOWER are functions defined in the SQL standard.)
If you don't want to apply a function on the column, then you can of course write a recursive query to generate all upper/lower case permutations of the value ('VALUE', 'vALUE', 'VaLUE', ..., 'value') and check whether your column value is in this set. Standard SQL provides the SUBSTRING function for accessing substrings (e.g. the nth letter) and CHAR_LENGTH for getting the string's length.
It depends on the DBMS you are using and its version to what extent the standard is supported. In Oracle for example it's SUBSTR instead of SUBSTRING and LENGTH instead of CHAR_LENGTH. MySQL on the other hand features both SUBSTRING and CHAR_LENGTH directly, but only supports recursive queries as of version 8.0.
this will work:
SELECT *
FROM mytable
WHERE REGEXP_LIKE (column_name, 'value', 'i');
Oracle 12c answer
select * from mytable WHERE fieldname='Value' collate binary_ci
SAP HANA does not seem to have a way other than using upper or lower.
SQL Server and MySQL do not distinguish between upper and lower case letters—they are case-insensitive by default.
One could use CONTAINS Function. For example, Microsoft SQL Server query:
SELECT *
FROM TableName
WHERE ColumnName LIKE 'Abc%'
Maybe written in SAP HANA as:
SELECT *
FROM TableName
WHERE CONTAINS(ColumnName,'Abc%');
https://help.sap.com/viewer/05c9edaee7fe4d28ab3627d0b1583df6/2021_01_QRC/en-US/b45ff4c0e9ab4ba7a9e18a2552adeb3d.html

How to determine if SQLite column created with COLLATE NOCASE

A column in a SQLite db must be COLLATE NOCASE. I assume there is no way to add that capability to an existing table, so I'm prepare to recreate the table with it. How can I determine if the existing column is COLLATE NOCASE in order to avoid recreating the table every time it is opened?
How can I determine if the existing column is COLLATE NOCASE
The query
SELECT sql FROM sqlite_master WHERE type='table' AND tbl_name='my_table'
will give you the CREATE TABLE statement for that table. You could inspect the DDL to determine if the column is already defined as COLLATE NOCASE.
You might not need to do that at all if it is sufficient to change the collations in the query. I mean you can just overwrite it in the query. It won't affect constraints or index, but depending on your use case, it might be good enough.
To be clear: the collate clause in the table definition is just a default for the queries. You can overwrite this in the queries.
e.g.
WHERE column = 'term' COLLATE NOCASE
or
ORDER BY column COLLATE NOCASE
However, not that SQLite's LIKE doesn't honor collate clause (use pragma case_sensitive_like instead).
The easiest and most general way is store a version number somewhere (in another table, or with PRAGMA user_version).
If you want to check the column itself, use a query with a comparison that is affected by the column's collation:
SELECT Col = upper(Col)
FROM (SELECT Col
FROM MyTable
WHERE 0 -- don't actually return any row from MyTable
UNION ALL
SELECT 'x' -- lowercase; same collation as Col
);

UPPER() and LOWER() not required?

For a while I thought, in order for the WHERE criteria to be evaluated correctly, I need to account for case sensitivity. I would use UPPER() and LOWER() when case didn't matter. However, I am finding the below queries produce the same result.
SELECT * FROM ATable WHERE UPPER(part) = 'SOMEPARTNAME'
SELECT * FROM ATable WHERE part = 'SOMEPARTNAME'
SELECT * FROM ATable WHERE part = 'somepartname'
SQL Case Sensitive String Compare explains to use case-sensitive collations. Is this the only way to force case sensitivity? Also, if you had a case-insensitive collation when would UPPER() and LOWER() be necessary?
Thanks for help.
The common SQL Server default of a case-insensitive collation means that UPPER() and LOWER() are not required when comparing strings.
In fact an expression such as
SELECT * FROM Table WHERE UPPER(part) = 'SOMEPARTNAME'
is also non-sargable i.e won't use available indexes, due to the function applied to the part column on the left hand side of the comparison.
this query below produces CASE SENSITIVE search:
SELECT Column1
FROM Table1
WHERE Column1 COLLATE Latin1_General_CS_AS = 'casesearch'
UPPER() and LOWER() are only functions to change the case of the letter so if you case-insensitive collation, they are only use after the SELECT Keyword:
SELECT UPPER('qwerty'), LOWER('Dog')
returns
QWERTY, dog

Oracle DB: How can I write query ignoring case?

As I had written in title, I have SQL query, run on Oracle DB, lets say:
SELECT * FROM TABLE WHERE TABLE.NAME Like 'IgNoReCaSe'
If I would like, that the query would return either "IGNORECASE", "ignorecase" or combinations of them, how can this be done?
Select * from table where upper(table.name) like upper('IgNoreCaSe');
Alternatively, substitute lower for upper.
Use ALTER SESSION statements to set comparison to case-insensitive:
alter session set NLS_COMP=LINGUISTIC;
alter session set NLS_SORT=BINARY_CI;
If you're still using version 10gR2, use the below statements. See this FAQ for details.
alter session set NLS_COMP=ANSI;
alter session set NLS_SORT=BINARY_CI;
You can use either lower or upper function on both sides of the where condition
You could also use Regular Expressions:
SELECT * FROM TABLE WHERE REGEXP_LIKE (TABLE.NAME,'IgNoReCaSe','i');
You can use the upper() function in your query, and to increase performance you can use a function-base index
CREATE INDEX upper_index_name ON table(upper(name))
You can convert both values to upper or lowercase using the upper or lower functions:
Select * from table where upper(table.name) like upper('IgNoreCaSe')
or
Select * from table where lower(table.name) like lower('IgNoreCaSe');
In version 12.2 and above, the simplest way to make the query case insensitive is this:
SELECT * FROM TABLE WHERE TABLE.NAME COLLATE BINARY_CI Like 'IgNoReCaSe'
...also do the conversion to upper or lower outside of the query:
tableName:= UPPER(someValue || '%');
...
Select * from table where upper(table.name) like tableName
Also don't forget the obvious, does the data in the tables need to have case? You could only insert rows already in lower case (or convert the existing DB rows to lower case) and be done with it right from the start.

How to change the collation of sqlite3 database to sort case insensitively?

I have a query for sqlite3 database which provides the sorted data. The data are sorted on the basis of a column which is a varchar column "Name". Now when I do the query
select * from tableNames Order by Name;
It provides the data like this.
Pen
Stapler
pencil
Means it is considering the case sensitive stuff. The way I want is as follows
Pen
pencil
Stapler
So what changes should I make in sqlite3 database for the necessary results?
Related How to set Sqlite3 to be case insensitive when string comparing?
To sort it Case insensitive you can use ORDER BY Name COLLATE NOCASE
The SQLite Datatypes documentation discusses user-defined collation sequences. Specifically you use COLLATE NOCASE to achieve your goal.
They give an example:
CREATE TABLE t1(
a, -- default collation type BINARY
b COLLATE BINARY, -- default collation type BINARY
c COLLATE REVERSE, -- default collation type REVERSE
d COLLATE NOCASE -- default collation type NOCASE
);
and note that:
-- Grouping is performed using the NOCASE collation sequence (i.e. values
-- 'abc' and 'ABC' are placed in the same group).
SELECT count(*) GROUP BY d FROM t1;
select * from tableNames Order by lower(Name);
Michael van der Westhuizen explains in his comment below why this is not a good way. I am leaving this answer up so as to preserve his comment and to serve as a warning to others who might have the same 'bright' idea I had ;-)
Use this statement in your SQLite database:
PRAGMA case_sensitive_like = false