When is comparing strings case-sensitive and when case-insensitive in SQL?

When is comparing strings case-sensitive and when case-insensitive in SQL? - sql

When is comparing strings case-sensitive and when case-insensitive in popular databases such as PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, and SQLite?
I mean comparison using operators like: 'ab' = 'AB', or comparison of strings performed inside string functions like: POSITION('b' IN 'ABC'), INSTR('ABC', 'b'), REPLACE('ABC', 'b', 'x'), TRANSLATE('ABC', 'b', 'x'), TRIM('XabcX', 'x').
I think that I can know the answer but I don't know if it is correct.
Additionally, does the SQL standard define the case-sensitivity of string comparison?
Unfortunately, I only found a question about case-sensitivity of SQL syntax, and not in string comparison.
Edit: I am asking about default setting of the RDBMS, without additional setting the collation of a database, table, nor column.
I am asking only about ASCII letters A-Z and a-z.

Oracle DB functions/operators are case sensitive, with some exceptions including regexp_like method that can be switched to case-sensitive mode ;
--case sensitive demo
select q.txt
from (select 'myPhrase' txt from dual) q
where regexp_like(q.txt, 'myphrase'); --empty result set
select q.txt
from (select 'myPhrase' txt from dual) q
where q.txt like 'myphrase'; --empty result set
select q.txt
from (select 'myPhrase' txt from dual) q
where q.txt like 'myPhrase'; --exact match; returns value
select q.txt
from (select 'myPhrase' txt from dual) q
where regexp_like(q.txt, 'myphrase', 'i'); --case insensitive switch 'i'; returns value
select q.txt
from (select 'myPhrase' txt from dual) q
where lower(q.txt) like 'myphrase'; --enforced match; returns value;
Here's reference to docs for Collation rules for different SQL Operations and for Data Type Comparison Rules
Once a pseudo-collation is determined as the collation to use, NLS_SORT and NLS_COMP session parameters are checked to provide the actual named collation to apply
Usually these two are set to BINARY
SELECT
t.parameter,
t.value
FROM
nls_database_parameters t
WHERE
t.parameter IN (
'NLS_COMP', 'NLS_SORT'
);

Related

DB2 efficient select query with like operator for many values (~200)

I have written the following query:
SELECT TBSPACE FROM SYSCAT.TABLES WHERE TYPE='T' AND (TABNAME LIKE '%_ABS_%' OR TABNAME LIKE '%_ACCT_%')
This gives me a certain amount of results. Now the problem is that I have multiple TABNAME to select using the LIKE operator (~200). Is there an efficient way to write the query for the 200 values without repeating the TABNAME LIKE part (because there are 200 such values which would result in a really huge query) ?
(If it helps, I have stored all required TABNAME values in a table TS to retrieve from)

If you are just looking for substrings, you could use LOCATE. E.g.
WITH SS(S) AS (
VALUES
('_ABS_')
, ('_ACCT_')
)
SELECT DISTINCT
TABNAME
FROM
SYSCAT.TABLES, SS
WHERE
TYPE='T'
AND LOCATE(S,TABNAME) > 0
or if your substrings are in table CREATE TABLE TS(S VARCHAR(64))
SELECT DISTINCT
TABNAME
FROM
SYSCAT.TABLES, TS
WHERE
TYPE='T'
AND LOCATE(S,TABNAME) > 0

You could try REGEXP_LIKE. E.g.
SELECT DISTINCT
TABNAME
FROM
SYSCAT.TABLES
WHERE
TYPE='T'
AND REGEXP_LIKE(TABNAME,'.*_((ABS)|(ACCT))_.*')

Just in case.
Note, that the '_' character has special meaning in a pattern-expression of the LIKE predicate:
The underscore character (_) represents any single character.
The percent sign (%) represents a string of zero or more characters.
Any other character represents itself.
So, if you really need to find _ABS_ substring, you should use something like below.
You get both rows in the result, if you use the commented out pattern instead, which may not be desired.
with
pattern (str) as (values
'%\_ABS\_%'
--'%_ABS_%'
)
, tables (tabname) as (values
'A*ABS*A'
, 'A_ABS_A'
)
select tabname
from tables t
where exists (
select 1
from pattern p
where t.tabname like p.str escape '\'
);

How to use regular expression in Oracle SQL query

I have a table with a column 'DESCRIPTION'.
I would like retrieve, by a regular expression, only rows with at least one lower case character.
I have tried
select * from MYTABLE t
WHERE REGEXP_LIKE (t.DESCRIPTION, '[a-z]');
but the result is equal to
select * from MYTABLE t

You may need to explicily force a case sensitive comparison:
select *
from MYTABLE t
WHERE REGEXP_LIKE (t.DESCRIPTION, '[a-z]', 'c')
From Oracle documentation:
If you omit match_parameter, then:
The default case sensitivity is determined by the value of the
NLS_SORT parameter

Flatten national characters in SQL Server

I have a column that contains pet names with national characters. How do I write the query to match them all in one condition?
|PetName|
Ćin
ćin
Ĉin
ĉin
Ċin
ċin
Čin
čin
sth like FLATTEN funciton here:
...WHERE LOWER(FLATTEN(PetName)) = 'cin'
Tried to cast it to from NVARCHAR to VARCHAR but it didn't help. I'd like to avoid using REPLACE for every character.

this should work because cyrillic collation base cases all diacritics like Đ,Ž,Ć,Č,Š,etc...
declare #t table(PetName nvarchar(100))
insert into #t
SELECT N'Ćin' union all
SELECT N'ćin' union all
SELECT N'Ĉin' union all
SELECT N'ĉin' union all
SELECT N'Ċin' union all
SELECT N'ċin' union all
SELECT N'Čin' union all
SELECT N'čin'
SELECT *
FROM #t
WHERE lower(PetName) = 'cin' COLLATE Cyrillic_General_CS_AI

You can change the collation used for the comparison:
WHERE PetName COLLATE Cyrillic_General_CI_AI = 'cin'

There isn't really a way or built-in function that will strip accents from characters.
If you are doing comparisons (LIKE, IN, PATINDEX etc), you can just force COLLATE if the column/db is not already accent insensitive.
Normally, a query like this
with test(col) as (
select 'Ćin' union all
select 'ćin')
select * from test
where col='cin'
will return both columns, since the default collation (unless you change it) is insensitive. This won't work for FULLTEXT indexes though.

SELECT from nothing?

Is it possible to have a statement like
SELECT "Hello world"
WHERE 1 = 1
in SQL?
The main thing I want to know, is can I SELECT from nothing, ie not have a FROM clause.

It's not consistent across vendors - Oracle, MySQL, and DB2 support dual:
SELECT 'Hello world'
FROM DUAL
...while SQL Server, PostgreSQL, and SQLite don't require the FROM DUAL:
SELECT 'Hello world'
MySQL does support both ways.

Try this.
Single:
SELECT * FROM (VALUES ('Hello world')) t1 (col1) WHERE 1 = 1
Multi:
SELECT * FROM (VALUES ('Hello world'),('Hello world'),('Hello world')) t1 (col1) WHERE 1 = 1
more detail here : http://modern-sql.com/use-case/select-without-from

In Oracle:
SELECT 'Hello world' FROM dual
Dual equivalent in SQL Server:
SELECT 'Hello world'

Here is the most complete list of database support of dual from https://blog.jooq.org/tag/dual-table/:
In many other RDBMS, there is no need for dummy tables, as you can
issue statements like these:
SELECT 1;
SELECT 1 + 1;
SELECT SQRT(2);
These are the RDBMS, where the above is generally possible:
H2
MySQL
Ingres
Postgres
SQLite
SQL Server
Sybase ASE
In other RDBMS, dummy tables are required, like in Oracle. Hence,
you’ll need to write things like these:
SELECT 1 FROM DUAL;
SELECT 1 + 1 FROM DUAL;
SELECT SQRT(2) FROM DUAL;
These are the RDBMS and their respective dummy tables:
DB2: SYSIBM.DUAL
Derby: SYSIBM.SYSDUMMY1
H2: Optionally supports DUAL
HSQLDB: INFORMATION_SCHEMA.SYSTEM_USERS
MySQL: Optionally supports DUAL
Oracle: DUAL
Sybase SQL Anywhere: SYS.DUMMY
Ingres has no DUAL, but would actually need it as in Ingres you cannot
have a WHERE, GROUP BY or HAVING clause without a FROM clause.

In SQL Server type:
Select 'Your Text'
There is no need for the FROM or WHERE clause.

You can. I'm using the following lines in a StackExchange Data Explorer query:
SELECT
(SELECT COUNT(*) FROM VotesOnPosts WHERE VoteTypeName = 'UpMod' AND UserId = #UserID AND PostTypeId = 2) AS TotalUpVotes,
(SELECT COUNT(*) FROM Answers WHERE UserId = #UserID) AS TotalAnswers
The Data Exchange uses Transact-SQL (the SQL Server proprietary extensions to SQL).
You can try it yourself by running a query like:
SELECT 'Hello world'

There is another possibility - standalone VALUES():
VALUES ('Hello World');
Output:
column1
Hello World
It is useful when you need to specify multiple values in compact way:
VALUES (1, 'a'), (2, 'b'), (3, 'c');
Output:
column1 column2
1 a
2 b
3 c
DBFiddle Demo
This syntax is supported by SQLite/PostgreSQL/DB LUW/MariaDB 10.3.

In Firebird, you can do this:
select "Hello world" from RDB$DATABASE;
RDB$DATABASE is a special table that always has one row.

I think it is not possible. Theoretically: select performs two sorts of things:
narrow/broaden the set (set-theory);
mapping the result.
The first one can be seen as a horizontal diminishing opposed to the where-clause which can be seen as a vertical diminishing. On the other hand, a join can augment the set horizontally where a union can augment the set vertically.
augmentation diminishing
horizontal join/select select
vertical union where/inner-join
The second one is a mapping. A mapping, is more a converter. In SQL it takes some fields and returns zero or more fields. In the select, you can use some aggregate functions like, sum, avg etc. Or take all the columnvalues an convert them to string. In C# linq, we say that a select accepts an object of type T and returns an object of type U.
I think the confusion comes by the fact that you can do: select 'howdy' from <table_name>. This feature is the mapping, the converter part of the select. You are not printing something, but converting! In your example:
SELECT "
WHERE 1 = 1
you are converting nothing/null into "Hello world" and you narrow the set of nothing / no table into one row, which, imho make no sense at all.
You may notice that, if you don't constrain the number of columns, "Hello world" is printed for each available row in the table. I hope, you understand why by now. Your select takes nothing from the available columns and creates one column with the text: "Hello world".
So, my answer is NO. You can't just leave out the from-clause because the select always needs table-columns to perform on.

In Standard SQL, no. A WHERE clause implies a table expression.
From the SQL-92 spec:
7.6 "where clause"
Function
Specify a table derived by the
application of a "search condition" to
the result of the preceding "from
clause".
In turn:
7.4 "from clause"
Function
Specify a table derived from one or more named tables.
A Standard way of doing it (i.e. should work on any SQL product):
SELECT DISTINCT 'Hello world' AS new_value
FROM AnyTableWithOneOrMoreRows
WHERE 1 = 1;
...assuming you want to change the WHERE clause to something more meaningful, otherwise it can be omitted.

For ClickHouse, the nothing is system.one
SELECT 1 FROM system.one

I know this is an old question but the best workaround for your question is using a dummy subquery:
SELECT 'Hello World'
FROM (SELECT name='Nothing') n
WHERE 1=1
This way you can have WHERE and any clause (like Joins or Apply, etc.) after the select statement since the dummy subquery forces the use of the FROM clause without changing the result.

For DB2:
`VALUES('Hello world')`
You can do multiple "rows" as well:
`VALUES('Hello world'),('Goodbye world');`
You can even use them in joins as long as the types match:
VALUES(1,'Hello world')
UNION ALL
VALUES(2,'Goodbye world');

I'm using firebird
First of all, create a one column table named "NoTable" like this
CREATE TABLE NOTABLE
(
NOCOLUMN INTEGER
);
INSERT INTO NOTABLE VALUES (0); -- You can put any value
now you can write this
select 'hello world' as name
from notable
you can add any column you want to be shown

Non-latin-characters ordering in database with "order by"

I just found some strange behavior of database's "order by" clause. In string comparison, I expected some characters such as '[' and '_' are greater than latin characters/digits such as 'I' or '2' considering their orders in the ASCII table. However, the sorting results from database's "order by" clause is different with my expectation. Here's my test:
SQLite version 3.6.23
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table products(name varchar(10));
sqlite> insert into products values('ipod');
sqlite> insert into products values('iphone');
sqlite> insert into products values('[apple]');
sqlite> insert into products values('_ipad');
sqlite> select * from products order by name asc;
[apple]
_ipad
iphone
ipod
select * from products order by name asc;
name
...
[B#
_ref
123
1ab
...
This behavior is different from Java's string comparison (which cost me some time to find this issue). I can verify this in both SQLite 3.6.23 and Microsoft SQL Server 2005. I did some web search but cannot find any related documentation. Could someone shed me some light on it? Is it a SQL standard? Where can I find some information about this? Thanks in advance.

The concept of comparing and ordering the characters in a database is called collation.
How the strings are stored depends on the collation which is usually set in the server, client or session properties.
In MySQL:
SELECT *
FROM (
SELECT 'a' AS str
UNION ALL
SELECT 'A' AS str
UNION ALL
SELECT 'b' AS str
UNION ALL
SELECT 'B' AS str
) q
ORDER BY
str COLLATE UTF8_BIN
--
'A'
'B'
'a'
'b'
and
SELECT *
FROM (
SELECT 'a' AS str
UNION ALL
SELECT 'A' AS str
UNION ALL
SELECT 'b' AS str
UNION ALL
SELECT 'B' AS str
) q
ORDER BY
str COLLATE UTF8_GENERAL_CI
--
'a'
'A'
'b'
'B'
UTF8_BIN sorts characters according to their unicode. Caps have lower unicodes and therefore go first.
UTF8_GENERAL_CI sorts characters according to their alphabetical position, disregarding case.
Collation is also important for indexes, since the indexes rely heavily on sorting and comparison rules.

The important keyword in this case is 'collation'. I have no experience with SQLite, but would expect it to be similar to other database engines in that you can define the collation to use for whole databases, single tables, per connection, etc.
Check your DB documentation for the options available to you.

The ASCII codes for lower-case characters such as 'i' are greater than the ones for '[' and '_':
'i': 105
'[': 91
'_': 95
However, try to insert upper-case characters, eg. try with "IPOD" or "Iphone", those will become before "_" and "[" with the default binary collation.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas