How to determine if SQLite column created with COLLATE NOCASE - sql

A column in a SQLite db must be COLLATE NOCASE. I assume there is no way to add that capability to an existing table, so I'm prepare to recreate the table with it. How can I determine if the existing column is COLLATE NOCASE in order to avoid recreating the table every time it is opened?

How can I determine if the existing column is COLLATE NOCASE
The query
SELECT sql FROM sqlite_master WHERE type='table' AND tbl_name='my_table'
will give you the CREATE TABLE statement for that table. You could inspect the DDL to determine if the column is already defined as COLLATE NOCASE.

You might not need to do that at all if it is sufficient to change the collations in the query. I mean you can just overwrite it in the query. It won't affect constraints or index, but depending on your use case, it might be good enough.
To be clear: the collate clause in the table definition is just a default for the queries. You can overwrite this in the queries.
e.g.
WHERE column = 'term' COLLATE NOCASE
or
ORDER BY column COLLATE NOCASE
However, not that SQLite's LIKE doesn't honor collate clause (use pragma case_sensitive_like instead).

The easiest and most general way is store a version number somewhere (in another table, or with PRAGMA user_version).
If you want to check the column itself, use a query with a comparison that is affected by the column's collation:
SELECT Col = upper(Col)
FROM (SELECT Col
FROM MyTable
WHERE 0 -- don't actually return any row from MyTable
UNION ALL
SELECT 'x' -- lowercase; same collation as Col
);

Related

Firebird order by collation

I am with strange problem with Firebird 2.5.
My database has default charset = utf8.
I have a column p_nname in patienten table:
CREATE TABLE PATIENTEN (
P_NNAME VARCHAR(25) DEFAULT '' NOT NULL COLLATE UNICODE_CI,
I expect collation to work everywhere. I mean in WHERE and ORDER BY clauses.
What I have is working collation in WHERE. Two queries below give me similar result and it is good.
select * from patienten where p_nname='adler'
select * from patienten where p_nname='ADler'
Problem is ORDER BY clause does not work as I expect.
This SQL works as if the column has no UNICODE_CI collation.
select * from patienten order by p_nname
To get the needed result with good sorting I have to write so:
select * from patienten order by p_nname collate unicode_ci
Is there a way to omit COLLATE flag in ORDER BY clause?
Looks like a bug indeed, the documentation states:
The keyword COLLATE specifies the collation order for a string column
if you need a collation that is different from the normal one for this
column. The normal collation order will be either the default one for
the database character set or one that has been set explicitly in the
column's definition.
so it should work without specifing the collate clause in ORDER BY. I suggest you file a bug report.

Is the LIKE operator case-sensitive with SQL Server?

In the documentation about the LIKE operator, nothing is told about the case-sensitivity of it. Is it? How to enable/disable it?
I am querying varchar(n) columns, on an Microsoft SQL Server 2005 installation, if that matters.
It is not the operator that is case sensitive, it is the column itself.
When a SQL Server installation is performed a default collation is chosen to the instance. Unless explicitly mentioned otherwise (check the collate clause bellow) when a new database is created it inherits the collation from the instance and when a new column is created it inherits the collation from the database it belongs.
A collation like sql_latin1_general_cp1_ci_as dictates how the content of the column should be treated. CI stands for case insensitive and AS stands for accent sensitive.
A complete list of collations is available at https://msdn.microsoft.com/en-us/library/ms144250(v=sql.105).aspx
(a) To check a instance collation
select serverproperty('collation')
(b) To check a database collation
select databasepropertyex('databasename', 'collation') sqlcollation
(c) To create a database using a different collation
create database exampledatabase
collate sql_latin1_general_cp1_cs_as
(d) To create a column using a different collation
create table exampletable (
examplecolumn varchar(10) collate sql_latin1_general_cp1_ci_as null
)
(e) To modify a column collation
alter table exampletable
alter column examplecolumn varchar(10) collate sql_latin1_general_cp1_ci_as null
It is possible to change a instance and database collations but it does not affect previously created objects.
It is also possible to change a column collation on the fly for string comparison, but this is highly unrecommended in a production environment because it is extremely costly.
select
column1 collate sql_latin1_general_cp1_ci_as as column1
from table1
All this talk about collation seem a bit over-complicated. Why not just use something like:
IF UPPER(##VERSION) NOT LIKE '%AZURE%'
Then your check is case insensitive whatever the collation
If you want to achieve a case sensitive search without changing the collation of the column / database / server, you can always use the COLLATE clause, e.g.
USE tempdb;
GO
CREATE TABLE dbo.foo(bar VARCHAR(32) COLLATE Latin1_General_CS_AS);
GO
INSERT dbo.foo VALUES('John'),('john');
GO
SELECT bar FROM dbo.foo
WHERE bar LIKE 'j%';
-- 1 row
SELECT bar FROM dbo.foo
WHERE bar COLLATE Latin1_General_CI_AS LIKE 'j%';
-- 2 rows
GO
DROP TABLE dbo.foo;
Works the other way, too, if your column / database / server is case sensitive and you don't want a case sensitive search, e.g.
USE tempdb;
GO
CREATE TABLE dbo.foo(bar VARCHAR(32) COLLATE Latin1_General_CI_AS);
GO
INSERT dbo.foo VALUES('John'),('john');
GO
SELECT bar FROM dbo.foo
WHERE bar LIKE 'j%';
-- 2 rows
SELECT bar FROM dbo.foo
WHERE bar COLLATE Latin1_General_CS_AS LIKE 'j%';
-- 1 row
GO
DROP TABLE dbo.foo;
You have an option to define collation order at the time of defining your table. If you define a case-sensitive order, your LIKE operator will behave in a case-sensitive way; if you define a case-insensitive collation order, the LIKE operator will ignore character case as well:
CREATE TABLE Test (
CI_Str VARCHAR(15) COLLATE Latin1_General_CI_AS -- Case-insensitive
, CS_Str VARCHAR(15) COLLATE Latin1_General_CS_AS -- Case-sensitive
);
Here is a quick demo on sqlfiddle showing the results of collation order on searches with LIKE.
The like operator takes two strings. These strings have to have compatible collations, which is explained here.
In my opinion, things then get complicated. The following query returns an error saying that the collations are incompatible:
select *
from INFORMATION_SCHEMA.TABLES
where 'abc' COLLATE SQL_Latin1_General_CP1_CI_AS like 'ABC' COLLATE SQL_Latin1_General_CP1_CS_AS
On a random machine here, the default collation is SQL_Latin1_General_CP1_CI_AS. The following query is successful, but returns no rows:
select *
from INFORMATION_SCHEMA.TABLES
where 'abc' like 'ABC' COLLATE SQL_Latin1_General_CP1_CS_AS
The values "abc" and "ABC" do not match in a case-sensitve world.
In other words, there is a difference between having no collation and using the default collation. When one side has no collation, then it is "assigned" an explicit collation from the other side.
(The results are the same when the explicit collation is on the left.)
Try running,
SELECT SERVERPROPERTY('COLLATION')
Then find out if your collation is case sensitive or not.
You can change from the property of every item.
You can easy change collation in Microsoft SQL Server Management studio.
right click table -> design.
choose your column, scroll down i column properties to Collation.
Set your sort preference by check "Case Sensitive"

How to change the collation of sqlite3 database to sort case insensitively?

I have a query for sqlite3 database which provides the sorted data. The data are sorted on the basis of a column which is a varchar column "Name". Now when I do the query
select * from tableNames Order by Name;
It provides the data like this.
Pen
Stapler
pencil
Means it is considering the case sensitive stuff. The way I want is as follows
Pen
pencil
Stapler
So what changes should I make in sqlite3 database for the necessary results?
Related How to set Sqlite3 to be case insensitive when string comparing?
To sort it Case insensitive you can use ORDER BY Name COLLATE NOCASE
The SQLite Datatypes documentation discusses user-defined collation sequences. Specifically you use COLLATE NOCASE to achieve your goal.
They give an example:
CREATE TABLE t1(
a, -- default collation type BINARY
b COLLATE BINARY, -- default collation type BINARY
c COLLATE REVERSE, -- default collation type REVERSE
d COLLATE NOCASE -- default collation type NOCASE
);
and note that:
-- Grouping is performed using the NOCASE collation sequence (i.e. values
-- 'abc' and 'ABC' are placed in the same group).
SELECT count(*) GROUP BY d FROM t1;
select * from tableNames Order by lower(Name);
Michael van der Westhuizen explains in his comment below why this is not a good way. I am leaving this answer up so as to preserve his comment and to serve as a warning to others who might have the same 'bright' idea I had ;-)
Use this statement in your SQLite database:
PRAGMA case_sensitive_like = false

Achieving properties of binary and collation at the same time

I have a varchar field in my database which i use for two significantly different things. In one scenario i use it for evaluating with case sensitivity to ensure no duplicates are inserted. To achieve this I've set the comparison to binary. However, I want to be able to search case-insensitively on the same column values. Is there any way I can do this without simply creating a redundant column with collation instead of binary?
CREATE TABLE t_search (value VARCHAR(50) NOT NULL COLLATE UTF8_BIN PRIMARY KEY);
INSERT
INTO t_search
VALUES ('test');
INSERT
INTO t_search
VALUES ('TEST');
SELECT *
FROM t_search
WHERE value = 'test' COLLATE UTF8_GENERAL_CI;
The second query will return both rows.
Note, however, that anything with COLLATE applied to it has the lowest coercibility.
This means that it's value that will be converted to UTF8_GENERAL_CI for the comparision purposes, not the other way round, which means that the index on value will not be used for searching and the condition in the query will be not sargable.
If you need good performance on case-insensitive searching, you should create an additional column with case-insensitive collation, index it and use in the searches.
you can use the COLLATE statement to change the collation on a column in a query. see this manual page for extensive examples.

Unique constraint on table column

I'm having a table (an existing table with data in it) and that table has a column UserName.
I want this UserName to be unique.
So I add a constraint like this:
ALTER TABLE Users
ADD CONSTRAINT [IX_UniqueUserUserName] UNIQUE NONCLUSTERED ([UserName])
Now I keep getting the Error that duplicate users exist in this table.
But I have checked the database using the following query:
SELECT COUNT(UserId) as NumberOfUsers, UserName
FROM Users
GROUP BY UserName, UserId
ORDER BY UserName
This results in a list of users all having 1 as a NumberOfUsers. So no duplicates there.
But when I'm checking the username he fails I see the following result:
beluga
béluga
So apperently he fails to compare an "e" and "é" or "è" ... It's like he ignores these, is there any way that sql doesn't ignore these accents when adding the unique key contraint.
SOLUTION:
THX to you guys I've found the solution.
This fixed the problem:
ALTER TABLE Users
ALTER COLUMN UserName nvarchar(250) COLLATE SQL_Latin1_General_CP1_CI_AS
The collation you are using most likely ignores case and accents when comparing. You'll need to change the collation.
Latin1_General_CI_AI Ignores case and accents
Latin1_General_CI_AS will not ignore accents
List of SQL server collation names here.
Your query groups by UserID too - you don't want to be doing that.
Use:
SELECT COUNT(*) as NumberOfUsers, UserName
FROM Users
GROUP BY UserName
ORDER BY UserName
Your query would only show up users with the same name and same user ID. Or, maybe, order the data by COUNT(*) so the last row that shows up is most likely the troublemaker?
You could also have problems with collation as others have suggested, but normally, GROUP BY would be self-consistent.
Presumably UserId is your primary key. Since it's part of what you are grouping by, you are guaranteed to get a single row per group. Take the "userId" column out of your group by.
As Andrew Barrett says, the default collation in MySQL doesn not recognize accents correctly.
Change the collation of your fields to UTF8_unicode_ci and it should see accents properly.
ci means case insensitive, and you can use a different collation if case is important.
You can create a new table with the new collation, then copy * from the existing table into the new one.
Also note that you can also create just the table you are interested in the relevent collation (instead of server wide) .So you could also do something like :
CREATE TABLE Users (c1 varchar (10), .., COLLATE Latin1_General_CI_AS NULL )