compare s, t with ş, ţ in SQL Server - sql

I followed this post How do I perform an accent insensitive compare (e with è, é, ê and ë) in SQL Server? but it doesn't help me with " ş ", " ţ " characters.
This doesn't return anything if the city name is " iaşi " :
SELECT *
FROM City
WHERE Name COLLATE Latin1_general_CI_AI LIKE '%iasi%' COLLATE Latin1_general_CI_AI
This also doesn't return anything if the city name is " iaşi " (notice the foreign ş in the LIKE pattern):
SELECT *
FROM City
WHERE Name COLLATE Latin1_general_CI_AI LIKE '%iaşi%' COLLATE Latin1_general_CI_AI
I'm using SQL Server Management Studio 2012.
My database and column collation is "Latin1_General_CI_AI", column type is nvarchar.
How can I make it work?

The characters you've specified aren't part of the Latin1 codepage, so they can't ever be compared in any other way than ordinal in Latin1_General_CI_AI. In fact, I assume that they don't really work at all in the given collation.
If you're only using one collation, simply use the correct collation (for example, if your data is turkish, use Turkish_CI_AI). If your data is from many different languages, you have to use unicode, and the proper collation.
However, there's an additional issue. In languages like Romanian or Turkish, ş is not an accented s, but rather a completely separate character - see http://collation-charts.org/mssql/mssql.0418.1250.Romanian_CI_AI.html. Contrast with eg. š which is an accented form of s.
If you really need ş to equal s, you have replace the original character manually.
Also, when you're using unicode columns (nvarchar and the bunch), make sure you're also using unicode literals, ie. use N'%iasi%' rather than '%iasi%'.

In SQL Server 2008 collations versioned 100 were introduced.
Collation Latin1_General_100_CI_AI seems to do what you want.
The following should work:
SELECT * FROM City WHERE Name LIKE '%iasi%' COLLATE Latin1_General_100_CI_AI

Not tidiest solution I guess, but if you know that it's just the "ş" and "ţ" characters that are the problem, would it be acceptable to do a replace?
SELECT *
FROM City
WHERE replace(replace(Name,'ş','s'),'ţ','t') LIKE COLLATE Latin1_general_CI_AI '%iasi%' COLLATE Latin1_general_CI_AI

You just need to change collation of name field before like operation. Check test code below
DECLARE #city TABLE ( NAME NVARCHAR(20) )
INSERT INTO #city
VALUES ( N'iaşi' )
SELECT *
FROM #city
WHERE name LIKE 'iasi'
--No return
SELECT *
FROM #city
WHERE name COLLATE Latin1_general_CI_AI LIKE '%iasi%'
--Return 1 row

This problem was haunting me for some time, until now, when I've finally figured it out.
Presuming your table or column is of SQL_Latin1_General_CP1_CI_AS collation, if you do:
update
set myCol = replace(myCol , N'ș', N's')
from MyTable
and
update
set myCol = replace(myCol,N'ț',N't')
from MyTable
the replace function will not find these characters, because the "ș" made from your keyboard (Romanian Standard keyboard) differs from the "ş" or "ţ" found in your database.
As a comparison: ţț and şș - you can see that they differ because the accents are closer to the "s" or "t" character.
Instead, you must do:
update
set myCol = replace(myCol , N'ş', N's')
from MyTable
and
update
set myCol = replace(myCol,N'ţ',N't')
from MyTable

Related

SQL LIKE using special character

How do I get a query that brings me word that contains or does not the special character?
Eg, I have this data: "NÃO" and if I search by typing "NAO", you should return this information to me. And the converse too, if I have: "ANTONIO" and I write "ANTÓNIO," ANTÓNIO should return to me.
I use this code but it does not work:
SELECT * FROM PESSOA WHERE NOME like '%'+ #PROCURAR + '%'
Accent Sensitive and Accent Insensitive searching can be don by using Latin1_general_CI_AI
ie, ÃNTONIO and ANTONIO are the same if Accent Insensitive.
In the below query Latin1_general_CI_AI can be break down into the following parts.
latin1 makes the server treat strings using charset latin 1, basically ascii.
CI specifies case-insensitive, so "ABC" equals to "abc".
AI specifies accent-insensitive,so 'ü' equals to 'u'.
Your query should be as follows:
SELECT * FROM table_name WHERE field_name COLLATE Latin1_general_CI_AI Like '%ANTONIO%' COLLATE Latin1_general_CI_AI
Expected Result is as follows:
Id name
1 ÃNTONIO
2 ANTÓNIO
3 ANTONIO
4 ANTÓNIÓ
SELECT *
FROM PESSOA
WHERE NOME COLLATE Latin1_General_CI_AI Like '%'+ #PROCURAR + '%'
COLLATE Latin1_General_CI_AI
for please visit Latin1_General_CI_AI
in sql server, what is: Latin1_General_CI_AI versus Latin1_General_CI_AS
see also
https://www.mssqltips.com/sqlservertip/4395/understanding-the-collate-databasedefault-clause-in-sql-server/

� IN SQL Server database

in my database I have this char �. I want to locate them with a query
Select *
from Sometable
where somecolumn like '%�%'
this gets me no result.
I think it is ANSI encoding
use N like below
where col like N'%�%'
why do you think ,you need N prefix:
Prefix Unicode character string constants with the letter N. Without the N prefix, the string is converted to the default code page of the database. This default code page may not recognize certain characters.
Thanks to Martin Smith,Earlier i tested only with one character earlier and it worked,but as Martin pointed out, it returns all characters..
Below query works and returns only intended
select * from #demo where id like N'%�%'
COLLATE Latin1_General_100_BIN
Demo:
create table #demo
(
id nvarchar(max)
)
insert into #demo
values
(N'ﬗ'),
( N'�')
to know more about unicode,please see below links
http://kunststube.net/encoding/
https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/
This is the Unicode replacement character symbol.
It could match any of 2,048 invalid code points in the UCS-2 encoding (or the single character U+FFFD for the symbol itself).
You can use a range and a binary collate clause to match them all (demo).
WITH T(N)
AS
(
SELECT TOP 65536 NCHAR(ROW_NUMBER() OVER (ORDER BY ##SPID))
FROM master..spt_values v1,
master..spt_values v2
)
SELECT N
FROM T
WHERE N LIKE '%[' + NCHAR(65533) + NCHAR(55296) + '-' + NCHAR(57343) + ']%' COLLATE Latin1_General_100_BIN
You can use ASCII to find out the ascii code for that char
Select ascii('�')
And use CHAR to retrieve the char from that code and combine it in a LIKE expression
Select * from Sometable
where somecolumn like '%'+CHAR(63)+'%'
Note the collation you use can affect the result. Also it depends on the encoding used by your application to feed your data (UTF-8, UNICODE, etc). also how you store it VARCHAR, or NVARCHAR has a last say on what you see.
There's more here in this similar question
EDIT
#Mark
try this simple test:
create table sometable(somecolumn nvarchar(100) not null)
GO
insert into sometable
values
('12345')
,('123�45')
,('12345')
GO
select * from sometable
where somecolumn like '%'+CHAR(63)+'%'
GO
This only means that character was stored win the as a "?" in this test.
When you see a � it means the app where you are seeing isn't quite sure what to print out.
It also mean OP probably needs to find out what char is that using a query.
Also note it means a string outputted like ��� can be 3 formed by different characters.
CHAR(63) was just an example, but you are right this in the ASCII table will be a standard interrogation.
EDIT
#Bridge
Not with time right now to deep dig in it but the below test don't worked
Select ascii('�'), CHAR(ascii('�')), UNICODE(N'�'), CHAR(UNICODE(N'�'))
GO
create table sometable(somecolumn nvarchar(100) not null)
GO
insert into sometable
values
('12345')
,('123�45')
,('12345')
,('12'+NCHAR(UNICODE(N'�'))+'345')
GO
select * from sometable
where somecolumn like '%'+CHAR(63)+'%'
select * from sometable
where somecolumn like '%'+NCHAR(UNICODE(N'�'))+'%'
GO

query collation on foreign language field in Latin table

I have a series of tables that each have a dedicated column to a foreign language. Languages vary from Japanese, Thai, English, Italian, French, more than 20 in all.
All of these tables are set up with Latin Case Insensitive collation. DB works fine.
But now I am trying to query against the specific foreign language column of each table. Lets take Japanese for starters. I'd like a foreign language user to enter foreign text and find the record based on the foreign language column.
DECLARE #myVar nvarchar(max);
SET #myVar = 'エンジン ストップ リレー' = 'Engine Stop Relay' in english
Select *
FROM tableJapanese
WHERE langString = #myVar;
I have tried a multitude of collation combinations. I even copied the table and changed the collation of the column to Japanese_CI_AI and tried to query it that way.
None of these WHERE clauses work on either table/columm collation, when the column was Latin or Japanese...
WHERE lang_String collate Japanese_CI_AI = #myVar;
WHERE lang_String = #myVar collate Japanese_CI_AI;
WHERE lang_String collate Japanese_CI_AI = #myVar collate Japanese_CI_AI;
WHERE lang_String collate Japanese_CI_AI = #myVar;
WHERE lang_String = #myVar collate Japanese_CI_AI;
WHERE lang_String collate Japanese_CI_AI = #myVar collate Japanese_CI_AI;
I would like to leave the columns/database as Latin collation and code the queries for each language if possible.
This seems like one of those problems that if were a snake I'd been bitten already. Can anyone see what I am missing?
MSSQL Express 2008 R2
SOLUTION:
Add N in front of the field, it indicates unicode to SQL...
Select *
FROM tblLangJAP_test
WHERE lang_String = N'エンジン ストップ リレー';
Works flawlessly.
Thanks,
Add N in front of the field, it indicates unicode to SQL...
Select *
FROM tblLangJAP_test
WHERE lang_String = N'エンジン ストップ リレー';
Works flawlessly.

How to find values in all caps in SQL Server?

How can I find column values that are in all caps? Like LastName = 'SMITH' instead of 'Smith'
Here is what I was trying...
SELECT *
FROM MyTable
WHERE FirstName = UPPER(FirstName)
You can force case sensitive collation;
select * from T
where fld = upper(fld) collate SQL_Latin1_General_CP1_CS_AS
Try
SELECT *
FROM MyTable
WHERE FirstName = UPPER(FirstName) COLLATE SQL_Latin1_General_CP1_CS_AS
This collation allows case sensitive comparisons.
If you want to change the collation of your database so you don't need to specifiy a case-sensitive collation in your queries you need to do the following (from MSDN):
1) Make sure you have all the information or scripts needed to re-create your user databases and all the objects in them.
2) Export all your data using a tool such as the bcp Utility.
3) Drop all the user databases.
4) Rebuild the master database specifying the new collation in the SQLCOLLATION property of the setup command. For example:
Setup /QUIET /ACTION=REBUILDDATABASE /INSTANCENAME=InstanceName
/SQLSYSADMINACCOUNTS=accounts /[ SAPWD= StrongPassword ]
/SQLCOLLATION=CollationName
5) Create all the databases and all the objects in them.
6) Import all your data.
You need to use a server collation which is case sensitive like so:
SELECT *
FROM MyTable
WHERE FirstName = UPPER(FirstName) Collate SQL_Latin1_General_CP1_CS_AS
Be default, SQL comparisons are case-insensitive.
Try
SELECT *
FROM MyTable
WHERE FirstName = LOWER(FirstName)
Could you try using this as your where clause?
WHERE PATINDEX(FirstName + '%',UPPER(FirstName)) = 1
Have a look here
Seems you have a few options
cast the string to VARBINARY(length)
use COLLATE to specify a case-sensitive collation
calculate the BINARY_CHECKSUM() of the strings to compare
change the table column’s COLLATION property
use computed columns (implicit calculation of VARBINARY)
Try This
SELECT *
FROM MyTable
WHERE UPPER(FirstName) COLLATE Latin1_General_CS_AS = FirstName COLLATE Latin1_General_CS_AS
You can find good example in Case Sensitive Search: Fetching lowercase or uppercase string on SQL Server
I created a simple UDF for that:
create function dbo.fnIsStringAllUppercase(#input nvarchar(max)) returns bit
as
begin
if (ISNUMERIC(#input) = 0 AND RTRIM(LTRIM(#input)) > '' AND #input = UPPER(#input COLLATE Latin1_General_CS_AS))
return 1;
return 0;
end
Then you can easily use it on any column in the WHERE clause.
To use the OP example:
SELECT *
FROM MyTable
WHERE dbo.fnIsStringAllUppercase(FirstName) = 1
Simple way to answer this question is to use collation. Let me try to explain:
SELECT *
FROM MyTable
WHERE FirstName COLLATE SQL_Latin1_General_CP1_CI_AS='SMITH’
In the above query I have used collate and didn’t use any in built sql functions like ‘UPPER’. Reason because using inbuilt functions has it’s own impact.
Please find the link to understand better:
performance impact of upper and collate

sql strictly equals, is there something? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
SQL server ignore case in a where expression
basically I need to check something like this
select * from users where name = #name, pass = #pass
the problem is that 'pass' = 'pAsS'
is there something more strict for string comparison in sql (ms sql-server)
It's down to your collation, which it would seem is case insensitive. For example, the standard collation is Latin1_General_CI_AS, where the CI means case insensitive. You can force a different collaction for a different comparison:
select *
from users
where name = #name
and pass COLLATE Latin1_General_CS_AS = #pass COLLATE Latin1_General_CS_AS
Incidentally, you shouldn't be storing passwords in your database - you should be salting and hashing them.
As several others have already posted you can use collations in your query or change the collation of your "pass" column to be case sensitive. You may also change your query to use the VARBINARY type instead of changing collation:
SELECT * FROM users
WHERE name = #name
AND pass = #pass
AND CAST(pass AS VARBINARY(50)) = CAST(#pass AS VARBINARY(50))
Note that I left in the pass = #pass clause. Leaving this line in the query allows SQL Server to use any index on the pass column.
You need to use a case sensitive collation for the comparison:
SELECT * FROM users
WHERE name = #name, pass = #pass
COLLATE SQL_Latin1_General_Cp1_CS_AS
See this article for more details.
It's all to do with database collation.
This should help you:
select * from users where name = #name, pass = #pass COLLATE SQL_Latin1_General_CP1_CS_AS
There is some information here regarding collations in SQL Server
For case sensitive you need to specify the collation in your query. Something like:
select * from users where name = #name, pass = #pass COLLATE SQL_Latin1_General_Cp1_CS_AS
Use a binary collation to ensure an exact match.
WHERE pass = #pass COLLATE Latin1_General_BIN