SQL Server - To search case insensitive where the COLLATE Latin1_General_CS_AS is set - sql

Parent question - Thanks to Iamdave, part of the problem is solved. Now the challenge is to make the search case insensitive in the db where the following collation is set already: COLLATE Latin1_General_CS_AS
I am using this query and it is not working - couldn't match test, Test, could match only TEST
UPDATE dbo.BODYCONTENT
SET BODY = LTRIM(RTRIM(REPLACE(
REPLACE(
REPLACE(N' ' + CAST(BODY AS NVARCHAR(MAX))
+ N' ', ' ', '<>'), '>TEST<', '>Prod<'), '<>', ' ')))
FROM dbo.BODYCONTENT
WHERE BODY COLLATE Latin1_General_CI_AS LIKE '%TEST%' COLLATE Latin1_General_CI_AS;
How to make the search string in the replace function to match case insensitive
Other queries and results:
UPDATE dbo.BODYCONTENT SET BODY =
ltrim(rtrim(replace(replace(
replace(N' ' + cast(BODY as nvarchar(max)) + N' ' ,' ','<>')
,'>Test<','>Prod<),'<>',' ')))
from dbo.BODYCONTENT WHERE lower(BODY) like '%test%';
result: Argument data type ntext is invalid for argument 1 of lower function.

Based on the comments, it'd be easier to just use LOWER
where lower(body) like '%test%'

What you have there should work, unless there's some assumption that's being left out of the question (such as not actually being collated like you think, or the test rows actually being absent.
You can do this a couple ways. As scsimon pointed out, you could simply do a lower case comparison. That's probably the most straight forward.
You can also explicitly collate the column like you're doing. You shouldn't need to specifically collate the '%TEST%' string though (unless I'm mistaken; on my machine it wasn't necessary. I suppose default DB settings might negate this argument).
Finally, another option is to have a computed column on the table which is the case insensitive version of the field. That's essentially the same as the previous method, but it's part of the table definition instead.
declare #t table
(
body nvarchar(max) collate Latin1_General_CS_AS,
body_Insensitive as body collate Latin1_General_CI_AS
)
insert into #t
values ('test'), ('Test'), ('TEST')
select * from #t where BODY collate Latin1_General_CI_AS like '%test%' collate Latin1_General_CI_AS;
select * from #t where lower(body) like '%test%'
select * from #T where body_Insensitive like '%TeSt%'

Related

SQl select statement whitespace issue

Values in SQL are USERNAME=ADMIN PASSWORD=ADMIN
SELECT * FROM TBL_USER
WHERE USERNAME='ADMIN'
AND PASSWORD COLLATE LATIN1_GENERAL_CS_AS=N'ADMIN'
The above query works fine.
2) If I add a space in front of the password.
SELECT * FROM TBL_USER
WHERE USERNAME='ADMIN'
AND PASSWORD COLLATE LATIN1_GENERAL_CS_AS=N' ADMIN'
This is also correct as it returns a message saying incorrect password.
3) If I add a space in to the end of the password:
SELECT * FROM TBL_USER
WHERE USERNAME='ADMIN'
AND PASSWORD COLLATE LATIN1_GENERAL_CS_AS=N'ADMIN '
This query should fail but it doesn't it retrieves data.
Can anyone help me in this.The third condition should fail since the value in table is 'admin' and the value provided is 'admin ' (with whitespaces at end).
Instead of using = operator use LIKE (without % wildcard)
SELECT * FROM TBL_USER WHERE USERNAME='ADMIN'
AND PASSWORD COLLATE LATIN1_GENERAL_CS_AS LIKE N'ADMIN '
And here's why: SQL WHERE clause matching values with trailing spaces
This is the expected behaviour of trailing spaces
SQL Server follows the ANSI/ISO SQL-92 specification (Section 8.2,
, General rules #3) on how to compare strings
with spaces. The ANSI standard requires padding for the character
strings used in comparisons so that their lengths match before
comparing them. The padding directly affects the semantics of WHERE
and HAVING clause predicates and other Transact-SQL string
comparisons. For example, Transact-SQL considers the strings 'abc' and
'abc ' to be equivalent for most comparison operations.
The only exception to this rule is the LIKE predicate. When the right
side of a LIKE predicate expression features a value with a trailing
space, SQL Server does not pad the two values to the same length
before the comparison occurs. Because the purpose of the LIKE
predicate, by definition, is to facilitate pattern searches rather
than simple string equality tests, this does not violate the section
of the ANSI SQL-92 specification mentioned earlier.
I suggest you add another condition to your where clause:
And DATALENGTH(Password) = DATALENGTH(N'ADMIN ')
This will add another check to ensure the input value length is the same as the Database value.
Full example:
Declare #tblUser table
(
Username nvarchar(50),
Password nvarchar(50)
)
Insert into #tblUser
Values (N'ADMIN',N'ADMIN')
select *
From #tblUser
Where Username = N'ADMIN'
And Password Collate LATIN1_GENERAL_CS_AS = N'ADMIN'
select *
From #tblUser
Where Username = N'ADMIN'
And Password Collate LATIN1_GENERAL_CS_AS = N' ADMIN'
select *
From #tblUser
Where Username = N'ADMIN'
And Password Collate LATIN1_GENERAL_CS_AS = N'ADMIN '
And DATALENGTH(Password) = DATALENGTH(N'ADMIN ')
This will work for you
SELECT * FROM TBL_USER
WHERE USERNAME='ADMIN' AND PASSWORD COLLATE LATIN1_GENERAL_CS_AS=N'ADMIN ' And LEN(PASSWORD) = LEN(Replace('admin ', ' ' , '_'))
As it will fail if the user uses spaces at the end of the password.
You can use trim function
SELECT * FROM TBL_USER WHERE USERNAME=trim('ADMIN') AND PASSWORD COLLATE LATIN1_GENERAL_CS_AS=N trim('ADMIN')
You can do a right trim in your check.
SELECT * FROM TBL_USER
WHERE USERNAME='ADMIN' AND PASSWORD COLLATE LATIN1_GENERAL_CS_AS=RTRIM(N'ADMIN ')

How do I identify the column(s) responsible for “String or binary data would be truncated.”

I have an INSERT statement which looks like this:
INSERT INTO CLIENT_TABLE
SELECT NAME, SURNAME, AGE FROM CONTACT_TABLE
My example above is a basic one, but is there a way to pass in a SELECT statement and then check the returned column values against what the actual field sizes are?
Checking LEN against every column isnt practical. I am looking for something that is automated.
My debugging in that kind of problem is..
I am removing columns in the SELECT one by one, if did not return error, then you know what column is the cause of truncation problem.. but here are some tips on debugging.
Option 1: Start first with the columns that hold more character.. like VARCHAR, for example in your case, i think the column NAME, SURNAME are the one causes an error since AGE column does not hold many characters because its integer. You should debug something like that.
Option 2: You can investigate the column in your final output. The final SELECT will return all columns and its values, then you can counter check if the values matches what you input on the UI etc.
Ex. See the Expected vs. Actual Output result on the image below
Expected:
Actual Output:
My example in option 2 shows that the truncated string is the SURNAME as you can see..
NOTE: You can only use the Option 2 if the query did not return execution error, meaning to say that the truncated string did not return an error BUT created an unexpected split string which we don't want.
IF the query return an error, your best choice is Option 1, which consume more time but worth it, because that is the best way to make sure you found the exact column that causes the truncation problem
Then if you already found the columns that causes the problem, you can now adjust the size of the column or another way is to limit the input of the user ?, you can put some validation to users to avoid truncation problem, but it is all up to you on how you want the program works depending on your requirement.
My answers/suggestion is base on my experience in that kind of situation.
Hope this answer will help you. :)
Check max length for each field, this way you can identify the fields that are over char limit specified in your table e.g CLIENT_TABLE.
SELECT Max(Len(NAME)) MaxNamePossible
, Max(Len(SURNAME)) MaxSurNamePossible
, Max(Len(AGE)) MaxAgePossible
FROM CONTACT_TABLE
Compare the result with Client_Table Design
Like if in Client_Table "Name" is of Type Varchar(50) and validation query( written above) return more than 50 chars than "Name" field is causing over flow.
There is a great answer by Aaron Bertrand to the question:
Retrieve column definition for stored procedure result set
If you used SQL Server 2012+ you could use sys.dm_exec_describe_first_result_set. Here is a nice article with examples. But, even in SQL Server 2008 it is possible to retrieve the types of columns of the query. Aaron's answer explains it in details.
In fact, in your case it is easier, since you have a SELECT statement that you can copy-paste, not something that is hidden in a stored procedure. I assume that your SELECT is a complex query returning columns from many tables. If it was just one table you could use sys.columns with that table directly.
So, create an empty #tmp1 table based on your complex SELECT:
SELECT TOP(0)
NAME, SURNAME, AGE
INTO #tmp1
FROM CONTACT_TABLE;
Create a second #tmp2 table based on the destination of your complex SELECT:
SELECT TOP(0)
NAME, SURNAME, AGE
INTO #tmp2
FROM CLIENT_TABLE;
Note, that we don't need any rows, only columns for metadata, so TOP(0) is handy.
Once those #tmp tables exist, we can query their metadata using sys.columns and compare it:
WITH
CTE1
AS
(
SELECT
c.name AS ColumnName
,t.name AS TypeName
,c.max_length
,c.[precision]
,c.scale
FROM
tempdb.sys.columns AS c
INNER JOIN tempdb.sys.types AS t ON
c.system_type_id = t.system_type_id
AND c.user_type_id = t.user_type_id
WHERE
c.[object_id] = OBJECT_ID('tempdb.dbo.#tmp1')
)
,CTE2
AS
(
SELECT
c.name AS ColumnName
,t.name AS TypeName
,c.max_length
,c.[precision]
,c.scale
FROM
tempdb.sys.columns AS c
INNER JOIN tempdb.sys.types AS t ON
c.system_type_id = t.system_type_id
AND c.user_type_id = t.user_type_id
WHERE
c.[object_id] = OBJECT_ID('tempdb.dbo.#tmp2')
)
SELECT *
FROM
CTE1
FULL JOIN CTE2 ON CTE1.ColumnName = CTE2.ColumnName
WHERE
CTE1.TypeName <> CTE2.TypeName
OR CTE1.max_length <> CTE2.max_length
OR CTE1.[precision] <> CTE2.[precision]
OR CTE1.scale <> CTE2.scale
;
Another possible way to compare:
WITH
... as above ...
SELECT * FROM CTE1
EXCEPT
SELECT * FROM CTE2
;
Finally
DROP TABLE #tmp1;
DROP TABLE #tmp2;
You can tweak the comparison to suit your needs.
A manual solution is very quick if you are using SQL Server Manager Studio (SSMS). First capture the table structure of your SELECT statement into a working table:
SELECT COL1, COL2, ... COL99 INTO dbo.zz_CONTACT_TABLE
FROM CONTACT_TABLE WHERE 1=0;
Then in SSMS, right-click your original destination table (CLIENT_TABLE) and script it as create to a new SSMS window. Then right-click your working table (zz_CONTACT_TABLE) and script the creation of this table to a second SSMS window. Arrange both windows side by side and check the columns of zz_CONTACT_TABLE against CLIENT_TABLE. Differences in length and out-of-order columns will be immediately seen, even if there are hundreds of output columns.
Finally drop your working table:
DROP TABLE dbo.zz_CONTACT_TABLE;
Regarding an automated solution, it is difficult to see how this could work. Basically you are comparing a destination table (or a subset of columns in a destination table) against the output of a SELECT statement. I suppose you could write a stored procedure that takes two varchar parameters: the name of the destination table and the SELECT statement that would populate it. But this would not handle the case where only some columns of the destination are populated, and it would be more work than the manual solution above.
Here is some code to compare two row producing SQL statements to compare the columns. It takes as parameters two row-sets specified with server name, database name, and T-SQL query. It can compare data in different databases and even on different SQL Servers.
--setup parameters
declare #Server1 as varchar(128)
declare #Database1 as varchar(128)
declare #Query1 as varchar(max)
declare #Server2 as varchar(128)
declare #Database2 as varchar(128)
declare #Query2 as varchar(max)
set #Server1 = '(local)'
set #Database1 = 'MyDatabase'
set #Query1 = 'select * from MyTable' --use a select
set #Server2 = '(local)'
set #Database2 = 'MyDatabase2'
set #Query2 = 'exec MyTestProcedure....' --or use a procedure
--calculate statement column differences
declare #SQLStatement1 as varchar(max)
declare #SQLStatement2 as varchar(max)
set #Server1 = replace(#Server1,'''','''''')
set #Database1 = replace(#Database1,'''','''''')
set #Query1 = replace(#Query1,'''','''''')
set #Server2 = replace(#Server2,'''','''''')
set #Database2 = replace(#Database2,'''','''''')
set #Query2 = replace(#Query2,'''','''''')
CREATE TABLE #Qry1Columns(
[colorder] [smallint] NULL,
[ColumnName] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[TypeName] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[prec] [smallint] NULL,
[scale] [int] NULL,
[isnullable] [int] NULL,
[collation] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
CREATE TABLE #Qry2Columns(
[colorder] [smallint] NULL,
[ColumnName] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[TypeName] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[prec] [smallint] NULL,
[scale] [int] NULL,
[isnullable] [int] NULL,
[collation] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
set #SQLStatement1 =
'SELECT *
INTO #Qry1
FROM OPENROWSET(''SQLNCLI'',
''server=' + #Server1 + ';database=' + #Database1 + ';trusted_connection=yes'',
''select top 0 * from (' + #Query1 + ') qry'')
select colorder, syscolumns.name ColumnName, systypes.name TypeName, syscolumns.prec, syscolumns.scale, syscolumns.isnullable, syscolumns.collation
from tempdb.dbo.syscolumns
join tempdb.dbo.systypes
on syscolumns.xtype = systypes.xtype
where id = OBJECT_ID(''tempdb.dbo.#Qry1'')
order by 1'
insert into #Qry1Columns
exec(#SQLStatement1)
set #SQLStatement2 =
'SELECT *
INTO #Qry1
FROM OPENROWSET(''SQLNCLI'',
''server=' + #Server2 + ';database=' + #Database2 + ';trusted_connection=yes'',
''select top 0 * from (' + #Query2 + ') qry'')
select colorder, syscolumns.name ColumnName, systypes.name TypeName, syscolumns.prec, syscolumns.scale, syscolumns.isnullable, syscolumns.collation
from tempdb.dbo.syscolumns
join tempdb.dbo.systypes
on syscolumns.xtype = systypes.xtype
where id = OBJECT_ID(''tempdb.dbo.#Qry1'')
order by 1'
insert into #Qry2Columns
exec(#SQLStatement2)
select ISNULL( #Qry1Columns.colorder, #Qry2Columns.colorder) ColumnNumber,
#Qry1Columns.ColumnName ColumnName1,
#Qry1Columns.TypeName TypeName1,
#Qry1Columns.prec prec1,
#Qry1Columns.scale scale1,
#Qry1Columns.isnullable isnullable1,
#Qry1Columns.collation collation1,
#Qry2Columns.ColumnName ColumnName2,
#Qry2Columns.TypeName TypeName2,
#Qry2Columns.prec prec2,
#Qry2Columns.scale scale2,
#Qry1Columns.isnullable isnullable2,
#Qry2Columns.collation collation2
from #Qry1Columns
join #Qry2Columns
on #Qry1Columns.colorder=#Qry2Columns.colorder
You can tweak the finally select statement to highlight any differences that you wish. You can also wrap this up in a procedure and make a nice little user interface for it if you like, so that it's literally a cut and paste away to quick results.

compare s, t with ş, ţ in SQL Server

I followed this post How do I perform an accent insensitive compare (e with è, é, ê and ë) in SQL Server? but it doesn't help me with " ş ", " ţ " characters.
This doesn't return anything if the city name is " iaşi " :
SELECT *
FROM City
WHERE Name COLLATE Latin1_general_CI_AI LIKE '%iasi%' COLLATE Latin1_general_CI_AI
This also doesn't return anything if the city name is " iaşi " (notice the foreign ş in the LIKE pattern):
SELECT *
FROM City
WHERE Name COLLATE Latin1_general_CI_AI LIKE '%iaşi%' COLLATE Latin1_general_CI_AI
I'm using SQL Server Management Studio 2012.
My database and column collation is "Latin1_General_CI_AI", column type is nvarchar.
How can I make it work?
The characters you've specified aren't part of the Latin1 codepage, so they can't ever be compared in any other way than ordinal in Latin1_General_CI_AI. In fact, I assume that they don't really work at all in the given collation.
If you're only using one collation, simply use the correct collation (for example, if your data is turkish, use Turkish_CI_AI). If your data is from many different languages, you have to use unicode, and the proper collation.
However, there's an additional issue. In languages like Romanian or Turkish, ş is not an accented s, but rather a completely separate character - see http://collation-charts.org/mssql/mssql.0418.1250.Romanian_CI_AI.html. Contrast with eg. š which is an accented form of s.
If you really need ş to equal s, you have replace the original character manually.
Also, when you're using unicode columns (nvarchar and the bunch), make sure you're also using unicode literals, ie. use N'%iasi%' rather than '%iasi%'.
In SQL Server 2008 collations versioned 100 were introduced.
Collation Latin1_General_100_CI_AI seems to do what you want.
The following should work:
SELECT * FROM City WHERE Name LIKE '%iasi%' COLLATE Latin1_General_100_CI_AI
Not tidiest solution I guess, but if you know that it's just the "ş" and "ţ" characters that are the problem, would it be acceptable to do a replace?
SELECT *
FROM City
WHERE replace(replace(Name,'ş','s'),'ţ','t') LIKE COLLATE Latin1_general_CI_AI '%iasi%' COLLATE Latin1_general_CI_AI
You just need to change collation of name field before like operation. Check test code below
DECLARE #city TABLE ( NAME NVARCHAR(20) )
INSERT INTO #city
VALUES ( N'iaşi' )
SELECT *
FROM #city
WHERE name LIKE 'iasi'
--No return
SELECT *
FROM #city
WHERE name COLLATE Latin1_general_CI_AI LIKE '%iasi%'
--Return 1 row
This problem was haunting me for some time, until now, when I've finally figured it out.
Presuming your table or column is of SQL_Latin1_General_CP1_CI_AS collation, if you do:
update
set myCol = replace(myCol , N'ș', N's')
from MyTable
and
update
set myCol = replace(myCol,N'ț',N't')
from MyTable
the replace function will not find these characters, because the "ș" made from your keyboard (Romanian Standard keyboard) differs from the "ş" or "ţ" found in your database.
As a comparison: ţț and şș - you can see that they differ because the accents are closer to the "s" or "t" character.
Instead, you must do:
update
set myCol = replace(myCol , N'ş', N's')
from MyTable
and
update
set myCol = replace(myCol,N'ţ',N't')
from MyTable

How to find values in all caps in SQL Server?

How can I find column values that are in all caps? Like LastName = 'SMITH' instead of 'Smith'
Here is what I was trying...
SELECT *
FROM MyTable
WHERE FirstName = UPPER(FirstName)
You can force case sensitive collation;
select * from T
where fld = upper(fld) collate SQL_Latin1_General_CP1_CS_AS
Try
SELECT *
FROM MyTable
WHERE FirstName = UPPER(FirstName) COLLATE SQL_Latin1_General_CP1_CS_AS
This collation allows case sensitive comparisons.
If you want to change the collation of your database so you don't need to specifiy a case-sensitive collation in your queries you need to do the following (from MSDN):
1) Make sure you have all the information or scripts needed to re-create your user databases and all the objects in them.
2) Export all your data using a tool such as the bcp Utility.
3) Drop all the user databases.
4) Rebuild the master database specifying the new collation in the SQLCOLLATION property of the setup command. For example:
Setup /QUIET /ACTION=REBUILDDATABASE /INSTANCENAME=InstanceName
/SQLSYSADMINACCOUNTS=accounts /[ SAPWD= StrongPassword ]
/SQLCOLLATION=CollationName
5) Create all the databases and all the objects in them.
6) Import all your data.
You need to use a server collation which is case sensitive like so:
SELECT *
FROM MyTable
WHERE FirstName = UPPER(FirstName) Collate SQL_Latin1_General_CP1_CS_AS
Be default, SQL comparisons are case-insensitive.
Try
SELECT *
FROM MyTable
WHERE FirstName = LOWER(FirstName)
Could you try using this as your where clause?
WHERE PATINDEX(FirstName + '%',UPPER(FirstName)) = 1
Have a look here
Seems you have a few options
cast the string to VARBINARY(length)
use COLLATE to specify a case-sensitive collation
calculate the BINARY_CHECKSUM() of the strings to compare
change the table column’s COLLATION property
use computed columns (implicit calculation of VARBINARY)
Try This
SELECT *
FROM MyTable
WHERE UPPER(FirstName) COLLATE Latin1_General_CS_AS = FirstName COLLATE Latin1_General_CS_AS
You can find good example in Case Sensitive Search: Fetching lowercase or uppercase string on SQL Server
I created a simple UDF for that:
create function dbo.fnIsStringAllUppercase(#input nvarchar(max)) returns bit
as
begin
if (ISNUMERIC(#input) = 0 AND RTRIM(LTRIM(#input)) > '' AND #input = UPPER(#input COLLATE Latin1_General_CS_AS))
return 1;
return 0;
end
Then you can easily use it on any column in the WHERE clause.
To use the OP example:
SELECT *
FROM MyTable
WHERE dbo.fnIsStringAllUppercase(FirstName) = 1
Simple way to answer this question is to use collation. Let me try to explain:
SELECT *
FROM MyTable
WHERE FirstName COLLATE SQL_Latin1_General_CP1_CI_AS='SMITH’
In the above query I have used collate and didn’t use any in built sql functions like ‘UPPER’. Reason because using inbuilt functions has it’s own impact.
Please find the link to understand better:
performance impact of upper and collate

Writing SQL code: same functionality as Yell.com

Can anyone help me with the trying to write SQL (MS SqlServer) - I must admit this is not by best skill.
What I want to do is exactly the same functionality as appears for the search boxes for the Yell website i.e.
Search for company type
AND/OR company name
AND/OR enter a company name
in a Location
if anyone can suggest the SQL code you would need to write in order to get the same functionality as Yell - that would be great.
Typically, one does something like this:
-- All these are NULL unless provided
DECLARE #CompanyType AS varchar
DECLARE #CompanyName AS varchar
DECLARE #Town AS varchar
SELECT *
FROM TABLE_NAME
WHERE (#CompanyType IS NULL OR COMPANY_TYPE_COLUMN LIKE '%' + #CompanyType + '%')
AND (#CompanyName IS NULL OR COMPANY_NAME_COLUMN LIKE '%' + #CompanyName + '%')
AND (#Town IS NULL OR TOWN_COLUMN LIKE '%' + #Town + '%')
Or this (only match start of columns with the wildcards):
-- All these are NULL unless provided
DECLARE #CompanyType AS varchar
DECLARE #CompanyName AS varchar
DECLARE #Town AS varchar
SELECT *
FROM TABLE_NAME
WHERE (#CompanyType IS NULL OR COMPANY_TYPE_COLUMN LIKE #CompanyType + '%')
AND (#CompanyName IS NULL OR COMPANY_NAME_COLUMN LIKE #CompanyName + '%')
AND (#Town IS NULL OR TOWN_COLUMN LIKE #Town + '%')
Can you provide the database layout (schema) that the sql would run against? It would be necessary to give you an exact result.
But generally speaking what you are looking for is
SELECT * FROM tablename WHERE companyType = 'type' OR companyName = 'companyName'
What you need first is not SQL code, but a database design. Only then does it make any sense to start writing SQL.
A simple table schema that matches Yell's functionality might be something like:
CREATE TABLE Company (
company_id INT NOT NULL PRIMARY KEY IDENTITY(1,1),
company_name VARCHAR(255) NOT NULL,
location VARCHAR(255) NOT NULL
)
and then you'd search for it by name with SQL like:
SELECT * FROM Company WHERE company_name like '%text%'
or by location like:
SELECT * FROM Company WHERE location = 'Location'
Of course, a real-world location search would have to use either exact city and state, or a zip code lookup, or some intelligent combination thereof. And a real table would then have lots more fields, like descriptions, etc. But that's the basic idea.