"SELECT DISTINCT" ignores different cases

"SELECT DISTINCT" ignores different cases - sql

I have the problem, that MSSQL Server 2000 should select some distinct values from a table (the specific column is of the nvarchar type).
There are the sometimes the same values, but with different cases, for example (pseudocode):
SELECT DISTINCT * FROM ("A", "a", "b", "B")
would return
A,b
But I do want (and do expect)
A,a,b,B
because they actually are different values.
How to solve this problem?

The collation will be set to case insensitive.
You need to do something like this
Select distinct col1 COLLATE sql_latin1_general_cp1_cs_as
From dbo.myTable

Not sure about MS SQL but with MySQL or postgres, use BINARY for this operation. Cast the column to binary like so:
SELECT DISTINCT BINARY(column1) from table1;
Just change column1 and table1 as per your schema.
Full example that works for me in MySQL 5.7, should work for others:
SELECT DISTINCT BINARY(gateway) from transactions;
Cheers!

SELECT DISTINCT
CasedTheColumn
FROM
(
SELECT TheColumn COLLATE LATIN1_GENERAL_BIN AS CasedTheColumn
FROM myTAble
)FOO
WHERE
CasedTheColumn IN ('A', 'a'...)

Try setting the collation of the column in question to something binary, e.g. utf8-bin. You can either do that in the SELECT statement itself or by changing your table structure directly (which means it doesn't have to map the collation each time the query is run, since it will store it correctly internally).

Related

what is the maximum value we can use with IN operator in sql [duplicate]

I'm using the following code:
SELECT * FROM table
WHERE Col IN (123,123,222,....)
However, if I put more than ~3000 numbers in the IN clause, SQL throws an error.
Does anyone know if there's a size limit or anything similar?!!

Depending on the database engine you are using, there can be limits on the length of an instruction.
SQL Server has a very large limit:
http://msdn.microsoft.com/en-us/library/ms143432.aspx
ORACLE has a very easy to reach limit on the other side.
So, for large IN clauses, it's better to create a temp table, insert the values and do a JOIN. It works faster also.

There is a limit, but you can split your values into separate blocks of in()
Select *
From table
Where Col IN (123,123,222,....)
or Col IN (456,878,888,....)

Parameterize the query and pass the ids in using a Table Valued Parameter.
For example, define the following type:
CREATE TYPE IdTable AS TABLE (Id INT NOT NULL PRIMARY KEY)
Along with the following stored procedure:
CREATE PROCEDURE sp__Procedure_Name
#OrderIDs IdTable READONLY,
AS
SELECT *
FROM table
WHERE Col IN (SELECT Id FROM #OrderIDs)

Why not do a where IN a sub-select...
Pre-query into a temp table or something...
CREATE TABLE SomeTempTable AS
SELECT YourColumn
FROM SomeTable
WHERE UserPickedMultipleRecordsFromSomeListOrSomething
then...
SELECT * FROM OtherTable
WHERE YourColumn IN ( SELECT YourColumn FROM SomeTempTable )

Depending on your version, use a table valued parameter in 2008, or some approach described here:
Arrays and Lists in SQL Server 2005

For MS SQL 2016, passing ints into the in, it looks like it can handle close to 38,000 records.
select * from user where userId in (1,2,3,etc)

I solved this by simply using ranges
WHERE Col >= 123 AND Col <= 10000
then removed unwanted records in the specified range by looping in the application code. It worked well for me because I was looping the record anyway and ignoring couple of thousand records didn't make any difference.
Of course, this is not a universal solution but it could work for situation if most values within min and max are required.

You did not specify the database engine in question; in Oracle, an option is to use tuples like this:
SELECT * FROM table
WHERE (Col, 1) IN ((123,1),(123,1),(222,1),....)
This ugly hack only works in Oracle SQL, see https://asktom.oracle.com/pls/asktom/asktom.search?tag=limit-and-conversion-very-long-in-list-where-x-in#9538075800346844400
However, a much better option is to use stored procedures and pass the values as an array.

You can use tuples like this:
SELECT * FROM table
WHERE (Col, 1) IN ((123,1),(123,1),(222,1),....)
There are no restrictions on number of these. It compares pairs.

Verify if the second character is a letter in SQL

I want to put a condition in my query where I have a column that should contain second position as an alphabet.
How to achieve this?
I've tried with _[A-Z]% in where clause but is not working. I've also tried [A-Z]%.
Any inputs please?

I think you want mysql query. like this
SELECT * FROM table WHERE column REGEXP '^.[A-Za-z]+$'
or sql server
select * from table where column like '_[a-zA-Z]%'

You can use regular expression matching in your query. For example:
SELECT * FROM `test` WHERE `name` REGEXP '^.[a-zA-Z].*';
That would match the name column from the test table against a regex that verifies if the second character is either a lowercase or uppercase alphabet letter.
Also see this SQL Fiddle for an example of data it does and doesn't match.

agree with #Gordon Linoff, your ('_[A-Z]%') should work.
if not work, kindly add some sample data with your question.
Declare #Table Table
(
TextCol Varchar(20)
)
Insert Into #Table(TextCol) Values
('23423cvxc43f')
,('2eD97S9')
,('sAgsdsf')
,('3Ss08008')
Select *
From #Table As t
Where t.TextCol Like '_[A-Z]%'

The use of '%[A-Z]%' suggests that you are using SQL Server. If so, you can do this using LIKE:
where col like '_[A-Z]%'
For LIKE patterns, _ represents any character. If the first character needs to be a digit:
where col like '[0-9][A-Z]%'
EDIT:
The above doesn't work in DB2. Instead:
where substr(col, 2, 1) between 'A' and 'Z'

How to find rows that have a value that contains a lowercase letter

I'm looking for an SQL query that gives me all rows where ColumnX contains any lowercase letter (e.g. "1234aaaa5789"). Same for uppercase.

SELECT * FROM my_table
WHERE UPPER(some_field) != some_field
This should work with funny characters like åäöøüæï. You might need to use a language-specific utf-8 collation for the table.

SELECT * FROM my_table WHERE my_column = 'my string'
COLLATE Latin1_General_CS_AS
This would make a case sensitive search.
EDIT
As stated in kouton's comment here and tormuto's comment here whosoever faces problem with the below collation
COLLATE Latin1_General_CS_AS
should first check the default collation for their SQL server, their respective database and the column in question; and pass in the default collation with the query expression. List of collations can be found here.

SELECT * FROM Yourtable
WHERE UPPER([column_NAME]) COLLATE Latin1_General_CS_AS !=[Column_NAME]

This is how I did it for utf8 encoded table and utf8_unicode_ci column, which doesn't seem to have been posted exactly:
SELECT *
FROM table
WHERE UPPER(column) != BINARY(column)

for search all rows in lowercase
SELECT *
FROM Test
WHERE col1
LIKE '%[abcdefghijklmnopqrstuvwxyz]%'
collate Latin1_General_CS_AS
Thanks Manesh Joseph

IN MS SQL server use the COLLATE clause.
SELECT Column1
FROM Table1
WHERE Column1 COLLATE Latin1_General_CS_AS = 'casesearch'
Adding COLLATE Latin1_General_CS_AS makes the search case sensitive.
Default Collation of the SQL Server installation SQL_Latin1_General_CP1_CI_AS is not case sensitive.
To change the collation of the any column for any table permanently run following query.
ALTER TABLE Table1
ALTER COLUMN Column1 VARCHAR(20)
COLLATE Latin1_General_CS_AS
To know the collation of the column for any table run following Stored Procedure.
EXEC sp_help DatabaseName
Source : SQL SERVER – Collate – Case Sensitive SQL Query Search

I've done something like this to find out the lower cases.
SELECT *
FROM YourTable
where BINARY_CHECKSUM(lower(ColumnName)) = BINARY_CHECKSUM(ColumnName)

mysql> SELECT '1234aaaa578' REGEXP '^[a-z]';

I have to add BINARY to the ColumnX, to get result as case sensitive
SELECT * FROM MyTable WHERE BINARY(ColumnX) REGEXP '^[a-z]';

I'm not an expert on MySQL I would suggest you look at REGEXP.
SELECT * FROM MyTable WHERE ColumnX REGEXP '^[a-z]';

In Posgresql you could use ~
For example you could search for all rows that have col_a with any letter in lowercase
select * from your_table where col_a '[a-z]';
You could modify the Regex expression according your needs.
Regards,

--For Sql
SELECT *
FROM tablename
WHERE tablecolumnname LIKE '%[a-z]%';

Logically speaking Rohit's solution should have worked, but it didn't. I think SQL Management Studio messed up when trying to optimize this.
But by modifying the string before comparing them I was able to get the right results. This worked for me:
SELECT [ExternalId]
FROM [EquipmentSerialsMaster] where LOWER('0'+[ExternalId]) COLLATE Latin1_General_CS_AS != '0'+[ExternalId]

This works in Firebird SQL, it should work in any SQL queries I believe, unless the underlying connection is not case sensitive.
To find records with any lower case letters:
select * from tablename where upper(fieldname) <> fieldname
To find records with any upper case letters:
select * from tablename where lower(fieldname) <> fieldname

Sql trying to change case letter and group similar nvarchar values

I am using sql server 2008 and I'm trying to build a query for displaying some overall results from a single sql table.
I want to display count(fieldname) for each date, for example I want to know how often the name "izla" is repeated in the table for each date but it could be also "IZLA" or "Izla", so i must find a way to group this data together as one and find count for the three of them.
The problem is that if i try using uppercase or lowercase so that they are considered automatically the same I have the problem: when izla is converted to upper it becomes İZLA or on the other hand when IZLA is converted to lowercase it is displayed ızla.
The big question is how can i group this data together? Maybe the problem comes from using nvarchar but i need the column type to be like that (can't change it).

When you group, you should use an Accent Insensitive collation. You can add this directly to your group by clause. The following is an example:
Declare #Temp Table(Data nvarchar(100))
Insert Into #Temp Values(N'izla')
Insert Into #Temp Values(N'İZLA')
Insert Into #Temp Values(N'IZLA')
Insert Into #Temp Values(N'Izla')
Select Data,
Count(*)
From #Temp
Group By Data
Select Data Collate Latin1_General_CI_AI,
Count(*)
From #Temp
Group By Data Collate Latin1_General_CI_AI
When you run this example, you will see that the first query creates two rows (with count 3 and count 1). The second example uses an accent insensitve collation for the grouping, so all 4 items are grouped together.
I used Latin1_General_CI_AI in my example. I suggest you examine the collation of the column you are using and then use a collation that most closely matches by changing the AS on the end to AI.

Try replacing ı and such with english equivalent after lowercasing

This all comes down to collation, which is the way that the system sorts string data.
You could say something like:
SELECT *, COUNT(*) OVER (PARTITION BY fieldname COLLATE Latin1_General_CI_AI), COUNT(*) OVER (PARTITION BY fieldname COLLATE Latin1_General_CI_AS)
FROM yourtable
This will provide some nice figures for you around how many times each name appeared in the various formats. There are many collations, and you can search in Books Online for a complete list. You may also be interested in Latin1_General_BIN for example.
Rob

Limit on the WHERE col IN (...) condition

I'm using the following code:
SELECT * FROM table
WHERE Col IN (123,123,222,....)
However, if I put more than ~3000 numbers in the IN clause, SQL throws an error.
Does anyone know if there's a size limit or anything similar?!!

Depending on the database engine you are using, there can be limits on the length of an instruction.
SQL Server has a very large limit:
http://msdn.microsoft.com/en-us/library/ms143432.aspx
ORACLE has a very easy to reach limit on the other side.
So, for large IN clauses, it's better to create a temp table, insert the values and do a JOIN. It works faster also.

There is a limit, but you can split your values into separate blocks of in()
Select *
From table
Where Col IN (123,123,222,....)
or Col IN (456,878,888,....)

Parameterize the query and pass the ids in using a Table Valued Parameter.
For example, define the following type:
CREATE TYPE IdTable AS TABLE (Id INT NOT NULL PRIMARY KEY)
Along with the following stored procedure:
CREATE PROCEDURE sp__Procedure_Name
#OrderIDs IdTable READONLY,
AS
SELECT *
FROM table
WHERE Col IN (SELECT Id FROM #OrderIDs)

Why not do a where IN a sub-select...
Pre-query into a temp table or something...
CREATE TABLE SomeTempTable AS
SELECT YourColumn
FROM SomeTable
WHERE UserPickedMultipleRecordsFromSomeListOrSomething
then...
SELECT * FROM OtherTable
WHERE YourColumn IN ( SELECT YourColumn FROM SomeTempTable )

Depending on your version, use a table valued parameter in 2008, or some approach described here:
Arrays and Lists in SQL Server 2005

For MS SQL 2016, passing ints into the in, it looks like it can handle close to 38,000 records.
select * from user where userId in (1,2,3,etc)

I solved this by simply using ranges
WHERE Col >= 123 AND Col <= 10000
then removed unwanted records in the specified range by looping in the application code. It worked well for me because I was looping the record anyway and ignoring couple of thousand records didn't make any difference.
Of course, this is not a universal solution but it could work for situation if most values within min and max are required.

You did not specify the database engine in question; in Oracle, an option is to use tuples like this:
SELECT * FROM table
WHERE (Col, 1) IN ((123,1),(123,1),(222,1),....)
This ugly hack only works in Oracle SQL, see https://asktom.oracle.com/pls/asktom/asktom.search?tag=limit-and-conversion-very-long-in-list-where-x-in#9538075800346844400
However, a much better option is to use stored procedures and pass the values as an array.

You can use tuples like this:
SELECT * FROM table
WHERE (Col, 1) IN ((123,1),(123,1),(222,1),....)
There are no restrictions on number of these. It compares pairs.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

"SELECT DISTINCT" ignores different cases - sql

The collation will be set to case insensitive. You need to do something like this Select distinct col1 COLLATE sql_latin1_general_cp1_cs_as From dbo.myTable

SELECT DISTINCT CasedTheColumn FROM ( SELECT TheColumn COLLATE LATIN1_GENERAL_BIN AS CasedTheColumn FROM myTAble )FOO WHERE CasedTheColumn IN ('A', 'a'...)

Related

what is the maximum value we can use with IN operator in sql [duplicate]

Verify if the second character is a letter in SQL

How to find rows that have a value that contains a lowercase letter

Sql trying to change case letter and group similar nvarchar values

Limit on the WHERE col IN (...) condition

Categories

Resources