I need to compare table counts for an Oracle schema to a SQL Server database. However, when I make my query, the results are always off because of the way each handles the underscore ('_') in terms of ordering. I've included an example of what I'm seeing below.
In Oracle:
SELECT FIELD1 FROM ORACLE_ORDER ORDER BY FIELD1 ASC;
Result:
'ABC'
'ABCD'
'ABC_D'
In SQL Server:
SELECT FIELD1 FROM SQL_ORDER ORDER BY FIELD1 ASC;
Result:
'ABC'
'ABC_D'
'ABCD'
As you can see from above, oracle and sql server both treat the underscore differently when it comes to ordering. How can I modify either of the queries (or environments) to make them order the same as the other?
In the SQL Server Side use the following
Select * from SQL_ORDER
ORDER BY FIELD1 Collate SQL_Latin1_General_CP850_BIN
The collation SQL_Latin1_General_CP850_BIN makes it to be used with ASCII values. In this case ASCII of underscore is 95, A being 65, and Z being 90. Remember lower case "a" will have a higher value than upper case "A" and so on.
Here is the fiddle
Simple way is to use Collate SQL_Latin1_General_CP850_BIN function in ORDER BY to achieve this
SELECT * FROM (
SELECT 'ABC' AS TAB UNION
SELECT'ABC_D'UNION
SELECT'ABCD'UNION
SELECT'ABC_'UNION
SELECT 'ABC' UNION
SELECT'A_C' UNION
SELECT'ABC_DE_FGH'UNION
SELECT'ABCXDEYFGH') AS X
ORDER BY X.Tab Collate SQL_Latin1_General_CP850_BIN
Related
I am currently working with a MS SQL database on Windows 2012 Server
I need to query only 1 column from a table that I only have access to read, not make any kind of changes.
Problem is that the name of the column is "Value"
My code is this:
SELECT 'Value' FROM table
If I add
`ORDER BY 'Value'`
The issue is that the query is returning an empty list of results.
Things I've tried already
I tried replacing ' with `"' but this didn't work either.
I also tried writing SELECT * instead of SELECT VALUE
Using the table name in the SELECT or ORDER clauses again didn't help
You are claiming that this query:
SELECT 'Value'
FROM table
ORDER BY 'Value'
Is returning no rows. That's not quite correct. It is returning an error because SQL Server does not allow constant expressions as keys for ORDER BY (or GROUP BY for that matter).
Do not use single quotes. In this case:
SELECT 'Value' as val
FROM table
ORDER BY val;
Or, if value is a column in the table:
SELECT t.Value
FROM table t
ORDER BY t.Value;
Value is not a reserved word in SQL Server, but if it were, you could escape it:
SELECT t.[Value]
FROM table t
ORDER BY t.[Value];
it looks like your table has null values. and because of the order by all null values come first.
try to add filter like this
select Value FROM table
where Value is not null and Value <> ''
order by Value
I have a table with a column 'DESCRIPTION'.
I would like retrieve, by a regular expression, only rows with at least one lower case character.
I have tried
select * from MYTABLE t
WHERE REGEXP_LIKE (t.DESCRIPTION, '[a-z]');
but the result is equal to
select * from MYTABLE t
You may need to explicily force a case sensitive comparison:
select *
from MYTABLE t
WHERE REGEXP_LIKE (t.DESCRIPTION, '[a-z]', 'c')
From Oracle documentation:
If you omit match_parameter, then:
The default case sensitivity is determined by the value of the
NLS_SORT parameter
I have query to find out the duplicates in my data but that query also encountering the data which are not duplicates but my query sees it as a duplicate because my query reading them same. For example 'AABWcFABmAAAyWJAAb' and 'AABWcFABmAAAyWJAAB' but in actual they are unique since this both id holds different data. I have used Collate function but it didn't help. Please let me know if there is any built in function I can use or any logic.
Thank you in advance for the help.
select distinct
npa
,npanxx_row_id
,count()
from kjm.audit_309
where npanxx_row_id
COLLATE Latin1_General_CS_AS in (npanxx_row_id) --and NPANXX_ROW_ID = 'AABWcFABmAAAyWJAAB'
group by npa,npanxx_row_id
having count() >1
order by npa
Another option in SQL Server would potentially be BINARY_CHECKSUM(). This will detect differences in case.
select
your_column
, BINARY_CHECKSUM(your_column)
, COUNT(*)
FROM your_table
GROUP BY
your_column
, BINARY_CHECKSUM(your_column)
HAVING count(*) >1
If you're looking for duplicates, then this should work:
CREATE TABLE #case_sensitivity_training (my_str VARCHAR(40) NOT NULL)
INSERT INTO #case_sensitivity_training (my_str)
VALUES ('AABWcFABmAAAyWJAAb'), ('AABWcFABmAAAyWJAAB')
SELECT
my_str COLLATE SQL_Latin1_General_CP1_CS_AS,
COUNT(*)
FROM
#case_sensitivity_training
GROUP BY
my_str COLLATE SQL_Latin1_General_CP1_CS_AS
HAVING
COUNT(*) > 1
You've put the collation in the wrong spot if you want to use DISTINCT.
The following should work on SQL Server:
SELECT DISTINCT
(ColumnName) COLLATE sql_latin1_general_cp1_cs_as
From TableName
But that collation may vary from RDBMS to RDBMS.
I have a scalar value function that returns a varchar of data containing the ASCII unit seperator Char(31). I am using this result as part of an Order By clause and attempting to sort in ascending order.
My scalar value function returns results like the following (nonprintable character spelled out for reference)
ABC
ABC (CHAR(31)) DEF
ABC (CHAR(31)) DEF (CHAR(31)) HIJ
I would expect that when I order by ascending the results would be the following:
ABC
ABCDEF
ABCDEFHIJ
instead I am seeing the results as the complete opposite:
ABCDEFHIJ
ABCDEF
ABC
Now I am fairly certain that this has to do with the non-printable characters, but I am not sure why. Any idea as to why that is the case?
Thanks
The sortorder can be influenced by your COLLATION settings. Following script, explicitly using Latin1_General_CI_AS as collation orders the items as you would expect.
;WITH q (Col) AS (
SELECT 'ABC' UNION ALL
SELECT 'ABC' + CHAR(31) + 'DEF' UNION ALL
SELECT 'ABC' + CHAR(31) + 'DEF' + CHAR(31) + 'HIJ'
)
SELECT *
FROM q
ORDER BY
Col COLLATE Latin1_General_CI_AS
What collation are you using? You can verify your current database collation settings with
SELECT DATABASEPROPERTYEX('master', 'Collation') SQLCollation;
I am able to duplicate this behavior in SQL Server 2008 R2 with collation set to SQL_Latin1_General_CP1_CI_AS.
If you cannot change your collation settings, set the field to nvarchar instead of varchar. This solved the issue for me.
I just found some strange behavior of database's "order by" clause. In string comparison, I expected some characters such as '[' and '_' are greater than latin characters/digits such as 'I' or '2' considering their orders in the ASCII table. However, the sorting results from database's "order by" clause is different with my expectation. Here's my test:
SQLite version 3.6.23
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table products(name varchar(10));
sqlite> insert into products values('ipod');
sqlite> insert into products values('iphone');
sqlite> insert into products values('[apple]');
sqlite> insert into products values('_ipad');
sqlite> select * from products order by name asc;
[apple]
_ipad
iphone
ipod
select * from products order by name asc;
name
...
[B#
_ref
123
1ab
...
This behavior is different from Java's string comparison (which cost me some time to find this issue). I can verify this in both SQLite 3.6.23 and Microsoft SQL Server 2005. I did some web search but cannot find any related documentation. Could someone shed me some light on it? Is it a SQL standard? Where can I find some information about this? Thanks in advance.
The concept of comparing and ordering the characters in a database is called collation.
How the strings are stored depends on the collation which is usually set in the server, client or session properties.
In MySQL:
SELECT *
FROM (
SELECT 'a' AS str
UNION ALL
SELECT 'A' AS str
UNION ALL
SELECT 'b' AS str
UNION ALL
SELECT 'B' AS str
) q
ORDER BY
str COLLATE UTF8_BIN
--
'A'
'B'
'a'
'b'
and
SELECT *
FROM (
SELECT 'a' AS str
UNION ALL
SELECT 'A' AS str
UNION ALL
SELECT 'b' AS str
UNION ALL
SELECT 'B' AS str
) q
ORDER BY
str COLLATE UTF8_GENERAL_CI
--
'a'
'A'
'b'
'B'
UTF8_BIN sorts characters according to their unicode. Caps have lower unicodes and therefore go first.
UTF8_GENERAL_CI sorts characters according to their alphabetical position, disregarding case.
Collation is also important for indexes, since the indexes rely heavily on sorting and comparison rules.
The important keyword in this case is 'collation'. I have no experience with SQLite, but would expect it to be similar to other database engines in that you can define the collation to use for whole databases, single tables, per connection, etc.
Check your DB documentation for the options available to you.
The ASCII codes for lower-case characters such as 'i' are greater than the ones for '[' and '_':
'i': 105
'[': 91
'_': 95
However, try to insert upper-case characters, eg. try with "IPOD" or "Iphone", those will become before "_" and "[" with the default binary collation.