DB2 SQL Case Insensitive - sql

I'm executing the below DB2 SQL via SQL Server (so needs to be in DB2 SQL):
exec ('
select
TRIM (vhitno) AS "Item",
TRIM (mmitds) AS "Description",
TRIM (SUBSTRING (vhitno,12,4)) AS "Size",
vhalqt AS "Available"
from m3fdbtest.oagrln
left outer join m3fdbtest.mdeohe
on vhcono = uwcono
and vhcuno = uwcuno
and vhagno = uwagno
and vhitno = uwobv1
left outer join m3fdbtest.mitmas
ON vhcono = mmcono
AND vhitno = mmitno
where uwcono = 1
and uwstdt >= ?
and uwlvdt <= ?
and uwcuno = ''JBHE0001''
and uwagst = ''20''
and (vhitno LIKE ''%'' || ? || ''%''
or mmitds LIKE ''%'' || ? || ''%'')',
#From, #To, #Search, #Search) at M3_TEST_ODBC
However, DB2 is case sensitive - how do I make the two LIKES on mmitds and vhitno case insensitive?

You could use something like this:
where UPPER(mycol) like '%' || UPPER(?) || '%'
Beware: This could affect index selection, but you can create an index like this:
create index MYINDEX on MYTABLE (UPPER(mycol))
If you were using SQL embedded in RPG, you could set the program to use case insensitive sorts and comparisons with
SET OPTION SRTSEQ=*LANGIDSHR;
To do this with JDBC, you need to set the following driver properties:
"sort" = "language"
"sort language" = Your language code, I use "ENU"
"sort weight" = "shared"
For an ODBC connection you need to have the following connection properties set:
SORTTYPE = 2
LANGUAGE = your language code, I use ENU
SORTWEIGHT = 0

This is a FAQ so maybe you should read more, for example: this article is one of many, and various approaches exist. The sample principles apply for i-series as Linux/Unix/Windows even if the implementations vary.
If you lack access to make table-changes (e.g. to add columns, indexes etc) then you might suffer the performance penalties of using UPPER() or LOWER() on the predicate columns. This may result in indexes on those columns being unable to be used and worse performance.
You should first verify if the relevant columns in the Db2 tables really have mixed-case values, and if they only have a single case then alter your query to ensure you compare against that case.
If the columns have mixed-case values and no fixed-case column (or UDF) exists, and if your query will be frequently run for a vital business purpose, then best advice is to ensure the table has an appropriate design (to support case insensitive comparisons) via any of a number of methods.
If Regular expression functions are available in your version of Db2, you might also consider using REGEXP_LIKE and a suitable regular expression.

Database setting
There is a database config setting you can set at database creation. It's based on unicode, though.
CREATE DATABASE yourDB USING COLLATE UCA500R1_S1
The default Unicode Collation Algorithm is implemented by the UCA500R1 keyword without any attributes. Since the default UCA cannot simultaneously encompass the collating sequence of every language supported by Unicode, optional attributes can be specified to customize the UCA ordering. The attributes are separated by the underscore (_) character. The UCA500R1 keyword and any attributes form a UCA collation name.
The Strength attribute determines whether accent or case is taken into account when collating or comparing text strings. In writing systems without case or accent, the Strength attribute controls similarly important features.
The possible values are: primary (1), secondary (2), tertiary (3), quaternary (4), and identity (I). To ignore:
accent and case, use the primary strength level
case only, use the secondary strength level
neither accent nor case, use the tertiary strength level
Almost all characters can be distinguished by the first three strength levels, therefore in most locales the default Strength attribute is set at the tertiary level. However if the Alternate attribute (described below) is set to shifted, then the quaternary strength level can be used to break ties among white space characters, punctuation marks, and symbols that would otherwise be ignored. The identity strength level is used to distinguish among similar characters, such as the MATHEMATICAL BOLD SMALL A character (U+1D41A) and the MATHEMATICAL ITALIC SMALL A character (U+1D44E).
Setting the Strength attribute to higher level will slow down text string comparisons and increase the length of the sort keys.
Examples:
UCA500R1_S1 will collate "role" = "Role" = "rôle"
UCA500R1_S2 will collate "role" = "Role" < "rôle"
UCA500R1_S3 will collate "role" < "Role" < "rôle"
This worked for me. As you can see, ..._S2 ignores case, too.
Using a newer standard version, it should look like this:
CREATE DATABASE yourDB USING COLLATE CLDR181_S1
Collation keywords:
UCA400R1 = Unicode Standard 4.0 = CLDR version 1.2
UCA500R1 = Unicode Standard 5.0 = CLDR version 1.5.1
CLDR181 = Unicode Standard 5.2 = CLDR version 1.8.1
If your database is already created, there is supposed to be a way to change the setting.
CALL SYSPROC.ADMIN_CMD( 'UPDATE DB CFG USING DB_COLLNAME UCA500R1_S1 ' );
I do have problems executing this, but for all I know it is supposed to work.
Generated table row
Other options are e.g. generating a upper case row:
CREATE TABLE t (
id INTEGER NOT NULL PRIMARY KEY,
str VARCHAR(500),
ucase_str VARCHAR(500) GENERATED ALWAYS AS ( UPPER(str) )
)#
INSERT INTO t(id, str)
VALUES ( 1, 'Some String' )#
SELECT * FROM t#
ID STR UCASE_STR
----------- ------------------------------------ ------------------------------------
1 Some String SOME STRING
1 record(s) selected.

For me using db2/400 and connecting via php/pdo I added a DSN to odbc.ini in /QOpenSys/etc/odbc.ini. with the following subset of connection options for shared weight as specified by jmarkmurphy:
[DSN]
SortSequence = 2
LanguageID = ENU
SortWeight = 0
The IBM odbc connection options can be found here

Related

Compare strings with trailing spaces in Firebird SQL?

I have an existing database with a table with a string[16] key field.
There are rows whose key ends with a space: "16 ".
I need to allow user to change from "16 " to e.g. "16" but also do a unique key check (i.e. the table does not have already a record with key="16").
I run the following query:
select * from plu__ where store=100 and plu_num = '16'
It returns the row with key="16 "!
How do I check for unique key so that keys with trailing spaces are not included?
EDIT: The DDL and the char_length
CREATE TABLE PLU__
(
PLU_NUM Varchar(16),
CAPTION Varchar(50),
...
string[16] - there is no such datatype in Firebird. There are CHAR(16) and VARCHAR(16) (and BLOB SUBTYPE TEXT, but it is improbable here). So you omit some crucial points about your system. You do not work with Firebird, but with some undisclosed intermediate layer, that is no one knows how opaque or transparent.
I suspect you or your system chose CHAR datatype instead of VARCHAR where all data is right-padded with space to the max. OR maybe the COLLATION of the column/table/database is so that trailing spaces do not matter.
Additionally, you may be just wrong. You claim that the row being Selected does contain the trailing blank, but I do not see it. For example, add CHAR_LENGTH(plu_num) to the columns in your SELECT and see what is there.
Additionally, if plu_num is number - should it not be integer or int64 rather than text?
Bottom of your screenshot shows "(NONE)". I suspect that is the "connection charset". This is allowed for backward compatibility with programs made 20 years ago, but it is quite dangerous today. You have to consult your system documentation, how to set the connection charset to URF-8 or Windows-1250 or something meaningful.
"How do I check for unique key so that keys with trailing spaces are not included?" you do not. You just can not do it reliably, because of different transactions and different programs making simultaneous connections. You would check it, decide you are clear, but right before you would insert your row - some other computer would insert it too. That gap can not be crossed that way, between your two commands of checking and inserting - anyone else can do it too. It is called race conditions.
You have to ask the server to do the checks.
For example, you have to introduce UNIQUE CONSTRAINT on the pair of columns (store, plu_num). That way the server would refuse to store two rows with the same values in those columns, visible in the same transaction.
Additionally, is it even normal to have values with spaces? Convert the field to integer datatype and be safe.
Or if you want to keep it textual and non-numeric you still can
Introduce CHECK CONSTRAINT that trim(plu_num) is not distinct from plu_num (or if plu_num is declared as a NOT NULL column to the server, then trim(plu_num) = plu_num). That way the server would refuse storing any value with spaces before or after the text.
In a case the datatype or the collation of the column makes no difference for comparing texts with and without trailing spaces (and in case you can not change that datatype or collation), you may try adding tokens, like ('+' || trim(plu_num) || '+') = ('+' || plu_num || '+')
Or instead of that CHECK CONSTRAINT, you can have proactively remove those spaces: set new before update or insert TRIGGER on the table, that would do like NEW.plu_num = TRIM(NEW.plu_num)
Documentation:
https://www.firebirdsql.org/refdocs/langrefupd20-distinct.html
http://www.firebirdtest.com/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-ddl-tbl.html#fblangref25-ddl-tbl-constraints
http://www.firebirdtest.com/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-ddl-tbl.html#fblangref25-ddl-tbl-altradd
http://www.firebirdtest.com/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-ddl-trgr.html
http://www.firebirdtest.com/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-datatypes-chartypes.html
Also, via http://www.translate.ru a bit more verbose:
http://firebirdsql.su/doku.php?id=constraint
http://firebirdsql.su/doku.php?id=alter_table
You may also check http://www.firebirdfaq.org/cat3/
Additionally, if you add the constraints onto existing table with non-valid data entered earlier before you introduced those checks, you might trap yourself into "non-restorable backup" situation. You would have to check for it, and sanitize your old data to abide by newly introduced constraints.
Option #4 explained in detail is below. Just this seems be a bad idea of database design! One should not just "let people edit number to remove trailing blanks", one should make the database design so that there would be no any numbers with trailing blank and would be no any way to insert them into the database.
CREATE TABLE "_NEW_TABLE" (
ID INTEGER NOT NULL,
TXT VARCHAR(10)
);
Select id, txt, '_'||txt||'_', char_length(txt) from "_NEW_TABLE"
ID TXT CONCATENATION CHAR_LENGTH
1 1 _1_ 1
2 2 _2_ 1
4 1 _1 _ 2
5 2 _2 _ 2
7 1 _ 1_ 2
8 2 _ 2_ 2
Select id, txt, '_'||txt||'_', char_length(txt) from "_NEW_TABLE"
where txt = '2'
ID TXT CONCATENATION CHAR_LENGTH
2 2 _2_ 1
5 2 _2 _ 2
Select id, txt, '_'||txt||'_', char_length(txt) from "_NEW_TABLE"
where txt || '+' = '2+' -- WARNING - this PROHIBITS index use on txt column, if there is any
ID TXT CONCATENATION CHAR_LENGTH
2 2 _2_ 1
Select id, txt, '_'||txt||'_', char_length(txt) from "_NEW_TABLE"
where txt = '2' and char_length(txt) = char_length('2')

How do I get around case sensitive fields when in SQL

My code is:
CURSOR get_party_description is
select party_name
from ifsapp.IDENTITY_PAY_INFO_ALL
where party_type = :NEW.PARTY_TYPE
and identity = identity_
:NEW_PARTY_TYPE = 'SUPPLIER' while the value in the field is 'Supplier'. This code will pull back no records but if I change it to 'Supplier', it finds the record
How do I change to search with out matching the case?
You can convert both the variable and the field to upper or lower case.
where UPPER(party_type) = UPPER(:NEW.PARTY_TYPE)
This might cause a table space scan as the index on the field would be Case sensitive.
you can get around this by adding a generated column that is upper case and indexing that.
Change both of your values to upper case. Example:
CURSOR get_party_description is
select party_name
from ifsapp.IDENTITY_PAY_INFO_ALL
where UPPER(party_type) = UPPER('SUPPLIER')
and identity = identity_
Besides converting both strings to the same case (upper- or lower-) and then comparing them for equality, most SQL dialects allow one to do a case-insensitive comparison by using the LIKE operator, as follows:
CURSOR get_party_description is
select party_name
from ifsapp.IDENTITY_PAY_INFO_ALL
where party_type LIKE :NEW.PARTY_TYPE
and identity = identity_

PostgreSQL upper function on the ascii 152 character ("ÿ")

On a Windows 7 platform, with PostgreSQL version 9.3.9, using PgAdmin as a client, the result of select upper on a column containing e.g. "ÿÿÿ", returns null. If three values are stored, e.g.,
"ada"
"john"
"mole"
"ÿÿÿ"
they all come back in upper case, except the row containing "ÿÿÿ"; this row
gives nothing back, null...
The database encoding scheme is UTF8 / UNICODE. The setting "client_encoding" has the same value, UNICODE.
Is this a setting issue in the database, an operating system issue, or a bug
in the database? Are there some recommended workarounds?
The result of:
select thecol, upper(thecol), upper(thecol) is null, convert_to(thecol, 'UTF8'), current_setting('server_encoding') from thetable where ...
is:
"Apps";"APPS";f;"Apps";"UTF8"
"All";"ALL";f;"All";"UTF8"
"Test";"TEST";f;"Test";"UTF8"
"ÿÿÿ";"";f;"\303\277\303\277\303\277";"UTF8"
The lc_ parts of pg_settings are:
"lc_collate";"Swedish_Sweden.1252";"Shows the collation order locale."
"lc_ctype";"Swedish_Sweden.1252";"Shows the character classification and case conversion locale."
"lc_messages";"Swedish_Sweden.1252";"Sets the language in which messages are displayed."
"lc_monetary";"Swedish_Sweden.1252";"Sets the locale for formatting monetary amounts."
"lc_numeric";"Swedish_Sweden.1252";"Sets the locale for formatting numbers."
The output of select * from pg_database is:
"template1";10;6;"Swedish_Sweden.1252";"Swedish_Sweden.1252";t;t;-1;12130;668;1‌​;1663;"{=c/postgres,postgres=CTc/postgres}"
"template0";10;6;"Swedish_Sweden.1252";"Swedish_Sweden.1252";t;f;-1;12130;668;1‌​;1663;"{=c/postgres,postgres=CTc/postgres}"
"postgres";10;6;"Swedish_Sweden.1252";"Swedish_Sweden.1252";f;t;-1;12130;668;1;‌​1663;""
The actual create database statement, for the 9.4.4 version, is:
CREATE DATABASE postgres
WITH OWNER = postgres
ENCODING = 'UTF8'
TABLESPACE = pg_default
LC_COLLATE = 'Swedish_Sweden.1252'
LC_CTYPE = 'Swedish_Sweden.1252'
CONNECTION LIMIT = -1;
My guess is that the upper function uses the LC_CTYPE setting of your database. The uppercase of LATIN SMALL LETTER Y WITH DIAERESIS (U+00FF) is LATIN CAPITAL LETTER Y WITH DIAERESIS' (U+0178) which isn't part of the Windows 1252 code page.
If you convert the string to a Unicode format first, the upper function might work as expected:
SELECT upper(convert_to(thecol, 'UTF8')) ...
You should probably use a different value for LC_CTYPE and LC_COLLATE. On Linux, you'd use sv_SE.UTF-8.
Nevertheless, I'd consider this a bug in Postgres. It would be better to leave ÿ as is if the upper case version can't be represented in the target character set.

How to compare strings in sql ignoring case?

How do I write a query in Oracle ignoring the case of the strings being compared? For example "angel", "Angel", "ANGEL", "angel", "AngEl" would all be equal when compared.
If you are matching the full value of the field use
WHERE UPPER(fieldName) = 'ANGEL'
EDIT: From your comment you want to use:
SELECT
RPAD(a.name, 10,'=') "Nombre del Cliente"
, RPAD(b.name, 12,'*') "Nombre del Consumidor"
FROM
s_customer a,
s_region b
WHERE
a.region_id = b.id
AND UPPER(a.name) LIKE '%SPORT%'
You could use the UPPER keyword:
SELECT *
FROM Customers
WHERE UPPER(LastName) = UPPER('AnGel')
You can use:
select * from your_table where upper(your_column) like '%ANGEL%'
Otherwise, you can use:
select * from your_table where upper(your_column) = 'ANGEL'
Which will be more efficient if you are looking for a match with no additional characters before or after your_column field as Gary Ray suggested in his comments.
before comparing the two or more strings first execute the following commands
alter session set NLS_COMP=LINGUISTIC;
alter session set NLS_SORT=BINARY_CI;
after those two statements executed then you may compare the strings and there will be case insensitive.for example you had two strings s1='Apple' and s2='apple', if yow want to compare the two strings before executing the above statements then those two strings will be treated as two different strings but when you compare the strings after the execution of the two alter statements then those two strings s1 and s2 will be treated as the same string
reasons for using those two statements
We need to set NLS_COMP=LINGUISTIC and NLS_SORT=BINARY_CI in order to use 10gR2 case insensitivity. Since these are session modifiable, it is not as simple as setting them in the initialization parameters. We can set them in the initialization parameters but they then only affect the server and not the client side.
More detail on Mr Dredel's answer and tuinstoel's comment.
The data in the column will be stored in its specific case, but you can change your session's case-sensitivity for matching.
You can change either the session or the database to use linguistic or case insensitive searching. You can also set up indexes to use particular sort orders.
eg
ALTER SESSION SET NLS_SORT=BINARY_CI;
Once you start getting into non-english languages, with accents and so on, there's additional support for accent-insensitive.
Some of the capabilities vary by version, so check out the Globablization document for your particular version of Oracle. The latest (11g) is here
SELECT STRCMP("string1", "string2");
this returns 0 if the strings are equal.
If string1 = string2, this function returns 0 (ignoring the case)
If string1 < string2, this function returns -1
If string1 > string2, this function returns 1
https://www.w3schools.com/sql/func_mysql_strcmp.asp
To avoid string conversions comparisons, use COLLATE SQL_Latin1_General_CP1_CI_AS.
EXAMPLE:
SELECT UserName FROM Users
WHERE UserName COLLATE SQL_Latin1_General_CP1_CI_AS = 'Angel'
That will return any usernames, whether ANGEL, angel, or Angel, etc.
I don't recall the exact syntax, but you may set the table column to be case insensitive. But be careful because then you won't be able to match based on case anymore and if you WANT 'cool' to not match 'CoOl' it will no longer be possible.

What is the best way to select string fields based on character ranges?

I need to add the ability for users of my software to select records by character ranges.
How can I write a query that returns all widgets from a table whose name falls in the range Ba-Bi for example?
Currently I'm using greater than and less than operators, so the above example would become:
select * from widget
where name >= 'ba' and name < 'bj'
Notice how I have "incremented" the last character of the upper bound from i to j so that "bike" would not be left out.
Is there a generic way to find the next character after a given character based on the field's collation or would it be safer to create a second condition?
select * from widget
where name >= 'ba'
and (name < 'bi' or name like 'bi%')
My application needs to support localization. How sensitive is this kind of query to different character sets?
I also need to support both MSSQL and Oracle. What are my options for ensuring that character casing is ignored no matter what language appears in the data?
Let's skip directly to localization. Would you say "aa" >= "ba" ? Probably not, but that is where it sorts in Sweden. Also, you simply can't assume that you can ignore casing in any language. Casing is explicitly language-dependent, with the most common example being Turkish: uppercase i is İ. Lowercase I is ı.
Now, your SQL DB defines the result of <, == etc by a "collation order". This is definitely language specific. So, you should explicitly control this, for every query. A Turkish collation order will put those i's where they belong (in Turkish). You can't rely on the default collation.
As for the "increment part", don't bother. Stick to >= and <=.
For MSSQL see this thread: http://bytes.com/forum/thread483570.html .
For Oracle, it depends on your Oracle version, as Oracle 10 now supports regex(p) like queries: http://www.psoug.org/reference/regexp.html (search for regexp_like ) and see this article: http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/rischert_regexp_pt1.html
HTH
Frustratingly, the Oracle substring function is SUBSTR(), whilst it SQL-Server it's SUBSTRING().
You could write a simple wrapper around one or both of them so that they share the same function name + prototype.
Then you can just use
MY_SUBSTRING(name, 2) >= 'ba' AND MY_SUBSTRING(name, 2) <= 'bi'
or similar.
You could use this...
select * from widget
where name Like 'b[a-i]%'
This will match any row where the name starts with b, the second character is in the range a to i, and any other characters follow.
I think that I'd go with something simple like appending a high-sorting string to the end of the upper bound. Something like:
select * from widgetwhere name >= 'ba' and name <= 'bi'||'~'
I'm not sure that would survive EBCDIC conversion though
You could also do it like this:
select * from widget
where left(name, 2) between 'ba' and 'bi'
If your criteria length changes (as you seemed to indicate in a comment you left), the query would need to have the length as an input also:
declare #CriteriaLength int
set #CriteriaLength = 4
select * from widget
where left(name, #CriteriaLength) between 'baaa' and 'bike'