PostgreSQL upper function on the ascii 152 character ("ÿ") - sql

On a Windows 7 platform, with PostgreSQL version 9.3.9, using PgAdmin as a client, the result of select upper on a column containing e.g. "ÿÿÿ", returns null. If three values are stored, e.g.,
"ada"
"john"
"mole"
"ÿÿÿ"
they all come back in upper case, except the row containing "ÿÿÿ"; this row
gives nothing back, null...
The database encoding scheme is UTF8 / UNICODE. The setting "client_encoding" has the same value, UNICODE.
Is this a setting issue in the database, an operating system issue, or a bug
in the database? Are there some recommended workarounds?
The result of:
select thecol, upper(thecol), upper(thecol) is null, convert_to(thecol, 'UTF8'), current_setting('server_encoding') from thetable where ...
is:
"Apps";"APPS";f;"Apps";"UTF8"
"All";"ALL";f;"All";"UTF8"
"Test";"TEST";f;"Test";"UTF8"
"ÿÿÿ";"";f;"\303\277\303\277\303\277";"UTF8"
The lc_ parts of pg_settings are:
"lc_collate";"Swedish_Sweden.1252";"Shows the collation order locale."
"lc_ctype";"Swedish_Sweden.1252";"Shows the character classification and case conversion locale."
"lc_messages";"Swedish_Sweden.1252";"Sets the language in which messages are displayed."
"lc_monetary";"Swedish_Sweden.1252";"Sets the locale for formatting monetary amounts."
"lc_numeric";"Swedish_Sweden.1252";"Sets the locale for formatting numbers."
The output of select * from pg_database is:
"template1";10;6;"Swedish_Sweden.1252";"Swedish_Sweden.1252";t;t;-1;12130;668;1‌​;1663;"{=c/postgres,postgres=CTc/postgres}"
"template0";10;6;"Swedish_Sweden.1252";"Swedish_Sweden.1252";t;f;-1;12130;668;1‌​;1663;"{=c/postgres,postgres=CTc/postgres}"
"postgres";10;6;"Swedish_Sweden.1252";"Swedish_Sweden.1252";f;t;-1;12130;668;1;‌​1663;""
The actual create database statement, for the 9.4.4 version, is:
CREATE DATABASE postgres
WITH OWNER = postgres
ENCODING = 'UTF8'
TABLESPACE = pg_default
LC_COLLATE = 'Swedish_Sweden.1252'
LC_CTYPE = 'Swedish_Sweden.1252'
CONNECTION LIMIT = -1;

My guess is that the upper function uses the LC_CTYPE setting of your database. The uppercase of LATIN SMALL LETTER Y WITH DIAERESIS (U+00FF) is LATIN CAPITAL LETTER Y WITH DIAERESIS' (U+0178) which isn't part of the Windows 1252 code page.
If you convert the string to a Unicode format first, the upper function might work as expected:
SELECT upper(convert_to(thecol, 'UTF8')) ...
You should probably use a different value for LC_CTYPE and LC_COLLATE. On Linux, you'd use sv_SE.UTF-8.
Nevertheless, I'd consider this a bug in Postgres. It would be better to leave ÿ as is if the upper case version can't be represented in the target character set.

Related

How to cast hex data string to a string db2 sql

How would you decode a hex string to get the value in text format by using a select statement?
For example my data in hex is:
4f004e004c005900200046004f00520020004200410043004b002d005500500020004f004e0020004c004500560045004c0020004f004e004500200046004f00520020004300520041004e004500530020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020000000
I want to decode it to get the string value using a select statement.
The value of the above is "ONLY FOR BACK-UP ON LEVEL ONE FOR CRANES"
what I have tried is :
SELECT CAST('4f004e004c005900200046004f00520020004200410043004b002d005500500020004f004e0020004c004500560045004c0020004f004e004500200046004f00520020004300520041004e004500530020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020002000200020000000'
AS VARCHAR(30000) CCSID 37) from myschema.atable
The above sql returns the exact same hex string and not the decoded text string of "ONLY FOR BACK-UP ON LEVEL ONE FOR CRANES" what I expected.
Is it possible to do this with a cast? If it is what will the syntax be?
My problem that I have is a system stores text data in a blob field and I want to use a select statement to see what the text data is in the blob field.
Db : Db2 on Ibm
Edit:
I have managed to covert the string to the hex value by using :
select hex(cast('ONLY FOR BACK-UP ON LEVEL ONE FOR CRANES' as varchar(100) ccsid 1208))
FROM myschema.atable
This gives me the string in hex :
4F4E4C5920464F52204241434B2D5550204F4E204C4556454C204F4E4520464F52204352414E4553
Now somehow I need to do the inverse and get the value.
Thanks.
Edit
Using the answer from Daniel Lema, I tried using the unhex function but my result that I got was :
|+<ßã|êâ ä.í&|+<áîá<|+áã|êäê +áë
Is this something to do with a CSSID? Or how should I convet the above to a readable string?
This is the table field definition if it will help the field with my data in is GDTXFT a BLOB :
I was able to take your shortened hex string and convert is to a valid EBCDIC string.
The problem I ran into is that the original hex code you receive comes in UTF-16LE (Thanks Tom Blodget). IBM's CCSID system does not have a distinction between UTF-16BE and UTF-16LE so I am at a loss there on how to convert it properly.
If it is in UTF-8 as you generated later, the following would work for you. It's not the prettiest but throw it in a couple functions and it will work.
Create or replace function unpivothex (in_ varchar(30000))
returns table (Hex_ char(2), Position_ int)
return
with returnstring (ST , POS )
as
(Select substring(STR,1,2), 1
from table(values in_) as A(STR)
union all
Select nullif(substring(STR,POS+2,2),'00'), POS+2
from returnstring, table(values in_) as A(STR)
where POS+2 <= length(in_)
)
Select ST, POS
from returnstring
;
Create or replace function converthextostring
(in_string char(30000))
returns varchar(30000)
return
(select listagg(char(varbinary_format(B.Hex_),1)) within group(order by In_table.Position_)
from table(unpivothex(upper(in_string))) in_table
join table(unpivothex(hex(cast('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz ' as char(53) CCSID 1208)))) A on In_table.Hex_ = A.Hex_
join table(unpivothex(hex(cast('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz ' as char(53) CCSID 37)))) B on A.Position_ = B.Position_
);
Here is a version if you're not on at least V7R2 TR6 or V7R3 TR2.
Create or replace function converthextostring
(in_string char(30000))
returns varchar(30000)
return
(select xmlserialize(
xmlagg(
xmltext(cast(char(varbinary_format(B.Hex_),1) as char(1) CCSID 37))
order by In_table.Position_)
as varchar(30000))
from table(unpivothex(upper(in_string))) in_table
join table(unpivothex(hex(cast('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz ' as char(53) CCSID 1208)))) A on In_table.Hex_ = A.Hex_
join table(unpivothex(hex(cast('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz ' as char(53) CCSID 37)))) B on A.Position_ = B.Position_
);
I tried the following solution I found published by Marcin Rudzki at Convert HEX value to CHAR on DB2, tested in my own Db2 for LUW v11 with a small modification.
the solution consists on creating a function just as Marcin suggested:
CREATE FUNCTION unhex(in VARCHAR(32000) FOR BIT DATA)
RETURNS VARCHAR(32000)
LANGUAGE SQL
CONTAINS SQL
DETERMINISTIC NO EXTERNAL ACTION
BEGIN ATOMIC
RETURN in;
END
To test the solution, lets create an HEXSAMPLE table with a HEXSTRING column loaded with the string representation of a HEX sequence:
INSERT INTO HEXSAMPLE (HEXSTRING) VALUES ('4F4E4C5920464F52204241434B2D5550204F4E204C4556454C204F4E4520464F52204352414E4553')
Then exec the following query (and here it is different from the original proposal):
SELECT UNHEX(CAST(HEXTORAW(HEXSTRING) AS VARCHAR(2000) FOR BIT DATA)) as TEXT, HEXSTRING FROM HEXSAMPLE
With result:
TEXT HEXSTRING
---------------------------------------- --------------------------------------------------------------------------------
ONLY FOR BACK-UP ON LEVEL ONE FOR CRANES 4F4E4C5920464F52204241434B2D5550204F4E204C4556454C204F4E4520464F52204352414E4553
I hope someone else can find a more direct solution. Also, if someone can explain why it works, it will be very interesting.
I question why you need to do this...
There's valid reasons to convert a hex string back to it's character equivalent...for instance somebody sends you a 32 byte string UUID and you want it back it it's 16 byte binary form.
But there's no reason ONLY FOR BACK-UP ON LEVEL ONE FOR CRANES should have been transformed to hex.
I suspect you need to post a new question asking why you're not getting readable strings in the first place.
However, in answer to this question... IBM i has an MI function Convert Character to Hex (CVTCH) that is easily called from any ILE langage. You could wrap that function call up into a user defined function in order to use it from SQL.
Note that you'll need to know what the hex string represents, EBCDIC, ASCII or Unicode, because you'll need to be able to tell the system what you've started with. From there there are ways to convert between encoding.
Here's an article that shows how to call the MI function from RPG.
Utilizing MI Functions in RPG Programs
A more modern free form version of the prototype that takes advantage of enhancements to the CCSID keyword might look like
dcl-pr FromHex extproc('cvtch');
charString char(32767) ccsid(*UTF8) options(*varsize);
hexString char(65534) ccsid(*HEX) const options(*varsize);
hexStringLen int(10) value;
end-pr;
With the above prototype, the system will treat the character string that comes back as UTF8 (ccsid 1208). But all I'm doing is telling the system how to interpret the bytes that come back. If the string was actually EBCDIC, I'm going to get garbage.
I think you could even defined the cvtch function directly as an external UDF without needing an ILE wrapper. I'd have to play around with that...
Disregard that idea...cvtch only has parameters, not a return value. Using an ILE wrapper is the best way to move the output parameter to a return value for use as a UDF.
The problem is that your original string is in ASCII format (actually with x'00' byte after each letter), and you have to convert it to EBCDIC.
Below is the solution for latin capital letters only:
select cast(translate(replace(mycol, x'00', x'')
, x'C1C2C3C4C5C6C7C8C9D1D2D3D4D5D6D7D8D9E2E3E4E5E6E7E8E940'
, x'4142434445464748494A4B4C4D4E4F505152535455565758595A20'
) as varchar(500) ccsid 37)
from mytab;
Every ASCII character is translated to the corresponding EBCDIC one.
x'00' symbols are removed.
cast (col_name as varchar(2000) ccsid ascii for sbcs data)

How can I get rid of having to prefix a WHERE query with 'N' for Unicode strings?

When searching for a string in our database where the column is of type nvarchar, specifying the 'N' prefix in the query nets some results. Leaving it out does not. I am trying the search for a Simplified Chinese string in a database that previously did not store any Chinese strings yet.
The EntityFramework application that uses the database, correctly retrieves the strings and the LINQ queries also work in the application. However, in SQL Server 2014 Management Studio, when I do a an SQL query for the string it does not show up unless I specify the 'N' prefix for unicode. (Even though the column is nvarchar type)
Works:
var text = from asd in Translations.TranslationStrings
where asd.Text == "嗄法吖无上几"
select asd;
MessageBox.Show(text.FirstOrDefault().Text);
Does not work:
SELECT *
FROM TranslationStrings
where Text = '嗄法吖无上几'
If I prefix the Chinese characters with 'N' it works.
Works:
SELECT *
FROM TranslationStrings
where Text = N'嗄法吖无上几'
Please excuse the Chinese characters, I just typed something random. My question is, is there something I can do to not have to include the 'N' prefix when doing a query?
Thank you very much!
As #sworkalot has mentioned below:
The default for .Net is Unicode, that's why you don't need to specify
it. This is not the case for Sql Manager.
If not specified Sql will assume that you work with asci according to
the collation specified in your DB.
Hence, when working from Sql Server you need to use N'
https://sqlquantumleap.com/2018/09/28/native-utf-8-support-in-sql-server-2019-savior-false-prophet-or-both/
Check out these examples, pay close attention to the data types and the values being assigned:
DECLARE #Varchar VARCHAR(100) = '嗄'
DECLARE #VarcharWithN VARCHAR(100) = N'嗄' -- Has N prefix
DECLARE #NVarchar NVARCHAR(100) = '嗄'
DECLARE #NVarcharWithN NVARCHAR(100) = N'嗄' -- Has N prefix
SELECT
Varchar = #Varchar,
VarcharWithN = #VarcharWithN,
NVarchar = #NVarchar,
NVarcharWithN = #NVarcharWithN
SELECT
Varchar = CONVERT(VARBINARY, #Varchar),
VarcharWithN = CONVERT(VARBINARY, #VarcharWithN),
NVarchar = CONVERT(VARBINARY, #NVarchar),
NVarcharWithN = CONVERT(VARBINARY, #NVarcharWithN)
Results:
Varchar VarcharWithN NVarchar NVarcharWithN
? ? ? 嗄
Varchar VarcharWithN NVarchar NVarcharWithN
0x3F 0x3F 0x3F00 0xC455
NVARCHAR data type stores 2 bytes for each character while VARCHAR only stores 1 (you can see this on the VARBINARY cast on the 2nd SELECT). Since chinese characters representation need 2 bytes to be stored, you have to use NVARCHAR to store them. If you try to stuff them in a VARCHAR it will be stored as ? and you will lose the original character information. This also happens on the 3rd example, because the literal doesn't have the N so it's converted to VARCHAR before actually assigning the value to the variable.
It's because of this that you need to add the N prefix when typing these characters as literals, so the SQL engine knows that you are typing characters that need 2 byte representation. So if you are doing a comparison against a NVARCHAR column always add the N prefix. You can change the database collation, but it's recommended to always use the proper data type independent of the collation so you don't have problems when using coding on different databases.
If you could explain the reason why you want to omit the N prefix we might address that, although I believe there is no work around in this particular case.
The default for .Net is Unicode, that's why you don't need to specify it.
This is not the case for Sql Manager.
If not specified Sql will assume that you work with asci according to the collation specified in your DB.
Hence, when working from Sql Server you need to use N'
https://sqlquantumleap.com/2018/09/28/native-utf-8-support-in-sql-server-2019-savior-false-prophet-or-both/

DB2 SQL Case Insensitive

I'm executing the below DB2 SQL via SQL Server (so needs to be in DB2 SQL):
exec ('
select
TRIM (vhitno) AS "Item",
TRIM (mmitds) AS "Description",
TRIM (SUBSTRING (vhitno,12,4)) AS "Size",
vhalqt AS "Available"
from m3fdbtest.oagrln
left outer join m3fdbtest.mdeohe
on vhcono = uwcono
and vhcuno = uwcuno
and vhagno = uwagno
and vhitno = uwobv1
left outer join m3fdbtest.mitmas
ON vhcono = mmcono
AND vhitno = mmitno
where uwcono = 1
and uwstdt >= ?
and uwlvdt <= ?
and uwcuno = ''JBHE0001''
and uwagst = ''20''
and (vhitno LIKE ''%'' || ? || ''%''
or mmitds LIKE ''%'' || ? || ''%'')',
#From, #To, #Search, #Search) at M3_TEST_ODBC
However, DB2 is case sensitive - how do I make the two LIKES on mmitds and vhitno case insensitive?
You could use something like this:
where UPPER(mycol) like '%' || UPPER(?) || '%'
Beware: This could affect index selection, but you can create an index like this:
create index MYINDEX on MYTABLE (UPPER(mycol))
If you were using SQL embedded in RPG, you could set the program to use case insensitive sorts and comparisons with
SET OPTION SRTSEQ=*LANGIDSHR;
To do this with JDBC, you need to set the following driver properties:
"sort" = "language"
"sort language" = Your language code, I use "ENU"
"sort weight" = "shared"
For an ODBC connection you need to have the following connection properties set:
SORTTYPE = 2
LANGUAGE = your language code, I use ENU
SORTWEIGHT = 0
This is a FAQ so maybe you should read more, for example: this article is one of many, and various approaches exist. The sample principles apply for i-series as Linux/Unix/Windows even if the implementations vary.
If you lack access to make table-changes (e.g. to add columns, indexes etc) then you might suffer the performance penalties of using UPPER() or LOWER() on the predicate columns. This may result in indexes on those columns being unable to be used and worse performance.
You should first verify if the relevant columns in the Db2 tables really have mixed-case values, and if they only have a single case then alter your query to ensure you compare against that case.
If the columns have mixed-case values and no fixed-case column (or UDF) exists, and if your query will be frequently run for a vital business purpose, then best advice is to ensure the table has an appropriate design (to support case insensitive comparisons) via any of a number of methods.
If Regular expression functions are available in your version of Db2, you might also consider using REGEXP_LIKE and a suitable regular expression.
Database setting
There is a database config setting you can set at database creation. It's based on unicode, though.
CREATE DATABASE yourDB USING COLLATE UCA500R1_S1
The default Unicode Collation Algorithm is implemented by the UCA500R1 keyword without any attributes. Since the default UCA cannot simultaneously encompass the collating sequence of every language supported by Unicode, optional attributes can be specified to customize the UCA ordering. The attributes are separated by the underscore (_) character. The UCA500R1 keyword and any attributes form a UCA collation name.
The Strength attribute determines whether accent or case is taken into account when collating or comparing text strings. In writing systems without case or accent, the Strength attribute controls similarly important features.
The possible values are: primary (1), secondary (2), tertiary (3), quaternary (4), and identity (I). To ignore:
accent and case, use the primary strength level
case only, use the secondary strength level
neither accent nor case, use the tertiary strength level
Almost all characters can be distinguished by the first three strength levels, therefore in most locales the default Strength attribute is set at the tertiary level. However if the Alternate attribute (described below) is set to shifted, then the quaternary strength level can be used to break ties among white space characters, punctuation marks, and symbols that would otherwise be ignored. The identity strength level is used to distinguish among similar characters, such as the MATHEMATICAL BOLD SMALL A character (U+1D41A) and the MATHEMATICAL ITALIC SMALL A character (U+1D44E).
Setting the Strength attribute to higher level will slow down text string comparisons and increase the length of the sort keys.
Examples:
UCA500R1_S1 will collate "role" = "Role" = "rôle"
UCA500R1_S2 will collate "role" = "Role" < "rôle"
UCA500R1_S3 will collate "role" < "Role" < "rôle"
This worked for me. As you can see, ..._S2 ignores case, too.
Using a newer standard version, it should look like this:
CREATE DATABASE yourDB USING COLLATE CLDR181_S1
Collation keywords:
UCA400R1 = Unicode Standard 4.0 = CLDR version 1.2
UCA500R1 = Unicode Standard 5.0 = CLDR version 1.5.1
CLDR181 = Unicode Standard 5.2 = CLDR version 1.8.1
If your database is already created, there is supposed to be a way to change the setting.
CALL SYSPROC.ADMIN_CMD( 'UPDATE DB CFG USING DB_COLLNAME UCA500R1_S1 ' );
I do have problems executing this, but for all I know it is supposed to work.
Generated table row
Other options are e.g. generating a upper case row:
CREATE TABLE t (
id INTEGER NOT NULL PRIMARY KEY,
str VARCHAR(500),
ucase_str VARCHAR(500) GENERATED ALWAYS AS ( UPPER(str) )
)#
INSERT INTO t(id, str)
VALUES ( 1, 'Some String' )#
SELECT * FROM t#
ID STR UCASE_STR
----------- ------------------------------------ ------------------------------------
1 Some String SOME STRING
1 record(s) selected.
For me using db2/400 and connecting via php/pdo I added a DSN to odbc.ini in /QOpenSys/etc/odbc.ini. with the following subset of connection options for shared weight as specified by jmarkmurphy:
[DSN]
SortSequence = 2
LanguageID = ENU
SortWeight = 0
The IBM odbc connection options can be found here

Cannot insert character '≤' in SQL Server 2008

I have a SQL Server 2008 database and a nvarchar(256) field of a table. The crazy problem is that when I run this query:
update ruds_values_short_text
set value = '≤ asjdklasd'
where rud_id=12202 and field_code='detection_limit'
and then
select * from ruds_values_short_text
where rud_id=12202 and field_code='detection_limit'
I get this result:
12202 detection_limit = asjdklasd 11
You can see that the character ≤ has been transformed in =
It's an encoding related problem, for sure, in fact, if I try to paste '≤' in Notepad++ it pastes '=' but I get '≤' when I convert ANSI to UTF-8.
So.. I think I should write the query in UTF8.. but how? Thanks.
You need to use the N prefix so the literal is treated as Unicode rather than being treated as character data in the code page of your database's default collation.
update ruds_values_short_text
set value = N'≤ asjdklasd'
where rud_id=12202 and field_code='detection_limit'
Try update ruds_values_short_text set value = N'≤ asjdklasd' where rud_id=12202 and field_code='detection_limit'. The N indicates that you are providing national language so it respects the encoding.

Arabic SQL query (on Oracle DB) returns empty result

I have this query (that runs on Oracle 10g database):
SELECT ge.*, ge.concept AS glossarypivot
FROM s_glossary_entries ge
WHERE (ge.glossaryid = '161' OR ge.sourceglossaryid = '161')
AND (ge.approved != 0 OR ge.userid = 361)
AND concept like 'م%' ORDER BY ge.concept
The query must display all words that begin with the arabic letter "م"
but unfortunately, it returns empty result ..
However, if I run the same query on the same database which runs on MYSQL, it works well and displays the correct result ..
and also, if I run the same query with an english letter (m), like this:
SELECT ge.*, ge.concept AS glossarypivot
FROM s_glossary_entries ge
WHERE (ge.glossaryid = '161' OR ge.sourceglossaryid = '161')
AND (ge.approved != 0 OR ge.userid = 361)
AND concept like 'm%' ORDER BY ge.concept
it displays result correctly and not empty !!
What should I do in order to get this query working the right way on oracle 10 database?
P.S. the oracle database character set is : "AL32UTF8"
thank you so much in advance
Sure that this works in MySQL? I would do this part:
AND concept = 'م'
like this:
AND concept LIKE 'م%'
or because it's arabic and the first char is the right one's like this:
AND concept LIKE '%م'
But i have no idea if Oracle even have LIKE, i never worked with Oracle.
if I put a UTF8 character : " ظ… " instead of the arabic character "م" it will work on oracle ...
The obvious question is, do you have matching data.
You can use SELECT DUMP(concept), DUMP('م') FROM ... to see the bytes that actually form the value. My database gives me 217/133. I believe there are some characters which can have different bytes in UTF-8 but the same physical appearance, though I couldn't say whether this is one of them.
Also, consult the Globalization guide.
i thjink its a mismatch in your oracle client codepage. it should be defined in the same character set as the database, otherwise there will be some character conversion.