Cannot insert character '≤' in SQL Server 2008 - sql

I have a SQL Server 2008 database and a nvarchar(256) field of a table. The crazy problem is that when I run this query:
update ruds_values_short_text
set value = '≤ asjdklasd'
where rud_id=12202 and field_code='detection_limit'
and then
select * from ruds_values_short_text
where rud_id=12202 and field_code='detection_limit'
I get this result:
12202 detection_limit = asjdklasd 11
You can see that the character ≤ has been transformed in =
It's an encoding related problem, for sure, in fact, if I try to paste '≤' in Notepad++ it pastes '=' but I get '≤' when I convert ANSI to UTF-8.
So.. I think I should write the query in UTF8.. but how? Thanks.

You need to use the N prefix so the literal is treated as Unicode rather than being treated as character data in the code page of your database's default collation.
update ruds_values_short_text
set value = N'≤ asjdklasd'
where rud_id=12202 and field_code='detection_limit'

Try update ruds_values_short_text set value = N'≤ asjdklasd' where rud_id=12202 and field_code='detection_limit'. The N indicates that you are providing national language so it respects the encoding.

Related

How can I get rid of having to prefix a WHERE query with 'N' for Unicode strings?

When searching for a string in our database where the column is of type nvarchar, specifying the 'N' prefix in the query nets some results. Leaving it out does not. I am trying the search for a Simplified Chinese string in a database that previously did not store any Chinese strings yet.
The EntityFramework application that uses the database, correctly retrieves the strings and the LINQ queries also work in the application. However, in SQL Server 2014 Management Studio, when I do a an SQL query for the string it does not show up unless I specify the 'N' prefix for unicode. (Even though the column is nvarchar type)
Works:
var text = from asd in Translations.TranslationStrings
where asd.Text == "嗄法吖无上几"
select asd;
MessageBox.Show(text.FirstOrDefault().Text);
Does not work:
SELECT *
FROM TranslationStrings
where Text = '嗄法吖无上几'
If I prefix the Chinese characters with 'N' it works.
Works:
SELECT *
FROM TranslationStrings
where Text = N'嗄法吖无上几'
Please excuse the Chinese characters, I just typed something random. My question is, is there something I can do to not have to include the 'N' prefix when doing a query?
Thank you very much!
As #sworkalot has mentioned below:
The default for .Net is Unicode, that's why you don't need to specify
it. This is not the case for Sql Manager.
If not specified Sql will assume that you work with asci according to
the collation specified in your DB.
Hence, when working from Sql Server you need to use N'
https://sqlquantumleap.com/2018/09/28/native-utf-8-support-in-sql-server-2019-savior-false-prophet-or-both/
Check out these examples, pay close attention to the data types and the values being assigned:
DECLARE #Varchar VARCHAR(100) = '嗄'
DECLARE #VarcharWithN VARCHAR(100) = N'嗄' -- Has N prefix
DECLARE #NVarchar NVARCHAR(100) = '嗄'
DECLARE #NVarcharWithN NVARCHAR(100) = N'嗄' -- Has N prefix
SELECT
Varchar = #Varchar,
VarcharWithN = #VarcharWithN,
NVarchar = #NVarchar,
NVarcharWithN = #NVarcharWithN
SELECT
Varchar = CONVERT(VARBINARY, #Varchar),
VarcharWithN = CONVERT(VARBINARY, #VarcharWithN),
NVarchar = CONVERT(VARBINARY, #NVarchar),
NVarcharWithN = CONVERT(VARBINARY, #NVarcharWithN)
Results:
Varchar VarcharWithN NVarchar NVarcharWithN
? ? ? 嗄
Varchar VarcharWithN NVarchar NVarcharWithN
0x3F 0x3F 0x3F00 0xC455
NVARCHAR data type stores 2 bytes for each character while VARCHAR only stores 1 (you can see this on the VARBINARY cast on the 2nd SELECT). Since chinese characters representation need 2 bytes to be stored, you have to use NVARCHAR to store them. If you try to stuff them in a VARCHAR it will be stored as ? and you will lose the original character information. This also happens on the 3rd example, because the literal doesn't have the N so it's converted to VARCHAR before actually assigning the value to the variable.
It's because of this that you need to add the N prefix when typing these characters as literals, so the SQL engine knows that you are typing characters that need 2 byte representation. So if you are doing a comparison against a NVARCHAR column always add the N prefix. You can change the database collation, but it's recommended to always use the proper data type independent of the collation so you don't have problems when using coding on different databases.
If you could explain the reason why you want to omit the N prefix we might address that, although I believe there is no work around in this particular case.
The default for .Net is Unicode, that's why you don't need to specify it.
This is not the case for Sql Manager.
If not specified Sql will assume that you work with asci according to the collation specified in your DB.
Hence, when working from Sql Server you need to use N'
https://sqlquantumleap.com/2018/09/28/native-utf-8-support-in-sql-server-2019-savior-false-prophet-or-both/

PostgreSQL upper function on the ascii 152 character ("ÿ")

On a Windows 7 platform, with PostgreSQL version 9.3.9, using PgAdmin as a client, the result of select upper on a column containing e.g. "ÿÿÿ", returns null. If three values are stored, e.g.,
"ada"
"john"
"mole"
"ÿÿÿ"
they all come back in upper case, except the row containing "ÿÿÿ"; this row
gives nothing back, null...
The database encoding scheme is UTF8 / UNICODE. The setting "client_encoding" has the same value, UNICODE.
Is this a setting issue in the database, an operating system issue, or a bug
in the database? Are there some recommended workarounds?
The result of:
select thecol, upper(thecol), upper(thecol) is null, convert_to(thecol, 'UTF8'), current_setting('server_encoding') from thetable where ...
is:
"Apps";"APPS";f;"Apps";"UTF8"
"All";"ALL";f;"All";"UTF8"
"Test";"TEST";f;"Test";"UTF8"
"ÿÿÿ";"";f;"\303\277\303\277\303\277";"UTF8"
The lc_ parts of pg_settings are:
"lc_collate";"Swedish_Sweden.1252";"Shows the collation order locale."
"lc_ctype";"Swedish_Sweden.1252";"Shows the character classification and case conversion locale."
"lc_messages";"Swedish_Sweden.1252";"Sets the language in which messages are displayed."
"lc_monetary";"Swedish_Sweden.1252";"Sets the locale for formatting monetary amounts."
"lc_numeric";"Swedish_Sweden.1252";"Sets the locale for formatting numbers."
The output of select * from pg_database is:
"template1";10;6;"Swedish_Sweden.1252";"Swedish_Sweden.1252";t;t;-1;12130;668;1‌​;1663;"{=c/postgres,postgres=CTc/postgres}"
"template0";10;6;"Swedish_Sweden.1252";"Swedish_Sweden.1252";t;f;-1;12130;668;1‌​;1663;"{=c/postgres,postgres=CTc/postgres}"
"postgres";10;6;"Swedish_Sweden.1252";"Swedish_Sweden.1252";f;t;-1;12130;668;1;‌​1663;""
The actual create database statement, for the 9.4.4 version, is:
CREATE DATABASE postgres
WITH OWNER = postgres
ENCODING = 'UTF8'
TABLESPACE = pg_default
LC_COLLATE = 'Swedish_Sweden.1252'
LC_CTYPE = 'Swedish_Sweden.1252'
CONNECTION LIMIT = -1;
My guess is that the upper function uses the LC_CTYPE setting of your database. The uppercase of LATIN SMALL LETTER Y WITH DIAERESIS (U+00FF) is LATIN CAPITAL LETTER Y WITH DIAERESIS' (U+0178) which isn't part of the Windows 1252 code page.
If you convert the string to a Unicode format first, the upper function might work as expected:
SELECT upper(convert_to(thecol, 'UTF8')) ...
You should probably use a different value for LC_CTYPE and LC_COLLATE. On Linux, you'd use sv_SE.UTF-8.
Nevertheless, I'd consider this a bug in Postgres. It would be better to leave ÿ as is if the upper case version can't be represented in the target character set.

String input matched against a binary field in SQL WHERE

Here is the scenario:
I have a SQL select statement that returns a binary data object as a string. This cannot be changed it is outside the area of what I can modify.
So for example it would return '1628258DB0DD2F4D9D6BC0BF91D78652'.
If I manually add a 0x in front of this string in a query I will retrieve the results I'm looking for so for example:
SELECT a, b FROM mytable WHERE uuid = 0x1628258DB0DD2F4D9D6BC0BF91D78652
My result set is correct.
However I need to find a Microsoft SQL Server 2008 compatible means to do this programatically. Simply concatenating 0x to the string variable does not work. Obvious, but I did try it.
Help please :)
Thank you
Mark
My understanding of your question is that you have a column uuid, which is binary.
You are trying to select rows with a particular value in uuid, but you are trying to use a string like so:
SELECT a, b FROM mytable WHERE uuid = '0x1628258DB0DD2F4D9D6BC0BF91D78652'
which does not work. If this is correct, you can use the CONVERT function with a style of 2 to have SQL Server treat the string as hex and not require a '0x' as the first characters:
SELECT a, b
FROM mytable
WHERE uuid = CONVERT(binary(16), '1628258DB0DD2F4D9D6BC0BF91D78652', 2)

searching sql DB table for japanese terms

I have a table with columns that allow for different language formats (using nvarchar) and the problem is that when I try to search for these terms; particularly Japanese/Chinese terms, the typical select statement does not work
select * from jtable where searchterm = 'ろくでなし'
It will return 0 which is incorrect since it is definitely in the table. Someone mentioned using cast(....) but not sure how to do this.
Need an N to make the string literal unicode.
select * from jtable where searchterm = N'ろくでなし'
Without the N the 'ろくでなし' is implicit varchar and is seen as '?????'
See my related answer about khmer text for examples of why: Khmer Unicode, English and Microsoft SQL Server 2008 results in questionmarks

Arabic SQL query (on Oracle DB) returns empty result

I have this query (that runs on Oracle 10g database):
SELECT ge.*, ge.concept AS glossarypivot
FROM s_glossary_entries ge
WHERE (ge.glossaryid = '161' OR ge.sourceglossaryid = '161')
AND (ge.approved != 0 OR ge.userid = 361)
AND concept like 'م%' ORDER BY ge.concept
The query must display all words that begin with the arabic letter "م"
but unfortunately, it returns empty result ..
However, if I run the same query on the same database which runs on MYSQL, it works well and displays the correct result ..
and also, if I run the same query with an english letter (m), like this:
SELECT ge.*, ge.concept AS glossarypivot
FROM s_glossary_entries ge
WHERE (ge.glossaryid = '161' OR ge.sourceglossaryid = '161')
AND (ge.approved != 0 OR ge.userid = 361)
AND concept like 'm%' ORDER BY ge.concept
it displays result correctly and not empty !!
What should I do in order to get this query working the right way on oracle 10 database?
P.S. the oracle database character set is : "AL32UTF8"
thank you so much in advance
Sure that this works in MySQL? I would do this part:
AND concept = 'م'
like this:
AND concept LIKE 'م%'
or because it's arabic and the first char is the right one's like this:
AND concept LIKE '%م'
But i have no idea if Oracle even have LIKE, i never worked with Oracle.
if I put a UTF8 character : " ظ… " instead of the arabic character "م" it will work on oracle ...
The obvious question is, do you have matching data.
You can use SELECT DUMP(concept), DUMP('م') FROM ... to see the bytes that actually form the value. My database gives me 217/133. I believe there are some characters which can have different bytes in UTF-8 but the same physical appearance, though I couldn't say whether this is one of them.
Also, consult the Globalization guide.
i thjink its a mismatch in your oracle client codepage. it should be defined in the same character set as the database, otherwise there will be some character conversion.