Three questions with the following scenario:
SQL Server 2005 production db with a Latin1 codepage and showing "?" for invalid chars in Management Studio.
SomeCompanyApp client as a service that populates the data from servers and workstations.
SomeCompanyApp management console that shows "?" for Asian characters.
Since this is a prod db I will not write to it.
I don't know if the client app that is storing the data in the database is actually storing it correctly as Unicode and it simply doesn't show because they are using Latin1 for the console.
Q1: As I understand it, SQL Server stores nvarchar text as Unicode regardless of the codepage or am I completely wrong and if the codepage is Latin1 then everything that is not in that codepage gets converted to "?".
Q2: Is it the same with a text column?
Q3: Is there a way using SQL Server Management Studio or Visual Studio and some code (don't care which language :)) to query the db and show me if the chars really do show up as Japanese, Chinese, Korean, etc.?
My final goal is to extract data from the db and store it in another db using UTF-8 to show Japanese and other Asian chars as what they are in my own client webapp. I will settle for an answer to Q3. I can code in several languages and at the very least understand some others but I'm just not knowledgeable enough about Unicode. In case you want to know my webapp will be using pyodbc and cassandra but for these questions that doesn't matter.
When inserting into an NVARCHAR column in SSMS, you need to make absolutely sure you're prefixing your string with a N:
This will NOT work:
INSERT INTO dbo.MyTable(NVarcharColumn) VALUES('Some Text with Special Char')
SQL Server will interpret your string in the VALUES(..) as VARCHAR and thus strip off any special characters.
You need this:
INSERT INTO dbo.MyTable(NVarcharColumn) VALUES(N'Some Text with Special Char')
Prefixing your text literal with an N'..' tells SQL Server to treat this as NVARCHAR all the way.
Does this help you solve your Q3 ??
Related
I have a subject table which has a theme field contains the following rows :
theme
-----
pays
économie
associée
And I have this basic query :
SELECT * FROM SUBJECT WHERE THEME='associée';
The query runs fine in Sql developer and returns the expected row to me.
On the other hand under Sqlplus it returns 0 lines to me (which is not normal).
I have the impression that the query does not recognize accented characters under sqlplus. I am thinking of an NLS_LANG problem but I do not know about it. Please help.
Thank you in advance.
Set your OS session's NLS_LANG variable to the value of, e.g., ENGLISH_AMERICA.AL32UTF8 and restart your SQL Developer. Retry afterwards.
If that didn't help, try also running your query as follows:
SELECT * FROM SUBJECT WHERE THEME = n'associée';
Notice the n before the string literal. That's a nvarchar2 string literal modifier. Depending on your DB charset/national charset settings you may need to explicitly state that the value you are querying for, is "national charset", not just a "regular charset".
If that didn't help, there's actually a multitude of additional variables that come into play when working with accented characters against an Oracle DB.
Explanation:
Your SQL Developer does recognize accents... provided that you have your Oracle DB session using character set compatible with your database character set. And your Oracle DB session's character set can be set either on OS level (via OS environment variable) or, possibly(!), in SQL Developer's options directly. Alas, the said multitude of other factors may include (though not exclusively):
your OS regional settings,
your OS Unicode support,
your Oracle client software's (SQL Developer) Unicode support,
your Java JDK/JRE's Unicode support,
your JDBC driver's Unicode support,
your other *DBC drivers' Unicode support, if there are any more in chain.
Sad thing is that the more interfaces you have between your keyboard and your Oracle database, the more likely is one of them to fiddle with your charset conversions badly.
So, let's just hope that the first two hints work for you, otherwise I can't help you (that easily).
Our ETL team is sending us some data with chinese description. When we are loading that data in our SQL Server database, those descriptions are coming up as blank.
We tried changing the column format to nvarchar, but that doesnt help.
Can you please help.
Thanks
You must use the N prefix when dealing with NVARCHAR.
INSERT INTO table (column) VALUES (N'chinese characters')
Prefix a Unicode character string constants with the letter N to
signal UCS-2 or UTF-16 input, depending on whether an SC collation is
used or not. Without the N prefix, the string is converted to the
default code page of the database that may not recognize certain
characters. Starting with SQL Server 2019 preview, when a UTF-8
enabled collation is used, the default code page is capable of storing
UNICODE UTF-8 character set.
Source: https://learn.microsoft.com/en-us/sql/t-sql/data-types/nchar-and-nvarchar-transact-sql?view=sql-server-2017
I have a SQL Server 2005 database with COLLATION SQL_Latin_General_CP1_CI_AS and I want to run a query from Delphi XE via ADO. Data in SQL Server is Greek and Latin characters. But in Delphi I get unreadable character strings. How can I manage this problem with Delphi XE ?
Since you say that you have both Greek and Latin characters in the db I guess that you are already using nvarchar in the db.
In Delphi you should then use TWideStringField for nvarchar fields. TStringField is for varchar (ansistring).
Field1 contains "γειά σου"
StringField := ADODataSet1.FieldByName('Field1') as TStringField;
ShowMessage(StringField.Value);
ShowMessage shows "?e??s??"
This works fine
WideStringField := ADODataSet1.FieldByName('Field1') as TWideStringField;
ShowMessage(WideStringField.Value);
Edit 1
If you have varchar fields in db you should use TStringField and you need to make sure that the "Language for non-Unicode programs" is Greek(Greece).
"Control Panel - Region and Language - Administrative - Change system locale..."
I have found that sometimes UTF-8 is stored in databases in VarChar fields, usually from Java programs.
If you see things like â€", there's a good chance that's what is going on.
You could try
// Delphi 2009+
UTF8ToUnicodeString(RawByteString( db_value ))
// Delphi 2007 and older
UTF8Decode( db_value )
If this is the case, you can also use a sql function to convert the VarChar fields to NVarChar
I'm working on SQL Server 2005, in which I have a database. When I use Japanese Characters in my application, they are stored as question marks in the databse. I would like to which Collations should I use save the japanese characters properly.
Note: Additional info(if it helps) In MySQL, we have used UTF8 as default character set in the startup variable and it works file.
Thank you,
Pavan
Japanese_90 appears to be the new collation name.
http://msdn.microsoft.com/en-us/library/bb330962%28v=sql.90%29.aspx#intlftrql2005_topic24
Note, you might want to consider the _KS suffix if you want to consider Hirigana/Katakana whilst sorting.
Like Marc_S says, you will also want to ensure your column datatype is nvarchar
I can see the Japanese test in the excel cells. I've built the insert query using ADO. It does the insert in the DB, but Japanese characters are simply represented as "????"
Any help would be appreciated.
Is it the Sybase client where you are seeing the Japanse characters misrepresented? If you are lucky then it's just a mix-up between the server and a client. You can try running:
set char_convert off
in the Sybase client which will turn off Sybases automatic character conversion that it attempts to do.
If the above doesn't work then you have to find out what your Sybase servers default charset is. You can do this with:
sp_default_charset
This will return the default charset for your Sybase server (e.g roman8 ). Check the charset your server returns supports Japanese characters.