Unicode- VARCHAR and NVARCHAR - sql

-- Creating Table
Create Table Test1
(
id Varchar(8000)
)
-- Inserting a record
Insert into Test1 Values ('我們的鋁製車架採用最新的合金材料所製成,不但外型輕巧、而且品質優良。為了達到強化效果,骨架另外經過焊接和高溫處理。創新的設計絕對能充分提升踏乘舒適感和單車性能。');
As I have defined data type of id as Varchar. The data is stored as ?????.
Do I have to use NVARCHAR..? What is Difference between VarChar and Nvarchar(). Please explain about UNIcode as well.

The column type nvarchar allows you to store Unicode characters, which basically means almost any character from almost any language (including modern languages and some obsolete languages), and a good number of symbols too.

also it is required to prefix N before your value. example Insert into Test1 Values (N'我們的鋁製車架採用最新的合金材料所製成,不但外型輕巧、而且品質優良。為了達到強化效果,骨架另外經過焊接和高溫處理。創新的設計絕對能充分提升踏乘舒適感和單車性能。'); or programatically use preparedstatement with bind values for inserting and updating natural characterset

Nvarchar supports UNICODE. SO yes. you need to have the column as nvarchar and not varchar.

Despite the collation of your database. Use nvarchar to store UNICODE.
Embbed your Unicode value in N'[value]'
INSERT INTO ... VALUES
('Azerbaijani (Cyrillic)', N'Aзәрбајҹан (кирил әлифбасы)', 'az-cyrl')
In DB: 59 Azerbaijani (Cyrillic) Aзәрбајҹан (кирил әлифбасы) az-cyrl
Important is the N prefix!
Valid for MS SQL 2014 I am using. Hope this helps.

Yes you have to use nvarchar or use a collation for the language set you want. But nvarchar is preferred. Goodgle can tell you what this stuff means.

Varchar uses Windows-1252 character encoding, which is for all practical purposes standard ASCII.
As others have noted, nvarchar allows the storage of unicode characters.
You can get the ASCII translations from either data type, as shown here:
IF OBJECT_ID('TEST1') IS NOT NULL
DROP TABLE TEST1
GO
CREATE TABLE TEST1(VARCHARTEST VARCHAR(8000), NVARCHARTEST NVARCHAR(4000))
-- Inserting a record
INSERT INTO TEST1 VALUES ('ABC','DEF')
SELECT
VARCHARTEST
,NVARCHARTEST
,ASCII(SUBSTRING(VARCHARTEST,1,1))
,ASCII(SUBSTRING(VARCHARTEST,2,1))
,ASCII(SUBSTRING(VARCHARTEST,3,1))
,ASCII(SUBSTRING(NVARCHARTEST,1,1))
,ASCII(SUBSTRING(NVARCHARTEST,2,1))
,ASCII(SUBSTRING(NVARCHARTEST,3,1))
FROM
TEST1
DROP TABLE TEST1

Related

How to save Russian character in Oracle Database [duplicate]

I have a database with one column of the type nvarchar. If I write
INSERT INTO table VALUES ("玄真")
It shows ¿¿ in the table. What should I do?
I'm using SQL Developer.
Use single quotes, rather than double quotes, to create a text literal and for a NVARCHAR2/NCHAR text literal you need to prefix it with N
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( value NVARCHAR2(20) );
INSERT INTO table_name VALUES (N'玄真');
Query 1:
SELECT * FROM table_name
Results:
| VALUE |
|-------|
| 玄真 |
First, using NVARCHAR might not even be necessary.
The 'N' character data types are for storing data that doesn't 'fit' in the database's defined character set. There's an auxiliary character set defined as the NCHAR Character set. It's kind of a band aid - once you create a database it can be difficult to change its character set. Moral of this story - take great care in defining the Character Set when creating your database and do not just accept the defaults.
Here's a scenario (LiveSQL) where we're storing a Chinese string in both NVARCHAR and VARCHAR2.
CREATE TABLE SO_CHINESE ( value1 NVARCHAR2(20), value2 varchar2(20 char));
INSERT INTO SO_CHINESE VALUES (N'玄真', '我很高興谷歌翻譯。' )
select * from SO_CHINESE;
Note that both the character sets are in the Unicode family. Note also I told my VARCHAR2 string to hold 20 characters. That's because some characters may require up to 4 bytes to be stored. Using a definition of (20) would give you only room to store 5 of those characters.
Let's look at the same scenario using SQL Developer and my local database.
And to confirm the character sets:
SQL> clear screen
SQL> set echo on
SQL> set sqlformat ansiconsole
SQL> select *
2 from database_properties
3 where PROPERTY_NAME in
4 ('NLS_CHARACTERSET',
5 'NLS_NCHAR_CHARACTERSET');
PROPERTY_NAME PROPERTY_VALUE DESCRIPTION
NLS_NCHAR_CHARACTERSET AL16UTF16 NCHAR Character set
NLS_CHARACTERSET AL32UTF8 Character set
First of all, you should to establish the Chinese character encoding on your Database, for example
UTF-8, Chinese_Hong_Kong_Stroke_90_BIN, Chinese_PRC_90_BIN, Chinese_Simplified_Pinyin_100_BIN ...
I show you an example with SQL Server 2008 (Management Studio) that incorporates all of this Collations, however, you can find the same characters encodings in other Databases (MySQL, SQLite, MongoDB, MariaDB...).
Create Database with Chinese_PRC_90_BIN, but you can choose other Coallition:
Select a Page (Left Header) Options > Collation > Choose the Collation
Create a Table with the same Collation:
Execute the Insert Statement
INSERT INTO ChineseTable VALUES ('玄真');

Special Characters like ❷ ❶ ❸ are not storing correctly in SQL even when the type is nVarchar

I facing a weird issue in SQL. Im trying to save these characters ❷❶❸ into SQL. But its storing as Question Marks (?). The field is nVarchar.
This is my update query
update mytable set keywords='key1❶,key2❶,key3❶,key4❶' where id=50543
The column should be created as
CREATE TABLE mytable (columnname NVARCHAR(40) COLLATE SQL_Latin1_General_CP1253_CI_AI)
Then when insert use prefix Unicode character string
INSERT INTO mytable (columnname) VALUES (N'❷❶❸')

Select cyrillic character in SQL

When user insert Russian word like 'пример' to database,database saves it like '??????'. If they insert with 'N' letter or I select it with 'N' letter, ie; exec Table_Name N'иытание' there is no problem. But I don't want to use 'N' in every query, so is there any solution for this? I will use stored procedure by the way.
UPDATE:
Now I can use Russian letters with alter collation. But I can't alter collation for every language and I just want to learn is there any trigger or function for automatic add N in front of the text after text add. IE; when I insert 'пример', SQL should take it like N'пример' autamaticly.
You have to use column's datatype NVARCHAR to insert unicode letters, also you have to use N'value' when inserting.
You can test it in following:
CREATE TABLE #test
(
varcharCol varchar(40),
nvarcharCol nvarchar(40)
)
INSERT INTO #test VALUES (N'иытание', N'иытание')
SELECT * FROM #test
OUTPUT
varcharCol nvarcharCol
??????? иытание
As you see column of datatype varchar returning questionmarks ?????? and column of datatype nvarchar returning russian characters иытание.
UPDATE
Problem is that your database collation does not support russian letters.
In Object Explorer, connect to an instance of the SQL Server Database Engine, expand that instance, and then expand Databases.
Right-click the database that you want and click Properties.
Click the Options page, and select a collation from the Collation
drop-down list.
After you are finished, click OK.
MORE INFO
it would very difficult to put in comment i would recommend this link Info
declare #test TABLE
(
Col1 varchar(40),
Col2 varchar(40),
Col3 nvarchar(40),
Col4 nvarchar(40)
)
INSERT INTO #test VALUES
('иытание',N'иытание','иытание',N'иытание')
SELECT * FROM #test
RESULT
To store and select Unicode character in database you have to use NVARCHAR instead of VARCHAR. To insert Unicode data you have to use N
See this link https://technet.microsoft.com/en-us/library/ms191200%28v=sql.105%29.aspx
The n prefix for these data types comes from the ISO standard for National (Unicode) data types.
Change type of your columns (containing Russian) from varchar to nvarchar.

Finding character values outside ASCII range in an NVARCHAR column

Is there a simple way of finding rows in an Oracle table where a specific NVARCHAR2 column has one or more characters which wouldn't fit into the standard ASCII range?
(I'm building a warehousing and data extraction process which takes the Oracle data, drags it into SQL Server -- UCS-2 NVARCHAR -- and then exports it to a UTF-8 XML file. I'm pretty sure I'm doing all the translation properly, but I'd like to find a bunch of real data to test with that's more likely to cause problems.)
Not sure how to tackle this in Oracle, but here is something I've done in MS-SQL to deal with the same issue...
create table #temp (id int, descr nvarchar(200))
insert into #temp values(1,'Now is a good time')
insert into #temp values(2,'So is yesterday')
insert into #temp values(2,'But not '+NCHAR(2012))
select *
from #temp
where CAST(descr as varchar(200)) <> descr
drop table #temp
Sparky's example for SQL Server was enough to lead me to a pretty simple Oracle solution, once I'd found the handy ASCIISTR() function.
SELECT
*
FROM
test_table
WHERE
test_column != ASCIISTR(test_column)
...seems to find any data outside the standard 7-bit ASCII range, and appears to work for NVARCHAR2 and VARCHAR2.

Storing Symbols like ϱπΩ÷√νƞµΔϒᵨλθ→%° in SQL Server XML

I ran these quires in my SQL server
select cast('<Answers>
<AnswerDescription> ϱπΩ÷√νƞµΔϒᵨλθ→%° </AnswerDescription>
</Answers>' as xml)
select ' ϱπΩ÷√νƞµΔϒᵨλθ→%°'
And got the following results
<Answers>
<AnswerDescription> ?pO÷v??µ??????%° </AnswerDescription>
</Answers>
and
" ?pO÷v??µ??????%°"
How to make my SQL server store or display these values as they are being sent from Application ?
In SQL Server, scalar string values are cast to VARCHAR by default.
Your example can be made to work by indicating that the strings should be treated as NVARCHAR by adding N before the opening single quote:
select cast(N'<Answers>
<AnswerDescription> ϱπΩ÷√νƞµΔϒᵨλθ→%° </AnswerDescription>
</Answers>' as xml)
select N' ϱπΩ÷√νƞµΔϒᵨλθ→%°'
If these strings are being incorrectly stored in the database, it is likely that they are being implicitly cast to VARCHAR at some point during insertion (e.g. INSERT). It's also possible that they are being stored correctly and are cast to VARCHAR on retrieval (e.g. SELECT).
If you add some code to the question showing how you're inserting data and the datatypes of the target tables, it should be possible to provide more detailed assistance.
I believe its problem with incorectly set character set,
change charecter set to UTF8.
I just tested it on my MySQL database, i changed character set to utf8-bin using
ALTER TABLE `tab1` CHANGE `test` `test` VARCHAR( 255 ) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL
worked without any problem