MS SQL Server 2008 Encoding - sql

I have table in database with Lithuanian_100_CI_AS collation. Some rows has text fields with text, which contains random symbols instead of Lithuanian ones. Is it possible to change the encoding, that i would see the letters i need? Changing collation does totally nothing.

If you have got the data like this (manipulated) then you can not realy save it by changing the collation, but if you set the right collation this could help you to get the data written in a right way to your database (more relevant for the future)

No, the data is random.
You need to
use nvarchar to store this data correctly
ensure the client is using nvarchar for parameters
ensure all string constants have N in front (example: N'foobar')
The collation is not encoding: it only determins how strings and compared/sported, but determines the code page for non-unicode (unicode = nvarchar) columns
Note, the data types "text" and "ntext" are deprecated in SQL Server. Use the max types

Related

HANA: Unknown Characters in Database column of datatype BLOB

I need help on how to resolve characters of unknown type from a database field into a readable format, because I need to overwrite this value on database level with another valid value (in the exact format the application stores it in) to automate system copy acitvities.
I have a proprietary application that also allows users to configure it in via the frontend. This configuration data gets stored in a table and the values of a configuration property are stored in a column of type "BLOB". For the here desired value, I provide a valid URL in the application frontend (like http://myserver:8080). However, what gets stored in the database is not readable (some square characters). I tried all sorts of conversion functions of HANA (HEX, binary), simple, and in a cascaded way (e.g. first to binary, then to varchar) to make it readable. Also, I tried it the other way around and make the value that I want to insert appear in the correct format (conversion to BLOL over hex or binary) but this does not work either. I copied the value to clipboard and compared it to all sorts of character set tables (although I am not sure if this can work at all).
My conversion tries look somewhat like this:
SELECT TO_ALPHANUM('') FROM DUMMY;
while the brackets would contain the characters in question. I cant even print them here.
How can one approach this and maybe find out the character set that is used by this application? I would be grateful for some more ideas.
What you have in your BLOB column is a series of bytes. As you mentioned, these bytes have been written by an application that uses an unknown character set.
In order to interpret those bytes correctly, you need to know the character set as this is literally the mapping of bytes to characters or character identifiers (e.g. code points in UTF).
Now, HANA doesn't come with a whole lot of options to work on LOB data in the first place and for C(haracter)LOB data most manipulations implicitly perform a conversion to a string data type.
So, what I would recommend is to write a custom application that is able to read out the BLOB bytes and perform the conversion in that custom app. Once successfully converted into a string you can store the data in a new NVCLOB field that keeps it in UTF-8 encoding.
You will have to know the character set in the first place, though. No way around that.
I assume you are on Oracle. You can convert BLOB to CLOB as described here.
http://www.dba-oracle.com/t_convert_blob_to_clob_script.htm
In case of your example try this query:
select UTL_RAW.CAST_TO_VARCHAR2(DBMS_LOB.SUBSTR(<your_blob_value)) from dual;
Obviously this only works for values below 32767 characters.

SQL NVARCHAR(MAX) returning ASCII and Weird Characters instead of Text

I have an SQL Table and I'm trying to return the values as a string.
The values should be city names like Sydney, Melbourne, Port Maquarie etc.
But When I run a select I either get black results or as detailed in the first picture some strange backwards L character. The column is an NVARCHAR(MAX)
SELECT ctGlobalName FROM Crm.Cities
Then I tried using MSSQL's Edit top 200 rows feature and I could see the names of the cities, but also all these weird ascii characters.
Now I didn't create the database, I'm just running queries on it. Some things I've read have suggested it is a problem with the Collation. But the table is SQL_Latin1_General_CP1_CI_AS which matches the server collation.
I'm sure there must be something I can add to my select query to return the values as an ordinary string. Is there something I can do to my select query to return the expected format without the weird characters?
An NVARCHAR datatype can store Unicode characters, which are used for languages that are not supported by the ASCII character set i.e. non-English (or related) languages such as Chinese or Indonesian. If your SQL Server or Windows doesn't have that language installed then you might see strange-looking representations of the data.
On the other hand, it could also be that the application that updates this table has just stored bad data in that column.
Either way you might need to do some string manipulation to strip out the characters you don't want.

SQL column collation change

I would like to change a column collation to some Polish collation and be able to view Polish characters properly. All three, original column, original table and original database, use SQL_Scandinavian_CP850_CS_AS.
For column collation change I tried:
SELECT CAST([ColumnName] AS nvarchar(50)) COLLATE Polish_CI_AS FROM t1
These 3 example letters appear in Scandinavian table:
SELECT 'ØùÒ' COLLATE Polish_CI_AS
Should return in results łŚń. Instead it shows 'OuO'.
Unfortunately SQL Server does not support OEM code page 852 which is what you need to convert code page 850 data into if you want to convert 'ØùÒ' to 'łŚń'. You can change the collation of data without SQL Server doing character mapping by CASTing through varbinary, but this only works with supported collations.
An alternative approach might be to create a user-defined function that takes a string and maps characters one-at-a-time, so Ø maps to ł etc. Fiddly to do, there are (up to) 127 characters to map, but not difficult.

SQL Server : whitespaces in rows

I have this problem where in my database I get lots of empty spaces after my text,
In my email I have "something#mail.com+_________________________________________" lots of spaces
My email row is nchar(255)
and that is happening to all of my tables
Can anyone explain to me why is this happening and how to fix it?
CHAR and NCHAR will automatically right-pad a string with spaces to meet the defined length. Use NVARCHAR(255) instead of NCHAR(255).
you should use nvarchar
SQL Server provides both datatypes to store character information. For the most part the two datatypes are identical in how you would work with them within SQL Server or from an application. The difference is that nvarchar is used to store unicode data, which is used to store multilingual data in your database tables. Other languages have an extended set of character codes that need to be saved and this datatype allows for this extension. If your database will not be storing multilingual data you should use the varchar datatype instead. The reason for this is that nvarchar takes twice as much space as varchar, this is because of the need to store the extended character codes for other languages from

Problem with SQL Collation

I'm making an Arabic website , and after I create the database and start writing Arabic text inside it , it just show ???? , so I change the collation of my Database from SQL_Latien to Arabic_CI_AI
but I'm still getting the ???? inside my fields and when I check the properties of the field I found it SQL_Latien and it doesn't change
so what should I do to fix this problem without repeating building the database
please reply as soon as you can
Thanks in Advance
Database collation is just the default setting for new columns.
To change the collation of an existing column, you'd have to alter table. For example:
alter table YourTable alter column col1 varchar(10) collate Arabic_CI_AI
The collation sequence is the order in which characters appear when you sort (ie. use the 'ORDER BY' clause). Different collations will result in different sort orders.
This is obviously NOT what you are looking for. You problem is storing and retrieving UNICODE characters outside the ASCII range (ie. Arabic characters). To do that, the data types storing this data must support UNICODE, instead of ASCII. Simply, when defining a column, use the data types nchar, nvarchar, and ntext, instead of char, varchar and text.