SQL Server : whitespaces in rows - sql

I have this problem where in my database I get lots of empty spaces after my text,
In my email I have "something#mail.com+_________________________________________" lots of spaces
My email row is nchar(255)
and that is happening to all of my tables
Can anyone explain to me why is this happening and how to fix it?

CHAR and NCHAR will automatically right-pad a string with spaces to meet the defined length. Use NVARCHAR(255) instead of NCHAR(255).

you should use nvarchar
SQL Server provides both datatypes to store character information. For the most part the two datatypes are identical in how you would work with them within SQL Server or from an application. The difference is that nvarchar is used to store unicode data, which is used to store multilingual data in your database tables. Other languages have an extended set of character codes that need to be saved and this datatype allows for this extension. If your database will not be storing multilingual data you should use the varchar datatype instead. The reason for this is that nvarchar takes twice as much space as varchar, this is because of the need to store the extended character codes for other languages from

Related

SQL NVARCHAR(MAX) returning ASCII and Weird Characters instead of Text

I have an SQL Table and I'm trying to return the values as a string.
The values should be city names like Sydney, Melbourne, Port Maquarie etc.
But When I run a select I either get black results or as detailed in the first picture some strange backwards L character. The column is an NVARCHAR(MAX)
SELECT ctGlobalName FROM Crm.Cities
Then I tried using MSSQL's Edit top 200 rows feature and I could see the names of the cities, but also all these weird ascii characters.
Now I didn't create the database, I'm just running queries on it. Some things I've read have suggested it is a problem with the Collation. But the table is SQL_Latin1_General_CP1_CI_AS which matches the server collation.
I'm sure there must be something I can add to my select query to return the values as an ordinary string. Is there something I can do to my select query to return the expected format without the weird characters?
An NVARCHAR datatype can store Unicode characters, which are used for languages that are not supported by the ASCII character set i.e. non-English (or related) languages such as Chinese or Indonesian. If your SQL Server or Windows doesn't have that language installed then you might see strange-looking representations of the data.
On the other hand, it could also be that the application that updates this table has just stored bad data in that column.
Either way you might need to do some string manipulation to strip out the characters you don't want.

MS SQL Server 2008 Encoding

I have table in database with Lithuanian_100_CI_AS collation. Some rows has text fields with text, which contains random symbols instead of Lithuanian ones. Is it possible to change the encoding, that i would see the letters i need? Changing collation does totally nothing.
If you have got the data like this (manipulated) then you can not realy save it by changing the collation, but if you set the right collation this could help you to get the data written in a right way to your database (more relevant for the future)
No, the data is random.
You need to
use nvarchar to store this data correctly
ensure the client is using nvarchar for parameters
ensure all string constants have N in front (example: N'foobar')
The collation is not encoding: it only determins how strings and compared/sported, but determines the code page for non-unicode (unicode = nvarchar) columns
Note, the data types "text" and "ntext" are deprecated in SQL Server. Use the max types

SQL - Numeric data type with leading zeros

I need to store Medicare APC codes. I believe the format requires 4 numbers. Leading zeros are relevant. Is there any way to store this data type with verification? How should I store this data (varchar(4), int)?
This kind of issue, storing zero leading numbers that need to be treated as Numeric values on some scenarios (i.e. sorting) and as textual values in others (i.e. addresses) is always a pain and there is no one answer that is best for all users. At my company we have a database that stores numbers as text for codes (not Medicare APC codes) and we must pad them with zero’s so they will sort properly when used in an order operation.
Do not use a numeric data type for this because the item is not a true number but textual data that uses numeric characters. You will not be performing any calculations or aggregates on the codes and so the only benefit to storing them as a number would be to ensure proper sorting of the codes and that can be done with the code stored as text by padding it with zeros where needed. If you sue a numeric data type then any time the code is combined with other textual values you will have to explicitly convert it to CHAR/VARCHAR or let SQL Server do it since implicit conversions should always be avoided that means a lot of extra work for you and the query processor any time the code is used.
Assuming you decide to go with a textual data type the question then is should you use VARCHAR or CHAR and while many who have posted say VARCHAR I would suggest you go with CHAR set to a length of 4. WHY?
The VARCHAR data type is for textual data in which the size (the length or number of characters) is unknown in advance. For this Medicare code we know the length will always be at least 4 and possibly no more than 4 for the foreseeable future. SQL Server handles the storage of the data differently between CHAR and VARCHAR. SQL Server’s BOL (Books On Line) says :
Use CHAR when the size of the column data entries are consistent
Use VARCHAR when the size of the column data varies considerably.
I can’t say for certain this is true for SQL Server 2008 and up but for earlier versions, the use of a VARCHAR data type carries an extra overhead of 1 byte per row of data per column in a table that has a VARCHAR data type. If the data stored is always the same size and in your scenario it sounds like it is then this extra byte is a waste.
In the end it’s up to you as to whether you like CHAR or VARCHAR better but definitely don’t use a numeric data type to store a fixed length code.
That's not numeric data; it's textual data that happens to contain digits.
Use a VARCHAR.
I agree, using
CHAR(4)
for the check constraint use
check( APC_ODE LIKE '[0-9][0-9][0-9][0-9]' )
This will force a 4 digit number only to be accepted...
varchar(4)
optionally, you can still add a check constraint to ensure the data is numeric with leading zeros. This example will throw exceptions in Oracle. In other RDBMS, you could use regular expression checks:
alter table X add constraint C
check (cast(APC_CODE as int) = cast(APC_CODE as int))
If you are certain that the APC codes will always be numeric (that is if it wouldn't change in the near future), a better way would be to leave the database column as is, and handle the formatting (to include leading zeros) at places where you use this field values.
If you need leading 0s, then you must use a varchar or other string data type.
There are ways to format the output for leading 0s without compromising your actual data.
See this blog entry for an easy method.
CHAR(4) seems more appropriate to me (if I understood you right, and the code is always 4 digits).
What you want to use is a VARCHAR data type with a CHECK constraint, using LIKE with a pattern to check for numeric values.
in TSQL
check( isnumeric(APC_ODE) = 1)

Problem with SQL Collation

I'm making an Arabic website , and after I create the database and start writing Arabic text inside it , it just show ???? , so I change the collation of my Database from SQL_Latien to Arabic_CI_AI
but I'm still getting the ???? inside my fields and when I check the properties of the field I found it SQL_Latien and it doesn't change
so what should I do to fix this problem without repeating building the database
please reply as soon as you can
Thanks in Advance
Database collation is just the default setting for new columns.
To change the collation of an existing column, you'd have to alter table. For example:
alter table YourTable alter column col1 varchar(10) collate Arabic_CI_AI
The collation sequence is the order in which characters appear when you sort (ie. use the 'ORDER BY' clause). Different collations will result in different sort orders.
This is obviously NOT what you are looking for. You problem is storing and retrieving UNICODE characters outside the ASCII range (ie. Arabic characters). To do that, the data types storing this data must support UNICODE, instead of ASCII. Simply, when defining a column, use the data types nchar, nvarchar, and ntext, instead of char, varchar and text.

varchar or nvarchar

I am storing first name and last name with up to 30 characters each. Which is better varchar or nvarchar.
I have read that nvarchar takes up twice as much space compared to varchar and that nvarchar is used for internationalization.
So what do you suggest should I use: nvarchar or varchar ?
Also please let me know about the performance of both. Is performance for both is same or they differ in performance. Because space is not too big issue. Issue is the performance.
Basically, nvarchar means you can handle lots of alphabets, not just regular English. Technically, it means unicode support, not just ANSI. This means double-width characters or approximately twice the space. These days disk space is so cheap you might as well use nvarchar from the beginning rather than go through the pain of having to change during the life of a product.
If you're certain you'll only ever need to support one language you could stick with varchar, otherwise I'd go with nvarchar.
This has been discussed on SO before here.
EDITED: changed ascii to ANSI as noted in comment.
First of all, to clarify, nvarchar stores unicode data while varchar stores ANSI (8-bit) data. They function identically but nvarchar takes up twice as much space.
Generally, I prefer storing user names using varchar datatypes unless those names have characters which fall out of the boundary of characters which varchar can store.
It also depends on database collation also. For e.g. you'll not be able to store Russian characters in a varchar field, if your database collation is LATIN_CS_AS. But, if you are working on a local application, which will be used only in Russia, you'd set the database collation to Russian. What this will do is that it will allow you to enter Russian characters in a varchar field, saving some space.
But, now-a-days, most of the applications being developed are international, so you'd yourself have to decide which all users will be signing up, and based on that decide the datatype.
I have red that nvarchar takes twice as varchar.
Yes.
nvarchar is used for internationalization.
Yes.
what u suggest should i use nvarchar or varchar?
It's depends upon the application.
By default go with nvarchar. There is very little reason to go with varchar these days, and every reason to go with nvarchar (allows international characters; as discussed).
varchar is 1 byte per character, nvarchar is 2 bytes per character.
You will use more space with nvarchar but there are many more allowable characters. The extra space is negligible, but you may miss those extra characters in the future. Even if you don't expect to require internationalization, people will often have non-English characters (e.g. é, ñ or ö) in their names.
I would suggest you use nvarchar.
I have red that nvarchar takes twice as varchar
Yes. According to Microsoft: "Storage size, in bytes, is two times the number of characters entered + 2 bytes" (http://msdn.microsoft.com/en-us/library/ms186939(SQL.90).aspx).
But storage is cheap; I never worry about a few extra bytes.
Also, save yourself trouble in the future and set the maximum widths to something more generous, like 100 characters. There is absolutely no storage overhead to this when you're using varchar or nvarchar (as opposed to char/nchar). You never know when you're going to encounter a triple-barrelled surname or some long foreign name which exceeds 30 characters.
nvarchar is used for internationalization.
nvarchar can store any unicode character, such as characters from non-Latin scripts (Arabic, Chinese, etc). I'm not sure how your application will be taking data (via the web, via a GUI toolkit, etc) but it's likely that whatever technology you're using supports unicode out of the box. That means that for any user-entered data (such as name) there is always the possibility of receiving non-Latin characters, if not now then in the future.
If I was building a new application, I would use nvarchar. Call it "future-proofing" if you like.
The nvarchar type is Unicode, so it can handle just about any character that exist in every language on the planet. The characters are stored as UTF-16 or UCS-2 (not sure which, and the differences are subtle), so each character uses two bytes.
The varchar type uses an 8 bit character set, so it's limited to the 255 characters of the character set that you choose for the field. There are different character set that handles different character groups, so it's usually sufficient for text local to a country or a region.
If varchar works for what you want to do, you should use that. It's a bit less data, so it's overall slightly faster. If you need to handle a wide variety of characters, use nvarchar.
on performance:
a reason to use varchar over nvarchar is that you can have twice as many characters in your indexes! index keys are limited to 900 bytes
on usability:
if the application is only ever intended for a english audience & contain english names, use varchar
Data to store: "Sunil"
varchar(5) takes 7B
nvarchar(5) takes 12B