What is the max value of a CHAR? - sql

I was wondering what the max char value is in sql? I noticed in C# this \uFFFF, but when I use that value to compare a string SQL renders it as an empty string I think.
The table is in SQL_Latin1_General_CP1_CI_AS if that matters.

There is a deep misconception of what is ascii...
ASCII is a 7bit code (0 to 127) where the characters are fix
the 8th bit offers this range a second time (128 to 255). In this area the characters are depending on codepages and collations.
Thinking of CHAR as a BYTE (8 bit in memory) is misleading...
Try this, both return a captial A
SELECT CHAR(65) COLLATE Latin1_General_CI_AS
SELECT CHAR(65) COLLATE Arabic_CI_AS
The code 255 renders with Latin1_General_CI_AS as ÿ, with the arabic collation there seems to be no printable character, hence the question mark.
SELECT CHAR(255) COLLATE Latin1_General_CI_AS
SELECT CHAR(255) COLLATE Arabic_CI_AS
So in short: SQL renders it as an empty string is not true. This is depending on your settings

Did you checked Documentation as it clearly says
char [ ( n ) ]
Fixed-length, non-Unicode string data. n defines the
string length and must be a value from 1 through 8,000. The storage
size is n bytes. The ISO synonym for char is character.

Numerically, the answer is 255. CHAR has a potential range of 0 to 255. It is an 8-bit code unit for the character encoding configured for the field (which it might inherit from the table or database).
Whether 255 is a valid code unit and is a complete codepoint, and which character it represents, and its sort order (is that what you meant by max?), depends on the collation. (A collation specifies a character encoding and sort order.)
Oh, if you are going to compare SQL datatypes to others, NVARCHAR and C#'s char and .NET's Char all use UTF-16 as the character encoding.

Related

Printing out box drawing characters (extended-ascii) in SSMS

I want to print box-drawing character in Output messages in SSMS. It includes characters like e.g. ║ or ░ or ╬.
The full list of characters which I have in my mind can be found here.
When I am trying the following: PRINT '╬' it returns simply + while I am expecting ╬.
When I am executing SELECT ASCII('╬') it returns 43, but when I am executing SELECT CHAR(43) it returns (not surprisingly) +.
Is it related to collation? If so, how can I find which collation to use?
A simple literal in SQL-Server is - by default a CHAR / VARCHAR type. This type is 1-byte-encoded extended ASCII: The lower half is the plain latin character set, the upper half is depending on a collation. This means, there is very little support for non-standard characters.
The second character type is NCHAR / NVARCHAR. This is (almost) unicode, very close to utf-16. The actual encoding is two-byte encoded UCS-2. The support for non-standard characters is (almost) complete. Any literal starting with a N is treated as NCHAR / NVARCHAR:
Try this:
SELECT '╬',N'╬';
DECLARE #str1a VARCHAR(10)='╬';
DECLARE #str1b VARCHAR(10)=N'╬'; --The NVARCHAR literal is changed to VARCHAR
DECLARE #str2 NVARCHAR(10)=N'╬';
SELECT #str1a,#str1b,#str2;
The functions to get the code point and - vice versa - to get the character are two-folded too:
SELECT ASCII('a'), UNICODE(N'a')
,ASCII('╬'), UNICODE(N'╬')
,CHAR(97),NCHAR(97),CHAR(43),NCHAR(43)
,NCHAR(9580)--does not work with `CHAR`
You need to print then in Unicode, i.e. to prefix them with N:
PRINT N'╬'

Select truncated string from Postgres

I have some large varchar values in Postgres that I want to SELECT and move somewhere else. The place they are going to uses VARCHAR(4095) so I only need at most 4095 bytes (I think that's bytes) and some of these varchars are quite big, so a performance optimization would be to SELECT a truncated version of them.
How can I do that?
Something like:
SELECT TRUNCATED(my_val, 4095) ...
I don't think it's a character length though, it needs to be a byte length?
The n in varchar(n) is the number of characters, not bytes. The manual:
SQL defines two primary character types: character varying(n) and
character(n), where n is a positive integer. Both of these types can
store strings up to n characters (not bytes) in length.
Bold emphasis mine.
The simplest way to "truncate" a string would be with left():
SELECT left(my_val, 4095)
Or just cast:
SELECT my_val::varchar(4095)
The manual once more:
If one explicitly casts a value to character varying(n) or
character(n), then an over-length value will be truncated to n
characters without raising an error. (This too is required by the SQL standard.)

Are there any limits on length of string in mysql?

I am using MySQL data base with Rails. I have created a field of type string. Are there any limits to its length? What about type text?
Also as text is variable sized, I believe there would be extra costs associated with using text objects. How important can they get, if at all?
CHAR
A fixed-length string that is always right-padded with spaces to the specified length when stored The range of Length is 1 to 255 characters. Trailing spaces are removed when the value is retrieved. CHAR values are sorted and compared in case-insensitive fashion according to the default character set unless the BINARY keyword is given.
VARCHAR
A variable-length string. Note: Trailing spaces are removed when the value is stored (this differs from the ANSI SQL specification)
The range of Length is 1 to 255 characters. VARCHAR values are sorted and compared in case-insensitive fashion unless the BINARY keyword is given
TINYBLOB, TINYTEXT
A TINYBLOB or TINYTEXT column with a maximum length of 255 (28 - 1) characters
BLOB, TEXT
A BLOB or TEXT column with a maximum length of 65,535 (216 - 1) characters , bytes = 64 KiB
MEDIUMBLOB, MEDIUMTEXT
A MEDIUMBLOB or MEDIUMTEXT column with a maximum length of 16,777,215 (224 - 1)characters , bytes = 16 MiB
LONGBLOB, LONGTEXT
A LONGBLOB or LONGTEXT column with a maximum length of 4,294,967,295 (232 - 1) characters , bytes = 4 GiB
See MySQL Data Types Quick Reference Table for more info.
also you can see MYSQL - String Type Overview
String, in general, should be used for short text. For example, it is a VARCHAR(255) under MySQL.
Text uses the larger text from the database, like, in MySQL, the type TEXT.
For information on how this works and the internals in MySQL and limits and such, see the other answer by Pekka.
If you are requesting, say, a paragraph, I would use text. If you are requesting a username or email, use string.
See the mySQL manual on String Types.
Varchar (String):
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions. The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.
Text: See storage requirements
If you want a fixed size text field, use CHAR which can be 255 characters in length maximum. VARCHAR and TEXT both have variable length.

What is the purpose of putting an 'N' in front of function parameters in TSQL?

What is the purpose of putting an 'N' in front of function parameters in TSQL?
For example, what does the N mean in front of the function parameter in the following code:
object_id(N'dbo.MyTable')
It indicates a "nationalized" a.k.a. unicode string constant.
http://support.microsoft.com/kb/239530
When dealing with Unicode string constants in SQL Server you must precede all Unicode strings with a capital letter N, as documented in the SQL Server Books Online topic "Using Unicode Data".
http://msdn.microsoft.com/en-us/library/aa276823%28SQL.80%29.aspx
nchar and nvarchar
Character data types that are either fixed-length (nchar) or variable-length (nvarchar) Unicode data and use the UNICODE UCS-2 character set.
nchar(n)
Fixed-length Unicode character data of n characters. n must be a value from 1 through 4,000. Storage size is two times n bytes. The SQL-92 synonyms for nchar are national char and national character.
nvarchar(n)
Variable-length Unicode character data of n characters. n must be a value from 1 through 4,000. Storage size, in bytes, is two times the number of characters entered. The data entered can be 0 characters in length. The SQL-92 synonyms for nvarchar are national char varying and national character varying.

Difference between BYTE and CHAR in column datatypes

In Oracle, what is the difference between :
CREATE TABLE CLIENT
(
NAME VARCHAR2(11 BYTE),
ID_CLIENT NUMBER
)
and
CREATE TABLE CLIENT
(
NAME VARCHAR2(11 CHAR), -- or even VARCHAR2(11)
ID_CLIENT NUMBER
)
Let us assume the database character set is UTF-8, which is the recommended setting in recent versions of Oracle. In this case, some characters take more than 1 byte to store in the database.
If you define the field as VARCHAR2(11 BYTE), Oracle can use up to 11 bytes for storage, but you may not actually be able to store 11 characters in the field, because some of them take more than one byte to store, e.g. non-English characters.
By defining the field as VARCHAR2(11 CHAR) you tell Oracle it can use enough space to store 11 characters, no matter how many bytes it takes to store each one. A single character may require up to 4 bytes.
One has exactly space for 11 bytes, the other for exactly 11 characters. Some charsets such as Unicode variants may use more than one byte per char, therefore the 11 byte field might have space for less than 11 chars depending on the encoding.
See also http://www.joelonsoftware.com/articles/Unicode.html
Depending on the system configuration, size of CHAR mesured in BYTES can vary. In your examples:
Limits field to 11 BYTE
Limits field to 11 CHARacters
Conclusion: 1 CHAR is not equal to 1 BYTE.
I am not sure since I am not an Oracle user, but I assume that the difference lies when you use multi-byte character sets such as Unicode (UTF-16/32). In this case, 11 Bytes could account for less than 11 characters.
Also those field types might be treated differently in regard to accented characters or case, for example 'binaryField(ete) = "été"' will not match while 'charField(ete) = "été"' might (again not sure about Oracle).
In simple words when you write NAME VARCHAR2(11 BYTE) then only 11 Byte can be accommodated in that variable.
No matter which characters set you are using, for example, if you are using Unicode (UTF-16) then only half of the size of Name can be accommodated in NAME.
On the other hand, if you write NAME VARCHAR2(11 CHAR) then NAME can accommodate 11 CHAR regardless of their character encoding.
BYTE is the default if you do not specify BYTE or CHAR
So if you write NAME VARCHAR2(4000 BYTE) and use Unicode(UTF-16) character encoding then only 2000 characters can be accommodated in NAME
That means the size limit on the variable is applied in BYTES and it depends on the character encoding that how many characters can be accommodated in that vraible.