text or varchar? - sql

I have 2 columns containing text one will be max 150 chars long and the other max 700 chars long,
My question is, should I use for both varchar types or should I use text for the 700 chars long column ? why ?
Thanks,

The varchar data type in MySQL < 5.0.3 cannot hold data longer than 255 characters. While in MySQL >= 5.0.3 it has a maximum of 65,535 characters.
So, it depends on the platform you're targeting, and your deployability requirements. If you want to be sure that it will work on MySQL versions less than 5.0.3, go with a text type data field for your longer column
http://dev.mysql.com/doc/refman/5.0/en/char.html

An important consideration is that with varchar the database stores the data directly in the table, and with text the database stores a pointer to a separate tablespace in which the data is stored. So, unless you run into the limit of a row length (64K in MySQL, 4-32K in DB2, 8K in SQL Server 2000) I would normally use varchar.

Not sure about mysql specifically, but in MS SQL you definitely should use a VARCHAR for anything under 8000 characters long, if you want to be able to run any sort of comparison on the value in the field. For example this would be possible with a VARCHAR:
select your_column from your_table
where your_column like '%dogs%'
but not with a TEXT field.
More information regarding TEXT field in mysql 5.4 can be found here and more information about the VARCHAR field can be found here.

I'd pick the option based on the type of data being entered. For example, if it's a (potentially) super long username and a biography, I'd do varchar and text. Sure, you're limiting the bio to 700 chars, but what if you add HTML formatting down the road, keeping the 700 char limit but allowing HTML tags for formatting?
Twitter may use text for their tweets, since it could be quicker to add meta data to the posts (e.g. url href's and #user PK's) to cache the additional data instead of caluclate it every rendering.

Related

sql server data length [duplicate]

What is the best way to store a large amount of text in a table in SQL server?
Is varchar(max) reliable?
In SQL 2005 and higher, VARCHAR(MAX) is indeed the preferred method. The TEXT type is still available, but primarily for backward compatibility with SQL 2000 and lower.
I like using VARCHAR(MAX) (or actually NVARCHAR) because it works like a standard VARCHAR field. Since it's introduction, I use it rather than TEXT fields whenever possible.
Varchar(max) is available only in SQL 2005 or later. This will store up to 2GB and can be treated as a regular varchar. Before SQL 2005, use the "text" type.
According to the text found here, varbinary(max) is the way to go. You'll be able to store approximately 2GB of data.
Split the text into chunks that your database can actually handle. And, put the split up text in another table. Use the id from the text_chunk table as text_chunk_id in your original table. You might want another column in your table to keep text that fits within your largest text data type.
CREATE TABLE text_chunk (
id NUMBER,
chunk_sequence NUMBER,
text BIGTEXT)
In a BLOB
BLOBs are very large variable binary or character data, typically documents (.txt, .doc) and pictures (.jpeg, .gif, .bmp), which can be stored in a database. In SQL Server, BLOBs can be text, ntext, or image data type, you can use the text type
text
Variable-length non-Unicode data, stored in the code page of the server, with a maximum length of 231 - 1 (2,147,483,647) characters.
Depending on your situation, a design alternative to consider is saving them as .txt file to server and save the file path to your database.
Use nvarchar(max) to store the whole chat conversation thread in a single record. Each individual text message (or block) is identified in the content text by inserting markers.
Example:
{{UserId: Date and time}}<Chat Text>.
On display time UI should be intelligent enough to understand this markers and display it correctly. This way one record should suffice for a single conversation as long as size limit is not reached.

SQL Server NText field limited to 43,679 characters?

I working with SQL Server data base in order to store very long Unicode string. The field is from type 'ntext', which theoretically should be limit to 2^30 Unicode characters.
From MSDN documentation:
ntext
Variable-length Unicode data with a maximum string length of 2^30 - 1 (1,073,741,823) bytes. Storage size, in bytes, is two times the string length that is entered. The ISO synonym for ntext is national
text.
I'm made this test:
Generate 50,000 characters string.
Run an Update SQL statement
UPDATE [table]
SET Response='... 50,000 character string...'
WHERE ID='593BCBC0-EC1E-4850-93B0-3A9A9EB83123'
Check the result - what actually stored in the field at the end.
The result was that the field [Response] contain only 43,679 characters. All the characters at the end of the string was thrown out.
Why this happens? How I can fix this?
If this is really the capacity limit of this data type (ntext), which another data type can store longer Unicode string?
Based on what I've seen, you may just only be able to copy 43679 characters. It is storing all the characters, they're in the db(check this with Select Len(Reponse) From [table] Where... to verify this), and SSMS has problem copying more than when you go to look at the full data.
NTEXT datatype is deprecated and you should use NVARCHAR(MAX).
I see two possible explanations:
Your ODBC driver you use to connect to database truncate parameter value when it is too long (try using SSMS)
You write you generate your input string. I suspect you generate CHAR(0) which is Null literal
If second is your case make sure you cannot generate \0 char.
EDIT:
I don't know how you check the length but keep in mind that LEN does not count trailing whitespaces
SELECT LEN('aa ') AS length -- 2
,DATALENGTH('aa ') AS datalength -- 7
Last possible solution I see you do sth like:
SELECT 'aa aaaa'
-- result in SSMS `aa aaaa`: so when you count you lose all multiple whitespaces
Check query below if returns 100k:
SELECT DATALENGTH(ntext_column)
For all bytes; Grid result on right click and click save result to file.
Can confirm. The actual limit is 43679. Had a problem with a subscription service for a week now. Every data looked good, but it still gave us an error that one of the fields have invalid values, even tho, it got correct values in. It turned out that the parameters was stored in NText and it maxed out at 43679 characters. And because we cannot change the database design, we had to make 2 different subscriptions for the same thing and put half of the entities to the other one.

Text datatype in SQL got trimmed to 8000 characters

In SQL I have a table with around 2000 rows and i store XML's in a column. The data type of the column is text. Around 80 rows are currently trimmed off to exact 8000 characters. There are values in that column which has more than 8000 characters.what could be the issue.
8000 characters is actually the maximum amount of characters that SSMS will display (by default), it will return to your front end the full amount - at least in the case of varchar(max). I think that is the case with text as well, but I don't use text type anymore :P
It should be noted, that you shouldn't be using the text data type for sql server, as it is in the process of deprecation.
See: https://msdn.microsoft.com/en-us/library/ms143729.aspx for information on texts deprecation.

SQL - Numeric data type with leading zeros

I need to store Medicare APC codes. I believe the format requires 4 numbers. Leading zeros are relevant. Is there any way to store this data type with verification? How should I store this data (varchar(4), int)?
This kind of issue, storing zero leading numbers that need to be treated as Numeric values on some scenarios (i.e. sorting) and as textual values in others (i.e. addresses) is always a pain and there is no one answer that is best for all users. At my company we have a database that stores numbers as text for codes (not Medicare APC codes) and we must pad them with zero’s so they will sort properly when used in an order operation.
Do not use a numeric data type for this because the item is not a true number but textual data that uses numeric characters. You will not be performing any calculations or aggregates on the codes and so the only benefit to storing them as a number would be to ensure proper sorting of the codes and that can be done with the code stored as text by padding it with zeros where needed. If you sue a numeric data type then any time the code is combined with other textual values you will have to explicitly convert it to CHAR/VARCHAR or let SQL Server do it since implicit conversions should always be avoided that means a lot of extra work for you and the query processor any time the code is used.
Assuming you decide to go with a textual data type the question then is should you use VARCHAR or CHAR and while many who have posted say VARCHAR I would suggest you go with CHAR set to a length of 4. WHY?
The VARCHAR data type is for textual data in which the size (the length or number of characters) is unknown in advance. For this Medicare code we know the length will always be at least 4 and possibly no more than 4 for the foreseeable future. SQL Server handles the storage of the data differently between CHAR and VARCHAR. SQL Server’s BOL (Books On Line) says :
Use CHAR when the size of the column data entries are consistent
Use VARCHAR when the size of the column data varies considerably.
I can’t say for certain this is true for SQL Server 2008 and up but for earlier versions, the use of a VARCHAR data type carries an extra overhead of 1 byte per row of data per column in a table that has a VARCHAR data type. If the data stored is always the same size and in your scenario it sounds like it is then this extra byte is a waste.
In the end it’s up to you as to whether you like CHAR or VARCHAR better but definitely don’t use a numeric data type to store a fixed length code.
That's not numeric data; it's textual data that happens to contain digits.
Use a VARCHAR.
I agree, using
CHAR(4)
for the check constraint use
check( APC_ODE LIKE '[0-9][0-9][0-9][0-9]' )
This will force a 4 digit number only to be accepted...
varchar(4)
optionally, you can still add a check constraint to ensure the data is numeric with leading zeros. This example will throw exceptions in Oracle. In other RDBMS, you could use regular expression checks:
alter table X add constraint C
check (cast(APC_CODE as int) = cast(APC_CODE as int))
If you are certain that the APC codes will always be numeric (that is if it wouldn't change in the near future), a better way would be to leave the database column as is, and handle the formatting (to include leading zeros) at places where you use this field values.
If you need leading 0s, then you must use a varchar or other string data type.
There are ways to format the output for leading 0s without compromising your actual data.
See this blog entry for an easy method.
CHAR(4) seems more appropriate to me (if I understood you right, and the code is always 4 digits).
What you want to use is a VARCHAR data type with a CHECK constraint, using LIKE with a pattern to check for numeric values.
in TSQL
check( isnumeric(APC_ODE) = 1)

Why does VARCHAR need length specification?

Why do we always need to specify VARCHAR(length) instead of just VARCHAR? It is dynamic anyway.
UPD: I'm puzzled specifically by the fact that it is mandatory (e.g. in MySQL).
The "length" of the VARCHAR is not the length of the contents, it is the maximum length of the contents.
The max length of a VARCHAR is not dynamic, it is fixed and therefore has to be specified.
If you don't want to define a maximum size for it then use VARCHAR(MAX).
First off, it does not needed it in all databases. Look at SQL Server, where it is optional.
Regardless, it defines a maximum size for the content of the field. Not a bad thing in itself, and it conveys meaning (for example - phone numbers, where you do not want international numbers in the field).
You can see it as a constraint on your data. It ensures that you don't store data that violates your constraint. It is conceptionally similar to e.g. a check constraint on a integer column that ensure that only positive values are entered.
The more the database knows about the data it is storing, the more optimisations it can make when searching/adding/updating data with requests.
The answer is you don't need to, it's optional.
It's there if you want to ensure that strings do not exceed a certain length.
From Wikipedia:
Varchar fields can be of any size up
to the limit. The limit differs from
types of databases, an Oracle 9i
Database has a limit of 4000 bytes, a
MySQL Database has a limit of 65,535
bytes (for the entire row) and
Microsoft SQL Server 2005 8000 bytes
(unless varchar(max) is used, which
has a maximum storage capacity of
2,147,483,648 bytes).
The most dangerous thing for programmers, as #DimaFomin pointed out in comments, is the default length enforced, if there is no length specified.
How SQL Server enforces the default length:
declare #v varchar = '123'
select #v
result:
1