Getting max length of a varchar(max) from syscolumns in sql server - sql

select c.name, t.name, c.length
from syscolumns c
c.length gives me -1 for any column that has max e.g varchar(max)
What should I do to get length ?

The data type of length on sys.columns is a smallint, whilst the max length of the varchar(max) is 2.1 billion, so it has a problem holding the real length. The -1 is in the documentation for denoting a varchar(max), varbinary(max), nvarchar(max) and xml.
http://msdn.microsoft.com/en-us/library/ms176106(v=sql.100).aspx
If you really need the number, then you would need a case statement to replace -1 with (2^31)-1
If you want to get the length of physical data, then you need to max / min / avg the appropriate lengths on the tables with the data on it based on what you need that information for. When querying the length of the field, DATALENGTH returns the bytes used, LEN returns the characters count.

-1 means that the column is of type max. The max length is then the max type, as per documentation. MAX types have a maximum length of 2GB if the FILESTREAM attribute is not specified, or a max size limited only by the disk size available:
The sizes of the BLOBs are limited only by the volume size of the file
system. The standard varbinary(max) limitation of 2-GB file sizes does
not apply to BLOBs that are stored in the file system.
Therefore your question really doesn't have an answer. You can ask what is the actual size of any actual in the table value, using DATALENGTH.

As seen HERE:
Variable-length, non-Unicode character data. n can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes. The storage size is the actual length of data entered + 2 bytes. The data entered can be 0 characters in length. The ISO synonyms for varchar are char varying or character varying.
In other words, max = 2147483647 bytes if all the possible space is occupied..

The length of the column in each row could vary. Hence the result of -1

Related

Numeric Data Type - Storage

According to Microsoft Site a data with type Numeric(10,2) - 10 means precision should have 9 bytes.
But when I'm doing this:
DECLARE #var as numeric(10,0) = 2147483649
SELECT #var, DATALENGTH(#var)
DATALENGTH(#var) is returning 5 bytes instead of 10. Can someone explain me why?
The documentation specifies:
Maximum storage sizes vary, based on the precision.
The storage is not constant for a given precision. The actual storage depends on the value.
As a note, this has nothing to do with integerness. The following also returns 5:
declare #var numberic(11, 1) = 214483649.8
In actual fact, SQL Server seems to use the amount of storage needed for the value, not for the maximum value of the type. You can readily see this by changing the "10" to "20" and noting that the data length does not change.
EDIT:
You can see the dependence on the value if you run:
declare #a numeric(20, 1) = '123.1';
declare #b numeric(20, 1) = '1234567890123456789.0';
select datalength(#a), datalength(#b);
The two lengths are not the same.
The other answer, by #GordonLinoff is wrong, or at least misleading.
Numeric is not stored with a variable number of bytes, but with a fixed size for a specific precision.
Trying this on SQL Server 2017 gave the same results you got.
The documentation you linked to originally, for numeric, is correct about how many bytes it takes to store a numeric of varying precisions.
This storage requirement is based only on the precision of the numeric column. In other words, that's how many bytes of storage are used. It is not a maximum that depends on the value in that row.
All rows use the same number of bytes for that column.
The key to this variation is the documentation for DATALENGTH says this function
Returns the number of bytes used to represent any expression.
It appears that DATALENGTH goes not mean 'represent' as in 'represent' on disk, but rather 'represent' in memory.
The other documenation regarding numeric is talking about the on-disk storage of numeric.
This is probably because DATALENGTH is intended primarily for var* types or the other BLOB types.
So although a numeric(20,1) requires 13 bytes of storage, depending on the value, SQL Server can represent it in a smaller number of bytes when in memory, which is when DATALENGTH evaluates it.
As I pointed out in my other comment, although numeric has different sizes, it a fixed size data type, because for a specific column in a specific table, every values takes up the same amount of storage.
Roughly, a SQL Server row has 4 parts:
4 byte header
Fixed size data
Offsets into variable size data
Variable size data
Numerics & other fixed size types are stored in 2, var* are stored in 4, with lengths in 3.
This script displays the metadata for a table with some fixed & variable columns.
declare #a numeric(20, 1) = '123.1';
declare #b numeric(20, 1) = '1234567890123456789.0';
select datalength(#a) union select datalength(#b);
create table #numeric(num1 numeric(20,1), text1 varchar(10), char2 char(6));
insert into #numeric(num1, text1, char2) values ('123.1', 'hello', 'first'), ('1234567890123456789.0', 'there', '2nd');
select datalength(num1) from #numeric;
select
t.name as table_name,
c.name as column_name,
pc.partition_column_id,
pc.max_inrow_length,
pc.max_length,
pc.precision,
pc.scale,
pc.collation_name,
pc.leaf_offset
from tempdb.sys.tables as t
join tempdb.sys.partitions as p
on(t.object_id=p.object_id)
join tempdb.sys.system_internals_partition_columns as pc
on(pc.partition_id=p.partition_id)
join tempdb.sys.columns as c
on((c.object_id=p.object_id)and(c.column_id=pc.partition_column_id))
where (t.object_id=object_id('tempdb..#numeric'));
drop table #numeric;
Notice the leaf_offset column. This indicates the starting position of the value in the raw binary data.
The first column starts immediately after the 4 byte header.
The second fixed column starts 13 bytes later, as per the SQL documentation.
The varchar column has an offset of -1, indicating it is a variable length column & it's position in the byte array isn't fixed.
In this case it could be fixed since there's only 1 var column, but an alter table statement could add another column & shift things.
If you want to research further, the best source is a book called SQL Server Internals, by Kalen Delaney. She was part of the team that wrote SQL Server.

SQL Server size difference for a column

I have table in SQL Server say "Temp" and it has Addr1, Addr2, Addr3, Addr4 columns and some additional columns also there.
These Addr1, Addr2, Addr3 and Addr4 are nvarchar type. when I check the size of this column by object explorer. it shows all of them in nvarchar(100).
But when I check them using Alt + F1. It shows the details in Result Pane with the length as 200. screenshot is below.
why there is different?
If I enter more than 100 characters, I'm getting truncation errors? seems like it taking only 100 characters.
can you please let me know what is the length value specifies ?
Thanks,
Prakash.
Because the size listed in Object Explorer is number of characters and the size listed in the result of your query to sp_help is number of bytes.
VARCHAR values in SQL use 1 byte per character, whereas NVARCHAR values use 2 bytes per character. Both also need a 2 byte overhead - see below. So because you are looking at NVARCHAR columns, these need 200 (well actually 202) bytes to store 100 characters, where a VARCHAR would only require 100 (really 102).
References:
MSDN: char and varchar
The storage size is the actual length of the data entered + 2 bytes.
MSDN: nchar and nvarchar:
The storage size, in bytes, is two times the actual length of data entered + 2 bytes.
(emphasis mine)
MSDN: sp_help:
Reports information about a database object (any object listed in the sys.sysobjects compatibility view), a user-defined data type, or a data type.
/------------------------------------------------------------------------\
| Column name | Data type | Description |
|-------------+-----------+----------------------------------------------|
| Length | smallint | Physical length of the data type (in bytes). |
\------------------------------------------------------------------------/

What data type to be used for description of an object?

A table has a field of description in sql server. If I take varchar then its maximum limit is 8000 characters but a description can be larger than that. I am using this field for a job description which are normally lengthy .
A varchar(max) has no (effective) limit. Use that.
Use VARCHAR(MAX). It can store more than 8k chars. Actually it can store up to 2GB
varchar [ ( n | max ) ]
Variable-length, non-Unicode string data. n defines the string length and can be a value from 1 through 8,000. MAX indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size is the actual length of the data entered + 2 bytes. The ISO synonyms for varchar are char varying or character varying.
Read more at http://msdn.microsoft.com/en-us/library/ms176089.aspx
Please note, that the TEXT and NTEXT types can store nearly the same amount of data, but they are obsolete and you can do less operations on them. Use VARCHAR(MAX) and NVARCHAR(MAX) instead of TEXT and NTEXT.
You can use
varchar(max) or Text data type.

What does the specified number mean in a VARCHAR() clause?

Just to clarify, by specifying something like VARCHAR(45) means it can take up to max 45 characters? I remember I heard from someone a few years ago that the number in the parenthesis doesn't refer to the number of characters, then the person tried to explain to me something quite complicated which I don't understand and forgot already.
And what is the difference between CHAR and VARCHAR? I did search around a bit and see that CHAR gives you the max of the size of the column and it is better to use it if your data has a fixed size and use VARCHAR if your data size varies.
But if it gives you the max of the size of the column of all the data of this column, isn't it better to use it when your data size varies? Especially if you don't know how big your data size is going to be. VARCHAR needs to specify the size (CHAR don't really need right?), isn't it more troublesome?
You also have to specify the size with CHAR. With CHAR, column values are padded with spaces to fill the size you specified, whereas with VARCHAR, only the actual value you specified is stored.
For example:
CREATE TABLE test (
char_value CHAR(10),
varchar_value VARCHAR(10)
);
INSERT INTO test VALUES ('a', 'b');
SELECT * FROM test;
The above will select "a " for char_value and "b" for varchar_value
If all your values are about the same size, the CHAR is possibly a better choice because it will often require less storage space than VARCHAR. This is because VARCHAR stores both the length of the value and the value itself, whereas CHAR can just store the (fixed-size) value.
The MySQL documentation gives a good explanation of the storage requirements of the various data types.
In particular, for a string of length L, a CHAR(M) datatype will take up (M x c) bytes (where c is the number of bytes required to store a character... this depends on the character set in use).
A VARCHAR(M) will take up (L + 1) or (L + 2) depending on whether M is <=255 or >255.
So, it really depends on how long you expect your strings to be, what the variation in length will be.
NB: The documetation doesn't discuss the impact of character sets on the storage requirements of a VARCHAR type. I've tried to quote it accurately, but my guess is that you would need to multiply the string length by the character byte-width as well to get the storage requirement.
The complicated stuff you don't remember is that the 45 refer to bytes, not chars. It's not the same if you are using a multibyte character encoding. In Oracle you can specify bytes or chars explicitly.
varchar2(45 BYTE)
or
varchar2(45 CHAR)
See Difference between BYTE and CHAR in column datatypes
char and varchar actually becomes irrelevant if you have just 1 variable length field in your table, like a varchar or text. Mysql will automatically change all char to varchar.
The fixed length/size record can give you extra performance, but you can't use any variable length field types. The reason is that it will be quicker and easier for mysql to find the next record.
For example, if you do a SELECT * FROM table LIMIT 10, mysql has to scan the table file for the tenth record. This means finding the end of each record until you find the end of the 10th record. But if your table has fixed length/size records, mysql just needs to know the record size and then skip 10 x #bytes.
If you know a column will contain a small, fixed number of chars use a CHAR, otherwise use a varchar. A CHAR column is padded to the max length.
VARCHAR has a small overhead (4-8 bytes depending on RDBMS), but only uses the overhead + the actual number of chars stored.
For the values you know they are going to be constant, for example for Phone Numbers, Zip Codes etc., It is optimal to use "char" for sure.

Which data type saves more space TINYTEXT or VARCHAR for variable data length in MySQL?

I need to store a data into MySQL. Its length is not fixed, it could be 255 or 2 characters long. Should I use TINYTEXT or VARCHAR in order to save space (speed is irrelevant)?
When using VARCHAR, you need to specify maximum number of characters that will be stored in that column. So, if you declare a column to be VARCHAR(255), it means that you can store text with up to 255 characters. The important thing here is that if you insert two characters, only those two characters will be stored, i.e. allocated space will be 2 not 255.
TINYTEXT is one of four TEXT types. They are very similar to VARCHAR, but there are few differences (this depends on MySQL version you are using though). But for version 5.5, there are some limitations when it comes to TEXT types. First one is that you have to specify an index prefix length for indexes on TEXT. And the other one is that TEXT columns can't have default values.
In general, TEXT should be used for (extremely) long values. If you will be using string that will have up to 255 characters, then you should use VARCHAR.
Hope that this helps.
As for data storage space, VARCHAR(255) and TINYTEXT are equivalent:
VARCHAR(M): L + 1 bytes if column values require 0 – 255 bytes, L + 2 bytes if values may require more than 255 bytes.
TINYTEXT: L + 1 bytes, where L < 28.
Source: MySQL Reference Manual: Data Storage Requirements.
Storage space being equal, you may want to check out the following Stack Overflow posts for further reading on when you should use one or the other:
What’s the difference between VARCHAR(255) and TINYTEXT string types in MySQL?
varchar(255) v tinyblob v tinytext