in sql,How does fixed-length data type take place in memory? - sql

I want to know in sql,how fixed-length data type take places length in memory?I know is that for varchar,if we specify length is (20),and if user input length is 15,it takes 20 by setting space.for varchar2,if we specify length is (20),and if user input is 15,it only take 15 length in memory.So how about fixed-length data type take place?I searched in Google,but I did not find explanation with example.Please explain me with example.Thanks in advance.

A fixed length data field always consumes its full size.
In the old days (FORTRAN), it was padded at the end with space characters. Modern databases might do that too, but either implicitly trim trailing blanks off or the query might have to do it explicitly.
Variable length fields are a relative newcomer to databases, probably in the 1970s or 1980s they made widespread appearances.
It is considerably easier to manage fixed length record offsets and sizes rather than compute the offset of each data item in a record which has variable length fields. Furthermore, a fixed length data record is easily addressed in a data file by computing the byte offset of its beginning by multiplying the record size times the record number (and adding the length of whatever fixed header data is at the beginning of file).

Related

Should I define a column type from actual length or nth power of 2(Sql Server )?

Should I define a column type from actual length to nth power of 2?
The first case, I have a table column store no more than 7 charactors,
will I use NVARCHAR(8)? since there maybe implicit convert inside Sql
server, allocate 8 space and truncate automatic(heard some where).
If not, NCHAR(7)/NCHAR(8), which should be(assume the fixed length is 7)
Any performance differ on about this 2 cases?
You should use the actual length of the string. Now, if you know that the value will always be exactly 7 characters, then use CHAR(7) rather than VARCHAR(7).
The reason you see powers-of-2 is for columns that have an indeterminate length -- a name or description that may not be fixed. In most databases, you need to put in some maximum length for the varchar(). For historical reasons, powers-of-2 get used for such things, because of the binary nature of the underlying CPUs.
Although I almost always use powers-of-2 in these situations, I can think of no real performance differences. There is one. . . in some databases the actual length of a varchar(255) is stored using 1 byte whereas a varchar(256) uses 2 bytes. That is a pretty minor difference -- even when multiplied over millions of rows.

Strings or integers for Steam IDs

What is the preferred datatype for storing Steam IDs? These IDs are very similar to credit card numbers, but is different cases of use. Until now I'm using unsigned big integer but I'm not 100% sure yet. If the ID starts with a zero number, can cause issues? Eg ID: 76561197960287930
In general number take less space on the disk to store and on the transfer from the database to the application compared to strings. They are for the same reason faster to compare e.g. in the where-clause of a query.
Have a look here for the bytes needed to store numbers and bytes to store strings.
In the database the numbers are stored without leading zeros. You could fill up your numbers with leading zeros in your application after loading them from the database, if the numbers always have a fixed size.
But if the numbers can have leading zeros strings are easier to handle, because you do not have to implement additional logic for edgecases like leading zeros.

Memory reserved according to the defined field size or just the size of the data inside?

In HANA, there's a column of type NVARCHAR(4000) with value ThisISaString, is the RAM that is being used = 4000 or 13?
If it reserves 4000, then this space could really add up when you have a lot of records.
I am trying to decide how big I should make my text fields.
What I make of your question in its current form is how SAP HANA handles variable length strings when it comes to presenting it to the client (I take from your intention to reserve a buffer.
Thus, I'm not going to discuss what happens inside of HANA when you enter a value into a table - this is rather complex and depends on the table type used (column, row, external, temporary...)
So, for the client application, a (N)VARCHAR
will result in a string with the length of the stored value, i.e. no padding (with spaces at the end) will happen.

Is varchar(128) better than varchar(100)

Quick question. Does it matter from the point of storing data if I will use decimal field limits or hexadecimal (say 16,32,64 instead of 10,20,50)?
I ask because I wonder if this will have anything to do with clusters on HDD?
Thanks!
VARCHAR(128) is better than VARCHAR(100) if you need to store strings longer than 100 bytes.
Otherwise, there is very little to choose between them; you should choose the one that better fits the maximum length of the data you might need to store. You won't be able to measure the performance difference between them. All else apart, the DBMS probably only stores the data you send, so if your average string is, say, 16 bytes, it will only use 16 (or, more likely, 17 - allowing 1 byte for storing the length) bytes on disk. The bigger size might affect the calculation of how many rows can fit on a page - detrimentally. So choosing the smallest size that is adequate makes sense - waste not, want not.
So, in summary, there is precious little difference between the two in terms of performance or disk usage, and aligning to convenient binary boundaries doesn't really make a difference.
If it would be a C-Program I'd spend some time to think about that, too. But with a database I'd leave it to the DB engine.
DB programmers spent a lot of time in thinking about the best memory layout, so just tell the database what you need and it will store the data in a way that suits the DB engine best (usually).
If you want to align your data, you'll need exact knowledge of the internal data organization: How is the string stored? One, two or 4 bytes to store the length? Is it stored as plain byte sequence or encoded in UTF-8 UTF-16 UTF-32? Does the DB need extra bytes to identify NULL or > MAXINT values? Maybe the string is stored as a NUL-terminated byte sequence - then one byte more is needed internally.
Also with VARCHAR it is not neccessary true, that the DB will always allocate 100 (128) bytes for your string. Maybe it stores just a pointer to where space for the actual data is.
So I'd strongly suggest to use VARCHAR(100) if that is your requirement. If the DB decides to align it somehow there's room for extra internal data, too.
Other way around: Let's assume you use VARCHAR(128) and all things come together: The DB allocates 128 bytes for your data. Additionally it needs 2 bytes more to store the actual string length - makes 130 bytes - and then it could be that the DB aligns the data to the next (let's say 32 byte) boundary: The actual data needed on the disk is now 160 bytes 8-}
Yes but it's not that simple. Sometimes 128 can be better than 100 and sometimes, it's the other way around.
So what is going on? varchar only allocates space as necessary so if you store hello world in a varchar(100) it will take exactly the same amount of space as in a varchar(128).
The question is: If you fill up the rows, will you hit a "block" limit/boundary or not?
Databases store their data in blocks. These have a fixed size, for example 512 (this value can be configured for some databases). So the question is: How many blocks does the DB have to read to fetch each row? Rows that span several block will need more I/O, so this will slow you down.
But again: This doesn't depend on the theoretical maximum size of the columns but on a) how many columns you have (each column needs a little bit of space even when it's empty or null), b) how many fixed width columns you have (number/decimal, char), and finally c) how much data you have in variable columns.

[My]SQL VARCHAR Size and Null-Termination

Disclaimer: I'm very new to SQL and databases in general.
I need to create a field that will store a maximum of 32 characters of text data. Does "VARCHAR(32)" mean that I have exactly 32 characters for my data? Do I need to reserve an extra character for null-termination?
I conducted a simple test and it seems that this is a WYSIWYG buffer. However, I wanted to get a concrete answer from people who actually know what they're doing.
I have a C[++] background, so this question is raising alarm bells in my head.
Yes, you have 32 characters at your disposal. SQL does not concern itself with nul terminated strings like some programming languages do.
Your VARCHAR specification size is the max size of your data, so in this case, 32 characters. However, VARCHARS are a dynamic field, so the actual physical storage used is only the size of your data, plus one or two bytes.
If you put a 10-character string into a VARCHAR(32), the physical storage will be 11 or 12 bytes (the manual will tell you the exact formula).
However, when MySQL is dealing with result sets (ie. after a SELECT), 32 bytes will be allocated in memory for that field for every record.