Assigning an empty binary value to a varbinary(MAX) column creates a column 8000 bytes long - sql

I have a table with a varbinary(max) column, i am trying to assign to that column a zero-lengh binary buffer, but instead of getting a zero-length value in the table, i am getting an 8000 bytes long value filled with zeros:
* the dataSize column in the shown query was added using DATALENGHT(data) ("SELECT _index, dataSize=DATALENGHT(data), data FROM....") and shows the actual size on the table of the value
Where does the 8000 bytes long empty buffer come from? is this some kind of default behavior?

If your source column is binary(8000), then DATALENGTH(data) will return 8000 (it is fully padded) and data will contain the full 8000 bytes.
But since you are using
SELECT _index, dataSize=DATALENGTH(data), data FROM
It cannot be a binary(8000) column - because a fixed size column will report the same datalength for all rows. It is likely some data was copied there from a BINARY(8000) variable or other means some time in the past.

Related

database padding space if the value inserted has smaller length than column size- DB2

I was checking if DB pads spaces in a column if the inserted string has fewer characters than the designated length of the column. Example:
lets say size of <column1> is 10 but the value entered is abc - then is it abc_______ which the DB stores where _ represents spaces?
I am asking because I used LTRIM-RTRIM while INSERTing the values and on again fetching the value in the very next minute I got the result as abc_______.
You are using the CHAR or CHARACTER datatype for the column. The CHAR or CHARACTER datatype is a fixed length datatype and is padded with space at the end of the value to fill the column size.
You can use VARCHAR to avoid the padding with spaces at the end of the values.
Note: Make sure you are using CHARACTER_LENGTH on CHARACTER columns to get the correct character length (without padding spaces). The result of LENGTH also includes the padding spaces.
demo on dbfiddle.uk

need help for sql server

When I save data byte[] in SQL Server the value change and add 0x0 in the first of value
the correct value (0xFFD8FFE000104A46)
the incorrect value (0x0FFD8FFE000104A46494600010102004C)
0xF and 0x0F are the same number, both are hexadecimal notations of number 15 decimal. A byte contains two hexadecimal 'digits'. If the left most digit is 0, it doesn't affect the value, just like zero-hundred and fifteen is the same as fifteen. The notation with the leading 0 just prints all the bytes, the one without strips the leading zeros.
Where the 494600010102004C part is coming from I don't know.

MS Access 2010 SQL query is rounding automatically to whole numbers

All,
I'm running the SQL query below in MS Access 2010. Everything works fine except that the "a.trans_amt" column is rounded to a whole number (i.e. the query returns 12.00 instead of 12.15 or 96.00 instead of 96.30). Any ideas? I'd like it to display 2 decimal points. I tried using the ROUND function but didn't have any success.
Thanks!
INSERT INTO [2-Matched Activity] ( dbs_eff_date, batch_id_r1, jrnl_name,
ledger, entity_id_s1, account_s2, intercompany_s6, trans_amt,
dbs_description, icb_name, fdt_key, combo )
SELECT a.dbs_eff_date,
a.batch_id_r1,
a.jrnl_name,
a.ledger,
a.entity_id_s1,
a.account_s2,
a.intercompany_s6,
a.trans_amt,
a.dbs_description,
a.icb_name,
a.fdt_key,
a.combo
FROM [1-ICB Daily Activity] AS a
INNER JOIN
(
SELECT
b.dbs_eff_date,
b.batch_id_r1,
b.jrnl_name,
sum(b.trans_amt) AS ["trans_amt"],
b.icb_name
FROM [1-ICB Daily Activity] AS b
GROUP BY dbs_eff_date, batch_id_r1, jrnl_name, icb_name
HAVING sum(trans_amt) = 0
) AS b
ON (a.dbs_eff_date = b.dbs_eff_date) AND (a.batch_id_r1 = b.batch_id_r1) AND
(a.jrnl_name = b.jrnl_name) AND (a.icb_name = b.icb_name);
Essentially, you are attempting to append decimal precise values to an integer column. While MS Access does not raise a type exception it will implicitly reduce precision to fit the destination storage. To avoid these undesired results, set the precision type ahead of time.
According to MSDN docs, the MS Access database engine maintains the following numeric types:
REAL 4 bytes A single-precision floating-point value with a range of ...
FLOAT 8 bytes A double-precision floating-point value with a range of ...
SMALLINT 2 bytes A short integer between – 32,768 and 32,767.
INTEGER 4 bytes A long integer between – 2,147,483,648 and 2,147,483,647.
DECIMAL 17 bytes An exact numeric data type that holds values ...
And the MS Access GUI translates these as Field Sizes in table design interface where the default format of Number is Long Integer type.
Byte — For integers that range from 0 to 255. Storage requirement is a single byte.
Integer — For integers that range from -32,768 to +32,767. Storage requirement is two bytes.
Long Integer — For integers that range from -2,147,483,648 to +2,147,483,647 ...
Single — For numeric floating point values that range from -3.4 x 1038 to ...
Double — For numeric floating point values that range from -1.797 x 10308 to ...
Replication ID — For storing a GUID that is required for replication...
Decimal — For numeric values that range from -9.999... x 1027 to +9.999...
Therefore, in designing your database, schema, and tables, select the appropriate values to accommodate your needed precision. If not using the MS Access GUI program, you can define type in a DDL command:
CREATE TABLE [2-Matched Activity] (
...
trans_amt DOUBLE,
...
)
If table already exists consider altering design with another DDL command.
ALTER TABLE [2-Matched Activity] ALTER COLUMN trans_amt DOUBLE
Do note: if you run CREATE and ALTER commands in Query Design window, no prompts or confirmation will occur but changes will render.

nvarchar(4001)?

MSDN has this to say on the subject:
nvarchar [ ( n | max ) ]
Variable-length Unicode character data. ncan be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes. The storage size, in bytes, is two times the number of characters entered + 2 bytes. The data entered can be 0 characters in length. The ISO synonyms for nvarchar are national char varying and national character varying.
This leaves me confused. I can define a column as being 1 - 4000 long, or 2147483647 long but nothing inbetween? Is my understanding correct? Why can't I be explicit about values inbetween?
NVARCHAR(MAX) covers everything else (not just 2 billion characters). If you need more than 4,000 characters the data is most certainly going to be off-page, so as far as behavior is concerned it doesn't matter if you've used 4,001 characters, 10,000 characters, or 10,000,000 characters. It only occupies the space you need, so don't think that you are wasting (2 billion characters - the length of your actual string).
Max will accept values between 4001 and 1073741823 (bear in mind storage size is approx 2x the length of the actual string).
The restriction is basically that anything over 4000 characters must be a MAX.
Because 4000 characters or less has one behavior in terms of storage and MAX has another behavior in terms of storage. And you really don't want to start forcing string length calculations on things that are 1M characters long do you? My current understanding is that up to 4000 characters is stored in-table and MAX is stored out-of-table.
Also NVARCHAR(MAX) and VARCHAR(MAX) are replacements for text and ntext.

Why & When should I use SPARSE COLUMN? (SQL SERVER 2008)

After going thru some tutorials on SQL Server 2008's new feature "SPARSE COLUMN", I have found that it doesn't take any space if the column value is 0 or NULL but when there is a value, it takes 4 times the space a regular(non sparse) column holds.
If my understanding is correct, then why I will go for that at the time of database design?
And if I use that, then at what situation will I be?
Also out of curiosity, how does no space get reserved when a column is defined as sparse column (I mean to say, what is the internal implementation for that?)
A sparse column doesn't use 4x the amount of space to store a value, it uses a (fixed) 4 extra bytes per non-null value. (As you've already stated, a NULL takes 0 space.)
So a non-null value stored in a bit column would be 1 bit + 4 bytes = 4.125 bytes. But if 99% of these are NULL, it is still a net savings.
A non-null value stored in a GUID (UniqueIdentifier) column is 16 bytes + 4 bytes = 20 bytes. So if only 50% of these are NULL, that's still a net savings.
So the "expected savings" depends strongly on what kind of column we're talking about, and your estimate of what ratio will be null vs non-null. Variable width columns (varchars) are probably a little more difficult to predict accurately.
This Books Online Page has a table showing what percentage of different data types would need to be null for you to end up with a benefit.
So when should you use a Sparse Column? When you expect a significant percentage of the rows to have a NULL value. Some examples that come to mind:
A "Order Return Date" column in an order table. You would hope that a very small percent of sales would result in returned products.
A "4th Address" line in an Address table. Most mailing addresses, even if you need a Department name and a "Care Of" probably don't need 4 separate lines.
A "Suffix" column in a customer table. A fairly low percent of people have a "Jr." or "III" or "Esquire" after their name.
Storing a null in a sparse column takes up no space at all.
To any external application the column will behave the same
Sparse columns work really well with filtered indexes as you will only want to create an index to deal with the non-empty attributes in the column.
You can create a column set over the sparse columns that returns an xml clip of all of the non-null data from columns covered by the set. The column set behaves like a column itself. Note: you can only have one column set per table.
Change Data Capture and Transactional replication both work, but not the column sets feature.
Downsides
If a sparse column has data in it it will take 4 more bytes than a normal column e.g. even a bit (0.125 bytes normally) is 4.125 bytes and unique identifier rises form 16 bytes to 20 bytes.
Not all data type can be sparse: text, ntext, image, timestamp, user-defined data type, geometry, or geography or varbinray (max) with the FILESTREAM attribute cannot be sparse. (Changed17/5/2009 thanks Alex for spotting the typo)
computed columns can't be sparse (although sparse columns can take part in a calculation in another computed column)
You can't apply rules or have default values.
Sparse columns cannot form part of a clustered index. If you need to do that use a computed column based on the sparse column and create the clustered index on that (which sort of defeats the object).
Merge replication doesn't work.
Data compression doesn't work.
Access (read and write) to sparse columns is more expensive, but I haven't been able to find any exact figures on this.
Reference
You're reading it wrong - it never takes 4x the space.
Specifically, it says 4* (4 bytes, see footnote), not 4x (multiply by 4). The only case where it's exactly 4x the space is a char(4), which would see savings if the NULLs exist more than 64% of the time.
"*The length is equal to the average of the data that is contained in the type, plus 2 or 4 bytes."
| datetime NULL | datetime SPARSE NULL | datetime SPARSE NULL |
|--------------------|----------------------|----------------------|
| 20171213 (8 bytes) | 20171213 (12 bytes) | 20171213 (12 bytes) |
| NULL (8 bytes) | 20171213 (12 bytes) | 20171213 (12 bytes) |
| 20171213 (8 bytes) | NULL (0 bytes) | NULL (0 bytes) |
| NULL (8 bytes) | NULL (0 bytes) | NULL (0 bytes) |
You lose 4 bytes not just once per row; but for every cell in the row that is not null.
From SQL SERVER – 2008 – Introduction to SPARSE Columns – Part 2 by Pinal Dave:
All SPARSE columns are stored as one XML column in database. Let us
see some of the advantage and disadvantage of SPARSE column.
Advantages of SPARSE column are:
INSERT, UPDATE, and DELETE statements can reference the sparse columns by name. SPARSE column can work as one XML column as well.
SPARSE column can take advantage of filtered Indexes, where data are filled in the row.
SPARSE column saves lots of database space when there are zero or null values in database.
Disadvantages of SPARSE column are:
SPARSE column does not have IDENTITY or ROWGUIDCOL property.
SPARSE column can not be applied on text, ntext, image, timestamp, geometry, geography or user defined datatypes.
SPARSE column can not have default value or rule or computed column.
Clustered index or a unique primary key index can not be applied SPARSE column. SPARSE column can not be part of clustered index key.
Table containing SPARSE column can have maximum size of 8018 bytes instead of regular 8060 bytes. A table operation which involves SPARSE
column takes performance hit over regular column.