Why does an int column values defaults to '0' when passed empty? - sql

I have table with an integer column.
CREATE TABLE [dbo].[tble1](
[id] [int] NOT NULL,
[test] [nchar](10) NULL
)
When I try to insert some values and pass an empty string to the id column like below, it gets inserted and the value of the id column is 0 by default.
INSERT INTO [dbo].[tble1]
([id],[test])
VALUES
('','a')
I couldn't find any satisfying reasoning behind it. Could some one please share your thoughts on this?

What is happening is that '' is being converted to an integer. The rules are that a string can be converted, based on the digit characters in the string.
If a string is empty, it gets converted to 0.
So, the conversion is happening at the very "top" level. The types don't match so SQL Server attempts an implicit conversion.
Unfortunately, the documentation is not really clear on the topic:
Character expressions that are being converted to an exact numeric
data type must consist of digits, a decimal point, and an optional
plus (+) or minus (-). Leading blanks are ignored. Comma separators,
such as the thousands separator in 123,456.00, are not allowed in the
string.
To be honest, I would interpret the "must consist of digits" as saying that there must be at least one digit (although technically in English "zero" is treated as a plural, I don't necessarily think of plurals as including zero elements). However, the empty string has been used -- pretty much for forever -- as a valid value for any type across a broad range of databases.

It will try to Convert ' ' to Integer and it got succeeded.
SELECT CONVERT(INT, '')
Output
0

You are getting defaulted value to 0, as you have NOT null, defined for the column,
if you keep ID as null, then it will put NULL,
Also, if you want to populate the value automatically then you set the identity for the column

Related

How to get data from sql server table. If values contains some special character [duplicate]

I have seen prefix N in some insert T-SQL queries. Many people have used N before inserting the value in a table.
I searched, but I was not able to understand what is the purpose of including the N before inserting any strings into the table.
INSERT INTO Personnel.Employees
VALUES(N'29730', N'Philippe', N'Horsford', 20.05, 1),
What purpose does this 'N' prefix serve, and when should it be used?
It's declaring the string as nvarchar data type, rather than varchar
You may have seen Transact-SQL code that passes strings around using
an N prefix. This denotes that the subsequent string is in Unicode
(the N actually stands for National language character set). Which
means that you are passing an NCHAR, NVARCHAR or NTEXT value, as
opposed to CHAR, VARCHAR or TEXT.
To quote from Microsoft:
Prefix Unicode character string constants with the letter N. Without
the N prefix, the string is converted to the default code page of the
database. This default code page may not recognize certain characters.
If you want to know the difference between these two data types, see this SO post:
What is the difference between varchar and nvarchar?
Let me tell you an annoying thing that happened with the N' prefix - I wasn't able to fix it for two days.
My database collation is SQL_Latin1_General_CP1_CI_AS.
It has a table with a column called MyCol1. It is an Nvarchar
This query fails to match Exact Value That Exists.
SELECT TOP 1 * FROM myTable1 WHERE MyCol1 = 'ESKİ'
// 0 result
using prefix N'' fixes it
SELECT TOP 1 * FROM myTable1 WHERE MyCol1 = N'ESKİ'
// 1 result - found!!!!
Why? Because latin1_general doesn't have big dotted İ that's why it fails I suppose.
1. Performance:
Assume your where clause is like this:
WHERE NAME='JON'
If the NAME column is of any type other than nvarchar or nchar, then you should not specify the N prefix. However, if the NAME column is of type nvarchar or nchar, then if you do not specify the N prefix, then 'JON' is treated as non-unicode. This means the data type of NAME column and string 'JON' are different and so SQL Server implicitly converts one operand’s type to the other. If the SQL Server converts the literal’s type
to the column’s type then there is no issue, but if it does the other way then performance will get hurt because the column's index (if available) wont be used.
2. Character set:
If the column is of type nvarchar or nchar, then always use the prefix N while specifying the character string in the WHERE criteria/UPDATE/INSERT clause. If you do not do this and one of the characters in your string is unicode (like international characters - example - ā) then it will fail or suffer data corruption.
Assuming the value is nvarchar type for that only we are using N''

MS SQL does not allow VARCHAR over 128 characters, why?

I have a table with a column configured to hold nvarchar data type.
I am trying to add a row using
INSERT INTO TABLE_NAME VALUES (value1, value2...)
Sql-server gets stuck on a 180 character string that I am trying to assign to the nvarchar data type column returning:
Error: The identifier that starts with [part of string] is too long.
Maximum length is 128.
I don't understand why this is happening since nvarchar(max) should hold 2GByte of storage as I read here: What is the maximum characters for the NVARCHAR(MAX)?
Any ideas of what I've got wrong here?
UPDATE:
The table was created with this:
CREATE TABLE MED_DATA (
MED_DATA_ID INT
,ORDER_ID INT
,GUID NVARCHAR
,INPUT_TXT NVARCHAR
,STATUS_CDE CHAR
,CRTE_DTM DATETIME
,MOD_AT_DTM DATETIME
,CHG_IN_REC_IND CHAR
,PRIMARY KEY (MED_DATA_ID)
)
And my actual INSERT statement is as follows:
INSERT INTO MED_DATA
VALUES (
5
,12
,"8fd9924"
,"{'firstName':'Foo','lastName':'Bar','guid':'8fd9924','weightChanged':false,'gender':'Male','heightFeet':9,'heightInches':9,'weightPounds':999}"
,"PENDING"
,"2017-09-02 00:00:00.000"
,"2017-09-02 00:00:00.000"
,NULL
)
By default, double quotes in T-SQL do not delimit a string. They delimit an identifier. So you cannot use double quotes here. You could change the default but shouldn't.
If this is being directly written in a query window, use single quotes for strings and then double up quotes within the string to escape them:
INSERT INTO MED_DATA VALUES (5, 12, '8fd9924', '{''firstName'':''Foo'',''lastName'':''Bar'',''guid'':''8fd9924'',''weightChanged'':false,''gender'':''Male'',''heightFeet'':9,''heightInches'':9,''weightPounds'':999}', 'PENDING', '2017-09-02T00:00:00.000', '2017-09-02T00:00:00.000', NULL)
But if, instead, you're passing this string across from another program, it's time to learn how to use parameterized queries. That'll also allow you to pass the dates across as dates and not rely on string parsing to reconstruct them correctly.
Also, as noted, you need to fix your table definitions because they've currently nvarchar which means the same as nvarchar(1).
Are you aware of what an Identifier is? Here is a hint - it is a NAME. SQL Server is not complaining about your data, it is complaining about a field or table name. SOmehow your SQL must be totally borked so that part of the text is parsed as name of a field or table. And yes, those are limited to 128 characters.
This is clear in the error message:
Error: The identifier
clearly states it is an identifier issue.

Alter column from varchar to decimal when nulls exist

How do I alter a sql varchar column to a decimal column when there are nulls in the data?
I thought:
ALTER TABLE table1
ALTER COLUMN data decimal(19,6)
But I just get an error, I assume because of the nulls:
Error converting data type varchar to numeric. The statement has been terminated.
So I thought to remove the nulls I could just set them to zero:
ALTER TABLE table1
ALTER COLUMN data decimal(19,6) NOT NULL DEFAULT 0
but I dont seem to have the correct syntax.
Whats the best way to convert this column?
edit
People have suggested it's not the nulls that are causing me the problem, but non-numeric data. Is there an easy way to find the non-numeric data and either disregard it, or highlight it so I can correct it.
If it were just the presence of NULLs, I would just opt for doing this before the alter column:
update table1 set data = '0' where data is null
That would ensure all nulls are gone and you could successfully convert.
However, I wouldn't be too certain of your assumption. It seems to me that your new column is perfectly capable of handling NULL values since you haven't specified not null for it.
What I'd be looking for is values that aren't NULL but also aren't something you could turn in to a real numeric value, such as what you get if you do:
insert into table1 (data) values ('paxdiablo is good-looking')
though some may argue that should be treated a 0, a false-y value :-)
The presence of non-NULL, non-numeric data seems far more likely to be causing your specific issue here.
As to how to solve that, you're going to need a where clause that can recognise whether a varchar column is a valid numeric value and, if not, change it to '0' or NULL, depending on your needs.
I'm not sure if SQL Server has regex support but, if so, that'd be the first avenue I'd investigate.
Alternatively, provided you understand the limitations (a), you could use isnumeric() with something like:
update table1 set data = NULL where isnumeric(data) = 0
This will force all non-numeric values to NULL before you try to convert the column type.
And, please, for the love of whatever deities you believe in, back up your data before attempting any of these operations.
If none of those above solutions work, it may be worth adding a brand new column and populating bit by bit. In other words set it to NULL to start with, and then find a series of updates that will copy data to this new column.
Once you're happy that all data has been copied, you should then have a series of updates you can run in a single transaction if you want to do the conversion in one fell swoop. Drop the new column and then do the whole lot in a single operation:
create new column;
perform all updates to copy data;
drop old column;
rename new column to old name.
(a) From the linked page:
ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($).
Possible solution:
CREATE TABLE test
(
data VARCHAR(100)
)
GO
INSERT INTO test VALUES ('19.01');
INSERT INTO test VALUES ('23.41');
ALTER TABLE test ADD data_new decimal(19,6)
GO
UPDATE test SET data_new = CAST(data AS decimal(19,6));
ALTER TABLE test DROP COLUMN data
GO
EXEC sp_RENAME 'test.data_new' , 'data', 'COLUMN'
As people have said, that error doesn't come from nulls, it comes from varchar values that can't be converted to decimal. Most typical reason for this I've found (after checking that the column doesn't contain any logically false values, like non-digit characters or double comma values) is when your varchar values use comma for decimal pointer, as opposed to period.
For instance, if you run the following:
DECLARE #T VARCHAR(256)
SET #T = '5,6'
SELECT #T, CAST(#T AS DEC(32,2))
You will get an error.
Instead:
DECLARE #T VARCHAR(256)
SET #T = '5,6'
-- Let's change the comma to a period
SELECT #T = REPLACE(#T,',','.')
SELECT #T, CAST(#T AS DEC(32,2)) -- Now it works!
Should be easy enough to look if your column has these cases, and run the appropriate update before your ALTER COLUMN, if this is the cause.
You could also just use a similar idea and make a regex search on the column for all values that don't match digit / digit+'.'+digit criteria, but i suck with regex so someone else can help with that. :)
Also, the american system uses weird separators like the number '123100.5', which would appear as '123,100.5', so in those cases you might want to just replace the commas with empty strings and try then?

Determining Nvarchar length

I've read all about varchar versus nvarchar. But I didn't see an answer to what I think is a simple question. How do you determine the length of your nvarchar column? For varchar it's very simple: my Description, for example, can have 100 characters, so I define varchar(100). Now I'm told we need to internationalize and support any language. Does this mean I need to change my Description column to nvarchar(200), i.e. simply double the length? (And I'm ignoring all the other issues that are involved with internationalization for the moment.)
Is it that simple?
Generally it is the same as for varchar really. The number is still the maximum number of characters not the data length.
nvarchar(100) allows 100 characters (which would potentially consume 200 bytes in SQL Server).
You might want to allow for the fact that different cultures may take more characters to express the same thing though.
An exception to this is however is if you are using an SC collation (which supports supplementary characters). In that case a single character can potentially take up to 4 bytes.
So worst case would be to double the character value declared.
From microsoft web site:
A common misconception is to think that NCHAR(n) and NVARCHAR(n), the n defines the number of characters. But in NCHAR(n) and NVARCHAR(n) the n defines the string length in byte-pairs (0-4,000). n never defines numbers of characters that can be stored. This is similar to the definition of CHAR(n) and VARCHAR(n).
The misconception happens because when using characters defined in the Unicode range 0-65,535, one character can be stored per each byte-pair. However, in higher Unicode ranges (65,536-1,114,111) one character may use two byte-pairs. For example, in a column defined as NCHAR(10), the Database Engine can store 10 characters that use one byte-pair (Unicode range 0-65,535), but less than 10 characters when using two byte-pairs (Unicode range 65,536-1,114,111). For more information about Unicode storage and character ranges, see
https://learn.microsoft.com/en-us/sql/t-sql/data-types/nchar-and-nvarchar-transact-sql?view=sql-server-ver15
#Musa Calgar - exactly right. That link has the information for the answer to this question.
But to make sure the question itself is clear, we are talking about the 'length' attribute we see when we look at the column definition for a given table, right? That is the storage allocated per column. On the other hand, if we want to know the number of characters for a given string in the table at a given moment you can:
"SELECT myColumn, LEN(myColumn) FROM myTable"
But if the storage length is desired, you can drag the table name into the query window using SSMS, highlight it, and use 'Alt-F1' to see the defined lengths of each column.
So as an example, I created a table like this specifiying collations. (Latin1_General_100_CI_AS_SC allows for supplemental characters - that is, characters that take more than just 2 bytes):
CREATE TABLE [dbo].[TestTable1](
[col1] [varchar](10) COLLATE Latin1_General_100_CI_AS,
[col2] [nvarchar](10) COLLATE Latin1_General_100_CI_AS_SC,
[col3] [nvarchar](10) COLLATE Latin1_General_100_CI_AS
) ON [PRIMARY]
The lengths show up like this (Highlight in query window and Alt-F1):
Column_Name Type Length [...] Collation
col1 varchar 10 Latin1_General_100_CI_AS
col2 nvarchar 20 Latin1_General_100_CI_AS_SC
col3 nvarchar 20 Latin1_General_100_CI_AS
If you insert ASCII characters into the varchar and nvarchar fields, it will allow you to put 10 characters into all of them. There will be an error if you try to put more than 10 characters into those fields:
"String or binary data would be truncated.
The statement has been terminated."
If you insert non-ASCII characters like 'ā' you can still put 10 of them into each one, but SQL Server will convert the values going into col1 to the closest known character that fits into 1-byte. In this case, 'ā' will be converted to 'a'.
However, if you insert characters that require 4 bytes to store, like for example, '𠜎', you will only be allowed to put FIVE of them into the varchar and nvarchar fields. Any more than that will result in the truncation error shown above. The varchar field will show question marks because it has no single-byte character that it can convert that input to.
So when you insert five of these '𠜎', do a select of that row using len(<colname>) and you will see this:
col1 len(col1) col2 len(col2) col3 len(col3)
?????????? 10 𠜎𠜎𠜎𠜎𠜎 5 𠜎𠜎𠜎𠜎𠜎 10
So the length of col2 shows 5 characters since supplemental characters were defined when the table was created (see above CREATE TABLE DDL statement). However, col3 did not have _SC for its collation, so it is showing length 10 for the five characters we inserted.
Note that col1 has ten question marks. If we had defined the col1 varchar using the _SC collation instead of the non-supplemental one, it would behave the same way.

(NOT) NULL for NVARCHAR columns

Allowing NULL values on a column is normally done to allow the absense of a value to be represented. When using NVARCHAR there is aldready a possibility to have an empty string, without setting the column to NULL. In most cases I cannot see a semantical difference between an NVARCHAR with an empty string and a NULL value for such a column.
Setting the column as NOT NULL saves me from having to deal with the possibility of NULL values in the code and it feels better to not have to different representations of "no value" (NULL or an empty string).
Will I run into any other problems by setting my NVARCHAR columns to NOT NULL. Performance? Storage size? Anything I've overlooked on the usage of the values in the client code?
A NULL indicates that the value in the column is missing/inapplicable. A blank string is different because that is an actual value. A NULL technically has no data type where as a blank string, in this case, is nvarchar.
I cant see any issues with having default values rather than NULL values.
In fact, it would probably be beneficial as you wouldn't have to worry about catering for NULL values in any of your queries
e.g
Select Sum(TotalPrice) as 'TotalPrice' From myTable Where CountOfItems > 10
much easier than
Select Sum(IsNull(TotalPrice,0)) as 'TotalPrice' From myTable Where IsNull(CountOfItems,0) > 10
You should use default constraints in your DDL to ensure that no rogue NULL's appear in your data.
The concept of the NULL value is a common source of confusion. NULL is not the same as an empty string, or a value of zero.
Conceptually, NULL means "a missing unknown value" and it is treated somewhat differently from other values. For example, to test for NULL, you cannot use the arithmetic comparison operators such as =, <, or <>.
If you have columns that may contain "a missing unknown value", you have to set them to accept NULLs. On the other hand, an empty string simply means that the value is known, but is empty.
For example: If a "Middle Name" field in a "Users" table is set to NULL, it should mean that the middle name of that user is not known. The user might or might not have a middle name. However, if the "Middle Name" field is set to an empty string, this should mean that the user is known to have no middle name.
If anything I'd say that it'll be easier to use the table if you don't allow NULLs, since you won't have to check for NULLs everywhere in code, so I'd only set a column to allow NULLs if I need to handle unknown rather than empty values.
What ho1 said. But you'd be ill-advised to define a column NOT NULL and then have a special value for 'Unknown'.