Unexpected behavior of binary conversions (COALESCE vs. ISNULL) - sql

Can you comment on what approach shown below is preferable? I hope the question will not be blocked as "opinionated". I would like to believe there is an explanation that makes that clear.
Context: I have a code for mirroring 3rd party table contents to my own table (optimization). It worked some time flawlessly until the size/modification of the database reached some threshold.
The optimization is based on row version values of more tables, and remembering the maximum of the values from the source tables. This way I am able to update my local table incrementally, much faster than rebuilding it from time to time from scratch.
The problem started to appear when the row-version value exceeded the 4byte value. After some effort, I have spotted that the upper 4 bytes of the binary(8) value were set to 0. Later, the suspect was found to have a form COALESCE(MAX(row_version), 1).
The COALESCE was used to cover the case when the local table is fresh, containing now data -- for comparing the MAX(row_version) of source tables with something meaningful.
The examples to show the bug: To simulate the last mentioned situation, I want to convert the NULL value of the binary(8) column to 1. I am adding also the ISNULL usage that was added later. The original code contained the COALESCE only.
DECLARE #bin8null binary(8) = NULL
SELECT 'bin NULL' AS the_variable, #bin8null AS value
SELECT 'coalesce 1' AS op, COALESCE(#bin8null, 1) AS good_value
SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(#bin8null, 1)) AS good_value
SELECT 'isnull 1' AS op, ISNULL(#bin8null, 1) AS good_value
SELECT 'isnull 0x1' AS op, ISNULL(#bin8null, 0x1) AS bad_value
(There is a bug in the image coalesce 0x1 + convert fixed later in the code to coalesce 1 + convert, but not fixed in the image.)
The application bug appeared when the binary value was bigger than the part that could be stored in 4 bytes. Here the 0xAAAAAAAA was used. (Actually, the 0x00000001 was the case, and it was difficult to spot that the single 1 was changed to 0.)
DECLARE #bin8 binary(8) = 0xAAAAAAAA01BB3A35
SELECT 'bin' AS the_variable, #bin8 AS value
SELECT 'coalesce 1' AS op, COALESCE(#bin8, 1) AS bad_value
SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(#bin8, 1)) AS bad_value
SELECT 'coalesce 0x1 + convert ' AS op, CONVERT(binary(8), COALESCE(#bin8, 0x1)) AS good_value
SELECT 'isnull 1' AS op, ISNULL(#bin8, 1) AS good_value
SELECT 'isnull 0x1' AS op, ISNULL(#bin8, 0x1) AS good_value
When executed in Microsoft SQL Server Management Studio on MS-SQL Server 2014, the result looks like this:
Description -- my understanding: The COALESCE() seems to derive the type of the result from the type of the last processed argument. This way, the non-NULL binary(8) was converted to int, and that lead to the loss of upper 4 bytes. (See the 2nd and 3rd red bad_value on the picture. The difference between the two cases is only in decimal/hexadecimal form of display.)
On the other hand, the ISNULL() seems to preserve the type of the first argument, and converts the second value to that type. One should be careful to understand that binary(8) is more like a series of bytes. The interpretation as one large integer is only the interpretation. Hence, the 0x1 as the default value does not expand as 8bytes integer and produces bad value.
My solution: So, I have fixed the bug using ISNULL(MAX(row_version), 1). Is that correct?

This is not a bug. They're documented to handle data type precedence differently. COALESCE determines the data type of the output based on examining all of the arguments, while ISNULL has a more simplistic approach of inspecting only the first argument. (Both still need to contain values which are all compatible, meaning they are all possible to convert to the determined output type.)
From the COALESCE topic:
Returns the data type of expression with the highest data type precedence.
The ISNULL topic does not make this distinction in the same way, but implicitly states that the first expression determines the type:
replacement_value must be of a type that is implicitly convertible to the type of check_expression.
I have a similar example (and describe several other differences between COALESCE and ISNULL) here. Basically:
DECLARE #int int, #datetime datetime;
SELECT COALESCE(#int, CURRENT_TIMESTAMP);
-- works because datetime has a higher precedence than the chosen output type, int
2020-08-20 09:39:41.763
GO
DECLARE #int int, #datetime datetime;
SELECT ISNULL(#int, CURRENT_TIMESTAMP);
-- fails because int, the first (and chosen) output type, has a lower precedence than datetimeMsg 257, Level 16, State 3Implicit conversion from data type datetime to int is not allowed. Use the CONVERT function to run this query.

Let me start of by saying:
This is not a "bug".
ISNULL and COALESCE are not the same function, and operate quite differently.
ISNULL takes 2 parameters, and returns the second parameter if the first has a value NULL. If the 2 parameters are different datatypes, then the dataype of the first datatype is returned (implicitly casting the second value).
COALESCE takes 2+ parameters, and returns the first non-NULL parameter. COALESCE is a short hand CASE expression, and uses Data Type Precendence to determine the returned data type.
As a result, this is why ISNULL returns what you expect, there is no implicit conversion in your query for the non-NULL variable.
For the COALESCE there is implicit conversion. binary has the lowest precedence of all the data types, with a rank of 30 (at time of writing). The value 1 is an int, and has a precedence of 16; far higher than 30.
As a result COALESCE(#bin8, 1) will implicitly convert the value 0xAAAAAAAA01BB3A35 to an int and then return that value. You see this as SELECT CONVERT(int,0xAAAAAAAA01BB3A35) returns 29047349, which your first "bad" value; it's not "bad", it's correct for what you wrote.
Then for the latter "bad" value, we can convert that int value (29047349) back to a binary, which results in 0x0000000001BB3A35, which is, again the result you get.
TL;DR: checking return types of functions is important. ISNULL returns the data type of first parameter and will implicitly convert the second if needed. For COALESCE it uses Data Type Precedence, and will implicitly convert the returned value to the data type of with the highest precedence of all the possible return values.

Related

Select case returning an error when both elemements not varchar in some cases

I wanted to return a value formatted with commas at every thousand if a number or just the value if it wasn't a number
I used the following statement which returned the error:
Conversion failed when converting the nvarchar value '1,000' to data type int.
Declare #QuantityToDelete int = 1000
SELECT CASE
WHEN ISNUMERIC(#QuantityToDelete)=1
THEN format(cast(#QuantityToDelete as int),'N0')
ELSE #QuantityToDelete
END [Result]
I can get it to work by using the following
SELECT CASE
WHEN ISNUMERIC(#QuantityToDelete)=1
THEN format(cast(#QuantityToDelete as int),'N0')
ELSE cast(#QuantityToDelete as varchar)
END [Result]
Result=1,000
Why doesn't the first example work when the ELSE #QuantityToDelete part of the statement isn't returned?
If I use the below switching the logic condition
SELECT CASE
WHEN ISNUMERIC(#QuantityToDelete)=0
THEN format(cast(#QuantityToDelete as int),'N0')
ELSE #QuantityToDelete
END [Result]
Result=1000
Which is expected, but no error, the case statement still has unmatched return types an nvarchar and an int as in the first example just different logic?
The important point to note is that a case expression returns a single scalar value, and that value has a single data type.
A case expression is fixed, it must evaluate the same and work the same for that query at runtime no matter what data flows through the query - in other words, the result of the case expression cannot be an int for some rows and a string for others.
Remember that the result of a query can be thought of, and used as, a table - so just like a table where you you define a column as being a specific data type, you cannot have a column where the data type can be different for rows of data.
Therefore with a case expression, SQL Server must determine at compile time what the resulting data type will be, which it does (if necessary) using data type precedence. If the case expression has different data types returned in different execution paths then it will attempt to implicitly cast them to the type with the highest precedence.
Hence your case expression that attempts to return two different data types fails because it's trying to return both a nvarchar and int and SQL Server is implicitly casting the nvarchar value to an int - and failing.
The second one works because you are controlling the casting and both paths result in the same varchar data type which works fine.
Also note that when defining a varchar it's good practice to define its length also, you can easily get complacent as it works here because the default length is 30 when casting however the default is 1 otherwise.
See the relevant part of the documentation

IS ISNULL() specific for integers?

This has been bothering me with my coding continuously and I can't seem to google a good workaround.
I have a number of columns which are data type nvarchar(255). Pretty standard I would assume.
Anyway, I want to run:
DELETE FROM Ranks WHERE ISNULL(INST,0) = 0
where INST is nvarchar(255). I am thrown the error:
Conversion failed when converting the nvarchar value 'Un' to data type int.
which is the first non null in the column. However, I don't care for this showing me the error means it's not null? - I just want to delete the nulls!
Is there something simple I'm missing.
Any help would be fab!
An expression may only be of one type.
Expression ISNULL(INST,0) involves two source types, nvarchar(255) and int. However, no type change happens at this point, because ISNULL is documented to return the type of its first argument (nvarchar), and will convert the second argument to that type if needed, so the entire original expression is equivalent to ISNULL(INST, '0').
Next step is the comparison expression, ISNULL(INST, '0') = 0. It again has nvarchar(255) and int as the source data types, but this time nothing can stop the conversion - in fact, it must happen for the comparison operator, =, to even work. According to the data type precedence list, the int wins, and is chosen as the resulting type of the comparison expression. Hence all values from column INST must be converted to int before the comparison = 0 is made.
If you
just want to delete the nulls
, then just delete the nulls:
DELETE FROM Ranks WHERE INST IS NULL
If for some reason you absolutely have to use isnull in this fashion, which there is no real reason for, then you should have stayed in the realm of strings:
DELETE FROM Ranks WHERE ISNULL(INST, '') = ''
That would have deleted null entries and entries with empty strings (''), just like the WHERE ISNULL(INST, 0) = 0 would have deleted null entries and entries with '0's if all values in INST could have been converted to int.
With ISNULL(INST,0) you are saying: If the string INST is null, replace it with the string 0. But 0 isn't a string, so this makes no sense.
With WHERE ISNULL(INST,0) = 0 you'd access all rows where INST is either NULL or 0 (but as mentioned a string is not an integer).
So what do you want to achieve? Delete all rows where INST is null? That would be
DELETE FROM ranks WHERE inst IS NULL;

Returning a varchar value from a coalesced int calculation

I'm a newbie learning my way around T-SQL using the AdventureWorks2012 database. I'm using SQL Server 2014, though a solution that would also work with 2008 would be great. I've been given the below exercise:
Write a query using the Sales.SpecialOffer table. Display the difference between the MinQty and MaxQty columns along with the SpecialOfferID and Description columns.
Thing is, MaxQty allows for null values, so I'm trying to come up with a real world solution for an output that doesn't involve leaving nulls in there. However, when I try to use coalesce to return 'No Max' (yes, I get that I could just leave NULL in there but I'm trying to see if I can figure this out), I get the message that the varchar value 'No Max' couldn't be converted to data type int. I'm assuming this is because MaxQty - MinQty as an int takes precedence?
select
specialofferid
, description
, coalesce((maxqty - minqty),'No Max') 'Qty_Difference'
from
sales.specialoffer;
Error:
Msg 245, Level 16, State 1, Line 135
Conversion failed when converting the varchar value 'No max' to data type int.
I thought about just returning a nonsense integer (0 or a negative) but that doesn't seem perfect - if return 0 I'm obscuring situations where the result is actually zero, etc.
Thoughts?
You just need to make sure that all the parameters of the COALESCE function call have consistent data types. Because you can't get around the fact No Max is a string, then you have to make sure that the maxqty - minqty part is also treated as a string by casting the expression.
select specialofferid
, description
, coalesce(cast(maxqty - minqty as varchar),'No Max') 'Qty_Difference'
from sales.specialoffer;
EDIT: A few more details on the cause of the error
Without the explicit cast, the reason why the COALESCE function attempts to convert the No Max string to an int can be explained by the following documented rule:
Data type determination of the resulting expression is different. ISNULL uses the data type of the first parameter, COALESCE follows the CASE expression rules and returns the data type of value with the highest precedence.
And if you check the precedence of the different types, as documented here, then you will see that int has higher precedence than varchar.
So as soon as you have a mix of data types in the call to COALESCE, SQL Server will try to convert all mismatching parameters to the data type with highest precedence, in this case int. To override that default behavior, explicit type casting is required.
I would use a case statement to so you can do stuff you want.
select specialofferid
, description
, CASE
WHEN maxqty is null THEN 'No Max'
ELSE (maxqty - minqty) 'Qty_Difference'
END
from sales.specialoffer;

Why IsNull(LTrim(RTrim(Lower(null))), -1) is *?

Today I was testing something at work place and came across this one
Case 1:
Declare #a nvarchar(20)
Set #a = null
Select IsNull(LTrim(RTrim(Lower(#a))), -1)
Case 2:
Select IsNull(LTrim(RTrim(Lower(null))), -1)
The result in case 1 is -1 but * in case 2
I was expecting same results in both cases. Any reason?
Without the declaration of data type, null in this case is declared as varchar(1). You can observe this by selecting the results into a #temp table:
Select IsNull(LTrim(RTrim(Lower(null))), -1) as x INTO #x;
EXEC tempdb..sp_help '#x';
Among the results you'll see:
Column_name Type Length
----------- ------- ------
x varchar 1
Since -1 can't fit in a varchar(1), you are getting * as output. This is similar to:
SELECT CONVERT(VARCHAR(1), -1);
If you want to collapse to a string, then I suggest enclosing the integer in single quotes so there is no confusion caused by integer <-> string conversions that aren't intended:
SELECT CONVERT(VARCHAR(1), '-1'); -- yields "-"
SELECT CONVERT(VARCHAR(30), '-1'); -- yields "-1"
I would not make any assumptions about how SQL Server will handle a "value" explicitly provided as null, especially when complex expressions make it difficult to predict which evaluation rules might trump data type precedence.
In SQL Server, there are "typed NULLs" and "untyped NULLs".
In the first case, the NULL is typed—it is aware that NULL is a varchar(20) and so as your functions wrap the inner value, that data type is propagated throughout the expression.
In the second case, the NULL is untyped, so it has to infer the NULL's type from the surrounding expressions. The IsNull function evaluates the data type of the first operand and applies that to the whole expression, and thus the NULL defaults to varchar(1):
PRINT sql_variant_property(IsNull(LTrim(NULL), -1), 'BaseType'); -- varchar
PRINT sql_variant_property(IsNull(LTrim(NULL), -1), 'MaxLength'); -- 1
Another complication is that IsNull does not do type promotion in the same way that Coalesce does (though Coalesce has its own problems due to not being a function—it is expanded to a CASE expression, sometimes causing unexpected side-effects due to repeat expression evaluation). Look:
SELECT Coalesce(LTrim(NULL), -1);
This results in -1 with data type int!
Check out Sql Server Data Type Precedence and you'll see that int is much higher than varchar, so the whole expression becomes int.
The naked NULL is being passed to LOWER(), which expects a character. This is being defaulted to one character wide. The value "-1" doesn't fit in this field, so it is returning "*".
You can get the same effect with:
select isnull(CAST(NULL as varchar(1)), -1)
The following code also causes the problem:
declare #val varchar;
set #val = -1
select #val
Note that COALESCE() does not cause this problem.
I'm pretty sure this is fully documented behavior.

Multiplication with NULL and empty column values in SQL

This was my Interview Question
there are two columns called Length and Breadth in Area table
Length Breadth Length*Breadth
20 NULL ?
30 ?
21.2 1 ?
I tried running the same question on MYSQL while inserting,To insert an empty value I tried the below query . Am I missing anything while inserting empty values in MYSQL.
insert into test.new_table values (30,);
Answers: With Null,Result is Null.
With float and int multiplication result is float
As per your question the expected results would be as below.
SELECT LENGTH,BREADTH,LENGTH*BREADTH AS CALC_AREA FROM AREA;
LENGTH BREADTH CALC_AREA
20
30 0 0
21.2 1 21.2
For any(first) record in SQL SERVER if you do computation with NULL the answer would be NULL.
For any(second) record in SQL SERVER, if you do product computation between a non-empty value and an empty value the result would be zero as empty value is treated as zero.
For any(third) record in SQL SERVER, if you do computation between two non-empty data type values the answer would be a NON-EMPTY value.
Check SQL Fiddle for reference - http://sqlfiddle.com/#!3/f250a/1
That blank Breath (second row) cannot happen unless Breath is VARCHAR. Assuming that, the answers will be:
NULL (since NULL times anything is NULL)
Throws error (since an empty string is not a number. In Sql Server, the error is "Error converting data type varchar to numeric.")
21.20 (since in Sql Server, for example, conversion to a numeric type is automatic, so SELECT 21.2 * '1' returns 21.20).
Assuming that Length and Breadth are numerical types of some kind the second record does not contain possible values — Breadth must be either 0 or NULL.
In any event, any mathematical operation in SQL involving a NULL value will return the value NULL, indicating that the expression cannot be evaluated. The answer are NULL, impossible, and 21.2.
The product of any value and NULL is NULL. This is called "NULL propagation" if you want to Google it. To score points in an interview, you might want to mention that NULL isn't a value; it's a special marker.
The fact that the column Breadth has one entry "NULL" and one entry that's blank (on the second row) is misleading. A numeric column that doesn't have a value in a particular row means that row is NULL. So the second column should also show "NULL".
The answer to the third row, 21.2 * 1, depends on the data type of the column "Length*Breadth". If it's a data type like float, double, or numberic(16,2), the answer is 21.2. If it's an integer column (integer, long, etc.), the answer is 21.
A more snarky answer might be "There's no answer. The string "Length*Breadth" isn't a legal SQL column name."
In standard SQL they would all generate errors because you are comparing values (or nulls) of different types:
CAST ( 20 AS FLOAT ) * CAST ( NULL AS INTEGER ) -- mismatched types error
CAST ( '' AS INTEGER ) -- type conversion error
CAST ( AS INTEGER ) -- type conversion error
CAST ( 21.2 AS FLOAT ) * CAST ( 2 AS INTEGER ) -- mismatched types error
On the other hand, most SQL product would implicitly cast values when comparing values (or nulls) of different types according to type precedence e.g. comparing float value to an integer value would in effect cast the integer to float and result in a float. At the product level, the most interesting question is what happens when you compare a null of type integer with a value (or even a null) of type float...
...but, frankly, not terribly interesting. In an interview you are presented with a framework (in the form of questions asked of you) on which to present your knowledge, skills and experience. The 'answer' here is to discuss nulls (e.g. point out that nulls are tricky to define and behave in unintuitive ways, which leads to frequent bugs and a desire to avoid nulls entirely, etc) and whether implicit casting is a good thing.