Why IsNull(LTrim(RTrim(Lower(null))), -1) is *? - sql

Today I was testing something at work place and came across this one
Case 1:
Declare #a nvarchar(20)
Set #a = null
Select IsNull(LTrim(RTrim(Lower(#a))), -1)
Case 2:
Select IsNull(LTrim(RTrim(Lower(null))), -1)
The result in case 1 is -1 but * in case 2
I was expecting same results in both cases. Any reason?

Without the declaration of data type, null in this case is declared as varchar(1). You can observe this by selecting the results into a #temp table:
Select IsNull(LTrim(RTrim(Lower(null))), -1) as x INTO #x;
EXEC tempdb..sp_help '#x';
Among the results you'll see:
Column_name Type Length
----------- ------- ------
x varchar 1
Since -1 can't fit in a varchar(1), you are getting * as output. This is similar to:
SELECT CONVERT(VARCHAR(1), -1);
If you want to collapse to a string, then I suggest enclosing the integer in single quotes so there is no confusion caused by integer <-> string conversions that aren't intended:
SELECT CONVERT(VARCHAR(1), '-1'); -- yields "-"
SELECT CONVERT(VARCHAR(30), '-1'); -- yields "-1"
I would not make any assumptions about how SQL Server will handle a "value" explicitly provided as null, especially when complex expressions make it difficult to predict which evaluation rules might trump data type precedence.

In SQL Server, there are "typed NULLs" and "untyped NULLs".
In the first case, the NULL is typed—it is aware that NULL is a varchar(20) and so as your functions wrap the inner value, that data type is propagated throughout the expression.
In the second case, the NULL is untyped, so it has to infer the NULL's type from the surrounding expressions. The IsNull function evaluates the data type of the first operand and applies that to the whole expression, and thus the NULL defaults to varchar(1):
PRINT sql_variant_property(IsNull(LTrim(NULL), -1), 'BaseType'); -- varchar
PRINT sql_variant_property(IsNull(LTrim(NULL), -1), 'MaxLength'); -- 1
Another complication is that IsNull does not do type promotion in the same way that Coalesce does (though Coalesce has its own problems due to not being a function—it is expanded to a CASE expression, sometimes causing unexpected side-effects due to repeat expression evaluation). Look:
SELECT Coalesce(LTrim(NULL), -1);
This results in -1 with data type int!
Check out Sql Server Data Type Precedence and you'll see that int is much higher than varchar, so the whole expression becomes int.

The naked NULL is being passed to LOWER(), which expects a character. This is being defaulted to one character wide. The value "-1" doesn't fit in this field, so it is returning "*".
You can get the same effect with:
select isnull(CAST(NULL as varchar(1)), -1)
The following code also causes the problem:
declare #val varchar;
set #val = -1
select #val
Note that COALESCE() does not cause this problem.
I'm pretty sure this is fully documented behavior.

Related

Unexpected behavior of binary conversions (COALESCE vs. ISNULL)

Can you comment on what approach shown below is preferable? I hope the question will not be blocked as "opinionated". I would like to believe there is an explanation that makes that clear.
Context: I have a code for mirroring 3rd party table contents to my own table (optimization). It worked some time flawlessly until the size/modification of the database reached some threshold.
The optimization is based on row version values of more tables, and remembering the maximum of the values from the source tables. This way I am able to update my local table incrementally, much faster than rebuilding it from time to time from scratch.
The problem started to appear when the row-version value exceeded the 4byte value. After some effort, I have spotted that the upper 4 bytes of the binary(8) value were set to 0. Later, the suspect was found to have a form COALESCE(MAX(row_version), 1).
The COALESCE was used to cover the case when the local table is fresh, containing now data -- for comparing the MAX(row_version) of source tables with something meaningful.
The examples to show the bug: To simulate the last mentioned situation, I want to convert the NULL value of the binary(8) column to 1. I am adding also the ISNULL usage that was added later. The original code contained the COALESCE only.
DECLARE #bin8null binary(8) = NULL
SELECT 'bin NULL' AS the_variable, #bin8null AS value
SELECT 'coalesce 1' AS op, COALESCE(#bin8null, 1) AS good_value
SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(#bin8null, 1)) AS good_value
SELECT 'isnull 1' AS op, ISNULL(#bin8null, 1) AS good_value
SELECT 'isnull 0x1' AS op, ISNULL(#bin8null, 0x1) AS bad_value
(There is a bug in the image coalesce 0x1 + convert fixed later in the code to coalesce 1 + convert, but not fixed in the image.)
The application bug appeared when the binary value was bigger than the part that could be stored in 4 bytes. Here the 0xAAAAAAAA was used. (Actually, the 0x00000001 was the case, and it was difficult to spot that the single 1 was changed to 0.)
DECLARE #bin8 binary(8) = 0xAAAAAAAA01BB3A35
SELECT 'bin' AS the_variable, #bin8 AS value
SELECT 'coalesce 1' AS op, COALESCE(#bin8, 1) AS bad_value
SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(#bin8, 1)) AS bad_value
SELECT 'coalesce 0x1 + convert ' AS op, CONVERT(binary(8), COALESCE(#bin8, 0x1)) AS good_value
SELECT 'isnull 1' AS op, ISNULL(#bin8, 1) AS good_value
SELECT 'isnull 0x1' AS op, ISNULL(#bin8, 0x1) AS good_value
When executed in Microsoft SQL Server Management Studio on MS-SQL Server 2014, the result looks like this:
Description -- my understanding: The COALESCE() seems to derive the type of the result from the type of the last processed argument. This way, the non-NULL binary(8) was converted to int, and that lead to the loss of upper 4 bytes. (See the 2nd and 3rd red bad_value on the picture. The difference between the two cases is only in decimal/hexadecimal form of display.)
On the other hand, the ISNULL() seems to preserve the type of the first argument, and converts the second value to that type. One should be careful to understand that binary(8) is more like a series of bytes. The interpretation as one large integer is only the interpretation. Hence, the 0x1 as the default value does not expand as 8bytes integer and produces bad value.
My solution: So, I have fixed the bug using ISNULL(MAX(row_version), 1). Is that correct?
This is not a bug. They're documented to handle data type precedence differently. COALESCE determines the data type of the output based on examining all of the arguments, while ISNULL has a more simplistic approach of inspecting only the first argument. (Both still need to contain values which are all compatible, meaning they are all possible to convert to the determined output type.)
From the COALESCE topic:
Returns the data type of expression with the highest data type precedence.
The ISNULL topic does not make this distinction in the same way, but implicitly states that the first expression determines the type:
replacement_value must be of a type that is implicitly convertible to the type of check_expression.
I have a similar example (and describe several other differences between COALESCE and ISNULL) here. Basically:
DECLARE #int int, #datetime datetime;
SELECT COALESCE(#int, CURRENT_TIMESTAMP);
-- works because datetime has a higher precedence than the chosen output type, int
2020-08-20 09:39:41.763
GO
DECLARE #int int, #datetime datetime;
SELECT ISNULL(#int, CURRENT_TIMESTAMP);
-- fails because int, the first (and chosen) output type, has a lower precedence than datetimeMsg 257, Level 16, State 3Implicit conversion from data type datetime to int is not allowed. Use the CONVERT function to run this query.
Let me start of by saying:
This is not a "bug".
ISNULL and COALESCE are not the same function, and operate quite differently.
ISNULL takes 2 parameters, and returns the second parameter if the first has a value NULL. If the 2 parameters are different datatypes, then the dataype of the first datatype is returned (implicitly casting the second value).
COALESCE takes 2+ parameters, and returns the first non-NULL parameter. COALESCE is a short hand CASE expression, and uses Data Type Precendence to determine the returned data type.
As a result, this is why ISNULL returns what you expect, there is no implicit conversion in your query for the non-NULL variable.
For the COALESCE there is implicit conversion. binary has the lowest precedence of all the data types, with a rank of 30 (at time of writing). The value 1 is an int, and has a precedence of 16; far higher than 30.
As a result COALESCE(#bin8, 1) will implicitly convert the value 0xAAAAAAAA01BB3A35 to an int and then return that value. You see this as SELECT CONVERT(int,0xAAAAAAAA01BB3A35) returns 29047349, which your first "bad" value; it's not "bad", it's correct for what you wrote.
Then for the latter "bad" value, we can convert that int value (29047349) back to a binary, which results in 0x0000000001BB3A35, which is, again the result you get.
TL;DR: checking return types of functions is important. ISNULL returns the data type of first parameter and will implicitly convert the second if needed. For COALESCE it uses Data Type Precedence, and will implicitly convert the returned value to the data type of with the highest precedence of all the possible return values.

Conversion failed when converting the varchar value 'x' to data type int

I have a WHERE condition which works for the type STRING but fails for an of type INT
isnull(emp.name, 'x') <> isnull(mst.name, 'x') -- works
But this condition below throws an error:
isnull(emp.age, 'x') <> isnull(mst.age, 'x') -- fails
Conversion failed when converting the varchar value 'x' to data type int.
where name is a STRING and age is of INT type.
How to rectify this?
Don't use isnull() for this purpose -- or even coalesce(). Just expand out the logic:
where (emp.age = mst.age or emp.age is null and mst.age is null)
You could put in a fake value and use coalesce(), but the types need to be consistent. However, I think you are better off with explicit logic that does what you want and works for all data types.
ISNULL will attempt to convert the datatype of second parameter to that of first parameter. The string x cannot be converted to int (assuming age is int). Use an integer value that, ideally, does not exist in your data:
isnull(emp.age, -1) <> isnull(mst.age, -1)
Note that in your code, 'x' will effectively be considered the same as NULL, which is unlikely to be what you want. Fixing the error you get for age will not correct that problem.
Consider instead using the operator IS DISTINCT FROM, which is similar to <> but considers NULL as a "known value" (NULL IS DISTINCT FROM NULL = FALSE for example).
emp.name IS DISTINCT FROM mst.name
emp.age IS DISTINCT FROM mst.age
If your database engine does not support IS DISTINCT FROM then this related question will be helpful: How to rewrite IS DISTINCT FROM and IS NOT DISTINCT FROM?

SELECT vs UPDATE, Unexpected rounding when using ABS function

Attached is a code sample to run in SQL. This seems like unexpected behavior for SQL Server. What should happen is to remove the negative from the number but when using the same function under the update command it does the absolute value and also rounds the number. Why is this?
DECLARE #TEST TABLE (TEST varchar(2048));
INSERT INTO #TEST VALUES (' -29972.95');
SELECT TEST FROM #TEST;
SELECT ABS(TEST) FROM #TEST;
UPDATE #TEST SET TEST = ABS(TEST);
SELECT TEST FROM #TEST;
Below are the results of that code.
-29972.95
29972.95
29973
This seems more a "feature" of the CONVERT function than anything to do with SELECT or UPDATE (only reason it is different is because the UPDATE implicitly converts the FLOAT(8) returned by ABS(...) back into VARCHAR).
The compute scalar in the update plan contains the expression
[Expr1003] = Scalar Operator(CONVERT_IMPLICIT(varchar(2048),
abs(CONVERT_IMPLICIT(float(53),[TEST],0))
,0) /*<-- style used for convert from float*/
)
Value - Output
0 (default) - A maximum of 6 digits. Use in scientific notation, when appropriate.
1 - Always 8 digits. Always use in scientific notation.
2 - Always 16 digits. Always use in scientific notation.
From MSDN: https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql?view=sql-server-2017
This can be seen in the example below:
SELECT
[# Digits],
CONVERT(FLOAT(8), CONVERT(VARCHAR(20), N)) AS [FLOAT(VARCHAR(N))],
CONVERT(FLOAT(8), CONVERT(VARCHAR(20), N, 0)) AS [FLOAT(VARCHAR(N, 0))],
CONVERT(FLOAT(8), CONVERT(VARCHAR(20), N, 1)) AS [FLOAT(VARCHAR(N, 1))]
FROM (SELECT '6 digits', ABS('9972.95') UNION ALL SELECT '7 digits', ABS('29972.95')) T ([# Digits], N)
This returns the following results:
# Digits FLOAT(VARCHAR(N)) FLOAT(VARCHAR(N, 0)) FLOAT(VARCHAR(N, 1))
-------- ----------------- -------------------- --------------------
6 digits 9972.95 9972.95 9972.95
7 digits 29973 29973 29972.95
This proves the UPDATE was using CONVERT(VARCHAR, ABS(...)) effectively with the default style of "0". This limited the FLOAT from the ABS to 6 digits. Taking 1 character away so it does not overflow the implicit conversion, you retain the actual values in this scenario.
Taking this back to the OP:
The ABS function in this case is returning a FLOAT(8) in the example.
The UPDATE then caused an implicit conversion that was effectively `CONVERT(VARCHAR(2048), ABS(...), 0), which then overflowed the max digits of the default style.
To get around this behavior (if this is related to a practical issue), you need to specify the style of 1 or 2 (or even 3 to get 17 digits) to avoid this truncation (but be sure to handle the scientific notation used since it is now always returned in this case)
(some preliminary testing deleted for brevity)
It definitely has to do with silent truncating during INSERT/UPDATEs.
If you change the value insertion to this:
INSERT INTO #TEST SELECT ABS(' -29972.95')
You immediately get the same rounding/truncation without doing an UPDATE.
Meanwhile, SELECT ABS(' -29972.95') produces expected results.
Further testing supports the theory of an implicit float conversion, and indicates that the culprit lies with the conversion back to varchar:
DECLARE #Flt float = ' -29972.95'
SELECT #Flt;
SELECT CAST(#Flt AS varchar(2048))
Produces:
-29972.95
-29972
Probably final edit:
I was sniffing up the same tree as Martin. I found this.
Which made me try this:
DECLARE #Flt float = ' -29972.95'
SELECT #Flt;
SELECT CONVERT(varchar(2048),#Flt,128)
Which produced this:
-29972.95
-29972.95
So I'm gonna call this kinda documented since the 128 style is a legacy style that is deprecated and may go away in a future release. But none of the currently documented styles produce the same result. Very interesting.
ABS() is supposed to operate on numeric values and varchar input is converted to float. Most likely explanation for this behavior is that float has highest precedence among all numeric data types such as decimal, int, bit.
Your SELECT statement simply returns the float result. However the UPDATE statement implicitly converts the float back to varchar producing unexpected results:
SELECT
test,
ABS(test) AS test_abs,
CAST(ABS(test) AS VARCHAR(100)) AS test_abs_str
FROM (VALUES
('-29972.95'),
('-29972.94'),
('-29972.9')
) AS test(test)
test | test_abs | test_abs_str
----------|----------|-------------
-29972.95 | 29972.95 | 29973
-29972.94 | 29972.94 | 29972.9
-29972.9 | 29972.9 | 29972.9
I would suggest that you use explicit conversion and exact numeric datatype to avoid this and other potential problems with implicit conversions / floats:
SELECT
test,
ABS(CAST(test AS DECIMAL(18, 2))) AS test_abs,
CAST(ABS(CAST(test AS DECIMAL(18, 2))) AS VARCHAR(100)) AS test_abs_str
FROM (VALUES
('-29972.95'),
('-29972.94'),
('-29972.9')
) AS test(test)
test | test_abs | test_abs_str
----------|----------|-------------
-29972.95 | 29972.95 | 29972.95
-29972.94 | 29972.94 | 29972.94
-29972.9 | 29972.90 | 29972.90
ABS is a mathematical function, that means is designed to work with numeric values, you cannot expect a proper behavior of the function when using other data types like in this case VARCHAR, I suggest first to do the required CAST to a numeric data type before applying the ABS function as follows:
UPDATE #TEST SET TEST = ABS(CAST(TEST AS DECIMAL(18,2)))
After this your query will output
29972.95
This does not solve how it is posible that ABS works fine when selecting and not when updating a value, maybe it is a bug on sqlserver but also it is a really bad practice to avoid casting to proper data types required by functions. Maybe an implicit cast occurs when a SELECT clause is performed but ignored on UPDATE because microsoft is expecting you to do the right thing.

t-sql Different datatype possible in a case?

I have this query
SELECT
CASE WHEN dbo.CFE_PPHY.P77 IS NOT NULL OR dbo.CFE_PPHY.P77 <>''
THEN MONTH(dbo.CFE_PPHY.P77)
WHEN dbo.CFE_PPHY.P70 IS NOT NULL OR dbo.CFE_PPHY.P70 <>''
THEN MONTH(dbo.CFE_SERVICE_EVTS.C10_2)
ELSE COALESCE(CONVERT(VARCHAR,dbo.CFE_PPHY.P77)+
CONVERT(VARCHAR,dbo.CFE_SERVICE_EVTS.C10_2),'toto') END
AS CFELiasse_DateEffetEIRL_MM_N
FROM CFE_PPHY LEFT JOIN CFE_SERVICE_EVTS ON CFE_PPHY.colA = CFE_SERVICE_EVTS.colB
The ELSE part is giving me headaches.
The columns CFE_PPHY.P77 and CFE_SERVICE_EVTS.C10_2 have date time format. I'm turning them into varchar. Yet when I'm running the query, I have the following error
Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value 'toto' to data type int.
Obviously, I cannot turn toto to an integer. Fair enough. However, from my point of view, I've converted the datetime format to a varchar format, so it should do the work.
Where am I wrong?
Thanks
You have to convert all of your case expressions to varchar. SQL is deciding to case the field as int so 'toto' is invalid. If all expressions are converted to varchar this error should be solved.
http://blog.sqlauthority.com/2010/10/08/sql-server-simple-explanation-of-data-type-precedence/
Have a closer look at your case expression: in the first and second conditional branches you're returning MONTH(... which is obviously integer.
But in third branch you're returning varchar thus SQL server tries to convert it to int according to data type of previous branches and failing to do it.
Try like this,
SELECT CASE
WHEN dbo.CFE_PPHY.P77 IS NOT NULL
OR dbo.CFE_PPHY.P77 <> ''
THEN convert(VARCHAR, MONTH(dbo.CFE_PPHY.P77))
WHEN dbo.CFE_PPHY.P70 IS NOT NULL
OR dbo.CFE_PPHY.P70 <> ''
THEN convert(VARCHAR, MONTH(dbo.CFE_SERVICE_EVTS.C10_2))
ELSE COALESCE(CONVERT(VARCHAR, dbo.CFE_PPHY.P77) + CONVERT(VARCHAR, dbo.CFE_SERVICE_EVTS.C10_2), 'toto')
END AS CFELiasse_DateEffetEIRL_MM_N
FROM CFE_PPHY
LEFT JOIN CFE_SERVICE_EVTS ON CFE_PPHY.colA = CFE_SERVICE_EVTS.colB
First, when converting to a string, always include a length (in SQL Server). The default length varies by context and may not be correct.
Second, the comparison of date/time values to '' is not necessary. This is not really valid value for a date/time -- although it does get converted to a 0 which is 1900-01-01. The NULL comparison should be sufficient. Otherwise, be explicit.
Third, string concatenation will return NULL if any of the arguments are NULL.
Fourth, table aliases make a query easier to write and to read.
As far as I can tell, your case is a bit over complicated. In the ELSE, we know that dbo.CFE_PPHY.P77 is NULL, because of the first condition. So, how about:
SELECT (CASE WHEN p.P77 IS NOT NULL
THEN CAST(MONTH(p.P77) as VARCHAR(255))
WHEN p.P70 IS NOT NULL
THEN CAST(MONTH(e.C10_2) as VARCHAR(255))
ELSE 'toto'
END) AS CFELiasse_DateEffetEIRL_MM_N
FROM CFE_PPHY p LEFT JOIN
CFE_SERVICE_EVTS e
ON p.colA = e.colB;

How does one filter based on whether a field can be converted to a numeric?

I've got a report that has been in use quite a while - in fact, the company's invoice system rests in a large part upon this report (Disclaimer: I didn't write it). The filtering is based upon whether a field of type VarChar(50) falls between two numeric values passed in by the user.
The problem is that the field the data is being filtered on now not only has simple non-numeric values such as '/A', 'TEST' and a slew of other non-numeric data, but also has numeric values that seem to be defying any type of numeric conversion I can think of.
The following (simplified) test query demonstrates the failure:
Declare #StartSummary Int,
#EndSummary Int
Select #StartSummary = 166285,
#EndSummary = 166289
Select SummaryInvoice
From Invoice
Where IsNull(SummaryInvoice, '') <> ''
And IsNumeric(SummaryInvoice) = 1
And Convert(int, SummaryInvoice) Between #StartSummary And #EndSummary
I've also attempted conversions using bigint, real and float and all give me similar errors:
Msg 8115, Level 16, State 2, Line 7
Arithmetic overflow error converting
expression to data type int.
I've tried other larger numeric datatypes such as BigInt with the same error. I've also tried using sub-queries to sidestep the conversion issue by only extracting fields that have numeric data and then converting those in the wrapper query, but then I get other errors which are all variations on a theme indicating that the value stored in the SummaryInvoice field can't be converted to the relevant data type.
Short of extracting only those records with numeric SummaryInvoice fields to a temporary table and then querying against the temporary table, is there any one-step solution that would solve this problem?
Edit: Here's the field data that I suspect is causing the problem:
SummaryInvoice
11111111111111111111111111
IsNumeric states that this field is numeric - which it is. But attempting to convert it to BigInt causes an arithmetic overflow. Any ideas? It doesn't appear to be an isolated incident, there seems to have been a number of records populated with data that causes this issue.
It seems that you are gonna have problems with the ISNUMERIC function, since it returns 1 if can be cast to any number type (including ., ,, e0, etc). If you have numbers longer than 2^63-1, you can use DECIMAL or NUMERIC. I'm not sure if you can use PATINDEX to perform an regex look on SummaryInvoice, but if you can, then you should try this:
SELECT SummaryInvoice
FROM Invoice
WHERE ISNULL(SummaryInvoice, '') <> ''
AND CASE WHEN PATINDEX('%[^0-9]%',SummaryInvoice) > 0 THEN CONVERT(DECIMAL(30,0), SummaryInvoice) ELSE -1 END
BETWEEN #StartSummary And #EndSummary
You can't guarantee what order the WHERE clause filters will be applied.
One ugly option to decouple inner and outer.
SELECT
*
FROM
(
Select TOP 2000000000
SummaryInvoice
From Invoice
Where IsNull(SummaryInvoice, '') <> ''
And IsNumeric(SummaryInvoice) = 1
ORDER BY SummaryInvoice
) foo
WHERE
Convert(int, SummaryInvoice) Between #StartSummary And #EndSummary
Another using CASE
Select SummaryInvoice
From Invoice
Where IsNull(SummaryInvoice, '') <> ''
And
CASE WHEN IsNumeric(SummaryInvoice) = 1 THEN Convert(int, SummaryInvoice) ELSE -1 END
Between #StartSummary And #EndSummary
YMMV
Edit: after question update
use decimal(38,0) not int
Change ISNUMERIC(SummaryInvoice) to ISNUMERIC(SummaryInvoice + '0e0')
AND with IsNumeric(SummaryInvoice) = 1, will not short circuit in SQL Server.
But may be you can use
AND (CASE IsNumeric(SummaryInvoice) = 1 THEN Convert(int, SummaryInvoice) ELSE 0 END)
Between #StartSummary And #EndSummary
Your first issue is to fix your database structure so bad data cannot get into the field. You are putting a band-aid on a wound that needs stitches and wondering why it doesn't heal.
Database refactoring is not fun, but it needs to be done when there is a data integrity problem. I assume you aren't really invoicing someone for 11,111,111,111,111,111,111,111,111 or 'test'. So don't allow those values to ever get entered (if you can't change the structure to the correct data type, consider a trigger to prevent bad data from going in) and delete the ones you do have that are bad.