SQL And NULL Values in where clause - sql

So I have a simple query that returns a listing of products
SELECT Model, CategoryID
FROM Products
WHERE (Model = '010-00749-01')
This returns
010-00749-01 00000000-0000-0000-0000-000000000000
010-00749-01 NULL
Which is correct, so I wanted only the products whose CategoryID is not '00000000-0000-0000-0000-000000000000' so I have
SELECT Model, CategoryID
FROM Products
WHERE (Model = '010-00749-01')
AND (CategoryID <> '00000000-0000-0000-0000-000000000000')
But this returns no result. So I changed the query to
SELECT Model, CategoryID
FROM Products
WHERE (Model = '010-00749-01')
AND ((CategoryID <> '00000000-0000-0000-0000-000000000000') OR (CategoryID IS NULL))
Which returns expected result
010-00749-01 NULL
Can someone explain this behavior to me?
MS SQL Server 2008

Check out the full reference on Books Online - by default ANSI_NULLS is on meaning you'd need to use the approach you have done. Otherwise, you could switch that setting OFF at the start of the query to switch the behaviour round.
When SET ANSI_NULLS is ON, a SELECT
statement that uses WHERE column_name
= NULL returns zero rows even if there are null values in column_name. A
SELECT statement that uses WHERE
column_name <> NULL returns zero rows
even if there are nonnull values in
column_name.
...
When SET ANSI_NULLS
is ON, all comparisons against a null
value evaluate to UNKNOWN. When SET
ANSI_NULLS is OFF, comparisons of all
data against a null value evaluate to
TRUE if the data value is NULL.
Here's a simple example to demonstrate the behaviour with regard to comparisons against NULL:
-- This will print TRUE
SET ANSI_NULLS OFF;
IF NULL <> 'A'
PRINT 'TRUE'
ELSE
PRINT 'FALSE'
-- This will print FALSE
SET ANSI_NULLS ON;
IF NULL <> 'A'
PRINT 'TRUE'
ELSE
PRINT 'FALSE'

In general, you have to remember that NULL generally means UNKNOWN. That means if you say CategoryID <> '00000000-0000-0000-0000-000000000000' you have to assume that the query will only return values that it KNOWS will meet your criteria. Since there is a NULL (UNKNOWN) result, it does not actually know if that record meets your criteria and therefore will not be returned in the dataset.

Basically, a NULL is the absence of any value. So trying to compare the NULL in CategoryId to a varchar value in the query will always result in a false evaluation.
You might want to try using the COALESCE function, something like:
SELECT ModelId, CategoryID
FROM Products
WHERE (ModelId = '010-00749-01')
AND ( COALESCE( CategoryID, '' ) <> '00000000-0000-0000-0000-000000000000' )
EDIT
As noted by AdaTheDev the COALESCE function will negate any indices that may exist on the CategoryID column, which can affect the query plan and performance.

look at this:
1=1 --true
1=0 --false
null=null --false
null=1 --false
1<>1 --false
1<>0 --true
null<>null --false
null<>1 --false <<<--why you don't get the row with: AND (CategoryID <> '00000000-0000-0000-0000-000000000000')

Null gets special treatment. You need to explicitly test for null. See http://msdn.microsoft.com/en-us/library/ms188795.aspx

You may try using the Coalesce function to set a default value for fields that have null:
SELECT Model , CategoryID
FROM Products
WHERE Model = '010-00749-01'
AND Coalesce(CategoryID,'') <> '00000000-0000-0000-0000-000000000000'
I think the problem lies in your understanding of NULL which basically means "nothing." You can't compare anything to nothing, much like you can't divide a number by 0. It's just rules of math/science.
Edit:
As Ada has pointed out, this could cause an indexed field to no longer use an index.
Solution:
You can create an index using the coalesce function: eg create index ... coalesce(field)
You can add a not null constraint to prevent NULLs from ever appearing
A de facto standard of mine is to always assign default values and never allow nulls

Related

SQL HASHBYTES function returns weird output when used in CASE WHEN/IIF

I have written a stored procedure that hashes the value of a certain column. I need to use this HASHBYTES function in a CASE WHEN or IIF statement, like this:
DECLARE #Hash varchar(255) = 'testvalue'
SELECT IIF(1=1, HASHBYTES('SHA1',#Hash), #Hash)
SELECT CASE WHEN 1=1 THEN HASHBYTES('SHA1',#Hash) END AS Hashcolumn
I can't get my head around why I get different outputs from above queries? it seems that whenever I add an ELSE in the CASE WHEN / IIF statement, it returns a string of weird characters (like ü<þ+OUL'RDOk{­\Ìø in above example).
Can anyone tell me why this is happening? I need to use the CASE WHEN or IIF.
Thanks guys
IIF returns the data type with the highest precedence from the types in true_value and false_value. In this case, it's #Hash1 which is varchar(255) so your result is getting cast to varchar(255). See below.
DECLARE #Hash varchar(255) = 'testvalue'
SELECT cast(HASHBYTES('SHA1',#Hash) as varchar(255))
Similarly, CASE works the same way. However, if you don't add an ELSE or another WHEN that would conflict with the data type, it will work. This is because an ELSE NULL is implied. i.e.
SELECT CASE WHEN 1=1 THEN HASHBYTES('SHA1',#Hash) END
However, if you add another check, then precedence kicks in, and it will be converted.
SELECT CASE WHEN 1=1 THEN HASHBYTES('SHA1',#Hash) WHEN 1=2 THEN #Hash END AS Hashcolumn
SELECT CASE WHEN 1=1 THEN HASHBYTES('SHA1',#Hash) ELSE #Hash END AS Hashcolumn
The output of a select query is a virtual table. In a relational db a column of a table is constrained to single data type.. so here what happens is implicit conversion is being done by the server engine inorder to render a sigle type and hence weird characters are returned.
The nature of conversion is as #scsimon says it follows highest precedence order.
The following query should help.
DECLARE #Hash varchar(255) = 'testvalue'
SELECT IIF(1=1, CONVERT(VARCHAR(255),HASHBYTES('SHA1',#Hash),2), #Hash)
SELECT CASE WHEN 1=2 THEN CONVERT(VARCHAR(255),HASHBYTES('SHA1',#Hash),2)
ELSE #Hash END AS Hashcolumn

Why date field is getting saved as null?

Why #OpeningDate is getting saved as NULL even though I am doing this.
PROCEDURE [dbo].[InsertCaseANDHearingDetails]
#HearingDate datetime,
#IsOpeningDate bit= null,
#OpeningDate date= null,
AS
Begin
IF(#IsOpeningDate = 0)
Begin
Set #OpeningDate= (Select Convert(varchar, #HearingDate, 106))
End
Insert Into Hearings
values (#HearingDate, #OpeningDate)
End
Even though I am calculating it and hearing date is not null but why OpeningDate is getting saved as NULL.
#HearingDate != NULL will not work. The result of this comparison is always unknown. Use #HearingDate is not null instead.
Because the variable #HearingDate is not initialized, it would have a null value.
Also, the variable #OpeningDate wouldn't be set to Select Convert(varchar, #HearingDate, 106) because the if condition evaluates to unknown.
Hence, when you select values from the table they would be null.
Edit:
#IsOpeningDate bit= null
...
IF(#IsOpeningDate = 0)
This condition evaluates to unknown too as this is doing 0 = null. You cannot compare with null.
It will work if you use the following at the start of the Proc.
SET ANSI_NULLS OFF
Basically NULL can't be compared, not even with itself, because this is not a value. If at all, you want the engine to treat it as one, you need to set off the ANSI_NULLS property. That said, I would prefer to go with #vkp's answer any day.
Based on your edit, it looks like the below condition is not met:
IF(#IsOpeningDate = 0)
and hence, it remains NULL.
in the if condition why dont you check is null instead of comparing it to 0
Your query
IF(#IsOpeningDate = 0)
My suggestion
IF(#IsOpeningDate IS NULL)
Sorry it was my mistake to no mention parameters in Insert statement and since NULL were allowed so it inserted NULLs
Thanks for your help as it led me to figure out that what's actually going on.

Why doesn't the Select statement assigns an empty string or null value if it doesn't return a result?

I have the following code:
declare #testValue nvarchar(50) = 'TEST';
select #testValue = 'NOTUSED' where 1 > 2;
select #testValue; -- Outputs 'TEST'
select #testValue = 'USED' where 2 > 1;
select #testValue; -- Outputs 'USED'
With the above, the first assignment is never used because the where clause fails. The second one is done properly and used is returned.
Why doesn't SQL return a null in this case and assigns a NULL value to #testValue after the first assignment where the where clause fails?
This is the expected behavior:
"If the SELECT statement returns no rows, the variable retains its present value. If expression is a scalar subquery that returns no value, the variable is set to NULL."
https://msdn.microsoft.com/en-us/library/ms187330.aspx
You can get around this in your example by using a subquery in the right side.
SELECT #testValue = (SELECT 'NOTUSED' where 1 > 2);
As for why it is this way, I cannot say for certain. Perhaps the entire #testValue = 'NOTUSED' is equating to NULL instead of only the right side 'NOTUSED' portion of the statement, and this prevents the parameter from being set. Not directly related but I can say it took me some time to grow confident with writing queries when NULLs are involved. You need to be aware of / familiar with the ANSI NULL spec and associated behavior.
This is the default behavior of SELECT.
When assigning a value to a variable using SELECT, if there is no value returned, SELECT will not make the assignment at all so the variable's value will not be changed.
On the other hand, SET will assign NULL to the variable if there is no value returned.
For more info
NULL is the ideal value you would like but the SQL engine is not clever enough, because some else may want empty string , ' ' in that situation or 0 or 1, you see. So no single default value is set. Best is set your own default value. You can see below
DECLARE #testValue NVARCHAR(50) = 'TEST';
SELECT #testValue = 'NOTUSED' WHERE 2 > 1;
IF 2 <> 1
SELECT #testValue = NULL;
SELECT #testValue; -- Outputs 'TEST'
SELECT #testValue = 'USED' WHERE 1 > 2;
SELECT #testValue; -- Outputs 'USED'
NULL in SQL is used to denote missing data or an unknown value. In this case the data is not missing, the value of #testValue is known, it is just failing an assignment condition, so it gets no new value.
If you were to change your initial assignment to be like this
declare #testValue nvarchar(50)
You would get NULL like below :
select #testValue = 'NOTUSED' where 1 > 2;
select #testValue; -- Outputs NULL
select #testValue = 'USED' where 2 > 1;
select #testValue; -- Outputs 'USED'
Don't be too disappointed your not getting NULL back in the your example. NULL is not easy to handle.
For example, you can not compare two NULL values, because instances of NULL are not equal. Consequently you also need to use special operators like ISNULL to check for it.
In general, NULL as a programming construct should be avoided in my opinion. This is a bit of area of contention across the programming languages. But consdier this, even the creator of null Tony Hoare, calls the creation of null his 'billion dollar mistake'.

NULL comparison in SQL server 2008

I know that in SQL when we compare two NULL values, result is always false. Hence, statements like
SELECT case when NULL = NULL then '1' else '0' end
will always print '0'. My question is how functions like ISNULL determine whether value is null or not. Because, as per my understanding (and explained in above query) comparison of two null values is always FALSE.
You need to set the set ansi_nulls off and then check your result. Null can be thought of as an unknown value and when you are comparing two unknown values then you will get the result as false only. The comparisons null = null is undefined.
set ansi_nulls off
SELECT case when NULL = NULL then '1' else '0' end
Result:-
1
From MSDN
When SET ANSI_NULLS is OFF, the Equals (=) and Not Equal To (<>)
comparison operators do not follow the ISO standard. A SELECT
statement that uses WHERE column_name = NULL returns the rows that
have null values in column_name. A SELECT statement that uses WHERE
column_name <> NULL returns the rows that have nonnull values in the
column. Also, a SELECT statement that uses WHERE column_name <>
XYZ_value returns all rows that are not XYZ_value and that are not
NULL.
As correctly pointed by Damien in comments the behavior of NULL = NULL is unknown or undefined.
Your initial assumption appears to be that ISNULL is an alias for existing functionality which can be implemented directly within SQL statements, in the same way that a SQL function can. You are then asking how that function works.
This is an incorrect starting point, hence the confusion. Instead, like similar commands such as IN and LIKE, ISNULL is parsed and run within the database engine itself; its actual implementation is most likely written in C.
If you really want to look into the details of the implementation, you could take a look instead at mySQL - it's open source, so you may be able to search through the code to see how ISNULL is implemented there. They even provide a guided tour of the code if required.
... or {2} are you literally asking how the ISNULL function in SQL
Server itself works?
Actually I am asking for the second{2}. How ISNULL function in SQL server
works. If comparison of two nulls is not defined/unknown then how
isnull function compares two null values to return appropriate
results?
Null is a special marker used in Structured Query Language (SQL) to indicate that a data value does not exist in the database. ... NULL (SQL)
ISNULL ( check_expression , replacement_value ) is not concerned with comparison of values at all. It is concerned purely with the existence of value in the first parameter.
It tests if the check_expression has any value. If it does have any value that value is returned. If check_expression has no value the ISNULL function returns the second parameter replacement_value.
It does NOT compare the two values. It tests forthe existence of value in the first parameter only.
set ansi_nulls off
SELECT case when NULL = NULL then '1' else '0' end
result=1
set ansi_nulls on
SELECT case when NULL = NULL then '1' else '0' end
result=0
so that is the difference
I hope it works
SELECT CASE WHEN ISNULL(NULL,NULL) = NULL THEN 1 ELSE 0 END
SELECT case when 'NULL' = 'NULL' then '1' else '0' end
SELECT case when isnull(columnname,'NULL')='NULL' then '1' else '0' end
SET ANSI_NULLS OFF
SELECT case when NULL = NULL then '1' else '0' end

sql server: MERGE has unexpected results

The way these rows usually come into the target table the first time are with a sparse number of columns populated with mostly text data with the remainder of the columns set to NULL. On subsequent passes, the fresh data populates existing known (non null) and unknown (NULL) data. I've ascertained that the fresh data ( #pld) do indeed contain different data. The data does not appear to change. Here's what I have:
BEGIN TRANSACTION
BEGIN TRY
MERGE INTO [metro].listings AS metroList
USING #pld as listnew
ON metroList.id = listnew.id
AND metroList.sid = listnew.sid
WHEN MATCHED AND (
metroList.User != listnew.User
or metroList.Email != listnew.Email
or metroList.LocName != listnew.LocName
) THEN
UPDATE SET
metroList.User = listnew.User,
metroList.Email = listnew.Email,
metroList.LocName = listnew.LocName,
WHEN NOT MATCHED THEN
INSERT
( User,
Email,
LocName
)
VALUES
(
listnew.User,
listnew.Email,
listnew.LocName
);
COMMIT TRANSACTION
END TRY
IF ##TRANCOUNT > 0
ROLLBACK TRANSACTION;
END CATCH
I've tried replacing the != to under the update portion of the statement with <> . Same results. This has to be related to a comparison of a possible (likely) null value against a string--maybe even another null? Anyway, I'm calling on all sql-geeks to untangle this.
Also you can use option with NULLIF() function.
NULLIF returns the first expression if the two expressions are not equal. If the expressions are equal, NULLIF returns a null value of the type of the first expression.
WHEN MATCHED AND (
NULLIF(ISNULL(metroList.[User],''), listnew.[User]) IS NOT NULL
OR NULLIF(ISNULL(metroList.Email, ''), listnew.Email) IS NOT NULL
OR NULLIF(ISNULL(metroList.LocName, ''), listnew.LocName) IS NOT NULL
)
THEN
Comparing NULL with an empty string will not work.
If either side could be NULL, you could do something like:
WHEN MATCHED AND (
COALESCE(metroList.User, '') <> COALESCE(listnew.User, '')
or COALESCE(metroList.Email, '') <> COALESCE(listnew.Email, '')
or COALESCE(metroList.LocName, '') <> COALESCE(listnew.LocName, '')
) THEN
Of course, this assumes that you're fine with NULL meaning the same as an empty string (which may not be appropriate).
Take a look at this BOL article on NULL comparisons.
As I understand the question you are looking for an expression that emulates IS DISTINCT FROM.
The answer you have accepted is not correct then
WITH metroList([User])
AS (SELECT CAST(NULL AS VARCHAR(10))),
listnew([User])
AS (SELECT 'Foo')
SELECT *
FROM metroList
JOIN listnew
ON NULLIF(metroList.[User], listnew.[User]) IS NOT NULL
Returns zero rows. Despite the values under comparison being NULL and Foo.
I would use the technique from this article: Undocumented Query Plans: Equality Comparisons
WHEN MATCHED AND EXISTS (
SELECT metroList.[User], metroList.Email,metroList.LocName
EXCEPT
SELECT listnew.[User], listnew.Email,listnew.LocName
)