NULLs and SET ANSI_NULLS OFF - sql

I have a bit of SQL that queries a table that has one column that can take NULL.
In the code below the #term_type_id could be NULL, and the column term_type_id could have NULL values.
So the code below works if #term_type_id has a value, but does not work if #term_type_id is NULL.
Setting SET ANSI_NULLS OFF is a way around the problem, but I do know that this will be depreciated at some stage.
SELECT [date],value FROM history
WHERE id=#id
AND data_type_id=#data_type_id
AND quote_type_id=#quote_type_id
AND update_type_id=#update_type_id
AND term_type_id=#term_type_id
AND source_id=#source_id
AND ([date]>=#temp_from_date and [date]<=#temp_to_date)
ORDER BY [date]
What I have done in the past is to have something like this
if #term_type_id is NULL
BEGIN
SELECT ......
WHERE .....
AND term_type_id IS NULL
END
BEGIN
SELECT ......
WHERE .....
AND term_type_id = #term_type_id
END
While this works, it is very verbose and makes the code hard to read and maintain.
Does anyone have a better solution than using SET ANSI_NULLS OFF or having to write conditional code just to manage the case when something could be a value or NULL?
BTW - When I use SET ANSI_NULLS OFF, I only do it for the specific query then turn it back on afterwards. I do understand the reasons why this is frowned upon, but it is at the expense of writing pointless code to get around a 'pure' view of NULL.
Ben

Since both the column and the parameter can be null, you should treat both cases:
SELECT [date],value FROM history
WHERE id=#id
AND data_type_id=#data_type_id
AND quote_type_id=#quote_type_id
AND update_type_id=#update_type_id
AND ((term_type_id IS NULL AND #term_type_id IS NULL) OR term_type_id = #term_type_id)
AND source_id=#source_id
AND ([date]>=#temp_from_date and [date]<=#temp_to_date)
ORDER BY [date]
Note that this will only return results when both column and parameter are null, or none of them is null.

Ben, a better solution to if #term_type_id is NULL SELECT #term_type_id=-1
would be to use isnull(#term_type_id,-1)

Ahh - yes - thank you for the prompt responses. I think I have just come up with a better solution (well in this case)...
if #term_type_id is NULL SELECT #term_type_id=-1
SELECT [date],value FROM history
WHERE id=#id
AND data_type_id=#data_type_id
AND quote_type_id=#quote_type_id
AND update_type_id=#update_type_id
AND ISNULL(term_type_id,-1)=#term_type_id
AND source_id=#source_id
AND ([date]>=#temp_from_date and [date]<=#temp_to_date)
ORDER BY [date]
This works in this case as term_type_id is the result of an identity (1,1) and thus can not be -1.

Try this, this will work on both case.
SELECT ......
WHERE .....
AND ISNULL(term_type_id,-1) = ISNULL(#term_type_id,-1)
You can use any static value instead of -1
Or you can use something like below
SELECT ......
WHERE .....
AND ( (#term_type_id IS NULL AND term_type_id IS NULL)
OR term_type_id = #term_type_id
)

If it's the only nullable column among your search criteria, the best way would be to split conditions within a single UNION statement:
select date, value from dbo.History
where term_type_id is null
-- Remaining search criteria
and ...
union all
select date, value from dbo.History
where term_type_id = #term_type_id
-- Remaining search criteria
and ...
This is the fastest code possible in your case, basically because SQL Server doesn't have a particular knack for OR-ed conditions. However, another nullable column will turn this into a rather unpleasant mess.
If you think you can sacrifice performance, there is a useful function in T-SQL that does exactly that - NULLIF():
select date, value from dbo.History
where nullif(#term_type_id, term_type_id) is null
-- Remaining search criteria
and ...
However, this type of condition will be non-SARGable, most likely. Also, note that the order of arguments does matter in NULLIF(). Alternatively, you can devise CASE constructs of various complexity that might be semantically more suitable to your exact requirements.

Related

Passing in parameter to where clause using IS NULL or Coalesce

I would like to pass in a parameter #CompanyID into a where clause to filter results. But sometimes this value may be null so I want all records to be returned. I have found two ways of doing this, but am not sure which one is the safest.
Version 1
SELECT ProductName, CompanyID
FROM Products
WHERE (#CompanyID IS NULL OR CompanyID = #CompanyID)
Version 2
SELECT ProductName, CompanyID
FROM Products
WHERE CompanyID = COALESCE(#CompanyID, CompanyID)
I have found that the first version is the quickest, but I have also found in other tables using a similar method that I get different result sets back. I don't quite understand the different between the two.
Can anyone please explain?
Well, both queries are handling the same two scenarios -
In one scenario #CompanyID contains a value,
and in the second #CompanyID contains NULL.
For both queries, the first scenario will return the same result set - since
if #CompanyId contains a value, both will return all rows where companyId = #CompanyId, however the first query might return it faster (more on that at the end of my answer).
The second scenario, however, is where the queries starts to behave differently.
First, this is why you get different result sets:
Difference in result sets
Version 1
WHERE (#CompanyID IS NULL OR CompanyID = #CompanyID)
When #CompanyID is null, the where clause will not filter out any rows whatsoever, and all the records in the table will be returned.
Version 2
WHERE CompanyID = COALESCE(#CompanyID, CompanyID)
When #CompanyID is null, the where clause will filter out all the rows where CompanyID is null, since the result of null = null is actually unknown - and any query with null = null as it's where clause will return no results, unless ANSI_NULLS is set to OFF (which you really should not do since it's deprecated).
Index usage
You might get faster results from the first version, since the use of any function on a column in the where clause will prevent SQL Server from using any index that you might have on this column.
You can read more about it on this article in MSSql Tips.
Conclusion
Version 1 is better than version 2.
Even if you do not want to return records where companyId is null it's still better to write as WHERE (#CompanyID IS NULL OR CompanyID = #CompanyID) AND CompanyID IS NOT NULL than to use the second version.
It's worth noting that using the syntax ([Column] = #Value OR [Column] IS NULL) is a much better idea than using ISNULL([Column],#Value) = #Value (or using COALESCE).
This is because using the function causes the query to become un-SARGable; so indexes won't be used. The first expression is SARGable, and thus, will perform better.
Just adding this, as the OP states "I have found that the first version is the quickest", and wanted to elaborate why (even though, currently the statement is incomplete, I am guessing this was more due to user error and ignorance).
The second version is not correct SQL (for SQL Server). It needs an operator. Presumably:
SELECT ProductName, CompanyID
FROM Products
WHERE COALESCE(#CompanyID, CompanyID) = CompanyID;
The first version is correct as written. If you have an index on CompanyID, you might find this faster:
SELECT *
FROM Products
WHERE CompanyID = #CompanyID
UNION ALL
SELECT *
FROM Products
WHERE #CompanyID IS NULL;

Why Does One SQL Query Work and the Other Does Not?

Please disregard the obvious problems with the manipulation of data in the where clause. I know! I'm working on it. While working on it, though, I discovered that this query runs:
SELECT *
FROM PatientDistribution
WHERE InvoiceNumber LIKE'PEX%'
AND ISNUMERIC(CheckNumber) = 1
AND CONVERT(BIGINT,CheckNumber) <> TransactionId
And this one does not:
SELECT *
FROM PatientDistribution
WHERE InvoiceNumber LIKE'PEX%'
AND CONVERT(BIGINT,CheckNumber) <> TransactionId
AND ISNUMERIC(CheckNumber) = 1
The only difference between the two queries is the order of items in the WHERE clause. I was under the impression that the SQL Server query optimizer would take the worry out of me having to worry about that.
The error returned is: Error converting data type varchar to bigint.
You are right, the order of the conditions shouldn't matter.
If AND ISNUMERIC(CheckNumber) = 1 is checked first and non-matching rows thus dismissed, then AND CONVERT(BIGINT,CheckNumber) <> TransactionId will work (for exceptions see scsimon's answer).
If AND CONVERT(BIGINT,CheckNumber) <> TransactionId is processed before AND ISNUMERIC(CheckNumber) = 1 then you may get an error.
That your first query worked and the second not was a matter of luck. It could just as well have been vice versa.
You can force one condition to be executed before the other:
SELECT *
FROM
(
SELECT *
FROM PatientDistribution
WHERE InvoiceNumber LIKE 'PEX%'
AND ISNUMERIC(CheckNumber) = 1
) num_only
WHERE CONVERT(BIGINT,CheckNumber) <> TransactionId;
You just got lucky that the first one worked, since you are correct that the order of what you list in the where clause does not matter. SQL is a declarative language meaning that you are telling the engine what should happen, not how. So your queries weren't executed with the same query plan I would suspect. Granted, you can affect what the optimizer does to a certain extent. You'll also notice this type of issue when using a CTE. For example:
declare #table table(columnName varchar(64))
insert into #table
values
('1')
,('1e4')
;with cte as(
select columnName
from #table
where isnumeric(columnName) = 1)
select
cast(columnName as decimal(32,16))
from cte
The above snippet you would assume that the second statement is ran on the results / subset from the CTE statement. However, you can't ensure this will happen and you could still get a type/conversion error on the second statement.
More importantly, you should know that ISNUMERIC() is largely misused. People often think that if it returns 1 then it could be converted to a decimal or int. But this isn't the case. It just checks that it's a valid numeric type. For example:
select
isnumeric('1e4')
,isnumeric('$')
,isnumeric('1,123,456')
As you can see, these evaluate to true, but would fail the conversion you put in your post.
Side note, your indexes are likely the reason why the first actually didn't error our.

Ignore other results if a resultset has been found

To start, take this snippet as an example:
SELECT *
FROM StatsVehicle
WHERE ((ReferenceMakeId = #referenceMakeId)
OR #referenceMakeId IS NULL)
This will fetch and filter the records if the variable #referenceMakeId is not null, and if it is null, will fetch all the records. In other words, it is taking the first one into consideration if #referenceMakeId is not null.
I would like to add a further restriction to this, how can I achieve this?
For instance
(ReferenceModelId = #referenceModeleId) OR
(
(ReferenceMakeId = #referenceMakeId) OR
(#referenceMakeId IS NULL)
)
If #referenceModelId is not null, it will only need to filter by ReferenceModelId, and ignore the other statements inside it. If I actually do this as such, it returns all the records. Is there anything that can be done to achieve such a thing?
Maybe something like this?
SELECT * FROM StatsVehicle WHERE
(
-- Removed the following, as it's not clear if this is beneficial
-- (#referenceModeleId IS NOT NULL) AND
(ReferenceModelId = #referenceModeleId)
) OR
(#referenceModeleId IS NULL AND
(
(ReferenceMakeId = #referenceMakeId) OR
(#referenceMakeId IS NULL)
)
)
This should do the trick.
SELECT * FROM StatsVehicle
WHERE ReferenceModelId = #referenceModeleId OR
(
#referenceModeleId IS NULL AND
(
#referenceMakeId IS NULL OR
ReferenceMakeId = #referenceMakeId
)
)
However, you should note that this types of queries (known as catch-all queries) tend to be less efficient then writing a single query for every case.
This is due to the fact that SQL Server will cache the first query plan that might not be optimal for other parameters.
You might want to consider using the OPTION (RECOMPILE) query hint, or braking down the stored procedure to pieces that will each handle the specific conditions (i.e one select for null variables, one select for non-null).
For more information, read this article.
If #referenceModelId is not null, it will only need to filter by
ReferenceModelId, and ignore the other statements inside it. If I
actually do this as such, it returns all the records. Is there
anything that can be done to achieve such a thing?
You can think of using a CASE for good short circuit mechanism
WHERE
CASE
WHEN #referenceModelId is not null AND ReferenceModelId = #referenceModeleId THEN 1
WHEN #referenceMakeId is not null AND ReferenceMakeId = #referenceMakeId THEN 1
WHEN #referenceModelId is null AND #referenceMakeId is null THEN 1
ELSE 0
END = 1

mismatch not picked up when one value is null

I have a simple SQL query where a comparison is done between two tables for mismatching value.
Yesterday, we picked up an issue where one field was null and the other wasn't, but a mismatch was not detected.
As far as I can determine,the logic has been working all along until yesterday.
Here is the logic of the SQL:
CREATE TABLE Table1
(UserID INT,PlayDate DATETIME)
CREATE TABLE Table2
(UserID INT,PlayDate DATETIME)
INSERT INTO Table1 (UserID)
SELECT 5346
INSERT INTO Table2 (UserID,PlayDate)
SELECT 5346,'2012-11-01'
SELECT a.UserID
FROM Table1 a
INNER JOIN
Table2 b
ON a.UserID = b.UserID
WHERE a.PlayDate <> b.PlayDate
No values are returned even though the PlayDate values are different.
I have now updated the WHERE to read:
WHERE ISNULL(a.PlayDate,'') <> ISNULL(b.PlayDate,'')
Is there a setting in SQL which someone could have changed to cause the original code to no longer pick up the difference in fields?
Thanks
NULL <> anything
is unknown not true. SQL uses three valued logic (false/true/unknown) and the predicate needs to evaluate to true in a where clause for the row to be returned.
In fact in standard SQL any comparison with NULL except for IS [NOT] NULL yields unknown. Including WHERE NULL = NULL
You don't state RDBMS but if it supports IS DISTINCT FROM you could use that or if you are using MySQL it has a null safe equality operator <=> you could negate.
You say you think it previously behaved differently. If you are on SQL Server you might be using a different setting for ANSI_NULLS somehow but this setting is deprecated and you should rewrite any code that depends on it anyway.
You can simulate IS DISTINCT FROM in SQL Server with WHERE EXISTS (SELECT a.PlayDate EXCEPT SELECT b.PlayDate)
Not even a NULL can be equal to NULL.
Here are two common queries that just don’t work:
select * from table where column = null;
select * from table where column <> null;
there is no concept of equality or inequality, greater than or less
than with NULLs. Instead, one can only say “is” or “is not”
(without the word “equal”) when discussing NULLs.
- The correct way to write the queries
select * from table where column IS NULL;
select * from table where column IS NOT NULL;

SQL procedure where clause to list records

I have a stored procedure that will query and return records based on if items are available or not. That part is easy, Im simply passing in a variable to check where available is yes or no.
But what I want to know is how do I throw a everything clause in there (i.e available, not available, or everything)?
The where clause right now is
where availability = #availability
The values of availabitility are either 1 or 0, nothing else.
You can use NULL to represent everything.
WHERE (#availability IS NULL OR availability = #availability)
SELECT *
FROM mytable
WHERE #availability IS NULL
UNION ALL
SELECT *
FROM mytable
WHERE availability = #availability
Passing a NULL value will select everything.
This query will use an index on availability if any.
Don't know the type of #availability, but assuming -1 = everything then you could simply do a
where #availability = -1 OR availability = #availability
multiple ways of doing it. Simpliest way is is to set the default value of the #availability to null and then your where clause would look like this
WHERE (#availability IS NULL OR availability = #availability)