sql server: MERGE has unexpected results - sql

The way these rows usually come into the target table the first time are with a sparse number of columns populated with mostly text data with the remainder of the columns set to NULL. On subsequent passes, the fresh data populates existing known (non null) and unknown (NULL) data. I've ascertained that the fresh data ( #pld) do indeed contain different data. The data does not appear to change. Here's what I have:
BEGIN TRANSACTION
BEGIN TRY
MERGE INTO [metro].listings AS metroList
USING #pld as listnew
ON metroList.id = listnew.id
AND metroList.sid = listnew.sid
WHEN MATCHED AND (
metroList.User != listnew.User
or metroList.Email != listnew.Email
or metroList.LocName != listnew.LocName
) THEN
UPDATE SET
metroList.User = listnew.User,
metroList.Email = listnew.Email,
metroList.LocName = listnew.LocName,
WHEN NOT MATCHED THEN
INSERT
( User,
Email,
LocName
)
VALUES
(
listnew.User,
listnew.Email,
listnew.LocName
);
COMMIT TRANSACTION
END TRY
IF ##TRANCOUNT > 0
ROLLBACK TRANSACTION;
END CATCH
I've tried replacing the != to under the update portion of the statement with <> . Same results. This has to be related to a comparison of a possible (likely) null value against a string--maybe even another null? Anyway, I'm calling on all sql-geeks to untangle this.

Also you can use option with NULLIF() function.
NULLIF returns the first expression if the two expressions are not equal. If the expressions are equal, NULLIF returns a null value of the type of the first expression.
WHEN MATCHED AND (
NULLIF(ISNULL(metroList.[User],''), listnew.[User]) IS NOT NULL
OR NULLIF(ISNULL(metroList.Email, ''), listnew.Email) IS NOT NULL
OR NULLIF(ISNULL(metroList.LocName, ''), listnew.LocName) IS NOT NULL
)
THEN

Comparing NULL with an empty string will not work.
If either side could be NULL, you could do something like:
WHEN MATCHED AND (
COALESCE(metroList.User, '') <> COALESCE(listnew.User, '')
or COALESCE(metroList.Email, '') <> COALESCE(listnew.Email, '')
or COALESCE(metroList.LocName, '') <> COALESCE(listnew.LocName, '')
) THEN
Of course, this assumes that you're fine with NULL meaning the same as an empty string (which may not be appropriate).
Take a look at this BOL article on NULL comparisons.

As I understand the question you are looking for an expression that emulates IS DISTINCT FROM.
The answer you have accepted is not correct then
WITH metroList([User])
AS (SELECT CAST(NULL AS VARCHAR(10))),
listnew([User])
AS (SELECT 'Foo')
SELECT *
FROM metroList
JOIN listnew
ON NULLIF(metroList.[User], listnew.[User]) IS NOT NULL
Returns zero rows. Despite the values under comparison being NULL and Foo.
I would use the technique from this article: Undocumented Query Plans: Equality Comparisons
WHEN MATCHED AND EXISTS (
SELECT metroList.[User], metroList.Email,metroList.LocName
EXCEPT
SELECT listnew.[User], listnew.Email,listnew.LocName
)

Related

NULL comparison in SQL server 2008

I know that in SQL when we compare two NULL values, result is always false. Hence, statements like
SELECT case when NULL = NULL then '1' else '0' end
will always print '0'. My question is how functions like ISNULL determine whether value is null or not. Because, as per my understanding (and explained in above query) comparison of two null values is always FALSE.
You need to set the set ansi_nulls off and then check your result. Null can be thought of as an unknown value and when you are comparing two unknown values then you will get the result as false only. The comparisons null = null is undefined.
set ansi_nulls off
SELECT case when NULL = NULL then '1' else '0' end
Result:-
1
From MSDN
When SET ANSI_NULLS is OFF, the Equals (=) and Not Equal To (<>)
comparison operators do not follow the ISO standard. A SELECT
statement that uses WHERE column_name = NULL returns the rows that
have null values in column_name. A SELECT statement that uses WHERE
column_name <> NULL returns the rows that have nonnull values in the
column. Also, a SELECT statement that uses WHERE column_name <>
XYZ_value returns all rows that are not XYZ_value and that are not
NULL.
As correctly pointed by Damien in comments the behavior of NULL = NULL is unknown or undefined.
Your initial assumption appears to be that ISNULL is an alias for existing functionality which can be implemented directly within SQL statements, in the same way that a SQL function can. You are then asking how that function works.
This is an incorrect starting point, hence the confusion. Instead, like similar commands such as IN and LIKE, ISNULL is parsed and run within the database engine itself; its actual implementation is most likely written in C.
If you really want to look into the details of the implementation, you could take a look instead at mySQL - it's open source, so you may be able to search through the code to see how ISNULL is implemented there. They even provide a guided tour of the code if required.
... or {2} are you literally asking how the ISNULL function in SQL
Server itself works?
Actually I am asking for the second{2}. How ISNULL function in SQL server
works. If comparison of two nulls is not defined/unknown then how
isnull function compares two null values to return appropriate
results?
Null is a special marker used in Structured Query Language (SQL) to indicate that a data value does not exist in the database. ... NULL (SQL)
ISNULL ( check_expression , replacement_value ) is not concerned with comparison of values at all. It is concerned purely with the existence of value in the first parameter.
It tests if the check_expression has any value. If it does have any value that value is returned. If check_expression has no value the ISNULL function returns the second parameter replacement_value.
It does NOT compare the two values. It tests forthe existence of value in the first parameter only.
set ansi_nulls off
SELECT case when NULL = NULL then '1' else '0' end
result=1
set ansi_nulls on
SELECT case when NULL = NULL then '1' else '0' end
result=0
so that is the difference
I hope it works
SELECT CASE WHEN ISNULL(NULL,NULL) = NULL THEN 1 ELSE 0 END
SELECT case when 'NULL' = 'NULL' then '1' else '0' end
SELECT case when isnull(columnname,'NULL')='NULL' then '1' else '0' end
SET ANSI_NULLS OFF
SELECT case when NULL = NULL then '1' else '0' end

Comparing 2 variables in SQL stored procedure where 1 of them could be NULL [duplicate]

I must to check if two values, X and Y are different. If both are null, they must be considered as equal.
The unique way I found is:
select 1 as valueExists
where (#X is null and #Y is not null)
or (#Y is null and #X is not null)
or (#X <> #Y)
Is there a smart way to write this expression?
Thanks!
I think you could use COALESCE for that
WHERE coalesce(#X, '') <> coalesce(#Y, '')
What it does it returns an empty string if one of variables is null, so if two variables are null the two empty strings become equal.
I typically use a technique I picked up from here
SELECT 1 AS valuesDifferent
WHERE EXISTS (SELECT #X
EXCEPT
SELECT #Y)
WHERE EXISTS returns true if the sub query it contains returns a row. This will happen in this case if the two values are distinct. null is treated as a distinct value for the purposes of this operation.
You could try using NULLIF like this:
WHERE NULLIF(#X,#Y) IS NOT NULL OR NULLIF(#Y,#X) IS NOT NULL
You can use ISNULL
WHERE ISNULL(#X,'') <> ISNULL(#Y,'')

Oracle/PL SQL/SQL null comparison on where clause

Just a question about dealing will null values in a query.
For example I have the following table with the following fields and values
TABLEX
Column1
1
2
3
4
5
---------
Column2
null
A
B
C
null
I'm passing a variableY on a specific procedure. Inside the procedure is a cursor like this
CURSOR c_results IS
SELECT * FROM TABLEX where column2 = variableY
now the problem is variableY can be either null, A, B or C
if the variableY is null i want to select all record where column2 is null, else where column2 is either A, B or C.
I cannot do the above cursor/query because if variableY is null it won't work because the comparison should be
CURSOR c_results IS
SELECT * FROM TABLEX where column2 IS NULL
What cursor/query should I use that will accomodate either null or string variable.
Sorry if my question is a bit confusing. I'm not that good in explaining things. Thanks in advance.
Either produce different SQL depending on the contents of that parameter, or alter your SQL like this:
WHERE (column2 = variableY) OR (variableY IS NULL AND column2 IS NULL)
Oracle's Ask Tom says:
where decode( col1, col2, 1, 0 ) = 0 -- finds differences
or
where decode( col1, col2, 1, 0 ) = 1 -- finds sameness - even if both NULL
Safely Comparing NULL Columns as Equal
You could use something like:
SELECT * FROM TABLEX WHERE COALESCE(column2, '') = COALESCE(variableY, '')
(COALESCE takes the first non NULL value)
Note this will only work when you the column content cannot be '' (empty string). Else this statement will fail because NULL will match '' (empty string).
(edit)
You could also consider:
SELECT * FROM TABLEX WHERE COALESCE(column2, 'a string that never occurs') = COALESCE(variableY, 'a string that never occurs')
This will fix the '' fail hypothesis.
Below is similar to "top" answer but more concise:
WHERE ((column2 = variableY ) or COALESCE( column2, variableY) IS NULL)
May not be appropriate depending on the data you're looking at, but one trick I've seen (and used) is to compare NVL(fieldname,somenonexistentvalue).
For example, if AGE is an optional column, you could use:
if nvl(table1.AGE,-1) = nvl(table2.AGE,-1)
This relies on there being a value that you know will never be allowed. Age is a good example, salary, sequence numbers, and other numerics that can't be negative. Strings may be trickier of course - you may say that you'll never have anyone named 'xyzzymaryhadalittlelamb" or something like that, but the day you run with that assumption you KNOW they'll hire someone with that name!!
All that said: "where a = b or (a is null and b is null)" is the traditional way to solve it. Which is unfortunate, as even experienced programmers forget that part of it sometimes.
Try using the ISNULL() function. you can check if the variable is null and if so, set a default return value. camparing null to null is not really possible. remember: null <> null
WHERE variableY is null or column2 = variableY
for example:
create table t_abc (
id number(19) not null,
name varchar(20)
);
insert into t_abc(id, name) values (1, 'name');
insert into t_abc(id, name) values (2, null);
commit;
select * from t_abc where null is null or name = null;
--get all records
select * from t_abc where 'name' is null or name = 'name';
--get one record with name = 'name'
You could use DUMP:
SELECT *
FROM TABLEX
WHERE DUMP(column2) = DUMP(variableY);
DBFiddle Demo
Warning: This is not SARG-able expression so there will be no index usage.
With this approach you don't need to search for value that won't exists in your data (like NVL/COALESCE).

Using a function in a where clause that includes null search

I currently have a prepared statement in Java which uses the following SQL statement in the WHERE clause of my query, but I would like to re-write this into a function to limit the user parameters passed to it and possibly make it more efficient.
(
(USER_PARAM2 IS NULL AND
( COLUMN_NAME = nvl(USER_PARAM1, COLUMN_NAME) OR
(nvl(USER_PARAM1, COLUMN_NAME) IS NULL)
)
)
OR
(USER_PARAM2 IS NOT NULL AND COLUMN_NAME IS NULL)
)
USER_PARAM1 and USER_PARAM2 are passed into the prepared statement by the user.
USER_PARAM1 represents what the application user wants to search this particular COLUMN_NAME for. If the user does not include this parameter, it will default to NULL.
USER_PARAM2 was my way to allow a user to request a NULL value only search on this COLUMN_NAME. Additionally I have some server logic that sets USER_PARAM2 to 'true' if passed in by the user or NULL if it wasn't specified by the user.
The intended behavior is that if USER_PARAM2 was declared then only COLUMN_NAME values of NULL are returned. If USER_PARAM2 wasn't declared and USER_PARAM1 was declared then only COLUMN_NAME = USER_PARAM1 are returned. If neither user params are declared then all rows are returned.
Could anyone help me out on this?
Thanks in advance...
EDIT:
Just to clarify this is how my current query looks (without the other WHERE clause statements..)
SELECT *
FROM TABLE_NAME
WHERE (
(USER_PARAM2 IS NULL AND
( COLUMN_NAME = nvl(USER_PARAM1, COLUMN_NAME) OR
(nvl(USER_PARAM1, COLUMN_NAME) IS NULL)
)
)
OR
(USER_PARAM2 IS NOT NULL AND COLUMN_NAME IS NULL)
)
... and this is where I would like to get to...
SELECT *
FROM TABLE_NAME
WHERE customSearchFunction(USER_PARAM1, USER_PARAM2, COLUMN_NAME)
EDIT #2:
OK, so another co-worker helped me out with this...
CREATE OR REPLACE function searchNumber (pVal IN NUMBER, onlySearchForNull IN CHAR, column_value IN NUMBER)
RETURN NUMBER
IS
BEGIN
IF onlySearchForNull IS NULL THEN
IF pVal IS NULL THEN
RETURN 1;
ELSE
IF pVal = column_value THEN
RETURN 1;
ELSE
RETURN 0;
END IF;
END IF;
ELSE
IF column_value IS NULL THEN
RETURN 1;
ELSE
RETURN 0;
END IF;
END IF;
END;
... this seems to work in my initial trials..
SELECT *
FROM TABLE_NAME
WHERE 1=searchNumber(USER_PARAM1, USER_PARAM2, COLUMN_NAME);
... the only issues I have with it would be
1)possible performance concerns vs the complex SQL statement I started with.
2)that I would have to create similar functions for each data type.
However, the latter would be less of an issue for me.
EDIT #3 2012.02.01
So we ended up going with the solution I chose below, while using the function based approach where code/query cleanliness trumps performance. We found that the function based approach performed roughly 6x worse than using pure SQL.
Thanks everyone for the great input everyone!
EDIT #4 2012.02.14
So looking back I noticed that applying the virtual table concept in #Alan's solution with the clarity of #danihp's solution gives a very nice overall solution in terms of clarity and performance. Here's what I now have
WITH params AS (SELECT user_param1 AS param, user_param2 AS param_nullsOnly FROM DUAL)
SELECT *
FROM table_name, params p
WHERE ( nvl(p.param_nullsOnly, p.param) IS NULL --1)
OR p.param_nullsOnly IS NOT NULL AND column_name IS NULL --2)
OR p.param IS NOT NULL AND column_name = p.param --3)
)
-- 1) Test if all rows should be returned
-- 2) Test if only NULL values should be returned
-- 3) Test if param equals the column value
Thanks again for the suggestions and comments!
There's a simple way of to pass your parameters only once and refer to them as many times as needed, using common-table expressions:
WITH params AS (SELECT user_param1 AS up1, user_param2 AS up2 FROM DUAL)
SELECT *
FROM table_name, params p
WHERE ((p.up2 IS NULL
AND (column_name = NVL(p.up1, column_name)
OR (NVL(p.up1, column_name) IS NULL)))
OR (p.up2 IS NOT NULL AND column_name IS NULL))
In effect, you're creating a virtual table, where the columns are your parameters, that is populated with a single row.
Conveniently, this also ensures that all of your parameters are collected in the same place and can be specified in an arbitrary order (as opposed to the order that the naturally appear in the query).
There are a couple big advantages to this over a function-based approach. First, this will not prevent the use of indexes (as pointed out by #Bob Jarvis). Second, this keeps the query's logic in the query, rather than hidden in functions.
I don't know if my approach has more performance, but it has best readability:
Sending 2 additionals parameters to query you can rewrite query like:
where
( P_ALL_RESULTS is not null
OR
P_ONLY_NULLS is not null AND COLUMN_NAME IS NULL
OR
P_USE_P1 is not null AND COLUMN_NAME = USER_PARAM1
)
Disclaimer: answered before OP question clarification

SQL And NULL Values in where clause

So I have a simple query that returns a listing of products
SELECT Model, CategoryID
FROM Products
WHERE (Model = '010-00749-01')
This returns
010-00749-01 00000000-0000-0000-0000-000000000000
010-00749-01 NULL
Which is correct, so I wanted only the products whose CategoryID is not '00000000-0000-0000-0000-000000000000' so I have
SELECT Model, CategoryID
FROM Products
WHERE (Model = '010-00749-01')
AND (CategoryID <> '00000000-0000-0000-0000-000000000000')
But this returns no result. So I changed the query to
SELECT Model, CategoryID
FROM Products
WHERE (Model = '010-00749-01')
AND ((CategoryID <> '00000000-0000-0000-0000-000000000000') OR (CategoryID IS NULL))
Which returns expected result
010-00749-01 NULL
Can someone explain this behavior to me?
MS SQL Server 2008
Check out the full reference on Books Online - by default ANSI_NULLS is on meaning you'd need to use the approach you have done. Otherwise, you could switch that setting OFF at the start of the query to switch the behaviour round.
When SET ANSI_NULLS is ON, a SELECT
statement that uses WHERE column_name
= NULL returns zero rows even if there are null values in column_name. A
SELECT statement that uses WHERE
column_name <> NULL returns zero rows
even if there are nonnull values in
column_name.
...
When SET ANSI_NULLS
is ON, all comparisons against a null
value evaluate to UNKNOWN. When SET
ANSI_NULLS is OFF, comparisons of all
data against a null value evaluate to
TRUE if the data value is NULL.
Here's a simple example to demonstrate the behaviour with regard to comparisons against NULL:
-- This will print TRUE
SET ANSI_NULLS OFF;
IF NULL <> 'A'
PRINT 'TRUE'
ELSE
PRINT 'FALSE'
-- This will print FALSE
SET ANSI_NULLS ON;
IF NULL <> 'A'
PRINT 'TRUE'
ELSE
PRINT 'FALSE'
In general, you have to remember that NULL generally means UNKNOWN. That means if you say CategoryID <> '00000000-0000-0000-0000-000000000000' you have to assume that the query will only return values that it KNOWS will meet your criteria. Since there is a NULL (UNKNOWN) result, it does not actually know if that record meets your criteria and therefore will not be returned in the dataset.
Basically, a NULL is the absence of any value. So trying to compare the NULL in CategoryId to a varchar value in the query will always result in a false evaluation.
You might want to try using the COALESCE function, something like:
SELECT ModelId, CategoryID
FROM Products
WHERE (ModelId = '010-00749-01')
AND ( COALESCE( CategoryID, '' ) <> '00000000-0000-0000-0000-000000000000' )
EDIT
As noted by AdaTheDev the COALESCE function will negate any indices that may exist on the CategoryID column, which can affect the query plan and performance.
look at this:
1=1 --true
1=0 --false
null=null --false
null=1 --false
1<>1 --false
1<>0 --true
null<>null --false
null<>1 --false <<<--why you don't get the row with: AND (CategoryID <> '00000000-0000-0000-0000-000000000000')
Null gets special treatment. You need to explicitly test for null. See http://msdn.microsoft.com/en-us/library/ms188795.aspx
You may try using the Coalesce function to set a default value for fields that have null:
SELECT Model , CategoryID
FROM Products
WHERE Model = '010-00749-01'
AND Coalesce(CategoryID,'') <> '00000000-0000-0000-0000-000000000000'
I think the problem lies in your understanding of NULL which basically means "nothing." You can't compare anything to nothing, much like you can't divide a number by 0. It's just rules of math/science.
Edit:
As Ada has pointed out, this could cause an indexed field to no longer use an index.
Solution:
You can create an index using the coalesce function: eg create index ... coalesce(field)
You can add a not null constraint to prevent NULLs from ever appearing
A de facto standard of mine is to always assign default values and never allow nulls