Simplest SQL expression to check if two columns have same value accounting for NULL [duplicate]

Simplest SQL expression to check if two columns have same value accounting for NULL [duplicate] - sql

This question already has answers here:
Is there better Oracle operator to do null-safe equality check?
(3 answers)
Closed 7 years ago.
I am trying to figure out the simplest generalized SQL expression that can check if two columns a and b are the same. In other words, an expression that evaluates to true when:
a is NULL and b is NULL; or
a is not NULL and b is not NULL and a = b
Assume columns a and b have exactly the same data type.
The most obvious solution, which I'm using in the below example, is horribly convoluted, particularly because I need to repeat this clause 15x in a 15-column table:
SELECT * FROM (
SELECT 'x' a, 'x' b FROM dual
UNION ALL
SELECT 'x' a, NULL b FROM dual
UNION ALL
SELECT NULL a, 'x' b FROM dual
UNION ALL
SELECT NULL a, NULL b FROM dual
UNION ALL
SELECT 'x' a, 'y' b FROM dual
UNION ALL
SELECT 'x' a, NULL b FROM dual
UNION ALL
SELECT NULL a, 'y' b FROM dual
UNION ALL
SELECT NULL a, NULL b FROM dual
)
WHERE (a IS NULL AND b IS NULL) OR
(a IS NOT NULL AND b IS NOT NULL AND a = b)
/
And the expected result is:
+--------+--------+
| a | b |
+--------+--------+
| x | x |
| (null) | (null) |
| (null) | (null) |
+--------+--------+
tl;dr - Can I simplify my WHERE clause, ie make it more compact, while keeping it logically correct?
P.S.: I couldn't give a damn about any SQL purist insistence that "NULL is not a value". For my practical purposes, if a contains NULL and b does not, then a differs from b. It is not "unknown" whether they differ. So please, in advance, no arguments up that alley!
P.P.S.: My SQL flavour is Oracle 11g.
P.P.P.S.: Someone decided this question is a duplicate of "Is there better Oracle operator to do null-safe equality check?" but a cursory check in that question will show that the answers are less helpful than the ones posted on this thread and do not satisfy my particular, and explicitly-stated criteria. Just because they are similar doesn't make them duplicates. I've never understood why people on SO work so hard to force my problem X to be someone else's problem Y.

You can readily simplify it as:
WHERE (a IS NULL AND b IS NULL) OR
(a = b)
The IS NOT NULL is not needed.
If you have a "safe" value (i.e. one that is never used), you can do this:
WHERE COALESCE(a, ' ') = COALESCE(b, ' ')
This assumes that ' ' is not a valid value.

I have found the Ask Tom article "Safely Comparing NULL Columns As Equal" to be the most helpful. In Oracle, you can use the DECODE function to do this:
WHERE 1 = DECODE(a, b, 1, 0)
And this is the most compact solution I have seen so far.

Simple is not necessarily performant.
consider this possibility.
WHERE X || 'x' = Y || 'x'
If you want to really push the envelope, use the SYS_OP_MAP_NONNULL

SELECT * FROM (
SELECT 'x' a, 'x' b FROM dual
UNION ALL
SELECT 'x' a, NULL b FROM dual
UNION ALL
SELECT NULL a, 'x' b FROM dual
UNION ALL
SELECT NULL a, NULL b FROM dual
UNION ALL
SELECT 'x' a, 'y' b FROM dual
UNION ALL
SELECT 'x' a, NULL b FROM dual
UNION ALL
SELECT NULL a, 'y' b FROM dual
UNION ALL
SELECT NULL a, NULL b FROM dual
)
WHERE NVL(a,'1')=NVL(b,'1')

Related

How to check in SQL if multi columnar set is in the table (without string concatenation)

Let's assume I've 3 columns in a table with values like this:
table_1:
A | B | C
-----------------------
'xx' | '' | 'y'
'x' | 'y' | 'x'
'x' | 'x' | 'y'
'x' | 'yy' | ''
'x' | '' | 'yy'
'x' | 'y' | 'y'
I've a result set (result of an SQL SELECT statement) which I want to identify in the above table if it exists there:
[
('x', 'x', 'y')
('x', 'y', 'y')
]
This result set would match for 5 (of 6) rows in instead of the 2 from the table above if I've compared the results of simple string concatenation, e.g. I would simply compare the results of this: SELECT concat(A, B, C) FROM table_1
I could solve this problem with comparing the results of more complex string concatenation functions like this: SELECT concat('A=', A, '_', 'B=', B, '_', 'C=', C )
BUT:
I don't want to use any hardcoded special separator in a string concatenation like _ or =
because any character might be in the data
e.g.: somewhere in column B there might be this value: xx_C=yy
it's not a clean solution
I don't want to use string concatenation at all, because it's an ugly solution
it makes the "distance" between the attributes disappear
not general enough
maybe I've columns with different datatypes I don't want to convert to a STRING based column
Question:
Is it possible to solve somehow this problem without using string concatenation?
Is there a simple solution for this multi column value checking problem?
I want to solve this in BiqQuery, but I'm interested in a general solution for every relational databse/datawarehouse.
Thank you!
CREATE TABLE test.table_1 (
A STRING,
B STRING,
C STRING
) AS
SELECT * FROM (
SELECT 'xx', '', 'y'
UNION ALL
SELECT 'x', 'y', 'x'
UNION ALL
SELECT 'x', 'x', 'y'
UNION ALL
SELECT 'x', 'yy', ''
UNION ALL
SELECT 'x', '', 'yy'
UNION ALL
SELECT 'x', 'y', 'y'
)
SELECT A, B, C
FROM test.table_1
WHERE (A, B, C) IN ( -> I need this functionality
SELECT 'x', 'x', 'y'
UNION ALL
SELECT 'x', 'y', 'y'
);

Below is the most generic way I can think of (BigQuery Standard SQL):
#standardSQL
SELECT *
FROM `project.test.table1` t
WHERE t IN (
SELECT t
FROM `project.test.table2` t
)
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.test.table1` AS (
SELECT 'xx' a, '' b, 'y' c UNION ALL
SELECT 'x', 'y', 'x' UNION ALL
SELECT 'x', 'x', 'y' UNION ALL
SELECT 'x', 'yy', '' UNION ALL
SELECT 'x', '', 'yy' UNION ALL
SELECT 'x', 'y', 'y'
), `project.test.table2` AS (
SELECT 'x' a, 'x' b, 'y' c UNION ALL
SELECT 'x', 'y', 'y'
)
SELECT *
FROM `project.test.table1` t
WHERE t IN (
SELECT t
FROM `project.test.table2` t
)
with output
Row a b c
1 x x y
2 x y y

Use join:
SELECT t1.*
FROM test.table_1 t1 JOIN
(SELECT 'x' as a, 'x' as b, 'y' as c
UNION ALL
SELECT 'x', 'y', 'y'
) t2
USING (a, b, c);

Is there a concept which is the 'opposite' of SQL NULL?

Is there a concept (with an implementation - in Oracle SQL for starters) which behaves like a 'universal' matcher ?
What I mean is; I know NULL is not equal to anything - including NULL.
Which is why you have to be careful to 'IS NULL' rather than '=NULL' in SQL expressions.
I also know it is useful to use the NVL (in Oracle) function to detect a NULL and replace it with something in the output.
However: what you replace the NULL with using NVL has to match the datatype of the underlying column; otherwise you'll (rightly) get an error.
An example:
I have a table with a NULLABLE column 'name' of type VARCHAR2; and this contains a NULL row.
I can fetch out the NULL and replace it with an NVL like this:
SELECT NVL(name, 'NullyMcNullFace’) from my_table;
Great.
But if the column happens to a NUMBER (say 'age'), then I have to change my NVL:
SELECT NVL(age, 32) from my_table;
Also great.
Now if the column happens to be a DATE (say 'somedate'), then I have to change my NVL again:
SELECT NVL(somedate, sysdate) from my_table;
What I'm getting at here : is that in order to deal with NULLs you have to replace with a specific something ; and that specific something has to 'fit' the data-type.
So is there a construct/concept of (for want of a better word) like 'ANY' here.
Where 'ANY' would fit into a column of any datatype (like NULL), but (unlike NULL and unlike all other specific values) would match ANYTHING (including NULL - ? probably urghhh dunno).
So that I could do:
SELECT NVL(whatever_column, ANY) from my_table;
I think the answer is probably no; and probably 'go away, NULLs are bad enough - never mind this monster you have half-thought of'.

No, there's no "universal acceptor" value in SQL that is equal to everything.
What you can do is raise the NVL into your comparison. Like if you're trying to do a JOIN:
SELECT ...
FROM my_table AS m
JOIN other_table AS o ON o.name = NVL(m.name, o.name)
So if m.name is NULL, then the join will compare o.name to o.name, which is of course always true.
For other uses of NULL, you might have to use another technique that suits the situation.

Adressing the question in the comment on Bill Karwin's answer:
I want to output a 1 if the NEW and OLD value differ and a 0 if they are the same. But (for my purposes) I want to also return 0 for two NULLS.
select
Case When (:New = :Old) or
(:New is NULL and :Old is NULL) then 0
Else
1
End
from dual

In a WHERE CLAUSE you can put a condition like this,
WHERE column1 LIKE NVL(any_column_or_param, '%')

Perhaps DECODE() would suit your purpose here?
WITH t1 AS (SELECT 1 ID, NULL val FROM dual UNION ALL
SELECT 2 ID, NULL val FROM dual UNION ALL
SELECT 3 ID, 1 val FROM dual UNION ALL
SELECT 4 ID, 2 val FROM dual UNION ALL
SELECT 5 ID, 5 val FROM dual),
t2 AS (SELECT 1 ID, NULL val FROM dual UNION ALL
SELECT 2 ID, 3 val FROM dual UNION ALL
SELECT 3 ID, 1 val FROM dual UNION ALL
SELECT 4 ID, 4 val FROM dual UNION ALL
SELECT 6 ID, 5 val FROM dual)
SELECT t1.id t1_id,
t1.val t1_val,
t2.id t2_id,
t2.val t2_val,
DECODE(t1.val, t2.val, 0, 1) different_vals
FROM t1
FULL OUTER JOIN t2 ON t1.id = t2.id
ORDER BY COALESCE(t1.id, t2.id);
T1_ID T1_VAL T2_ID T2_VAL DIFFERENT_VALS
---------- ---------- ---------- ---------- --------------
1 1 0
2 2 3 1
3 1 3 1 0
4 2 4 4 1
5 5 1
6 5 1

SQL query : how to check existence of multiple rows with one query

I have this table MyTable:
PROG VALUE
-------------
1 aaaaa
1 bbbbb
2 ccccc
4 ddddd
4 eeeee
now I'm checking the existence of a tuple with a certain id with a query like
SELECT COUNT(1) AS IT_EXISTS
FROM MyTable
WHERE ROWNUM = 1 AND PROG = {aProg}
For example I obtain with aProg = 1 :
IT_EXISTS
---------
1
I get with aProg = 3 :
IT_EXISTS
---------
0
The problem is that I must do multiple queries, one for every value of PROG to check.
What I want is something that with a query like
SELECT PROG, ??? AS IT_EXISTS
FROM MyTable
WHERE PROG IN {1, 2,3, 4, 5} AND {some other condition}
I can get something like
PROG IT_EXISTS
------------------
1 1
2 1
3 0
4 1
5 0
The database is Oracle...
Hope I'm clear
regards
Paolo

Take a step back and ask yourself this: Do you really need to return the rows that don't exist to solve your problem? I suspect the answer is no. Your application logic can determine that records were not returned which will allow you to simplify your query.
SELECT PROG
FROM MyTable
WHERE PROG IN (1, 2, 3, 4, 5)
If you get a row back for a given PROG value, it exists. If not, it doesn't exist.
Update:
In your comment in the question above, you stated:
the prog values are from others tables. The table of the question has only a subset of the all prog values
This suggests to me that a simple left outer join could do the trick. Assuming your other table with the PROG values you're interested in is called MyOtherTable, something like this should work:
SELECT a.PROG,
CASE WHEN b.PROG IS NOT NULL THEN 1 ELSE 0 END AS IT_EXISTS
FROM MyOtherTable AS a
LEFT OUTER JOIN MyTable AS b ON b.PROG = a.PROG
A WHERE clause could be tacked on to the end if you need to do some further filtering.

I would recommend something like this. If at most one row can match a prog in your table:
select p.prog,
(case when t.prog is null then 0 else 1 end) as it_exists
from (select 1 as prog from dual union all
select 2 as prog from dual union all
select 3 as prog from dual union all
select 4 as prog from dual union all
select 5 as prog from dual
) p left join
mytable t
on p.prog = t.prog and <some conditions>;
If more than one row could match, you'll want to use aggregation to avoid duplicates:
select p.prog,
max(case when t.prog is null then 0 else 1 end) as it_exists
from (select 1 as prog from dual union all
select 2 as prog from dual union all
select 3 as prog from dual union all
select 4 as prog from dual union all
select 5 as prog from dual
) p left join
mytable t
on p.prog = t.prog and <some conditions>
group by p.prog
order by p.prog;

One solution is to use (arguably abuse) a hierarchical query to create an arbitrarily long list of numbers (in my example, I've set the largest number to max(PROG), but you could hardcode this if you knew the top range you were looking for). Then select from that list and use EXISTS to check if it exists in MYTABLE.
select
PROG
, case when exists (select 1 from MYTABLE where PROG = A.PROG) then 1 else 0 end IT_EXISTS
from (
select level PROG
from dual
connect by level <= (select max(PROG) from MYTABLE) --Or hardcode, if you have a max range in mind
) A
;

It's still not very clear where you get the prog values to check. But if you can read them from a table, and assuming that the table doesn't contain duplicate prog values, this is the query I would use:
select a.prog, case when b.prog is null then 0 else 1 end as it_exists
from prog_values_to_check a
left join prog_values_to_check b
on a.prog = b.prog
and exists (select null
from MyTable t
where t.prog = b.prog)
If you do need to hard code the values, you can do it rather simply by taking advantage of the SYS.DBMS_DEBUG_VC2COLL function, which allows you to convert a comma-delimited list of values into rows.
with prog_values_to_check(prog) as (
select to_number(column_value) as prog
from table(SYS.DBMS_DEBUG_VC2COLL(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)) -- type your values here
)
select a.prog, case when b.prog is null then 0 else 1 end as it_exists
from prog_values_to_check a
left join prog_values_to_check b
on a.prog = b.prog
and exists (select null
from MyTable t
where t.prog = b.prog)
Note: The above queries take into account that the MyTable table may have multiple rows with the same prog value, but that you only want one row in the result. I make this assumption based the WHERE ROWNUM = 1 condition in your question.

Detect all columns in an oracle table which have the same value in each row

Every day, the requests get weirder and weirder.
I have been asked to put together a query to detect which columns in a table contain the same value for all rows. I said "That needs to be done by program, so that we can do it in one pass of the table instead of N passes."
I have been overruled.
So long story short. I have this very simple query which demonstrates the problem. It makes 4 passes over the test set. I am looking for ideas for SQL Magery which do not involve adding indexes on every column, or writing a program, or taking a full human lifetime to run.
And sigh It needs to be able to work on any table.
Thanks in advance for your suggestions.
WITH TEST_CASE AS
(
SELECT 'X' A, 5 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 3 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 7 B, 'TUE' C, NULL D FROM DUAL
),
KOUNTS AS
(
SELECT SQRT(COUNT(*)) S, 'Column A' COLUMNS_WITH_SINGLE_VALUES
FROM TEST_CASE P, TEST_CASE Q
WHERE P.A = Q.A OR (P.A IS NULL AND Q.A IS NULL)
UNION ALL
SELECT SQRT(COUNT(*)) S, 'Column B' COLUMNS_WITH_SINGLE_VALUES
FROM TEST_CASE P, TEST_CASE Q
WHERE P.B = Q.B OR (P.B IS NULL AND Q.B IS NULL)
UNION ALL
SELECT SQRT(COUNT(*)) S, 'Column C' COLUMNS_WITH_SINGLE_VALUES
FROM TEST_CASE P, TEST_CASE Q
WHERE P.C = Q.C OR (P.C IS NULL AND Q.C IS NULL)
UNION ALL
SELECT SQRT(COUNT(*)) S, 'Column D' COLUMNS_WITH_SINGLE_VALUES
FROM TEST_CASE P, TEST_CASE Q
WHERE P.D = Q.D OR (P.D IS NULL AND Q.D IS NULL)
)
SELECT COLUMNS_WITH_SINGLE_VALUES
FROM KOUNTS
WHERE S = (SELECT COUNT(*) FROM TEST_CASE)

do you mean something like this?
WITH
TEST_CASE AS
(
SELECT 'X' A, 5 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 3 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 7 B, 'TUE' C, NULL D FROM DUAL
)
select case when min(A) = max(A) THEN 'A'
when min(B) = max(B) THEN 'B'
when min(C) = max(C) THEN 'C'
when min(D) = max(D) THEN 'D'
else 'No one'
end
from TEST_CASE
Edit
this works:
WITH
TEST_CASE AS
(
SELECT 'X' A, 5 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 3 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 7 B, 'TUE' C, NULL D FROM DUAL
)
select case when min(nvl(A,0)) = max(nvl(A,0)) THEN 'A ' end ||
case when min(nvl(B,0)) = max(nvl(B,0)) THEN 'B ' end ||
case when min(nvl(C,0)) = max(nvl(C,0)) THEN 'C ' end ||
case when min(nvl(D,0)) = max(nvl(D,0)) THEN 'D ' end c
from TEST_CASE
Bonus: I have also added the check for the null values, so the result now is: A and D
And the SQLFiddle demo for you.

Optimizer statistics can easily identify columns with more than one distinct value. After statistics are gathered a simple query against the data dictionary will return the results almost instantly.
The results will only be accurate on 10g if you use ESTIMATE_PERCENT = 100. The results will be accurate on 11g+ if you use ESTIMATE_PERCENT = 100 or AUTO_SAMPLE_SIZE.
Code
create table test_case(a varchar2(1), b number, c varchar2(3),d number,e number);
--I added a new test case, E. E has null and not-null values.
--This is a useful test because null and not-null values are counted separately.
insert into test_case
SELECT 'X' A, 5 B, 'FRI' C, NULL D, NULL E FROM DUAL UNION ALL
SELECT 'X' A, 3 B, 'FRI' C, NULL D, NULL E FROM DUAL UNION ALL
SELECT 'X' A, 7 B, 'TUE' C, NULL D, 1 E FROM DUAL;
--Gather stats with default settings, which uses AUTO_SAMPLE_SIZE.
--One advantage of this method is that you can quickly get information for many
--tables at one time.
begin
dbms_stats.gather_schema_stats(user);
end;
/
--All columns with more than one distinct value.
--Note that nulls and not-nulls are counted differently.
--Not-nulls are counted distinctly, nulls are counted total.
select owner, table_name, column_name
from dba_tab_columns
where owner = user
and num_distinct + least(num_nulls, 1) <= 1
order by column_name;
OWNER TABLE_NAME COLUMN_NAME
------- ---------- -----------
JHELLER TEST_CASE A
JHELLER TEST_CASE D
Performance
On 11g, this method might be about as fast as mucio's SQL statement. Options like cascade => false would improve performance by not analyzing indexes.
But the great thing about this method is that it also produces useful statistics. If the system is already gathering statistics at regular intervals the hard work may already be done.
Details about AUTO_SAMPLE_SIZE algorithm
AUTO_SAMPLE_SIZE was completely changed in 11g. It does not use sampling for estimating number of distinct values (NDV). Instead it scans the whole table and uses a hash-based distinct algorithm. This algorithm does not require large amounts of memory or temporary tablespace. It's much faster to read the whole table than to sort even a small part of it. The Oracle Optimizer blog has a good description of the algorithm here. For even more details, see this presentation by Amit Podder. (You'll want to scan through that PDF if you want to verify the details in my next section.)
Possibility of a wrong result
Although the new algorithm does not use a simple sampling algorithm it still does not count the number of distinct values 100% correctly. It's easy to find cases where the estimated number of distinct values is not the same as the actual. But if the number of distinct values are clearly inaccurate, how can they be trusted in this solution?
The potential inaccuracy comes from two sources - hash collisions and synopsis splitting. Synopsis splitting is the main source of inaccuracy but does not apply here. It only happens when there are 13864 distinct values. And it never throws out all of the values, the final estimate will certainly be much larger than 1.
The only real concern is what are the chances of there being 2 distinct values with a hash collision. With a 64-bit hash the chance could be as low as 1 in 18,446,744,073,709,551,616. Unfortunately I don't know the details of their hashing algorithm and don't know the real probability. I was unable to produce any collisions from some simple testing and from previous real-life tests. (One of my tests was to use large values, since some statistics operations only use the first N bytes of data.)
Now also consider that this will only happen if all of the distinct values in the table collide. What are the chances of there being a table with only two values that just happen to collide? Probably much less than the chance of winning the lottery and getting struck by a meteorite at the same time.

If you can live with the result on a single line, this should only scan once;
WITH TEST_CASE AS
(
SELECT 'X' A, 5 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 3 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 7 B, 'TUE' C, NULL D FROM DUAL
)
SELECT
CASE WHEN COUNT(DISTINCT A) +
COUNT(DISTINCT CASE WHEN A IS NULL THEN 1 END) = 1
THEN 1 ELSE 0 END SAME_A,
CASE WHEN COUNT(DISTINCT B) +
COUNT(DISTINCT CASE WHEN B IS NULL THEN 1 END) = 1
THEN 1 ELSE 0 END SAME_B,
CASE WHEN COUNT(DISTINCT C) +
COUNT(DISTINCT CASE WHEN C IS NULL THEN 1 END) = 1
THEN 1 ELSE 0 END SAME_C,
CASE WHEN COUNT(DISTINCT D) +
COUNT(DISTINCT CASE WHEN D IS NULL THEN 1 END) = 1
THEN 1 ELSE 0 END SAME_D
FROM TEST_CASE
An SQLfiddle to test with.

this will be done in a single scan
WITH
TEST_CASE AS
(
SELECT 'X' A, 5 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 3 B, 'FRI' C, NULL D FROM DUAL UNION ALL
SELECT 'X' A, 7 B, 'TUE' C, NULL D FROM DUAL
)
select decode(count(distinct nvl(A,0)),1,'SINGLE','MULTP') COL_A,
decode(count(distinct nvl(B,0)),1,'SINGLE','MULTP') COL_B,
decode(count(distinct nvl(C,0)),1,'SINGLE','MULTP') COL_C,
decode(count(distinct nvl(D,0)),1,'SINGLE','MULTP') COL_D
from TEST_CASE

Is there better Oracle operator to do null-safe equality check?

According to this question, the way to perform an equality check in Oracle, and I want null to be considered equal null is something like
SELECT COUNT(1)
FROM TableA
WHERE
wrap_up_cd = val
AND ((brn_brand_id = filter) OR (brn_brand_id IS NULL AND filter IS NULL))
This can really make my code dirty, especially if I have a lot of where like this and the where is applied to several column. Is there a better alternative for this?

Well, I'm not sure if this is better, but it might be slightly more concise to use LNNVL, a function (that you can only use in a WHERE clause) which returns TRUE if a given expression is FALSE or UNKNOWN (NULL). For example...
WITH T AS
(
SELECT 1 AS X, 1 AS Y FROM DUAL UNION ALL
SELECT 1 AS X, 2 AS Y FROM DUAL UNION ALL
SELECT 1 AS X, NULL AS Y FROM DUAL UNION ALL
SELECT NULL AS X, 1 AS Y FROM DUAL
)
SELECT
*
FROM
T
WHERE
LNNVL(X <> Y);
...will return all but the row where X = 1 and Y = 2.

As an alternative you can use NVL function and designated literal which will be returned if a value is null:
-- both are not nulls
SQL> with t1(col1, col2) as(
2 select 123, 123 from dual
3 )
4 select 1 res
5 from t1
6 where nvl(col1, -1) = nvl(col2, -1)
7 ;
RES
----------
1
-- one of the values is null
SQL> with t1(col1, col2) as(
2 select null, 123 from dual
3 )
4 select 1 res
5 from t1
6 where nvl(col1, -1) = nvl(col2, -1)
7 ;
no rows selected
-- both values are nulls
SQL> with t1(col1, col2) as(
2 select null, null from dual
3 )
4 select 1 res
5 from t1
6 where nvl(col1, -1) = nvl(col2, -1)
7 ;
RES
----------
1
As #Codo has noted in the comment, of course, above approach requires choosing a literal comparing columns will never have. If comparing columns are of number datatype(for example) and are able to accept any value, then choosing -1 of course won't be an option. To eliminate that restriction we can use decode function(for numeric or character datatypes) for that:
with t1(col1, col2) as(
2 select null, null from dual
3 )
4 select 1 res
5 from t1
6 where decode(col1, col2, 'same', 'different') = 'same'
7 ;
RES
----------
1

With the LNNVL function, you still have a problem when col1 and col2 (x and y in the answer) are both null. With nvl it works but it is inefficient (not understood by the optimizer) and you have to find a value that cannot appear in the data (and the optimizer should know it cannot).
For strings you can choose a value that have more characters than the maximum of the columns but it is dirty.
The true efficient way to do it is to use the (undocumented) function SYS_OP_MAP_NONNULL().
like this:
where SYS_OP_MAP_NONNULL(col1) <> SYS_OP_MAP_NONNULL(col2)
SYS_OP_MAP_NONNULL(a) is equivalent to nvl(a,'some internal value that cannot appear in the data but that is not null')

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Simplest SQL expression to check if two columns have same value accounting for NULL [duplicate] - sql

You can readily simplify it as: WHERE (a IS NULL AND b IS NULL) OR (a = b) The IS NOT NULL is not needed. If you have a "safe" value (i.e. one that is never used), you can do this: WHERE COALESCE(a, ' ') = COALESCE(b, ' ') This assumes that ' ' is not a valid value.

I have found the Ask Tom article "Safely Comparing NULL Columns As Equal" to be the most helpful. In Oracle, you can use the DECODE function to do this: WHERE 1 = DECODE(a, b, 1, 0) And this is the most compact solution I have seen so far.

Simple is not necessarily performant. consider this possibility. WHERE X || 'x' = Y || 'x' If you want to really push the envelope, use the SYS_OP_MAP_NONNULL

Related

How to check in SQL if multi columnar set is in the table (without string concatenation)

Is there a concept which is the 'opposite' of SQL NULL?

SQL query : how to check existence of multiple rows with one query

Detect all columns in an oracle table which have the same value in each row

Is there better Oracle operator to do null-safe equality check?

Categories

Resources