SQL: Select (null = null); - sql

This question is out of a databases exam i was looking at, so it might be nothing one can ever use in a real situation...
What output does the following valid SQL statement produce? Explain your answer!
SELECT (NULL = NULL);
I can easily produce the output of the statement, which is
school=> select (null = null);
?column?
----------
(1 row)
in psql 8.4.11, but how and why this is the answer i don't know... I would have guessed it tries to evaluate the expression inside the brackets and comes up with true/false but apparently it doesn't.
Any hints on why it behaves like it does?
Thanks

NULL stands for "Unknown Value". It is not known whether a value is true or false or anything else.
So, when comparing two unknown values, what would the answer be? Is UnknownA equal to UnknownB?
Answer? Unknown...
Here is an example for SQL Server:
IF (NULL = NULL)
PRINT 'Equals'
IF (NULL != NULL)
PRINT 'Not Equals'
IF (NULL IS NULL)
PRINT 'IS'
IF (NULL IS NOT NULL)
PRINT 'IS NOT'
The only thing that gets printed: IS

I suppose, the expected answer is NULL. That's what MySQL does. That's what PostgreSQL returns as well, actually. Probably, your SQL client just doesn't show the single row where the only returned value is a NULL.
MS SQL server doesn't have a boolean type, though, so it cannot just select a result of boolean expression. So the result in MS SQL is an error.
The point of the question is to check your understanding of NULL logic in SQL. Instead of =, you should use the IS operator when comparing to NULL (unless the ANSI_NULLS option is set to true in MySQL).

Related

Handle null and empty in SQL Server [duplicate]

This is obviously not going to return a row...
select 1 where null = ''
But why does this also not return a row?
select 1 where null <> ''
How can both of those WHEREs be "false"?
"How can both of those WHEREs be "false"?"
It's not!
The answer is not "true" either!
The answer is "we don't know".
Think of NULL as a value you don't know yet.
Would you bet it's '' ?
Would you bet it's not '' ?
So, safer is to declare you don't know yet. The answer to both questions, therefore, is not false but I don't know, e.g. NULL in SQL.
Because this is an instance of SQL Server conforming to ANSI SQL ;-)
NULL in SQL is somewhat similar to IEEE NaN in comparison rules: NaN != NaN and NaN == NaN are both false. It takes a special operator IS NULL in SQL (or "IsNaN" for IEEE FP) to detect these special values. (There are actually multiple ways to detect these special values: IS NULL/"IsNaN" are just clean and simple methods.)
However, NULL = x goes one step further: the result of NULL =/<> x is not false. Rather, the result of the expression is itself NULL UNKNOWN. So NOT(NULL = '') is also NULL UNKNOWN (or "false" in a where context -- see comment). Welcome to the world of SQL tri-state logic ;-)
Since the question is about SQL Server, then for completeness: If running run with "SET ANSI_NULLS OFF" -- but see remarks/warnings at top -- then the "original" TSQL behavior can be attained.
"Original" behavior (this is deprecated):
SET ANSI_NULLS OFF;
select 'eq' where null = ''; -- no output
select 'ne' where null <> ''; -- output: ne
select 'not' where not(null = ''); -- output: not; null = '' -> False, not(False) -> True
ANSI-NULLs behavior (default in anything recent, please use):
SET ANSI_NULLS ON;
select 'eq' where null = ''; -- no output
select 'ne' where null <> ''; -- no output
select 'not' where not(null = ''); -- no output; null = '' -> Unknown, not(Unknown) -> Unknown
Happy coding.
Null has some very odd behaviour when comparing - the wikipedia article explains it quite well. In a nutshell, as well as true and false, there is an unknown value, which SQL returns when doing a comparison.
The SQL standard specifies that NULL = x is false for all x (even if x is itself NULL) and SQL Server is just following the standard. If you want to check if something is or is not NULL, then you have to use x IS NULL or x IS NOT NULL.
Databses have what is known as three valued logic. The values of true, false, and unknown.
Read this http://www.simple-talk.com/sql/learn-sql-server/sql-and-the-snare-of-three-valued-logic/
Because any comparison with NULL returns false (or to be precise: returns NULL)
As NULL is the absence of information you cannot tell whether is equal to or not equal to something.
Probably getting a little repetitive here, but my two cents anyway:
A. a_horse_with_no_name's example (comment, above) is a very good one!
B. In non=mathmatical terms, NULL is an unknown value. An empty String is a string with a length of zero - hence, a "known" value. This is why NULL does not compare equally or not equally to an empty string.
C. Since NULL represents unknown, it is impossible to compare two NULL values for equality. If you don't know the value of X, and you don't know the value of Y, then you don't know if they are equal or not.
See SQL-92 8.2 comparison predicate saying:
General Rules
Let X and Y be any two corresponding <row value constructor element>s. Let XV and YV be the values represented by X and Y, respectively.
Case:
a) If XV or YV is the null value, then "X <comp op> Y" is unknown.

In SQL, is there a difference between "IS" and "=" when returning values in where statements?

I am currently learning SQL utilizing Codecademy and am curious if there is a difference between using "IS" or "=".
In the current lesson, I wrote this code:
SELECT *
FROM nomnom
WHERE neighborhood IS 'Midtown'
OR neighborhood IS 'Downtown'
OR neighborhood IS 'Chinatown';
Which ran perfectly fine. I always like to look at the answer after to see if there was something I did wrong or could improve on. The answer had this code:
SELECT *
FROM nomnom
WHERE neighborhood = 'Midtown'
OR neighborhood = 'Downtown'
OR neighborhood = 'Chinatown';
Do IS and = function the same?
All that you want to know you can find it here:
The IS and IS NOT operators work like = and != except when one or both
of the operands are NULL. In this case, if both operands are NULL,
then the IS operator evaluates to 1 (true) and the IS NOT operator
evaluates to 0 (false). If one operand is NULL and the other is not,
then the IS operator evaluates to 0 (false) and the IS NOT operator is
1 (true). It is not possible for an IS or IS NOT expression to
evaluate to NULL. Operators IS and IS NOT have the same precedence as
=.
taken from: SQL As Understood By SQLite.
The important part is: ...except when one or both of the operands are NULL... because when using = or != (<>) and 1 (or both) of the operands is NULL then the result is also NULL and this is the difference to IS and IS NOT.
They work the same but "IS" is a keyword in MySQL and is generally used while comparing NULL values. While comparing NULL values "=" does not work.
SELECT * FROM nomnom WHERE neighborhood IS NULL
The above statement would run perfectly fine but
SELECT * FROM nomnom WHERE neighborhood = NULL
would result in an error.
They are the same for these cases, but further down the line you will discover one nifty little value called NULL.
NULL is a pain because... it doesn't exist.
0 = NULL returns FALSE;
Date <> [Column] will not return lines with NULL, only those with a value that is different.
Hell, even NULL = NULL returns false. And NULL <> NULL also returns false. That is why "IS" exists. Because NULL IS NULL will return true.
So as a general rule, use = for values.
Keep "IS" for null.
[Column] IS NULL
or
[Column] IS NOT NULL
And remember to always check if your column is nullable that you need to plan for null values in your WHERE or ON clauses.

Oracle: True value in WHERE

I am building a SQL query inside a function. The functions parameters are my criterias for the WHERE clause. Parameters can be null too.
foo (1,2,3) => SELECT x FROM y WHERE a=1 AND b=2 AND c=3;
foo (null, 2, null) => SELECT x FROM y WHERE b=2;
My approach to do that in code is to add a very first alltime true in the WHERE-clause (e.g. 1=1 or NULL is NULL or 2 > 1)
So I do not need to handle the problem that the 1st WHERE condition is after a "WHERE" and all others are after a "AND".
String sql="SELECT x FROM y WHERE 1=1";
if (a!=null)
{
sql += " AND a="+a;
}
Is there a better term than 1=1 or my other samples to EXPLICITLY have a always true value? TRUE and FALSE is not working.
Oracle does not support a boolean type in SQL. It exists in PL/SQL, but can't be used in a query. The easiest thing to do is 1=1 for true and 0=1 for false. I'm not sure if the optimizer will optimize these out or not, but I suspect the performance impact is negligible. I've used this where the number of predicates is unknown until runtime.
I think this construct is superior to anything using nvl, coalesce, decode, or case, because those bypass normal indexes and I've had them lead to complicated and slow execution plans.
So, yes I'd do something like you have (not sure what language you're building this query in; this is not valid PL/SQL. Also you should use bind variables but that's another story):
sql = "SELECT x FROM y WHERE 1=1";
if (a != null)
{
sql += " AND a=" + a;
}
if (b != null)
{
sql += " AND b=" + b;
}
... etc.
I think this is better than Gordon Linoff's suggestion, because otherwise this would be more complicated because you'd have to have another clause for each predicate checking if there was a previous predicate, i.e. whether you needed to include the AND or not. It just makes the code more verbose to avoid a single, trivial clause in the query.
I've never really understood the approach of mandating a where clause even when there are no conditions. If you are constructing the logic, then combine the conditions. If the resulting string is empty, then leave out the where clause entirely.
Many application languages have the equivalent of concat_ws() -- string concatenation with a separator (join in Python, for instance). So leaving out the where clause does not even result in code that is much more complicated.
As for your question, Oracle doesn't have boolean values, so 1=1 is probably the most common approach.
use NVL so you can set a default value. Look up NVL or NVL2. Either might work for you.
check out: http://www.dba-oracle.com/t_nvl_vs_nvl2.htm
In sql the standard way to "default null" is to use coalesce. So lets say your input is #p1, #p2, and #p3
Then
select *
from table
where a = coalesce(#p1,a) and
b = coalesce(#p2,b) and
c = coalesce(#p3,c)
Thus if (as an example) #p1 is null then it will compare a = a (which is always true). If #p1 is not null then it will filter on a equals that value.
In this way null becomes a "wildcard" for all records.
you could also use decode to get the results you need since it is oracle
Select
decode(a, 1, x,null),
decode(b, 2, x,null),
decode(c, 3, x,null)
from y
What about using OR inside of the parentheses while ANDs exist at outside of them
SELECT x
FROM y
WHERE ( a=:prm1 OR :prm1 is null)
AND ( b=:prm2 OR :prm2 is null)
AND ( c=:prm3 OR :prm3 is null);

is null vs. equals null

I have a rough understanding of why = null in SQL and is null are not the same, from questions like this one.
But then, why is
update table
set column = null
a valid SQL statement (at least in Oracle)?
From that answer, I know that null can be seen as somewhat "UNKNOWN" and therefore and sql-statement with where column = null "should" return all rows, because the value of column is no longer an an unknown value. I set it to null explicitly ;)
Where am I wrong/ do not understand?
So, if my question is maybe unclear:
Why is = null valid in the set clause, but not in the where clause of an SQL statement?
SQL doesn't have different graphical signs for assignment and equality operators like languages such as c or java have. In such languages, = is the assignment operator, while == is the equality operator. In SQL, = is used for both cases, and interpreted contextually.
In the where clause, = acts as the equality operator (similar to == in C). I.e., it checks if both operands are equal, and returns true if they are. As you mentioned, null is not a value - it's the lack of a value. Therefore, it cannot be equal to any other value.
In the set clause, = acts as the assignment operator (similar to = in C). I.e., it sets the left operand (a column name) with the value of the right operand. This is a perfectly legal statement - you are declaring that you do not know the value of a certain column.
They completely different operators, even if you write them the same way.
In a where clause, is a comparsion operator
In a set, is an assignment operator
The assigment operator allosw to "clear" the data in the column and set it to the "null value" .
In the set clause, you're assigning the value to an unknown, as defined by NULL. In the where clause, you're querying for an unknown. When you don't know what an unknown is, you can't expect any results for it.

why is null not equal to null false

I was reading this article:
Get null == null in SQL
And the consensus is that when trying to test equality between two (nullable) sql columns, the right approach is:
where ((A=B) OR (A IS NULL AND B IS NULL))
When A and B are NULL, (A=B) still returns FALSE, since NULL is not equal to NULL. That is why the extra check is required.
What about when testing inequalities? Following from the above discussion, it made me think that to test inequality I would need to do something like:
WHERE ((A <> B) OR (A IS NOT NULL AND B IS NULL) OR (A IS NULL AND B IS NOT NULL))
However, I noticed that that is not necessary (at least not on informix 11.5), and I can just do:
where (A<>B)
If A and B are NULL, this returns FALSE. If NULL is not equal to NULL, then shouldn't this return TRUE?
EDIT
These are all good answers, but I think my question was a little vague. Allow me to rephrase:
Given that either A or B can be NULL, is it enough to check their inequality with
where (A<>B)
Or do I need to explicitly check it like this:
WHERE ((A <> B) OR (A IS NOT NULL AND B IS NULL) OR (A IS NULL AND B IS NOT NULL))
REFER to this thread for the answer to this question.
Because that behavior follows established ternary logic where NULL is considered an unknown value.
If you think of NULL as unknown, it becomes much more intuitive:
Is unknown a equal to unknown b? There's no way to know, so: unknown.
relational expressions involving NULL actually yield NULL again
edit
here, <> stands for arbitrary binary operator, NULL is the SQL placeholder, and value is any value (NULL is not a value):
NULL <> value -> NULL
NULL <> NULL -> NULL
the logic is: NULL means "no value" or "unknown value", and thus any comparison with any actual value makes no sense.
is X = 42 true, false, or unknown, given that you don't know what value (if any) X holds? SQL says it's unknown. is X = Y true, false, or unknown, given that both are unknown? SQL says the result is unknown. and it says so for any binary relational operation, which is only logical (even if having NULLs in the model is not in the first place).
SQL also provides two unary postfix operators, IS NULL and IS NOT NULL, these return TRUE or FALSE according to their operand.
NULL IS NULL -> TRUE
NULL IS NOT NULL -> FALSE
All comparisons involving null are undefined, and evaluate to false. This idea, which is what prevents null being evaluated as equivalent to null, also prevents null being evaluated as NOT equivalent to null.
The short answer is... NULLs are weird, they don't really behave like you'd expect.
Here's a great paper on how NULLs work in SQL. I think it will help improve your understanding of the topic. I think the sections on handling null values in expressions will be especially useful for you.
http://www.oracle.com/technology/oramag/oracle/05-jul/o45sql.html
The default (ANSI) behaviour of nulls within an expression will result in a null (there are enough other answers with the cases of that).
There are however some edge cases and caveats that I would place when dealing with MS Sql Server that are not being listed.
Nulls within a statement that is grouping values together will be considered equal and be grouped together.
Null values within a statement that is ordering them will be considered equal.
Null values selected within a statement that is using distinct will be considered equal when evaluating the distinct aspect of the query
It is possible in SQL Server to override the expression logic regarding the specific Null = Null test, using the SET ANSI_NULLS OFF, which will then give you equality between null values - this is not a recommended move, but does exist.
SET ANSI_NULLS OFF
select result =
case
when null=null then 'eq'
else 'ne'
end
SET ANSI_NULLS ON
select result =
case
when null=null then 'eq'
else 'ne'
end
Here is a Quick Fix
ISNULL(A,0)=ISNULL(B,0)
0 can be changed to something that can never happen in your data
"Is unknown a equal to unknown b? There's no way to know, so: unknown."
The question was : why does the comparison yield FALSE ?
Given three-valued logic, it would indeed be sensible for the comparison to yield UNKNOWN (not FALSE). But SQL does yield FALSE, and not UNKNOWN.
One of the myriads of perversities in the SQL language.
Furthermore, the following must be taken into account :
If "unkown" is a logical value in ternary logic, then it ought to be the case that an equality comparison between two logical values that both happen to be (the value for) "unknown", then that comparison ought to yield TRUE.
If the logical value is itself unknown, then obviously that cannot be represented by putting the value "unknown" there, because that would imply that the logical value is known (to be "unknown"). That is, a.o., how relational theory proves that implementing 3-valued logic raises the requirement for a 4-valued logic, that a 4 valued logic leads to the need for a 5-valued logic, etc. etc. ad infinitum.