why does 10/NULL evaluate to null? - sql

In SQL , why does 10/NULL evaluate to NULL (or unknown) ? Example :
if((10/NULL) is NULL)
DBMS_OUTPUT.PUT_LINE("Null.");
However , 1 = NULL being a COMPARISON is considered as FALSE. Shouldn't 10/NULL also be considered as FALSE ?
I am referring to SQL only . Not any DBMS in particular. And it might be a duplicate but I didn't know what keywords to put in search for this query.

Shouldn't 10/NULL also be considered as FALSE?
No, because:
Any arithmetic expression containing a null always evaluates to null. For example, null added to 10 is null. In fact, all operators (except concatenation) return null when given a null operand.
Emphasis mine, taken from the Oracle manual: http://docs.oracle.com/cd/E11882_01/server.112/e26088/sql_elements005.htm#i59110
And this is required by the SQL standard.
Edit, as the question was for RDBMS in general:
SQL Server
When SET ANSI_NULLS is ON, an operator that has one or two NULL expressions returns UNKNOWN
Link to the the manual:
MySQL
An expression that contains NULL always produces a NULL value unless otherwise indicated in the documentation for a particular function or operator
Link to the manual
DB2
if either operand can be null, the result can be null, and if either is null, the result is the null value
Link to the manual:
PostgreSQL
Unfortunately I could not find such an explicit statement in the PostgreSQL manual, although I sure it behaves the same.
Warning: The "(except concatenation)" is an Oracle only and non-standard exception. (The empty string and NULL are almost identical in Oracle). Concatenating nulls gives null in all other DBMS.

1 = null is not null. It is actually unknown. As well as any other null operation.

The equality predicate 1 = NULL evaluates to NULL. But NULL in a boolean comparison is considered false.
If you do something like NOT( 1 = NULL ), 1 = NULL evaluates to NULL, NOT( NULL ) evaluates to NULL and so the condition as a whole ends up evaluating to false.
Oracle has a section in their documentation on handling NULL values in comparisons and conditional statements-- other databases will handle things in an very similar manner.

10/something means that you are counting how much "something" will be in 10
in this case you're counting how much "nothing" will be in 10 - that's infinity, unknown..
1 = NULL is false because one does not equal nothing
The NULLIF function accepts two parameters. If the first parameter is equal to the second parameter, NULLIF returns Null. Otherwise, the value of the first parameter is returned.
NULLIF(value1, value2)
NVL
The NVL function accepts two parameters. It returns the first non-NULL parameter or NULL if all parameters are NULL.
also check this conditional outcomes:
This "null equals UNKNOWN truth value" proposition introduces an inconsistency into SQL 3VL. One major problem is that it contradicts a basic property of nulls, the property of propagation. Nulls, by definition, propagate through all SQL expressions. The Boolean truth values do not have this property. Consider the following scenarios in SQL:1999, in which two Boolean truth values are combined into a compound predicate. According to the rules of SQL 3VL, and as shown in the 3VL truth table shown earlier in this article, the following statements hold:
( TRUE OR UNKNOWN ) → TRUE
( FALSE AND UNKNOWN ) → FALSE
However, because nulls propagate, treating null as UNKNOWN results in the following logical inconsistencies in SQL 3VL:
( TRUE OR NULL ) → NULL ( = UNKNOWN )
( FALSE AND NULL ) → NULL ( = UNKNOWN )
The SQL:1999 standard does not define how to deal with this inconsistency, and results could vary between implementations. Because of these inconsistencies and lack of support from vendors the SQL Boolean datatype did not gain widespread acceptance. Most SQL DBMS platforms now offer their own platform-specific recommendations for storing Boolean-type data.
Note that in the PostgreSQL implementation of SQL, the null value is used to represent all UNKNOWN results and the following evaluations occur:
( TRUE OR NULL ) → TRUE
( FALSE AND NULL ) → FALSE
( FALSE OR NULL ) IS NULL → TRUE
( TRUE AND NULL ) IS NULL → TRUE

Related

Handle null and empty in SQL Server [duplicate]

This is obviously not going to return a row...
select 1 where null = ''
But why does this also not return a row?
select 1 where null <> ''
How can both of those WHEREs be "false"?
"How can both of those WHEREs be "false"?"
It's not!
The answer is not "true" either!
The answer is "we don't know".
Think of NULL as a value you don't know yet.
Would you bet it's '' ?
Would you bet it's not '' ?
So, safer is to declare you don't know yet. The answer to both questions, therefore, is not false but I don't know, e.g. NULL in SQL.
Because this is an instance of SQL Server conforming to ANSI SQL ;-)
NULL in SQL is somewhat similar to IEEE NaN in comparison rules: NaN != NaN and NaN == NaN are both false. It takes a special operator IS NULL in SQL (or "IsNaN" for IEEE FP) to detect these special values. (There are actually multiple ways to detect these special values: IS NULL/"IsNaN" are just clean and simple methods.)
However, NULL = x goes one step further: the result of NULL =/<> x is not false. Rather, the result of the expression is itself NULL UNKNOWN. So NOT(NULL = '') is also NULL UNKNOWN (or "false" in a where context -- see comment). Welcome to the world of SQL tri-state logic ;-)
Since the question is about SQL Server, then for completeness: If running run with "SET ANSI_NULLS OFF" -- but see remarks/warnings at top -- then the "original" TSQL behavior can be attained.
"Original" behavior (this is deprecated):
SET ANSI_NULLS OFF;
select 'eq' where null = ''; -- no output
select 'ne' where null <> ''; -- output: ne
select 'not' where not(null = ''); -- output: not; null = '' -> False, not(False) -> True
ANSI-NULLs behavior (default in anything recent, please use):
SET ANSI_NULLS ON;
select 'eq' where null = ''; -- no output
select 'ne' where null <> ''; -- no output
select 'not' where not(null = ''); -- no output; null = '' -> Unknown, not(Unknown) -> Unknown
Happy coding.
Null has some very odd behaviour when comparing - the wikipedia article explains it quite well. In a nutshell, as well as true and false, there is an unknown value, which SQL returns when doing a comparison.
The SQL standard specifies that NULL = x is false for all x (even if x is itself NULL) and SQL Server is just following the standard. If you want to check if something is or is not NULL, then you have to use x IS NULL or x IS NOT NULL.
Databses have what is known as three valued logic. The values of true, false, and unknown.
Read this http://www.simple-talk.com/sql/learn-sql-server/sql-and-the-snare-of-three-valued-logic/
Because any comparison with NULL returns false (or to be precise: returns NULL)
As NULL is the absence of information you cannot tell whether is equal to or not equal to something.
Probably getting a little repetitive here, but my two cents anyway:
A. a_horse_with_no_name's example (comment, above) is a very good one!
B. In non=mathmatical terms, NULL is an unknown value. An empty String is a string with a length of zero - hence, a "known" value. This is why NULL does not compare equally or not equally to an empty string.
C. Since NULL represents unknown, it is impossible to compare two NULL values for equality. If you don't know the value of X, and you don't know the value of Y, then you don't know if they are equal or not.
See SQL-92 8.2 comparison predicate saying:
General Rules
Let X and Y be any two corresponding <row value constructor element>s. Let XV and YV be the values represented by X and Y, respectively.
Case:
a) If XV or YV is the null value, then "X <comp op> Y" is unknown.

Why doesn't NULL propagate in boolean expression in SQL?

In SQL, every operation which involves an operand with NULL yields NULL (with the obvious exceptions of IS NULL or IS NOT NULL operators). However, NULL does not propagate with AND or OR operators which may return TRUE or FALSE. For example, the following in MariaDB 10.4 returns NULL and 0 respectively:
select 0 & null, 0 and null
The difference is that the first is a bitwise AND, the second is a boolean AND. Why NULL does not propagate in boolean operation?
A NULL value has a whole series of possible meanings. IIRC Chris Date found about 7 different interpretations.
A very common interpretation of NULL is: "I don't know". Another one is: "Not applicable".
So let's try to evaluate a condition with the "I don't know" interpretation of a NULL value.
As an example suppose there are two persons. And you want to compare their age. Person A happens to be 31 years old. In case of the other person, person B, you don't know.
The question if A is as old as B cannot be answered positively. But it can't be denied either. In fact, you don't know. Hence the truth value here is NULL.
If we add the ages of both persons, we run into the same problem. We don't have a clue about the sum of their ages. Again the resulting value is NULL.
This is why you'll have to define how to treat NULL values. A database system cannot know this.
We have 0 and null. In MariaDB, 0 and FALSE are synonymous. So we have FALSE AND NULL. But FALSE AND <anything> is always FALSE - there's no doubt that no matter what value might be substituted here, nothing can make this statement TRUE now.
So we short-circuit and return the FALSE/0 result. Similarly, 1 OR NULL should return 1.
NULL has the semantics of "unknown" value. It does not have the semantics of "missing". This is a nuance.
But, it is "propagated" by AND and OR, just not as you might expect. So:
true AND NULL --> NULL, because the value would depend on what NULL is
false AND NULL --> false, because the first value requires that the result is false
In WHERE and WHEN clauses, NULL is treated as "false". However, in CHECK constraints, NULL is treated as "true" -- that is, only explicitly false values fail the NULL constraint.
Otherwise, you are correct that almost all operations with NULL return NULL. The & operator is a bitwise operator that has nothing to do with boolean values. It is just another "mathematical" operator, such as +, or *, so the value is NULL when any operand is NULL.
One very important exception is the NULL-safe comparison operator, <=>.

In SQL, is there a difference between "IS" and "=" when returning values in where statements?

I am currently learning SQL utilizing Codecademy and am curious if there is a difference between using "IS" or "=".
In the current lesson, I wrote this code:
SELECT *
FROM nomnom
WHERE neighborhood IS 'Midtown'
OR neighborhood IS 'Downtown'
OR neighborhood IS 'Chinatown';
Which ran perfectly fine. I always like to look at the answer after to see if there was something I did wrong or could improve on. The answer had this code:
SELECT *
FROM nomnom
WHERE neighborhood = 'Midtown'
OR neighborhood = 'Downtown'
OR neighborhood = 'Chinatown';
Do IS and = function the same?
All that you want to know you can find it here:
The IS and IS NOT operators work like = and != except when one or both
of the operands are NULL. In this case, if both operands are NULL,
then the IS operator evaluates to 1 (true) and the IS NOT operator
evaluates to 0 (false). If one operand is NULL and the other is not,
then the IS operator evaluates to 0 (false) and the IS NOT operator is
1 (true). It is not possible for an IS or IS NOT expression to
evaluate to NULL. Operators IS and IS NOT have the same precedence as
=.
taken from: SQL As Understood By SQLite.
The important part is: ...except when one or both of the operands are NULL... because when using = or != (<>) and 1 (or both) of the operands is NULL then the result is also NULL and this is the difference to IS and IS NOT.
They work the same but "IS" is a keyword in MySQL and is generally used while comparing NULL values. While comparing NULL values "=" does not work.
SELECT * FROM nomnom WHERE neighborhood IS NULL
The above statement would run perfectly fine but
SELECT * FROM nomnom WHERE neighborhood = NULL
would result in an error.
They are the same for these cases, but further down the line you will discover one nifty little value called NULL.
NULL is a pain because... it doesn't exist.
0 = NULL returns FALSE;
Date <> [Column] will not return lines with NULL, only those with a value that is different.
Hell, even NULL = NULL returns false. And NULL <> NULL also returns false. That is why "IS" exists. Because NULL IS NULL will return true.
So as a general rule, use = for values.
Keep "IS" for null.
[Column] IS NULL
or
[Column] IS NOT NULL
And remember to always check if your column is nullable that you need to plan for null values in your WHERE or ON clauses.

Simple where clause condition involving NULL

I have a query that needs to exclude both Null and Blank Values, but for some reason I can't work out this simple logic in my head.
Currently, my code looks like this:
WHERE [Imported] = 0 AND ([Value] IS NOT NULL **OR** [Value] != '')
However, should my code look like this to exclude both condition:
WHERE [Imported] = 0 AND ([Value] IS NOT NULL **AND** [Value] != '')
For some reason I just can't sort this in my head properly. To me it seems like both would work.
In your question you wrote the following:
have a query that needs to exclude both Null and Blank Values
So you have answered yourself, the AND query is the right query:
WHERE [Imported] = 0 AND ([Value] IS NOT NULL AND [Value] != '')
Here is an extract from the ANSI SQL Draft 2003 that I borrowed from this question:
6.3.3.3 Rule evaluation order
[...]
Where the precedence is not determined by the Formats or by
parentheses, effective evaluation of expressions is generally
performed from left to right. However, it is
implementation-dependent whether expressions are actually evaluated left to right, particularly when operands or operators might
cause conditions to be raised or if the results of the expressions
can be determined without completely evaluating all parts of the
expression.
You don't specify what kind of database system you are using but the concept of short-circuit evaluation which is explained in the previous paragraph applies to all major SQL versions (T-SQL, PL/SQL etc...)
Short-circuit evaluation means that once an expression has been successfully evaluated it will immediately exit the condition and stop evaluating the other expressions, applied to your question:
If value is null you want to exit the condition, that's why it should be the first expression (from left to right) but if it isn't null it should also not be empty, so it has to be NOT NULL and NOT EMPTY.
This case is a bit tricky because you cannot have a non empty string that is also null so the OR condition will also work but you will do an extra evaluation because short-circuit evaluation will never exit in the first expression:
Value is null but we would always need to check that value is also not an empty string (value is null or value is not an empty string).
In this second case, you may get an exception because the expression [Value] != '' may be checked on a null object.
So I think AND is the right answer. Hope it helps.
If the value was numeric and you didn't want either 1 or 2, you would write that condition as
... WHERE value != 1 AND value != 2
An OR would always be true in this case. For instance a value of 1 would return true for the check against 2 - and then the OR-check would return true, as at least one of the conditions evaluated to true.
When yu also want to check against null values, the situation is a bit more complicated. A check against a null value always fails: value != '' is false when value is null. That is why there is a special IS NULL or IS NOT NULL test.

why is null not equal to null false

I was reading this article:
Get null == null in SQL
And the consensus is that when trying to test equality between two (nullable) sql columns, the right approach is:
where ((A=B) OR (A IS NULL AND B IS NULL))
When A and B are NULL, (A=B) still returns FALSE, since NULL is not equal to NULL. That is why the extra check is required.
What about when testing inequalities? Following from the above discussion, it made me think that to test inequality I would need to do something like:
WHERE ((A <> B) OR (A IS NOT NULL AND B IS NULL) OR (A IS NULL AND B IS NOT NULL))
However, I noticed that that is not necessary (at least not on informix 11.5), and I can just do:
where (A<>B)
If A and B are NULL, this returns FALSE. If NULL is not equal to NULL, then shouldn't this return TRUE?
EDIT
These are all good answers, but I think my question was a little vague. Allow me to rephrase:
Given that either A or B can be NULL, is it enough to check their inequality with
where (A<>B)
Or do I need to explicitly check it like this:
WHERE ((A <> B) OR (A IS NOT NULL AND B IS NULL) OR (A IS NULL AND B IS NOT NULL))
REFER to this thread for the answer to this question.
Because that behavior follows established ternary logic where NULL is considered an unknown value.
If you think of NULL as unknown, it becomes much more intuitive:
Is unknown a equal to unknown b? There's no way to know, so: unknown.
relational expressions involving NULL actually yield NULL again
edit
here, <> stands for arbitrary binary operator, NULL is the SQL placeholder, and value is any value (NULL is not a value):
NULL <> value -> NULL
NULL <> NULL -> NULL
the logic is: NULL means "no value" or "unknown value", and thus any comparison with any actual value makes no sense.
is X = 42 true, false, or unknown, given that you don't know what value (if any) X holds? SQL says it's unknown. is X = Y true, false, or unknown, given that both are unknown? SQL says the result is unknown. and it says so for any binary relational operation, which is only logical (even if having NULLs in the model is not in the first place).
SQL also provides two unary postfix operators, IS NULL and IS NOT NULL, these return TRUE or FALSE according to their operand.
NULL IS NULL -> TRUE
NULL IS NOT NULL -> FALSE
All comparisons involving null are undefined, and evaluate to false. This idea, which is what prevents null being evaluated as equivalent to null, also prevents null being evaluated as NOT equivalent to null.
The short answer is... NULLs are weird, they don't really behave like you'd expect.
Here's a great paper on how NULLs work in SQL. I think it will help improve your understanding of the topic. I think the sections on handling null values in expressions will be especially useful for you.
http://www.oracle.com/technology/oramag/oracle/05-jul/o45sql.html
The default (ANSI) behaviour of nulls within an expression will result in a null (there are enough other answers with the cases of that).
There are however some edge cases and caveats that I would place when dealing with MS Sql Server that are not being listed.
Nulls within a statement that is grouping values together will be considered equal and be grouped together.
Null values within a statement that is ordering them will be considered equal.
Null values selected within a statement that is using distinct will be considered equal when evaluating the distinct aspect of the query
It is possible in SQL Server to override the expression logic regarding the specific Null = Null test, using the SET ANSI_NULLS OFF, which will then give you equality between null values - this is not a recommended move, but does exist.
SET ANSI_NULLS OFF
select result =
case
when null=null then 'eq'
else 'ne'
end
SET ANSI_NULLS ON
select result =
case
when null=null then 'eq'
else 'ne'
end
Here is a Quick Fix
ISNULL(A,0)=ISNULL(B,0)
0 can be changed to something that can never happen in your data
"Is unknown a equal to unknown b? There's no way to know, so: unknown."
The question was : why does the comparison yield FALSE ?
Given three-valued logic, it would indeed be sensible for the comparison to yield UNKNOWN (not FALSE). But SQL does yield FALSE, and not UNKNOWN.
One of the myriads of perversities in the SQL language.
Furthermore, the following must be taken into account :
If "unkown" is a logical value in ternary logic, then it ought to be the case that an equality comparison between two logical values that both happen to be (the value for) "unknown", then that comparison ought to yield TRUE.
If the logical value is itself unknown, then obviously that cannot be represented by putting the value "unknown" there, because that would imply that the logical value is known (to be "unknown"). That is, a.o., how relational theory proves that implementing 3-valued logic raises the requirement for a 4-valued logic, that a 4 valued logic leads to the need for a 5-valued logic, etc. etc. ad infinitum.