How to treat NULLs like '' (empty data) in comparison WHERE statements - sql

I know that if you try to compare a NULL value to something else. It being NULL makes the whole statement NULL.
table.ColumnA = NULL
table.ColumnB = 'something'
If I do something like
where table.ColumnA <> table.ColumnB
It will not work.
I know you can do something like,
(NOT (a <> b OR a IS NULL OR b IS NULL) OR (a IS NULL AND b IS NULL))
But my query is already pretty long and complicated. Trying to add this routine for every column would be a nightmare.
I was hoping there was an easier way to just temporarily treat NULL as ''

The logic for a <> b with NULL values is:
where (a <> b or a is null and b is not null or a is not null and b is null)
Unfortunately, SQL Server doesn't support the standard is distinct from clause, which would make this much simpler.
If there are values that you know will never be used, you can use coalesce():
where coalesce(a, '<null>') <> coalesce(b, '<null>')

Be careful blanket removing nulls, but if you need a quick solution to treat NULL as an empty string while preserving non-nulls you can use the coalesce function:
COALESCE(columnname, '')
Docs here:
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/coalesce-transact-sql?view=sql-server-ver15

Related

Concise way to count where two columns are non-null (in select clause)

Consider a table with two nullable columns a and b of any type, and some other arbitrary columns.
I can count cases where one column is not null with:
select count(a) from ...
I can count cases where either column is not null with:
select count(coalesce(a, b)) from ...
But the only way I've been able to figure out how to count cases where both columns are not null is the rather clunky:
select sum(iif(a is not null and b is not null, 1, 0)) from ...
Is there a more concise way to count if both are not null? If there's no general way, is there a way if both columns are int, or if both columns are nvarchar?
The reason I don't want to do it in a where clause, e.g.:
select count(*) from ... where a is not null and b is not null
Is that I'm selecting multiple counts from the same subquery at once:
select count(*)
,count(a)
,count(b)
,sum(iif(a is not null and b is not null, 1, 0))
from ...
And the other reason it needs to take this form is too long to explain here but basically boils down to this being part of a rather complicated query with a very specific structure related to performance.
This question is more out of curiosity, as sum(iif(...)) does work, I'm just wondering if there is something as concise as coalesce(a, b) for the and case.
This is SQL Server 2016, SP1.
In a special case, if both columns are nvarchar,
you could try
COUNT(a + b)
If data type is interger, then use
Count(a/2 + b/2)
To avoid overflow error.
Note: a+b is not null only when both a and b are not null
As #JasonC's suggestion, I add another solution for bitwise operators type:
Count(a & b)
A CASE statement should work well here:
SELECT CASE
WHEN a IS NULL OR b IS NULL
THEN 0
ELSE 1
END
FROM ...

Where with two AND

I'm Trying to delete rows from database that don't have phone, chellphone, and email (all 3 together)
DELETE *
FROM TEST
WHERE (((TEST.EMAIL1) IS Null)
AND ((TEST.PHONE3) IS Null)
AND ((TEST.PHONE1) IS Null));
For some reason if I'm doing only any two (email1, phone1 or phone3) it works, but when I'm adding second 'AND' it stop working. Any advise please.
Honestly the WHERE clause looks okay in this example, but the '*' should be left out:
DELETE FROM TEST
WHERE (((TEST.EMAIL1) IS Null)
AND ((TEST.PHONE3) IS Null)
AND ((TEST.PHONE1) IS Null));
If you have trouble after you delete the asterisk, re-copy-paste it back in, and we can see if there's a problem with parentheses or something. But the above bracketing looks okay (even if not necessary).
I'm not sure what is going on with your case but I've cleaned up your statement a bit, as you shouldn't need all those () as far as I know and DELETE doesn't require the * as it should delete anything that matches your criteria. Try it out and let me know!
DELETE
FROM TEST
WHERE EMAIL1 IS Null
AND PHONE3 IS Null
AND PHONE1 IS Null
DELETE *
FROM test
WHERE (((test.EMAIL1) ="") OR ((test.EMAIL1) IS NULL))
AND (((test.PHONE1) = "") OR ((test.phone1) IS NULL))
AND (((test.PHONE3) ="") OR ((test.phone3) IS NULL));
Worked for me. Thanks to all..
With all ANDs the parenthesis is not really necessary
DELETE
FROM TEST
WHERE TEST.EMAIL1 IS Null
AND TEST.PHONE3 IS Null
AND TEST.PHONE1 IS Null;
however, ensure that the values you expect to get deleted actually contain NULL and not something like empty string or the literal value of null.
You can check what information will be deleted by changing your statement to a select statement instead:
SELECT *
FROM TEST
WHERE TEST.EMAIL1 IS Null
AND TEST.PHONE3 IS Null
AND TEST.PHONE1 IS Null;
Based on your own answer to your question you were running into empty strings, not nulls. Another way to write what you wrote and to avoid ORs would be:
SELECT *
FROM TEST
WHERE isnull(TEST.EMAIL1, '') <> ''
AND isnull(TEST.PHONE3, '') <> ''
AND isnull(TEST.PHONE1, '') <> '';
In the above we're stating that any null test.email1s we encounter, treat as an empty string then check that that values is not an empty string.
So basically - if any of those three fields are null OR empty string. Same as your answer, just another way to write it.
You likely have no rows where all three are null. Check your data.

Returning varchars that are not null or empty SQL

I have a column in SQL that is varchar. I need it to return anything with a value.
Example...
select * from students where StudentID <> ''
Is this the correct way of doing it? I've tried is not null but then it returns anything that is empty as well.
Thanks
I would suggest using coalesce:
select * from students where coalesce(StudentID, '') <> ''
This will turn nulls into empty strings and disallow them. This has the added bonus of restricting empty strings as well.
A null is not equal to anything, not even another null, so a simple <> doesnt work.
select * from students where StudentID <> '' AND StudentID IS NOT NULL
You can target both white space and null.
There's something call NOT NULL
SELECT LastName,FirstName,Address FROM Persons WHERE Address IS NOT NULL
Is that helping ?
try this:
select * from students where StudentID is not null
You have to handle the case that it's not null separately because you cannot compare with null-values. They are neither equal nor unequal to any other value(incl. null). NULL means "unknown value", so any comparison with any actual value makes no sense
....
WHERE StudentID IS NOT NULL AND StudentID <> ''
You could use ISNULL(StudentID,'') <> '' (or COALESCE). But i think this is more efficient.
Try using ISNULL()
select * from student where isnull(studentid,'') <> ''

isnull vs is null

I have noticed a number of queries at work and on SO are using limitations in the form:
isnull(name,'') <> ''
Is there a particular reason why people do that and not the more terse
name is not null
Is it a legacy or a performance issue?
where isnull(name,'') <> ''
is equivalent to
where name is not null and name <> ''
which in turn is equivalent to
where name <> ''
(if name IS NULL that final expression would evaluate to unknown and the row not returned)
The use of the ISNULL pattern will result in a scan and is less efficient as can be seen in the below test.
SELECT ca.[name],
[number],
[type],
[low],
[high],
[status]
INTO TestTable
FROM [master].[dbo].[spt_values]
CROSS APPLY (SELECT [name]
UNION ALL
SELECT ''
UNION ALL
SELECT NULL) ca
CREATE NONCLUSTERED INDEX IX_TestTable ON dbo.TestTable(name)
GO
SELECT name FROM TestTable WHERE isnull(name,'') <> ''
SELECT name FROM TestTable WHERE name is not null and name <> ''
/*Can be simplified to just WHERE name <> '' */
Which should give you the execution plan you need.
is not null
Will only check if the field is not null. If the field contains an empty string, then the field is no longer null.
isnull(name, '') <> name
Checks for both a null and an empty string.
isnull(name,'') <> :name is shorthand for (name is null or name <> :name) (assuming that :name never contains the empty string, thus why shorthands like this can be bad).
Performance-wise, it depends. or statements in where clauses can give extremely bad performance. However, functions on columns impair index usage. As usual: profile.
isnull(name,'') <> name
Well I can see them using this because this way if the name doesn't match or is null it returns as a failed comparison. This really means: name is null or name <> name
Where as this one name is not null just checks to see if the name is null.
They don't mean the same thing.
name is not null
This checks for records where the name field is null
isnull(name,'') <> name
This one changes the value of null fields to the empty string so they can be used in a comparision. In SQL Server (but not in Oracle I think), if a value is null and it is used to compare equlaity or inequality it will not be considered becasue null means I don't know the value and thus is not an actual value. So if you want to make sure the null records are considered when doing the comparision, you need ISNULL or COALESCE(which is the ASCII STANDARD term to use as ISNULL doen't work in all databases).
What you should be looking at is the differnece between
isnull(a.name,'') <> b.name
a.name <> b.name
then you will understand why the ISNULL is needed to get correct results.
I apparently misread your question. So let me strike my first answer and try this one:
isnull(name,'') <> ''
is a misguided shortcut for
name is not null and name <> ''
Others have pointed out the functional difference. As to the performance issue, in Postgres I've found that -- oh, I should mention that Postgres has a function "coalesce" that is the equivalent of the "isnull" found in some other SQL dialects -- but in Postgres, saying
where coalesce(foobar,'')=''
is significantly faster than
where foobar is null or foobar=''
Also, it can be awesomely dramatically faster to say
where foobar>''
over
where foobar!=''
A greater than test can use the index and thus skip over all the blanks, while a not-equal test has to do a full file read. (Assuming you have an index on the field and no other index is used in preference.)
Also if you want to make use of the index on that column, use
name is not null and name <> ''
These two queries are not the same. For example, I do not have a middle name, this is a known fact, which can be stored as
MiddleName=''
However, if we don't know someone's middle name, we can store NULL.
So, ISNULL(MiddleName, '') means "persons without known middle names".
It is to handle both the empty string and NULL. While it is good to be able to do with with one statement, isnull is proprietary syntax. I would write this using portable Standard SQL as
NULLIF(name, '') IS NOT NULL

Any better ways to ascertain whether a column in a table is empty or not?

I have a table say T in SQL Server 2005 database and it has two columns say A and B, which more often than not won't have any values in them.
How to check whether A and B are empty (has all zero length strings) or not?
I have this naive way of doing it -
select count(*) as A_count from T where A <> ''
Let's assume A has data type varchar.
I was wondering whether I can get the same information using a system table, and if so would that be faster than this query?
cheers
Your method is essentially correct, although the wording in your question is imprecise. Does empty include NULL or a non-zero length empty string?
You could handle those cases with:
select count(*) as A_count from T where isnull(rtrim(ltrim(A)), '') <> ''
Also, make sure there is an index on column A.
If your column is nullable, you will have to modify your query as follows:
select count(*) as A_count from T where COALESCE(A, '') <> ''
otherwise you will not count nulls.
If we are talking about zero-length strings, then this is the way that I would do it:
select count(*) as A_count from T where LEN(A) > 0
Keep in mind, however that if A could be null, then these rows will not be caught by either LEN(A) > 0 or LEN(A) = 0, and that you will have to wrap an ISNULL around A in that case.