Delete many columns with null - sql

I have a table with 34 columns.
First 4 columns form the primary key (thus they do not have nulls in them) and rest 30 columns have data in them also nulls.
I wish to delete all the rows where the other 30 columns are null.
A way would be -
delete from my_table
where column_30 is null
and columns_29 is null
and column_28 is null
and column_27 is null
...
...
...
Is there any easy way to do this without mentioning all 30 column names?

No, that's impossible. You could shorten the code a bit but pay a penalty in sargability - using coalesce:
delete from my_table
where coalesce(column_30, column_29, column_28.....) is null
As Damien wrote in his comment - this will only work if all columns are of compatible types, meaning SQL Server can implicitly convert between them.

coalesce receives a list of expressions and returns the first not-null one, or null if they all are, so you can use it as some sort of a shorthand:
DELETE FROM mytable
WHERE COALESCE(column_30, column_29, column_28, ...) IS NULL

Another approach of doing it. (Assuming that the datatype of all the columns is same)
delete t
from my_table t
where (select top 1 1 as col
from (values (T.column_30),
(T.columns_29),
(T.columns_28 ),
(T.columns_27 )) v(x)
where x is null) = 1

Related

remove all NULL valued rows from table?

My sql query working fine but i have another issue some rows in my table have NULL values. i want to remove all NULL valued rows. Any recommendations?
Delete statement should work
DELETE FROM your_table
WHERE your_column IS NULL;
In case of multi column NULL check, I suggest using COALESCE
DELETE FROM your_table
WHERE COALESCE (your_column1, your_column2, your_column3 ) IS NULL;
If you want to delete a row where all columns values are null then use following query to delete:
DELETE FROM your_table
where your_column1 IS NULL
AND your_column2 IS NULL
AND your_column3 IS NUL;
But if you want to delete a row where any column value is null then use following query to delete:
DELETE FROM your_table
where your_column1 IS NULL
OR your_column2 IS NULL
OR your_column3 IS NUL;
You could start by filtering out columns you don't need, want, or will not use.
SELECT column1, column2, column3
FROM table
Then to remove the null values from your queries, use the IS NOT NULL command. Remember, null values can also be zeros, depending on how your tables and the data formats have been set up. In the example below, assume column2 is a number format.
WHERE column1 IS NOT NULL AND column2 > 0 AND column3 IS NOT NULL

Concise way to count where two columns are non-null (in select clause)

Consider a table with two nullable columns a and b of any type, and some other arbitrary columns.
I can count cases where one column is not null with:
select count(a) from ...
I can count cases where either column is not null with:
select count(coalesce(a, b)) from ...
But the only way I've been able to figure out how to count cases where both columns are not null is the rather clunky:
select sum(iif(a is not null and b is not null, 1, 0)) from ...
Is there a more concise way to count if both are not null? If there's no general way, is there a way if both columns are int, or if both columns are nvarchar?
The reason I don't want to do it in a where clause, e.g.:
select count(*) from ... where a is not null and b is not null
Is that I'm selecting multiple counts from the same subquery at once:
select count(*)
,count(a)
,count(b)
,sum(iif(a is not null and b is not null, 1, 0))
from ...
And the other reason it needs to take this form is too long to explain here but basically boils down to this being part of a rather complicated query with a very specific structure related to performance.
This question is more out of curiosity, as sum(iif(...)) does work, I'm just wondering if there is something as concise as coalesce(a, b) for the and case.
This is SQL Server 2016, SP1.
In a special case, if both columns are nvarchar,
you could try
COUNT(a + b)
If data type is interger, then use
Count(a/2 + b/2)
To avoid overflow error.
Note: a+b is not null only when both a and b are not null
As #JasonC's suggestion, I add another solution for bitwise operators type:
Count(a & b)
A CASE statement should work well here:
SELECT CASE
WHEN a IS NULL OR b IS NULL
THEN 0
ELSE 1
END
FROM ...

SQL Server inconsistent results over 2 columns using = and <>

I am trying to replace a manual process with an SQL-SERVER (2012) based automated one. Prior to doing this, I need to analyse the data in question over time to produce some data quality measures/statistics.
Part of this entails comparing the values in two columns. I need to count where they match and where they do not so I can prove my varied stats tally. This should be simple but seems not to be.
Basically, I have a table containing two columns both of which are defined identically as type INT with null values permitted.
SELECT * FROM TABLE
WHERE COLUMN1 is NULL
returns zero rows
SELECT * FROM TABLE
WHERE COLUMN2 is NULL
also returns zero rows.
SELECT COUNT(*) FROM TABLE
returns 3780
and
SELECT * FROM TABLE
returns 3780 rows.
So I have established that there are 3780 rows in my table and that there are no NULL values in the columns I am interested in.
SELECT * FROM TABLE
WHERE COLUMN1=COLUMN2
returns zero rows as expected.
Conversely therefore in a table of 3780 rows, with no NULL values in the columns being compared, I expect the following SQL
SELECT * FROM TABLE
WHERE COLUMN1<>COLUMN2
or in desperation
SELECT * FROM TABLE
WHERE NOT (COLUMN1=COLUMN2)
to return 3780 rows but it doesn't. It returns 3709!
I have tried SELECT * instead of SELECT COUNT(*) in case NULL values in some other columns were impacting but this made no difference, I still got 3709 rows.
Also, there are some negative values in 73 rows for COLUMN1 - is this what causes the issue (but 73+3709=3782 not 3780 my number of rows)?
What is a better way of proving the values in these numeric columns never match?
Update 09/09/2016: At Lamaks suggestion below I isolated the 71 missing rows and found that in each one, COLUMN1 = NULL and COLUMN2 = -99. So the issue is NULL values but why doesn't
SELECT * FROM TABLE WHERE COLUMN1 is NULL
pick them up? Here is the information in Information Schema Views and System Views:
ORDINAL_POSITION COLUMN_NAME DATA_TYPE CHARACTER_MAXIMUM_LENGTH IS_NULLABLE
1 ID int NULL NO
.. .. .. .. ..
7 COLUMN1 int NULL YES
8 COLUMN2 int NULL YES
CONSTRAINT_NAME
PK__TABLE___...
name type_desc is_unique is_primary_key
PK__TABLE___... CLUSTERED 1 1
Suspect the CHARACTER_MAXIMUM_LENGTH of NULL must be the issue?
You can find the count based on the below query using left join.
--To find COLUMN1=COLUMN2 Count
--------------------------------
SELECT COUNT(T1.ID)
FROM TABLE T1
LEFT JOIN TABLE T2 ON T1.COLUMN1=T2.COLUMN2
WHERE t2.id is not null
--To find COLUMN1<>COLUMN2 Count
--------------------------------
SELECT COUNT(T1.ID)
FROM TABLE T1
LEFT JOIN TABLE T2 ON T1.COLUMN1=T2.COLUMN2
WHERE t2.id is null
Through the exhaustive comment chain above with all help gratefully received, I suspect this to be a problem with the table creation script data types for the columns in question. I have no explanation from an SQL code point of view, as to why the "is NULL" intermittently picked up NULL values.
I was able to identify the 71 rows that were not being picked up as expected by using an "except".
i.e. I flipped the SQL that was missing 71 rows, namely:
SELECT * FROM TABLE WHERE COLUMN1 <> COLUMN 2
through an except:
SELECT * FROM TABLE
EXCEPT
SELECT * FROM TABLE WHERE COLUMN1 <> COLUMN 2
Through that I could see that COLUMN1 was always NULL in the missing 71 rows - even though the "is NULL" was not picking them up for me when I ran
SELECT * FROM TABLE WHERE COLUMN1 IS NULL
which returned zero rows.
Regarding the comparison of values stored in the columns, as my data volumes are low (3780 recs), I am just forcing the issue by using ISNULL and setting to 9999 (a numeric value I know my data will never contain) to make it work.
SELECT * FROM TABLE
WHERE ISNULL(COLUMN1, 9999) <> COLUMN2
I then get the 3780 rows as expected. It's not ideal but it'll have to do and is more or less appropriate as there are null values in there so they have to be handled.
Also, using Bertrands tip above I could view the table creation script and the columns were definitely set up as INT.

Excluding a Null value returns 0 rows in a sub query

I'm trying to clean up some data in SQL server and add a foreign key between the two tables.
I have a large quantity of orphaned rows in one of the tables that I would like to delete. I don't know why the following query would return 0 rows in MS SQL server.
--This Query returns no Rows
select * from tbl_A where ID not in ( select distinct ID from tbl_B
)
When I include IS NOT NULL in the subquery I get the results that I expect.
-- Rows are returned that contain all of the records in tbl_A but Not in tbl_B
select * from tbl_A where ID not in ( select distinct ID from tbl_B
where ID is not null )
The ID column is nullable and does contain null values. IF I run just the subquery I get the exact same results except the first query returns one extra NULL row as expected.
This is the expected behavior of the NOT IN subquery. When a subquery returns a single null value NOT IN will not match any rows.
If you don't exclusively want to do a null check, then you will want to use NOT EXISTS:
select *
from tbl_A A
where not exists (select distinct ID
from tbl_B b
where a.id = b.id)
As to why the NOT IN is causing issues, here are some posts that discuss it:
NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL
NOT EXISTS vs NOT IN
What's the difference between NOT EXISTS vs. NOT IN vs. LEFT JOIN WHERE IS NULL?
Matching on NULL with equals (=) will return NULL or UNKNOWN as opposed to true/false from a logic standpoint. E.g. see http://msdn.microsoft.com/en-us/library/aa196339(v=sql.80).aspx for discussion.
If you want to include finding NULL values in table A where there is no NULL in table B (if B is the "parent" and A is the "child" in the "foreign key" relationship you desire) then you would need a second statement, something like the following. Also I would recommend qualifying the ID field with a table prefix or alias since the field names are the same in both tables. Finally, I would not recommend having NULL values as the key. But in any case:
select * from tbl_A as A where (A.ID not in ( select distinct B.ID from tbl_B as B ))
or (A.ID is NULL and not exists(select * from tbl_B as B where B.ID is null))
The problem is the non-comparability of nulls. If you are asking "not in" and there are nulls in the subquery it cannot say that anything anything is definitely not in becuase it is looking at those nulls as "unknown" and so the answer is always "unknown" in the three value logic that SQL uses.
Now of course that is all assuming you have ANSI_NULLS ON (which is the default) If you turn that off then suddenly NULLS become comparable and it will give you results, and probably the results you expect.
If the ids are never negative, you might consider something like:
select *
from tbl_A
where coalesce(ID, -1) not in ( select distinct coalesce(ID, -1) from tbl_B )
(Or if id is a string, use something line coalesce(id, '<null>')).
This may not work in all cases, but it has the virtue of simplicity on the coding level.
You probably have ANSI NULLs switched off. This compares null values so null=null will return true.
Prefix the first query with
SET ANSI_NULLS ON
GO

mismatch not picked up when one value is null

I have a simple SQL query where a comparison is done between two tables for mismatching value.
Yesterday, we picked up an issue where one field was null and the other wasn't, but a mismatch was not detected.
As far as I can determine,the logic has been working all along until yesterday.
Here is the logic of the SQL:
CREATE TABLE Table1
(UserID INT,PlayDate DATETIME)
CREATE TABLE Table2
(UserID INT,PlayDate DATETIME)
INSERT INTO Table1 (UserID)
SELECT 5346
INSERT INTO Table2 (UserID,PlayDate)
SELECT 5346,'2012-11-01'
SELECT a.UserID
FROM Table1 a
INNER JOIN
Table2 b
ON a.UserID = b.UserID
WHERE a.PlayDate <> b.PlayDate
No values are returned even though the PlayDate values are different.
I have now updated the WHERE to read:
WHERE ISNULL(a.PlayDate,'') <> ISNULL(b.PlayDate,'')
Is there a setting in SQL which someone could have changed to cause the original code to no longer pick up the difference in fields?
Thanks
NULL <> anything
is unknown not true. SQL uses three valued logic (false/true/unknown) and the predicate needs to evaluate to true in a where clause for the row to be returned.
In fact in standard SQL any comparison with NULL except for IS [NOT] NULL yields unknown. Including WHERE NULL = NULL
You don't state RDBMS but if it supports IS DISTINCT FROM you could use that or if you are using MySQL it has a null safe equality operator <=> you could negate.
You say you think it previously behaved differently. If you are on SQL Server you might be using a different setting for ANSI_NULLS somehow but this setting is deprecated and you should rewrite any code that depends on it anyway.
You can simulate IS DISTINCT FROM in SQL Server with WHERE EXISTS (SELECT a.PlayDate EXCEPT SELECT b.PlayDate)
Not even a NULL can be equal to NULL.
Here are two common queries that just don’t work:
select * from table where column = null;
select * from table where column <> null;
there is no concept of equality or inequality, greater than or less
than with NULLs. Instead, one can only say “is” or “is not”
(without the word “equal”) when discussing NULLs.
- The correct way to write the queries
select * from table where column IS NULL;
select * from table where column IS NOT NULL;