How to get data where the whole column is NOT NULL?

How to get data where the whole column is NOT NULL? - sql

I am trying to pull data where a specific column is completely not null. It should only return the data if ALL of the rows in the column meet that requirement. Doing simply IS NOT NULL will not work. In short, I am trying to find contracts where all of the products on that contract has been terminated and to only return that data.
Here is what I have so far, its barebone:
SELECT
T0.CustomerCode
, T0.CustomerName
, T1.ContractID
, T1.StartDate
, T1.TerminationDate
, T2.ProductRecordID
, T2.ProductSN
, T2.CompanySN
, T2.ProductRecordStatus
FROM
T0
INNER JOIN T1 ON T0.ContractID = T1.ContractID
INNER JOIN T2 ON T1.ProductRecordID = T2.ProductRecordID
WHERE T0.ProductRecordStatus = 'A'

A solution can be to count null value in the specified column and check if this number is equal to zero.
DECLARE #nullCount int = 1;
SELECT
#nullCount = COUNT(1)
FROM
[your_table]
WHERE
[your_column] IS NULL;
IF #nullCount = 0
SELECT
T0.CustomerCode , T0.CustomerName , T1.ContractID , T1.StartDate , T1.TerminationDate ,
T2.ProductRecordID , T2.ProductSN , T2.CompanySN , T2.ProductRecordStatus
FROM T0
INNER JOIN T1 ON T0.ContractID = T1.ContractID INNER JOIN T2 ON T1.ProductRecordID =
T2.ProductRecordID
WHERE T0.ProductRecordStatus = 'A';
in this way if you have one or more null values the query is not even performed.

Related

Conditional update across multiple fields

New to SQL so still learning all there is to offer.
I'm bringing in data from multiple sources and building a unique identifier table.
Several fields must be populated in order of precedence (i.e. the first datasource is preferred, then 2nd and so on).
Here is what I'm trying to do
UPDATE
TABLE1 AS DIMTABLE
SET
FIRSTNAME = ifnull( FIRSTNAME, RAWTABLE.FIRSTNAME )
, DIMTABLE.MIDDLENAME = ifnull( DIMTABLE.MIDDLENAME, RAWTABLE.MIDDLENAME )
, DIMTABLE.LASTNAME = ifnull( DIMTABLE.LASTNAME, RAWTABLE.LASTNAME )
, DIMTABLE.GENDER = IFNULL( DIMTABLE.GENDER, RAWTABLE.GENDER )
, DIMTABLE.DOB = IFNULL( DIMTABLE.DOB, RAWTABLE.DOB )
, DIMTABLE.PHONE1 = IFNULL( DIMTABLE.PHONE1, RAWTABLE.PHONE1 )
, DIMTABLE.PHONE2 = IFNULL( DIMTABLE.PHONE2, RAWTABLE.PHONE2 )
, DIMTABLE.EMAIL = IFNULL( DIMTABLE.EMAIL, RAWTABLE.EMAIL )
, DIMTABLE.FAX = IFNULL( DIMTABLE.FAX, RAWTABLE.FAX )
FROM
TABLE2 AS RAWTABLE
WHERE
RAWTABLE.ID_SOURCE_ID = 9
AND DIMTABLE.UID = RAWTABLE.UID
I have a sequence of these statements. One for each RAWTABLE.IDSOURCE_ID = 10,11,15...
The result is that the null fields I'm trying to update, remain null.
I was hoping something like this was possible to avoid multiple passes over the table.
I'm struggling with this approach which usually means there must be a better way

One option is to prepare rows first(giving preference first NOT NULL value in a column per UID):
CREATE OR REPLACE TEMPORARY TABLE RAWTABLE_SINGLE_UID
AS
SELECT DISTINCT
UID
,FIRST_VALUE(FIRSTNAME) IGNORE_NULLS
OVER(PARTITION BY UID ORDER BY SOURCE_ID) AS FIRSTNAME
,FIRST_VALUE(LASTNAME) IGNORE_NULLS
OVER(PARTITION BY UID ORDER BY SOURCE_ID) AS LASTNAME
,...
FROM TABLE2
WHERE SOURCE_ID IN (9,10,11,15);
Warning QUALIFY does not guarantee first non-null value per column but entire row:
CREATE OR REPLACE TEMPORARY TABLE RAWTABLE_SINGLE_UID
AS
SELECT *
FROM TABLE2
WHERE SOURCE_ID IN (9,10,11,15)
QUALIFY ROW_NUMBER() OVER(PARTITION BY UID ORDER BY SOURCE_ID) = 1
And then perform update:
UPDATE TABLE1 AS DIMTABLE
SET FIRSTNAME = ifnull( FIRSTNAME, RAWTABLE.FIRSTNAME )
, DIMTABLE.MIDDLENAME = ifnull( DIMTABLE.MIDDLENAME, RAWTABLE.MIDDLENAME )
, DIMTABLE.LASTNAME = ifnull( DIMTABLE.LASTNAME, RAWTABLE.LASTNAME )
, DIMTABLE.GENDER = IFNULL( DIMTABLE.GENDER, RAWTABLE.GENDER )
, DIMTABLE.DOB = IFNULL( DIMTABLE.DOB, RAWTABLE.DOB )
, DIMTABLE.PHONE1 = IFNULL( DIMTABLE.PHONE1, RAWTABLE.PHONE1 )
, DIMTABLE.PHONE2 = IFNULL( DIMTABLE.PHONE2, RAWTABLE.PHONE2 )
, DIMTABLE.EMAIL = IFNULL( DIMTABLE.EMAIL, RAWTABLE.EMAIL )
, DIMTABLE.FAX = IFNULL( DIMTABLE.FAX, RAWTABLE.FAX )
FROM RAWTABLE_SINGLE_UID AS RAWTABLE
WHERE DIMTABLE.UID = RAWTABLE.UID;

If I understand correctly, you can use LEFT JOIN for this purpose with `COALESCE:
UPDATE TABLE1 main
SET FIRSTNAME = COALESCE(T1.FIRSTNAME, t2.FIRSTNAME, t3.FIRSTNAME),
. . ..
FROM TABLE1 T1 LEFT JOIN
TABLE2 T2
ON T2.UID = T1.UID LEFT JOIN
TABLE3 t3
ON t3.UID = T1.UID
WHERE T1.ID_SOURCE_ID = 9 AND
T1.UID = MAIN.UID;
This is assuming the following:
You want to update ID_SOURCE_ID = 9 in the main table.
UID is a unique id in all the tables.
The secondary tables do not need to have all the UIDs.

SQL Server looping query

I made this view in sql server to combine the values of 2 records of multiple columns. But the problem with this solution is that you need a concat for every column in table2. I would like to know if it is possible to do the concat part with a loop and a dynamic variable for the column numbers (columns in table2 are called 1,2,3,4,5....) of table2.
SELECT
dbo.table1.lot_id AS lot,
dbo.table1.hybird_id AS hybrid,
concat(
LEFT( (SELECT dbo.table2.[1] FROM dbo.table2 WHERE dbo.table2.parentals_id = dbo.table1.parental_male_id AND dbo.table2.lot_id = dbo.table1.lot_id) , 1),
LEFT( (SELECT dbo.table2.[1] FROM dbo.table2 WHERE dbo.table2.parentals_id = dbo.table1.parental_female_id AND dbo.table2.lot_id = dbo.table1.lot_id) , 1)
) AS '1',
--above concat x31 times more
FROM dbo.table2
INNER JOIN dbo.table1 ON dbo.table2.lot_id = dbo.table1.lot_id
GROUP BY dbo.table1.lot_id, dbo.table1.hybird_id,
dbo.table1.parental_male_id,
dbo.table1.parental_female_id
I tried a few things but nothing worked, any ideas?

Try to simplify it a bit, kind of
SELECT lot, hybrid, parental_male_id, parental_female_id
concat(Left(m.[1],1), left(f.[1], 1)) AS [1]
--,..
FROM (
SELECT dbo.table1.lot_id AS lot
, dbo.table1.hybird_id AS hybrid
, dbo.table1.parental_male_id
, dbo.table1.parental_female_id
FROM dbo.table2
INNER JOIN dbo.table1 ON dbo.table2.lot_id = dbo.table1.lot_id
GROUP BY dbo.table1.lot_id, dbo.table1.hybird_id,
dbo.table1.parental_male_id,
dbo.table1.parental_female_id
) t
JOIN dbo.table2 m ON m.parentals_id = t.parental_male_id AND m.lot_id = lot)
JOIN dbo.table2 f ON f.parentals_id = t.parental_female_id AND f.lot_id = lot)

Correct way to join on 3 Tables

I am trying to pull data from 3 different tables and my result set is not what I'm expecting.
SELECT mdp.ReportDate
, mdp.PolicyNumber
, Company
, StateCode
, LOB
, mdp.AccountReference
, EffectiveDate
, EquityDate
, AccountBalance
, TermPremium
, DelinquentAmount
, PolicyStatus
, dcbpt.PolicyTermExtendedData
, TermsInDays
, dcba.AccountId
FROM Bil_MonthlyDelinquentPayments mdp
INNER JOIN DC_BIL_Account AS dcba
ON PolicyNumber = dcba.AccountReference
AND ReportDate = (
SELECT Max(ReportDate)
FROM Bil_MonthlyDelinquentPayments maxmdp
WHERE Year(maxmdp.ReportDate) = 2017
AND Month(maxmdp.ReportDate) = 01
)
LEFT JOIN DC_BIL_PolicyTerm AS dcbpt
ON dcba.AccountId = dcbpt.PrimaryAccountId
AND PolicyTermEffectiveDate = (
SELECT Max(PolicyTermEffectiveDate)
FROM DC_BIL_PolicyTerm
)
ORDER BY AccountId
In my result set the column dcbpt.PolicyTermExtendedData is being returned as a null value. This column contains data in the table and I would expect my result set to contain that data but it doesn't.

That null value is from the second table used in your left join. Left join will return all the results from the 1st table (on the left) and if it does not find a match in your 2nd table to join on, it will pair the first table with a null value. Take a look at what you ware matching on.

Table alias name scope in sub-select query

Please have a look at the query below - I am getting invalid identifier t1.oid in the below inner query.
I have column oid in iclr_request t1
select t1.requestNo
, t2.routeDistance,
, (
select WM_CONCAT(crc7) as "TravCirc7s"
from (
select (
select crc7
from dim_afi_dnld_stn_v1
where stn_sys_nbr = t3.stn_sys_nbr
and rownum=1
) as crc7
from iclr_trav_circ7 t3
where request_oid = **t1.oid**
and sub_route_index=0
and station_type_oid = 1
order by sequence
)
)
from iclr_request t1
, iclr_summary_results t2
where t1.oid = t2.request_oid

You can try this:
select t1.requestNo , t2.routeDistance,
WM_CONCAT((select crc7 from dim_afi_dnld_stn_v1 where stn_sys_nbr = t3.stn_sys_nbr and rownum=1)) as "TravCirc7s"
from iclr_request t1
join iclr_summary_results t2 on t1.oid = t2.request_oid
left join iclr_trav_circ7 t3 on t3.request_oid = t1.oid
and t3.sub_route_index=0
and t3.station_type_oid = 1
group by t1.requestNo , t2.routeDistance;
Correlated subqueries may refer their parents only 1 level above (although some Oracle documentation says it's unlimited)
EDIT: It doesn't save the order by sequence in WM_CONCAT. You may need to wrap it a parent query and then wm_concat

Finding Not null on multiple columns using COALESCE

I have a query to find not null on 2 columns on a table which is a view , hence it is taking a lot of time for execution.
The query is : Query1
SELECT [Table1].M, [[Table1]].B, [Table1].P
FROM [Table1]
WHERE ((([[Table1]].B) Is Not Null) AND (([[Table1]].P) Is Not Null));
Does the below query does the same function as Query1 with faster execution time ?
SELECT [Table1].M, [[Table1]].B, [Table1].P
FROM [Table1]
WHERE COALESCE (([[Table1]].B),([[Table1]].P)) Is Not Null
Any help would be of great help and thanks in advance.
The view query
select dbo.TABLE1.[COL1]
, dbo.TABLE1.[COL2]
, RIGHT(dbo.TABLE1.M, 12) as M
, dbo.TABLE2.[MD]
, dbo.TABLE1.[COL3]
, dbo.TABLE1.[COL4]
, dbo.TABLE3.COL1
, dbo.TABLE3.[COL2]
, dbo.TABLE3.[COL3]
, dbo.TABLE4.[COL1]
, dbo.TABLE5.[COL1]
, dbo.TABLE6.[COL1]
, dbo.TABLE7.[COL1] as [BA]
, dbo.TABLE8.[COL1]
, dbo.TABLE3.[COL4]
, dbo.TABLE3.[COL5]
, dbo.TABLE3.[COL6]
from dbo.TABLE1
left outer join dbo.TABLE2
on dbo.TABLE1.M = dbo.TABLE2.M
left outer join dbo.TABLE3
on dbo.TABLE1.M = dbo.TABLE3.M
left outer join dbo.TABLE5
on dbo.TABLE3.[OBJ_NR] = dbo.TABLE5.OBJ
left outer join dbo.TABLE6
on dbo.TABLE3.[OBJ_NR] = dbo.TABLE6.OBJ
left outer join dbo.TABLE7
on dbo.TABLE3.[OBJ_NR] = dbo.TABLE7.OBJ
left outer join dbo.TABLE4
on dbo.TABLE3.[OBJ_NR] = dbo.TABLE4.OBJ
left outer join dbo.TABLE8
on dbo.TABLE3.[OBJ_NR] = dbo.TABLE8.OBJ
where (
(
dbo.TABLE1.[COL1] not in (
'XX'
, 'YY'
)
)
and (dbo.TABLE1.COL5 = 'x')
)

No, both queries aren't equivalent.
The WHERE clause in the second one is equivalent to
WHERE [[Table1]].B Is Not Null OR [[Table1]].P Is Not Null
COALESCE will evaluate the first parameter and return it if not null. Otherwise, it will return the second one if not null, and so on, until reaching the last parameter, which will be returned whatever its value. So COALESCE(...) IS NOT NULL needs only one not null value to return true, not all.

I've tried this out on a table in my development DB. Here are the results:
with only PK index: 2 minutes for 4 million selected records out of 8 million table
with index on 3 selected columns (none of them PK) 1.8 seconds.
You might need to do some testing to get the right indexes for your setup but here is the sample of what i changed:
select [col1]
, [col2]
, [col3]
from [dbo].[tbl]
where col2 is not null
and col3 is not null
create nonclustered index [idx_test] on [dbo].[tbl] (
[col2] asc
, [col3] asc
) INCLUDE ([col1])

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to get data where the whole column is NOT NULL? - sql

Related

Conditional update across multiple fields

SQL Server looping query

Correct way to join on 3 Tables

Table alias name scope in sub-select query

Finding Not null on multiple columns using COALESCE

Categories

Resources