fetch most recent non null record - sql

I'm trying to fetch the most recent record and find a non-NULL match. The problem is my subquery returns more than one result.
Data set
| ID | DD | SIG_ID | DCRP |
----------------------------------------
| 1 | 2010-06-01 | 1 | Expert |
| 2 | 2010-09-01 | 1 | Expert |
| 3 | 2010-12-01 | 1 | Expert |
| 4 | 2010-12-01 | 1 | Expert II |
| 5 | 2011-03-01 | 1 | Expert II |
| 6 | 2011-06-01 | 1 | (null) |
| 7 | 2010-06-01 | 2 | Senior |
| 8 | 2010-09-01 | 2 | Senior |
| 9 | 2010-09-01 | 2 | Senior |
| 10 | 2010-12-01 | 2 | Senior II |
| 11 | 2011-03-01 | 2 | (null) |
| 12 | 2011-03-01 | 2 | Senior |
| 13 | 2010-06-01 | 3 | (null) |
| 14 | 2010-09-01 | 3 | (null) |
| 15 | 2010-12-01 | 3 | (null) |
Query
SELECT a.sig_id, a.id,
CASE
WHEN b.dcrp IS NULL
THEN
(SELECT dcrp
FROM tbl
WHERE sig_id = a.sig_id
AND id < a.id
AND dcrp IS NOT NULL)
ELSE b.dcrp
END AS dcrp
FROM
(SELECT sig_id, MAX(id) id
FROM tbl
GROUP BY sig_id) a
LEFT JOIN
(SELECT id, dcrp
FROM tbl
WHERE dcrp IS NOT NULL) b ON b.id = a.id
Desired result
Fetch the most recent dcrp for each sig_id:
| ID | DD | SIG_ID | DCRP |
----------------------------------------
| 5 | 2011-03-01 | 1 | Expert II |
| 12 | 2011-03-01 | 2 | Senior |
| 15 | 2010-12-01 | 3 | (null) |
SQL Fiddle

You can use the following:
;WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY SIG_ID
ORDER BY CASE WHEN DCRP IS NOT NULL THEN 0 ELSE 1 END,
DD DESC) RN
FROM tbl
)
SELECT *
FROM CTE
WHERE RN = 1
And the fiddle.

;with si as (
select distinct sig_id from tbl
)
select *
from si
cross apply (select top 1 * from tbl where si.sig_id=tbl.sig_id order by case when dcrp is null then 1 else 0 end asc,dd desc) sii
and with fiddler :
http://sqlfiddle.com/#!3/8e267/2/0

The query in SQLFiddle fails due to subquery returning more than 1 row.
Adding TOP 1 fixes that. Please check if it is OK.
THEN
(SELECT TOP 1 dcrp
FROM tbl
WHERE sig_id = a.sig_id
AND id < a.id
AND dcrp IS NOT NULL)

Related

Replace nulls of a column with column value from another table

I have data flowing from two tables, table A and table B. I'm doing an inner join on a common column from both the tables and creating two more new columns based on different conditions. Below is a sample dataset:
Table A
| Id | StartDate |
|-----|------------|
| 119 | 01-01-2018 |
| 120 | 01-02-2019 |
| 121 | 03-05-2018 |
| 123 | 05-08-2021 |
TABLE B
| Id | CodeId | Code | RedemptionDate |
|-----|--------|------|----------------|
| 119 | 1 | abc | null |
| 119 | 2 | abc | null |
| 119 | 3 | def | null |
| 119 | 4 | def | 2/3/2019 |
| 120 | 5 | ghi | 04/7/2018 |
| 120 | 6 | ghi | 4/5/2018 |
| 121 | 7 | jkl | null |
| 121 | 8 | jkl | 4/4/2019 |
| 121 | 9 | mno | 3/18/2020 |
| 123 | 10 | pqr | null |
What I'm basically doing is joining the tables on column 'Id' when StartDate>2018 and create two new columns - 'unlock' by counting CodeId when RedemptionDate is null and 'Redeem' by counting CodeId when RedmeptionDate is not null. Below is the SQL query:
WITH cte1 AS (
SELECT a.id, COUNT(b.CodeId) AS 'Unlock'
FROM TableA AS a
JOIN TableB AS b ON a.Id=b.Id
WHERE YEAR(a.StartDate) >= 2018 AND b.RedemptionDate IS NULL
GROUP BY a.id
), cte2 AS (
SELECT a.id, COUNT(b.CodeId) AS 'Redeem'
FROM TableA AS a
JOIN TableB AS b ON a.Id=b.Id
WHERE YEAR(a.StartDate) >= 2018 AND b.RedemptionDate IS NOT NULL
GROUP BY a.id
)
SELECT cte1.Id, cte1.Unlocked, cte2.Redeemed
FROM cte1
FULL OUTER JOIN cte2 ON cte1.Id = cte2.Id
If I break down the output of this query, result from cte1 will look like below:
| Id | Unlock |
|-----|--------|
| 119 | 3 |
| 121 | 1 |
| 123 | 1 |
And from cte2 will look like below:
| Id | Redeem |
|-----|--------|
| 119 | 1 |
| 120 | 2 |
| 121 | 2 |
The last select query will produce the following result:
| Id | Unlock | Redeem |
|------|--------|--------|
| 119 | 3 | 1 |
| null | null | 2 |
| 121 | 1 | 2 |
| 123 | 1 | null |
How can I replace the null value from Id with values from 'b.Id'? If I try coalesce or a case statement, they create new columns. I don't want to create additional columns, rather replace the null values from the column values coming from another table.
My final output should like:
| Id | Unlock | Redeem |
|-----|--------|--------|
| 119 | 3 | 1 |
| 120 | null | 2 |
| 121 | 1 | 2 |
| 123 | 1 | null |
If I'm following correctly, you can use apply with aggregation:
select a.*, b.*
from a cross apply
(select count(RedemptionDate) as num_redeemed,
count(*) - count(RedemptionDate) as num_unlock
from b
where b.id = a.id
) b;
However, the answer to your question is to use coalesce(cte1.id, cte2.id) as id.

Get the Id of the matched data from other table. No duplicates of ID from both tables

Here is my table A.
| Id | GroupId | StoreId | Amount |
| 1 | 20 | 7 | 15000 |
| 2 | 20 | 7 | 1230 |
| 3 | 20 | 7 | 14230 |
| 4 | 20 | 7 | 9540 |
| 5 | 20 | 7 | 24230 |
| 6 | 20 | 7 | 1230 |
| 7 | 20 | 7 | 1230 |
Here is my table B.
| Id | GroupId | StoreId | Credit |
| 12 | 20 | 7 | 1230 |
| 14 | 20 | 7 | 15000 |
| 15 | 20 | 7 | 14230 |
| 16 | 20 | 7 | 1230 |
| 17 | 20 | 7 | 7004 |
| 18 | 20 | 7 | 65523 |
I want to get this result without getting duplicate Id of both table.
I need to get the Id of table B and A where the Amount = Credit.
| A.ID | B.ID | Amount |
| 1 | 14 | 15000 |
| 2 | 12 | 1230 |
| 3 | 15 | 14230 |
| 4 | null | 9540 |
| 5 | null | 24230 |
| 6 | 16 | 1230 |
| 7 | null | 1230 |
My problem is when I have 2 or more same Amount in table A, I get duplicate ID of table B. which should be null. Please help me. Thank you.
I think you want a left join. But this is tricky because you have duplicate amounts, but you only want one to match. The solution is to use row_number():
select . . .
from (select a.*, row_number() over (partition by amount order by id) as seqnum
from a
) a left join
(select b.*, row_number() over (partition by credit order by id) as seqnum
from b
)b
on a.amount = b.credit and a.seqnum = b.seqnum;
Another approach, I think simplier and shorter :)
select ID [A.ID],
(select top 1 ID from TABLE_B where Credit = A.Amount) [B.ID],
Amount
from TABLE_A [A]

Update using Self Join Sql Server

I have huge data and sample of the table looks like below
+-----------+------------+-----------+-----------+
| Unique_ID | Date | RowNumber | Flag_Date |
+-----------+------------+-----------+-----------+
| 1 | 6/3/2014 | 1 | 6/3/2014 |
| 1 | 5/22/2015 | 2 | NULL |
| 1 | 6/3/2015 | 3 | NULL |
| 1 | 11/20/2015 | 4 | NULL |
| 2 | 2/25/2014 | 1 | 2/25/2014 |
| 2 | 7/31/2014 | 2 | NULL |
| 2 | 8/26/2014 | 3 | NULL |
+-----------+------------+-----------+-----------+
Now I need to check if the difference between Date in 2nd row and Flag_date in 1st row. If the difference is more than 180 then 2nd row Flag_date should be updated with the date in 2nd row else it needs to be updated by Flag_date in 1st Row. And same rule follows for all rows with same unique_ID
update a
set a.Flag_Date=case when DATEDIFF(dd,b.Flag_Date,a.[Date])>180 then a.[Date] else b.Flag_Date end
from Table1 a
inner join Table1 b
on a.RowNumber=b.RowNumber+1 and a.Unique_ID=b.Unique_ID
The above update query when executed once, only the second row under each Unique_ID gets updated and result looks like below
+-----------+------------+-----------+------------+
| Unique_ID | Date | RowNumber | Flag_Date |
+-----------+------------+-----------+------------+
| 1 | 2014-06-03 | 1 | 2014-06-03 |
| 1 | 2015-05-22 | 2 | 2015-05-22 |
| 1 | 2015-06-03 | 3 | NULL |
| 1 | 2015-11-20 | 4 | NULL |
| 2 | 2014-02-25 | 1 | 2014-02-25 |
| 2 | 2014-07-31 | 2 | 2014-02-25 |
| 2 | 2014-08-26 | 3 | NULL |
+-----------+------------+-----------+------------+
And I need to run four times to achieve my desired result
+-----------+------------+-----------+------------+
| Unique_ID | Date | RowNumber | Flag_Date |
+-----------+------------+-----------+------------+
| 1 | 2014-06-03 | 1 | 2014-06-03 |
| 1 | 2015-05-22 | 2 | 2015-05-22 |
| 1 | 2015-06-03 | 3 | 2015-05-22 |
| 1 | 2015-11-20 | 4 | 2015-11-20 |
| 2 | 2014-02-25 | 1 | 2014-02-25 |
| 2 | 2014-07-31 | 2 | 2014-02-25 |
| 2 | 2014-08-26 | 3 | 2014-08-26 |
+-----------+------------+-----------+------------+
Is there a way where I can run update only once and all the rows are updated.
Thank you!
If you are using SQL Server 2012+, then you can use lag():
with toupdate as (
select t1.*,
lag(flag_date) over (partition by unique_id order by rownumber) as prev_flag_date
from table1 t1
)
update toupdate
set Flag_Date = (case when DATEDIFF(day, prev_Flag_Date, toupdate.[Date]) > 180
then toupdate.[Date] else prev_Flag_Date
end);
Both this version and your version can take advantage of an index on table1(unique_id, rownumber) or, better yet, table1(unique_id, rownumber, flag_date).
EDIT:
In earlier versions, this might have better performance:
with toupdate as (
select t1.*, t2.flag_date as prev_flag_date
from table1 t1 outer apply
(select top 1 t2.flag_date
from table1 t2
where t2.unique_id = t1.unique_id and
t2.rownumber < t1.rownumber
order by t2.rownumber desc
) t2
)
update toupdate
set Flag_Date = (case when DATEDIFF(day, prev_Flag_Date, toupdate.[Date]) > 180
then toupdate.[Date] else prev_Flag_Date
end);
The CTE can make use of the same index -- and it is important to have the index. The reason for the better performance is because your join on row_number() cannot use an index on that field.

MySQL: Pivot + Counting

I need help with a SQL that will convert this table:
===================
| Id | FK | Status|
===================
| 1 | A | 100 |
| 2 | A | 101 |
| 3 | B | 100 |
| 4 | B | 101 |
| 5 | C | 100 |
| 6 | C | 101 |
| 7 | A | 102 |
| 8 | A | 102 |
| 9 | B | 102 |
| 10 | B | 102 |
===================
to this:
==========================================
| FK | Count 100 | Count 101 | Count 102 |
==========================================
| A | 1 | 1 | 2 |
| B | 1 | 1 | 2 |
| C | 1 | 1 | 0 |
==========================================
I can so simple counts, etc., but am struggling trying to pivot the table with the information derived. Any help is appreciated.
Use:
SELECT t.fk,
SUM(CASE WHEN t.status = 100 THEN 1 ELSE 0 END) AS count_100,
SUM(CASE WHEN t.status = 101 THEN 1 ELSE 0 END) AS count_101,
SUM(CASE WHEN t.status = 102 THEN 1 ELSE 0 END) AS count_102
FROM TABLE t
GROUP BY t.fk
use:
select * from
(select fk,fk as fk1,statusFK from #t
) as t
pivot
(COUNT(fk1) for statusFK IN ([100],[101],[102])
) AS pt
Just adding a shortcut to #OMG's answer.
You can eliminate CASE statement:
SELECT t.fk,
SUM(t.status = 100) AS count_100,
SUM(t.status = 101) AS count_101,
SUM(t.status = 102) AS count_102
FROM TABLE t
GROUP BY t.fk

SQL query that throws away rows that are older & satisfy a condition?

Issuing the following query:
SELECT t.seq,
t.buddyId,
t.mode,
t.type,
t.dtCreated
FROM MIM t
WHERE t.userId = 'ali'
ORDER BY t.dtCreated DESC;
...returns me 6 rows.
+-------------+------------------------+------+------+---------------------+
| seq | buddyId | mode | type | dtCreated |
+-------------+------------------------+------+------+---------------------+
| 12 | abcdefghij25#gmail.com | 2 | 1 | 2009-09-14 12:39:05 |
| 11 | abcdefghij25#gmail.com | 4 | 1 | 2009-09-14 12:39:02 |
| 10 | op_eee_81#hotmail.com | 1 | -1 | 2009-09-14 12:39:00 |
| 9 | abcdefghij25#gmail.com | 1 | -1 | 2009-09-14 12:38:59 |
| 8 | op_eee_81#hotmail.com | 2 | 1 | 2009-09-14 12:37:53 |
| 7 | abcdefghij25#gmail.com | 2 | 1 | 2009-09-14 12:37:46 |
+-------------+------------------------+------+------+---------------------+
I want to return rows based on this condition:
If there are duplicate rows with the same buddyId, only return me the latest (as specified by dtCreated).
So, the query should return me:
+-------------+------------------------+------+------+---------------------+
| seq | buddyId | mode | type | dtCreated |
+-------------+------------------------+------+------+---------------------+
| 12 | abcdefghij25#gmail.com | 2 | 1 | 2009-09-14 12:39:05 |
| 10 | op_eee_81#hotmail.com | 1 | -1 | 2009-09-14 12:39:00 |
+-------------+------------------------+------+------+---------------------+
I've tried with no success to use a UNIQUE function but it's not working.
This should only return the most recent entry for each userId.
SELECT a.seq
, a.buddyId
, a.mode
, a.type
, a.dtCreated
FROM mim AS [a]
JOIN (SELECT MAX(dtCreated) FROM min GROUP BY buddyId) AS [b]
ON a.dtCreated = b.dtCreated
AND a.userId = b.userId
WHERE userId='ali'
ORDER BY dtCreated DESC;