+----------+---------+--------+------+------------+-----------+
| PersonID | OrderID | PlanID | Plan | Sdate | Edate |
+----------+---------+--------+------+------------+-----------+
| 1 | 1 | 312 | M | 10/14/2016 | 1/30/2017 |
| 1 | 4 | 125 | A | 10/18/2016 | 2/3/2017 |
| 1 | 7 | 411 | B | 10/25/2016 | 4/7/2017 |
| 2 | 1 | 111 | E | 10/31/2016 | 4/21/2017 |
| 2 | 3 | 312 | M | 11/4/2016 | 4/28/2017 |
| 2 | 5 | 253 | L | 11/29/2016 | 5/3/2017 |
| 3 | 1 | 50 | Q | 12/2/2016 | 5/8/2017 |
| 3 | 2 | 12 | W | 12/8/2016 | 6/8/2017 |
| 3 | 4 | 312 | M | 12/22/2016 | 6/26/2017 |
| 3 | 6 | 53 | Z | 12/27/2016 | 7/10/2017 |
+----------+---------+--------+------+------------+-----------+
I need the following output:
+----------+------------+-----------+----------+----------+
| PersonID | SDateM | EDateM | MinPlan1 | MinPlan2 |
+----------+------------+-----------+----------+----------+
| 1 | 10/14/2016 | 1/30/2017 | M | A |
| 2 | 11/4/2016 | 4/28/2017 | E | M |
| 3 | 12/22/2016 | 6/26/2017 | Q | W |
+----------+------------+-----------+----------+----------+
where SDateM is SDate when Plan=M, EDateM is Edate when Plan=M, MinPlan1 is the Plan for the minimum OrderID for a person. And MinPlan2 is the Plan for the 2nd lowest OrderID for a person.
I apply PlanID=312 in the WHERE clause but that restricts my ability to select the MinPlans. I don't know how to write the code to extract MinPlan1 and MinPlan2.
Attempt:
SELECT
PersonID
,Sdate AS SDateM
,Edate AS EDateM
,MinPlan1
,MinPlan2
FROM
T1
WHERE
1 = 1
AND PlanID = 312
CREATE TABLE t_plan
(personID INT,
orderID INT,
planID INT,
plan_n VARCHAR(20),
sdate DATE,
edate DATE
)
INSERT INTO t_plan
VALUES(1,1,312,'M','10/14/2016','1/30/2017'),
(1,4,125,'A','10/18/2016','2/3/2017'),
(1,7,411,'B','10/25/2016','4/7/2017'),
(2,1,111,'E','10/31/2016','4/21/2017'),
(2,3,312,'M','11/4/2016','4/28/2017'),
(2,5,253,'L','11/29/2016','5/3/2017'),
(3,1,50 ,'Q','12/2/2016','5/8/2017'),
(3,2,12 ,'W','12/8/2016','6/8/2017'),
(3,4,312,'M','12/22/2016','6/26/2017'),
(3,6,53 ,'Z','12/27/2016','7/10/2017')
SELECT personid,
MIN(CASE WHEN plan_n = 'M' THEN sdate END) sdateM,
MIN(CASE WHEN plan_n = 'M' THEN edate END) edateM,
MAX(CASE WHEN rnum = 1 THEN plan_n END) minplan1,
MAX(CASE WHEN rnum = 2 THEN plan_n END) minplan2
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY personid ORDER BY orderid) rnum
FROM t_plan
) t
GROUP BY personid;
Result
personid sdateM edateM minplan1 minplan2
1 2016-10-14 2017-01-30 M A
2 2016-11-04 2017-04-28 E M
3 2016-12-22 2017-06-26 Q W
I think a combination of conditional aggregation with row_number() does the trick:
select personid,
min(case when plan = 'M' then sdate end) as sdateM,
min(case when plan = 'M' then edate end) as edateM,
max(case when seqnum = 1 then plan end) as plan_1,
max(case when seqnum = 2 then plan end) as plan_2
from (select t1.*,
row_number() over (partition by personid order by orderid) as seqnum
from t1
) t1
group by person_id;
You can do this using row_number and conditional aggregation.
SELECT PersonID
,MAX(CASE WHEN plan='M' then Sdate END) AS SDateM
,MAX(CASE WHEN plan='M' then Edate END) AS EDateM
,MAX(CASE WHEN rnum=1 then plan END) AS MinPlan1
,MAX(CASE WHEN rnum=2 then plan END) AS MinPlan2
FROM (SELECT t1.*,row_number() over(partition by personId order by orderId) as rnum
FROM T1
) t
GROUP BY PersonID
I'm trying to create a query that contains a running count of ISO weeks when the account has no entry in a [Volume] table. The query is to return only a sample of the accounts, so I've created a couple of CTEs to limit the number of records and join the volume table so that weeks with no volume show up in the results.
To illustrate the result I'm looking for:
ISOWk | SurrID | Weekly Volume | No vol Count
201601 | 001 | 0 | 1
201601 | 002 | 5 | 0
201602 | 001 | 0 | 2
201602 | 002 | 0 | 1
201603 | 001 | 125| 0
201603 | 002 | 75 | 0
201604 | 001 | 0 | 1
201604 | 002 | 75 | 0
As you can see, account with SurrID 001 has no volume for weeks 201601 and 201602 so has [No vol Count] of 2 in week 201602. In week 201603, there is volume so the counter resets to 0 and increases to 1 week 201604.
From the research I've done I've managed to get a consecutive count running using ROW_NUMBER with a window function, but it doesn't reset if there is volume (as in week 201603 in my example). What I can't figure out is how to count consecutive zero values and reset when needed.
I've included my full query below so that you get the full picture (please point out any particularly bad practices here - I'm still finding my way!). It all works as intended until I include the third CTE "NDs". It then takes 45 mins to return just over 2000 rows and returns an non-resetting count for rows showing no weekly volume.
WITH Surrs AS
(
SELECT SurrID, OracleStartDate AS OSD
FROM (
SELECT ca.SurrID, OracleStartDate, ROW_NUMBER() OVER(ORDER BY OracleStartDate) as rn
FROM tblCustomerAccounts ca
JOIN tblAccountUpdates au
ON ca.SurrID = au.SurrID
WHERE CustomerType_ID IN (1,2,3,4,5,6,7,12)
AND au.ISOWk = 201641
) a
WHERE rn % 1000 = 0
),
Updates AS
(
SELECT au.ISOWk, s.SurrID, (CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus, (CASE WHEN dbo.udf_ConvertDateToISOWeek(OSD) <= BBC THEN 'B' ELSE 'F' END) AS Book
FROM Surrs s
JOIN (
SELECT ISOWk,
(SELECT BBCutOff FROM dbo.udf_CutOffWeeks(ISOWk)) AS BBC,
(SELECT FYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYS,
(SELECT FYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYE,
(SELECT BYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYS,
(SELECT BYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYE,
SurrID,
AccStatus_ID
FROM tblAccountUpdates
) au
ON au.SurrID = s.SurrID
),
NDs AS
(
SELECT u.ISOWk, u.SurrID, ROW_NUMBER() OVER (PARTITION BY u.SurrID ORDER BY u.ISOWk) AS NDCount
FROM Updates u
LEFT JOIN tblTotalVolumes tv
ON u.SurrID = tv.SurrID
AND u.ISOWk = tv.ISOWk
WHERE tv.Volume IS NULL
AND u.ISOWk >= 201601
)
SELECT tw.ISOWk,
tw.SurrID,
(CASE WHEN Volume IS NULL THEN 0 ELSE Volume END) AS [Weekly Volume],
tw.Book,
tw.AccStatus,
(CASE WHEN tw.AccStatus = 'I' AND lw.AccStatus = 'A' THEN 'Y' ELSE '' END) AS [Stopped this week],
(CASE WHEN tw.AccStatus = 'A' AND lw.AccStatus = 'I' THEN 'Y' ELSE '' END) AS [Restarted this week],
(CASE WHEN NDCount IS NULL THEN 0 ELSE NDCount END) AS [Consecutive ND Weeks]
FROM Updates tw
JOIN Updates lw
ON lw.ISOWk = dbo.udf_ConvertDateToISOWeek(DATEADD("ww",-1,dbo.udf_ConvertISOWkToDate(tw.ISOWk)))
AND tw.SurrID = lw.SurrID
LEFT JOIN tblTotalVolumes tv
ON tw.SurrID = tv.SurrID
AND tw.ISOWk = tv.ISOWk
LEFT JOIN NDs
ON tw.SurrID = nds.SurrID
AND tw.ISOWk = nds.ISOWk
ORDER BY tw.ISOWk
To reiterate what I need: the column [Consecutive ND Weeks] should count consecutive weeks where [Weekly Volume] is 0. Help will be much appreciated.
Thanks
UPDATE:
I've tried to implement #Gordon Linoff's post but my counter does not reset to 0 when there is a value for [Weekly Volume]. Here's my amended query:
SELECT t.*, (CASE WHEN [Weekly Volume] = 0 THEN ROW_NUMBER() OVER (PARTITION BY t.SurrID, grp ORDER BY ISOWk) ELSE 0 END) AS [ND Count]
FROM (
SELECT tw.ISOWk,
s.SurrID,
tw.AccStatus,
(CASE WHEN tv.Volume IS NULL THEN 0 ELSE tv.Volume END) AS [Weekly Volume],
(CASE WHEN dbo.udf_ConvertDateToISOWeek(OSD) <= BBC THEN 'B' ELSE 'F' END) AS Book,
(CASE WHEN tw.AccStatus = 'I' AND lw.AccStatus = 'A' THEN 'Y' ELSE '' END) AS [Stopped this week],
(CASE WHEN tw.AccStatus = 'A' AND lw.AccStatus = 'I' THEN 'Y' ELSE '' END) AS [Restarted this week],
SUM(CASE WHEN tv.volume > 0 THEN 1 ELSE 0 END) OVER(PARTITION BY tv.SurrID ORDER BY tv.ISOWk) AS grp
FROM (
SELECT SurrID, OracleStartDate AS OSD
FROM (
SELECT ca.SurrID, OracleStartDate, ROW_NUMBER() OVER(ORDER BY OracleStartDate) as rn
FROM tblCustomerAccounts ca
JOIN tblAccountUpdates au
ON ca.SurrID = au.SurrID
WHERE CustomerType_ID IN (1,2,3,4,5,6,7,12)
AND au.ISOWk = 201641
) a
WHERE rn % 1000 = 0
) s
JOIN (
SELECT ISOWk,
(SELECT BBCutOff FROM dbo.udf_CutOffWeeks(ISOWk)) AS BBC,
(SELECT FYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYS,
(SELECT FYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYE,
(SELECT BYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYS,
(SELECT BYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYE,
SurrID,
(CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus
FROM tblAccountUpdates
) tw
ON tw.SurrID = s.SurrID
JOIN (
SELECT ISOWk,
SurrID,
(CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus
FROM tblAccountUpdates
) lw
ON tw.SurrID = lw.SurrID
AND dbo.udf_ConvertDateToISOWeek(DATEADD("ww",-1,dbo.udf_ConvertISOWkToDate(tw.ISOWk))) = lw.ISOWk
LEFT JOIN tblTotalVolumes tv
ON tw.ISOWk = tv.ISOWk
AND tw.SurrID = tv.SurrID
) t
ORDER BY ISOWk
UPDATE:
I've now modified my query to reflect Vladimir's solution (once again, this is the full query):
SELECT ISOWk,
SurrID,
AccStatus,
[Weekly Volume],
Book,
[Stopped this week],
[Restarted this week],
RN1,
RN2,
grp,
rn3,
(CASE WHEN [Weekly Volume] = 0 THEN rn3 ELSE 0 END) AS [ND Count]
FROM (
SELECT t.ISOWk,
t.SurrID,
t.AccStatus,
t.[Weekly Volume],
t.Book,
t.[Stopped this week],
t.[Restarted this week],
rn1,
rn2,
rn1 - rn2 AS grp,
ROW_NUMBER() OVER(PARTITION BY t.SurrID, rn1-rn2 ORDER BY ISOWk) AS rn3
FROM (
SELECT tw.ISOWk,
s.SurrID,
tw.AccStatus,
(CASE WHEN tv.Volume IS NULL THEN 0 ELSE tv.Volume END) AS [Weekly Volume],
(CASE WHEN dbo.udf_ConvertDateToISOWeek(OSD) <= BBC THEN 'B' ELSE 'F' END) AS Book,
(CASE WHEN tw.AccStatus = 'I' AND lw.AccStatus = 'A' THEN 'Y' ELSE '' END) AS [Stopped this week],
(CASE WHEN tw.AccStatus = 'A' AND lw.AccStatus = 'I' THEN 'Y' ELSE '' END) AS [Restarted this week],
ROW_NUMBER() OVER(PARTITION BY tw.SurrID ORDER BY tw.ISOWk) AS rn1,
ROW_NUMBER() OVER(PARTITION BY tw.SurrID, tv.Volume ORDER BY tw.ISOWk) AS rn2
FROM (
SELECT SurrID, OracleStartDate AS OSD
FROM (
SELECT ca.SurrID, OracleStartDate, ROW_NUMBER() OVER(ORDER BY OracleStartDate) as rn
FROM tblCustomerAccounts ca
JOIN tblAccountUpdates au
ON ca.SurrID = au.SurrID
WHERE CustomerType_ID IN (1,2,3,4,5,6,7,12)
AND au.ISOWk = 201641
) a
WHERE rn % 2000 = 0
) s
JOIN (
SELECT ISOWk,
(SELECT BBCutOff FROM dbo.udf_CutOffWeeks(ISOWk)) AS BBC,
(SELECT FYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYS,
(SELECT FYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYE,
(SELECT BYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYS,
(SELECT BYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYE,
SurrID,
(CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus
FROM tblAccountUpdates
) tw
ON tw.SurrID = s.SurrID
JOIN (
SELECT ISOWk,
SurrID,
(CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus
FROM tblAccountUpdates
) lw
ON tw.SurrID = lw.SurrID
AND dbo.udf_ConvertDateToISOWeek(DATEADD("ww",-1,dbo.udf_ConvertISOWkToDate(tw.ISOWk))) = lw.ISOWk
LEFT JOIN tblTotalVolumes tv
ON tw.ISOWk = tv.ISOWk
AND tw.SurrID = tv.SurrID
) t
) x
ORDER BY ISOWk
Here's a sample of the unexpected (in that the counter doesn't reset to 0 when [Weekly Volume] has a value greater than 0, identified with the asterisks) results. The following all have the same ID, so I've removed the ID column.
ISOWk | Weekly Volume | rn1 | rn2 | grp | rn3 | ND Count |
201620 | 0 | 1 | 1 | 0 | 1 | 1 |
201621 | 0 | 2 | 2 | 0 | 2 | 2 |
201622 | 0 | 3 | 3 | 0 | 3 | 3 |
201623 | 0 | 4 | 4 | 0 | 4 | 4 |
201624 | 0 | 5 | 5 | 0 | 5 | 5 |
201625 | 53 | 6 | 1 | 5 | 1 | 0 |
201626 | 49 | 7 | 1 | 6 | 1 | 0 |
201627 | 98 | 8 | 1 | 7 | 1 | 0 |
201628 | 54 | 9 | 1 | 8 | 1 | 0 |
201629 | 53 | 10 | 2 | 8 | 2 | 0 |
201630 | 103 | 11 | 1 | 10 | 1 | 0 |
201631 | 59 | 12 | 1 | 11 | 1 | 0 |
201632 | 35 | 13 | 1 | 12 | 1 | 0 |
201633 | 0 | 14 | 6 | 8 | 3 | 3 |**
201634 | 0 | 15 | 7 | 8 | 4 | 4 |**
201635 | 0 | 16 | 8 | 8 | 5 | 5 |**
201636 | 0 | 17 | 9 | 8 | 6 | 6 |**
201637 | 87 | 18 | 1 | 17 | 1 | 0 |
201638 | 136 | 19 | 1 | 18 | 1 | 0 |
201639 | 56 | 20 | 1 | 19 | 1 | 0 |
201640 | 70 | 21 | 1 | 20 | 0 | 0 |
201641 | 77 | 22 | 1 | 21 | 1 | 0 |
There are other instances of this issue in my dataset.
It looks like a gaps-and-islands problem.
Sample data
DECLARE #T TABLE(ISOWk int, SurrID char(3), WeeklyVolume int);
INSERT INTO #T(ISOWk, SurrID, WeeklyVolume) VALUES
(201601, '001', 0),
(201601, '002', 5),
(201602, '001', 0),
(201602, '002', 0),
(201603, '001',125),
(201603, '002', 75),
(201604, '001', 0),
(201604, '002', 75),
(201620, '003', 0),
(201621, '003', 0),
(201622, '003', 0),
(201623, '003', 0),
(201624, '003', 0),
(201625, '003', 53),
(201626, '003', 49),
(201627, '003', 98),
(201628, '003', 54),
(201629, '003', 53),
(201630, '003',103),
(201631, '003', 59),
(201632, '003', 35),
(201633, '003', 0),
(201634, '003', 0),
(201635, '003', 0),
(201636, '003', 0),
(201637, '003', 87),
(201638, '003',136),
(201639, '003', 56),
(201640, '003', 70),
(201641, '003', 77),
(201601, '004', 0),
(201602, '004', 6),
(201603, '004', 0),
(201604, '004', 0);
I added your extended sample with SurrID=003 and added mine sample with SurrID=004.
Query
WITH
CTE
AS
(
SELECT
ISOWk
,SurrID
,WeeklyVolume
,ROW_NUMBER() OVER (PARTITION BY SurrID ORDER BY ISOWk) AS rn1
,ROW_NUMBER() OVER (PARTITION BY SurrID,WeeklyVolume ORDER BY ISOWk) AS rn2
FROM #T
)
,CTE2
AS
(
SELECT
ISOWk
,SurrID
,WeeklyVolume
,rn1
,rn2
,rn1-rn2 AS grp
,ROW_NUMBER() OVER (PARTITION BY SurrID,WeeklyVolume,rn1-rn2 ORDER BY ISOWk) AS rn3
FROM CTE
)
SELECT
ISOWk
,SurrID
,WeeklyVolume
,rn1
,rn2
,grp
,rn3
,CASE WHEN WeeklyVolume = 0 THEN rn3 ELSE 0 END AS NoVolumeCount
FROM CTE2
ORDER BY SurrID, ISOWk;
Result
+--------+--------+--------------+-----+-----+-----+-----+---------------+
| ISOWk | SurrID | WeeklyVolume | rn1 | rn2 | grp | rn3 | NoVolumeCount |
+--------+--------+--------------+-----+-----+-----+-----+---------------+
| 201601 | 001 | 0 | 1 | 1 | 0 | 1 | 1 |
| 201602 | 001 | 0 | 2 | 2 | 0 | 2 | 2 |
| 201603 | 001 | 125 | 3 | 1 | 2 | 1 | 0 |
| 201604 | 001 | 0 | 4 | 3 | 1 | 1 | 1 |
| 201601 | 002 | 5 | 1 | 1 | 0 | 1 | 0 |
| 201602 | 002 | 0 | 2 | 1 | 1 | 1 | 1 |
| 201603 | 002 | 75 | 3 | 1 | 2 | 1 | 0 |
| 201604 | 002 | 75 | 4 | 2 | 2 | 2 | 0 |
| 201620 | 003 | 0 | 1 | 1 | 0 | 1 | 1 |
| 201621 | 003 | 0 | 2 | 2 | 0 | 2 | 2 |
| 201622 | 003 | 0 | 3 | 3 | 0 | 3 | 3 |
| 201623 | 003 | 0 | 4 | 4 | 0 | 4 | 4 |
| 201624 | 003 | 0 | 5 | 5 | 0 | 5 | 5 |
| 201625 | 003 | 53 | 6 | 1 | 5 | 1 | 0 |
| 201626 | 003 | 49 | 7 | 1 | 6 | 1 | 0 |
| 201627 | 003 | 98 | 8 | 1 | 7 | 1 | 0 |
| 201628 | 003 | 54 | 9 | 1 | 8 | 1 | 0 |
| 201629 | 003 | 53 | 10 | 2 | 8 | 1 | 0 |
| 201630 | 003 | 103 | 11 | 1 | 10 | 1 | 0 |
| 201631 | 003 | 59 | 12 | 1 | 11 | 1 | 0 |
| 201632 | 003 | 35 | 13 | 1 | 12 | 1 | 0 |
| 201633 | 003 | 0 | 14 | 6 | 8 | 1 | 1 |
| 201634 | 003 | 0 | 15 | 7 | 8 | 2 | 2 |
| 201635 | 003 | 0 | 16 | 8 | 8 | 3 | 3 |
| 201636 | 003 | 0 | 17 | 9 | 8 | 4 | 4 |
| 201637 | 003 | 87 | 18 | 1 | 17 | 1 | 0 |
| 201638 | 003 | 136 | 19 | 1 | 18 | 1 | 0 |
| 201639 | 003 | 56 | 20 | 1 | 19 | 1 | 0 |
| 201640 | 003 | 70 | 21 | 1 | 20 | 1 | 0 |
| 201641 | 003 | 77 | 22 | 1 | 21 | 1 | 0 |
| 201601 | 004 | 0 | 1 | 1 | 0 | 1 | 1 |
| 201602 | 004 | 6 | 2 | 1 | 1 | 1 | 0 |
| 201603 | 004 | 0 | 3 | 2 | 1 | 1 | 1 |
| 201604 | 004 | 0 | 4 | 3 | 1 | 2 | 2 |
+--------+--------+--------------+-----+-----+-----+-----+---------------+
I included intermediate columns in the result, so you can understand how it works.
Standard approach for gaps-and-islands is two sequences of ROW_NUMBER - one is a plain sequence (rn1), second is partitioned by WeeklyVolume (rn2).
The difference between rn1 and rn2 gives the ID of the group (island) (grp). Calculate another sequence of row numbers partitioned by WeeklyVolume and this group (rn3) and use it only when WeeklyVolume is zero.
Obviously, everything above is partitioned by SurrID at first.
In the first variant of the answer I forgot to include WeeklyVolume in the last partitioning for rn3.
Here is an approach:
Count the number of preceding non-0 values for SurrId. This number identifies groups.
Do a row_number() within each group.
Only consider the row_number() when the value is 0.
This results in:
select t.*,
(case when weeklyvolume = 0
then row_number() over (partition by SurrId, grp order by ISOwk)
else 0
end) as NoVolCount
from (select t.*,
sum(case when weeklyvolume > 0 then 1 else 0 end) over (partition by SurrId order by ISOwk) as grp
from t
) t;