Pivot: Create column to capture duplicate cycle dates - sql

I have the below tables. The join from cycle to program is date based.
There are millions of PGMID entries, so I was thinking about pivoting feature but I can't hard code the PGMID. Any thoughts/help would be appreciated.
I do have the ability to edit tables in the db.
Table: Cycle
ID START_CYCLE END_CYCLE
4400 7/22/2018 8/3/2018
4400 8/4/2018 8/5/2018
4400 8/6/2018 8/6/2018
4400 8/7/2018 8/9/2018
4400 8/10/2018 9/6/2018
4400 9/7/2018 9/7/2018
4400 9/8/2018 9/9/2018
4400 9/10/2018 12/31/9999
Table: Program
PGMID START_DT END_DT
101 8/4/2018 9/10/2018
102 9/8/2018 9/8/2018
103 9/10/2018 NULL
Output:
ID START_CYCLE END_CYCLE PGMID
4400 7/22/2018 8/3/2018
4400 8/4/2018 8/5/2018 101
4400 8/6/2018 8/6/2018 101
4400 8/7/2018 8/9/2018 101
4400 8/10/2018 9/6/2018 101
4400 9/7/2018 9/7/2018 101
4400 9/8/2018 9/9/2018 101
4400 9/8/2018 9/9/2018 102
4400 9/10/2018 12/31/9999 103
There are duplicate cycle entries, I do NOT want any repeat dates.
4400 9/8/2018 9/9/2018 101
4400 9/8/2018 9/9/2018 102
Expected output:
ID START_CYCLE END_CYCLE PROGRAM1 PROGRAM2
4400 7/22/2018 8/3/2018
4400 8/4/2018 8/5/2018 101
4400 8/6/2018 8/6/2018 101
4400 8/7/2018 8/9/2018 101
4400 8/10/2018 9/6/2018 101
4400 9/7/2018 9/7/2018 101
4400 9/8/2018 9/9/2018 101 102
4400 9/10/2018 12/31/9999 103

select *
from (
select id, start_cycle, end_cycle, pgmid, case when rn > 5 then 0 else rn end rn
from (
select id, start_cycle, end_cycle, pgmid,
row_number() over (partition by id, start_cycle, end_cycle order by pgmid) rn
from cycle c
left join program p on p.start_dt <= c.end_cycle and c.start_cycle <= p.end_dt ))
pivot (listagg(pgmid, ',') within group (order by pgmid)
for rn in (1 program1, 2 program2, 3 program3, 4 program4, 5 program5, 0 others))
order by id, start_cycle
left join data as you did,
add row_number() partitioned by each cycle and ordered by pgmid,
if this number exceeds some number (in my case it is 5) then assign 0 instead,
make pivot using this column. First five columns are built as always, last which may contain more program is called others
instead of typically used in pivot min or max use listagg
These steps were needed to show all programs if there are more than 5 of them per cycle. All the rest are in others. If you know that there can be no more than, let's say 3 programs, then you can simplify this query.
If you want each program in different column and number of maximum columns is unknown then it's dynamic pivot problem. There are some solutions already described on Stack Overflow, but these are mostly workarounds.
Here is an example where we have up to 8 programs in one cycle:
with
cycle(id, start_cycle, end_cycle) as (
select 4400, date '2018-07-22', date '2018-08-03' from dual union all
select 4400, date '2018-08-04', date '2018-08-05' from dual union all
select 4400, date '2018-08-06', date '2018-08-06' from dual union all
select 4400, date '2018-08-07', date '2018-08-09' from dual union all
select 4400, date '2018-08-10', date '2018-09-06' from dual union all
select 4400, date '2018-09-07', date '2018-09-07' from dual union all
select 4400, date '2018-09-08', date '2018-09-09' from dual union all
select 4400, date '2018-09-10', date '9999-12-31' from dual ),
program(pgmid, start_dt, end_dt) as (
select 101, date '2018-08-04', date '2018-09-10' from dual union all
select 104, date '2018-08-06', date '2018-08-07' from dual union all
select 105, date '2018-08-06', date '2018-08-07' from dual union all
select 106, date '2018-08-06', date '2018-08-07' from dual union all
select 107, date '2018-08-06', date '2018-08-07' from dual union all
select 108, date '2018-08-06', date '2018-08-07' from dual union all
select 109, date '2018-08-06', date '2018-08-07' from dual union all
select 110, date '2018-08-07', date '2018-08-07' from dual union all
select 102, date '2018-09-08', date '2018-09-08' from dual union all
select 103, date '2018-09-10', null from dual )
select *
from (
select id, start_cycle, end_cycle, pgmid, case when rn > 5 then 0 else rn end rn
from (
select id, start_cycle, end_cycle, pgmid,
row_number() over (partition by id, start_cycle, end_cycle order by pgmid) rn
from cycle c
left join program p on p.start_dt <= c.end_cycle and c.start_cycle <= p.end_dt ))
pivot (listagg(pgmid, ',') within group (order by pgmid)
for rn in (1 program1, 2 program2, 3 program3, 4 program4, 5 program5, 0 others))
order by id, start_cycle
Result:
ID START_CYCLE END_CYCLE PROGRAM1 PROGRAM2 PROGRAM3 PROGRAM4 PROGRAM5 OTHERS
----- ----------- ----------- --------- --------- --------- --------- --------- ------------
4400 2018-07-22 2018-08-03
4400 2018-08-04 2018-08-05 101
4400 2018-08-06 2018-08-06 101 104 105 106 107 108,109
4400 2018-08-07 2018-08-09 101 104 105 106 107 108,109,110
4400 2018-08-10 2018-09-06 101
4400 2018-09-07 2018-09-07 101
4400 2018-09-08 2018-09-09 101 102
4400 2018-09-10 9999-12-31 101
dbfiddle demo

1- You must be add "group by START_CYCLE, END_CYCLE"
2- In select section must be add group_concat(PGMID separator ',')
I don't knowledge of oracle for above but in mysql is :
select ..., group_concat(PGMID separator ',') as PGMIDs, ...
from ...
join ...
where ...
group by START_CYCLE, END_CYCLE
I hope to help you.

Related

SQL query to find the count of number of same absences for an employee

I have a table:
PERSON_NUMBER ABS_DATE ABS_TYPE_NAME ABS_DAYS
-----------------------------------------------------------------------------
1010 01-01-2022 PTO 1
1010 01-01-2022 PTO 1
1010 06-01-2022 PTO 0.52
1020 02-02-2022 VACATION 1
1020 02-02-2022 VACATION 0.2
1030 01-12-2021 PTO 1
1030 01-12-2021 PTO 1
1040 02-12-2021 sick 1
1040 30-12-2021 sick 1
1050 30-01-2022 SICK 1
I want to add another column to the output, COUNT that tells me instances where for one person there are repetive data with same ABS_TYPE on same date.
PERSON_NUMBER ABS_DATE ABS_TYPE_NAME ABS_DAYS COUNT
------------------------------------------------------------------------------
1010 01-01-2022 PTO 1 2
1010 01-01-2022 PTO 1
1010 06-01-2022 PTO 0.52 1
1020 02-02-2022 VACATION 1 2
1020 02-02-2022 VACATION 1
1030 01-12-2021 PTO 1 2
1030 01-12-2021 PTO 1
1040 02-12-2021 sick 1 1
1040 30-12-2021 sick 1 1
1050 30-01-2022 SICK 1 1
I am using -
COUNT(ABS_DATE) OVER (PARTITION BY ABS_DATE, PERSON_NUMBER, ABS_TYPE_NAME
ORDER BY PERSON_NUMBER, ABS_TYPE_NAME)
But this is returning output 4 for the first row. Also it's returned for all rows. I want one value to come for these records.
Eg- if 2 came for 1st row, it should not come in 2nd
When you are performing a COUNT with a partition clause, ORDER BY is not needed because it does not matter the order that the records are counted.
You can also use a second analytic function to determine the first row that will hold the "count" and only display that in the results.
Query
WITH
absences (PERSON_NUMBER,
ABS_DATE,
ABS_TYPE_NAME,
ABS_DAYS)
AS
(SELECT 1010, DATE '2022-01-01', 'PTO', 1 FROM DUAL
UNION ALL
SELECT 1010, DATE '2022-01-01', 'PTO', 1 FROM DUAL
UNION ALL
SELECT 1010, DATE '2022-01-06', 'PTO', 0.52 FROM DUAL
UNION ALL
SELECT 1020, DATE '2022-02-02', 'VACATION', 1 FROM DUAL
UNION ALL
SELECT 1020, DATE '2022-02-02', 'VACATION', 0.2 FROM DUAL
UNION ALL
SELECT 1030, DATE '2021-12-01', 'PTO', 1 FROM DUAL
UNION ALL
SELECT 1030, DATE '2021-12-01', 'PTO', 1 FROM DUAL
UNION ALL
SELECT 1040, DATE '2021-12-02', 'sick', 1 FROM DUAL
UNION ALL
SELECT 1040, DATE '2021-12-30', 'sick', 1 FROM DUAL
UNION ALL
SELECT 1050, DATE '2022-01-30', 'SICK', 1 FROM DUAL)
SELECT person_number,
abs_date,
abs_type_name,
abs_days,
CASE rn WHEN 1 THEN cnt END AS COUNT
FROM (SELECT a.*,
COUNT (ABS_DATE) OVER (PARTITION BY ABS_DATE, PERSON_NUMBER, ABS_TYPE_NAME)
AS cnt,
ROW_NUMBER () OVER (PARTITION BY ABS_DATE, PERSON_NUMBER, ABS_TYPE_NAME ORDER BY 1)
AS rn
FROM absences a)
ORDER BY person_number, abs_type_name, abs_date, rn;
Results
PERSON_NUMBER ABS_DATE ABS_TYPE_NAME ABS_DAYS COUNT
________________ ____________ ________________ ___________ ________
1010 01-JAN-22 PTO 1 2
1010 01-JAN-22 PTO 1
1010 06-JAN-22 PTO 0.52 1
1020 02-FEB-22 VACATION 1 2
1020 02-FEB-22 VACATION 0.2
1030 01-DEC-21 PTO 1 2
1030 01-DEC-21 PTO 1
1040 02-DEC-21 sick 1 1
1040 30-DEC-21 sick 1 1
1050 30-JAN-22 SICK 1 1

How to filter my table based on this specific date criteria?

I am using SQL Server 2014. Below is an extract of Table t1:
rownum RoomID ArrivalDate DepartureDate Name GuestID
1 287 2020-01-01 2020-01-09 John 600
2 451 2020-01-09 2020-01-10 John 600
3 458 2020-01-09 2020-01-10 John 600
1 240 2020-03-19 2020-03-21 Alan 112
2 159 2020-03-21 2020-03-22 Alan 112
1 400 2020-05-01 2020-05-10 Joe 225
2 155 2020-06-13 2020-06-18 Joe 225
1 200 2020-07-01 2020-07-08 Smith 980
2 544 2020-07-08 2020-07-10 Smith 980
3 428 2020-09-01 2020-09-05 Smith 980
...
The problem: I need to filter this table so that the output gives me only those rows of a guest where the difference/s between his ArrivalDate (at rownum 2 or 3 or 4...) and his DepartureDate (at rownum =1) is greater than 0.
To simplify: If we take Guest John, his ArrivalDate for rownum=2 and rownum=3 are both the same as his DepartureDate for rownum=1; therefore I want to exclude him completely in my output. Same for Guest Allan. However, for Guest Smith only where the rownum=2 needs to be excluded.
Note: all guests in this table will have at least a rownum=2 (that is, a minimum of 2 entries).
My expected output:
rownum RoomID ArrivalDate DepartureDate Name GuestID
1 400 2020-05-01 2020-05-10 Joe 225
2 155 2020-06-13 2020-06-18 Joe 225
1 200 2020-07-01 2020-07-08 Smith 980
3 428 2020-09-01 2020-09-05 Smith 980
I am stuck on how to write the logic behind this filter. Any help would be appreciated.
The trick here appears to be keeping the first row when you there is a match -- but not including any rows otherwise. You can use window functions:
select t.*
from (select t.*,
max(case when rownum = 1 then departuredate end) over (partition by guestid) as departuredate_1,
max(case when rownum <> 1 then arrivaldate end) over (partition by guestid) as arrivaldate_not_1
from t1 t
) t
where (arrivaldate_not_1 > departuredate_1) and
(rownum = 1 or arrivaldate > departuredate_1);
Here is a db<>fiddle.
Please use below query and confirm if this is what you are expecting,
select * from table where (ArrivalDate, Name) not in
(select DepartureDate, Name from table);
create table #Aridept
(
rownum int,
RoomID int,
ArrivalDate date,
DepartureDate date,
Name varchar(20),
GuestID int
)
insert into #Aridept
select 1 , 287 , '2020-01-01', '2020-01-09', 'John', 600
union all select 2 , 451 , '2020-01-09', '2020-01-10','John' , 600
union all select 3 , 458 , '2020-01-09', '2020-01-10','John', 600
union all select 1 , 240 , '2020-03-19', '2020-03-21','Alan', 112
union all select 2 , 159 , '2020-03-21', '2020-03-22','Alan', 112
union all select 1 , 400 , '2020-05-01', '2020-05-10','Joe', 225
union all select 2 , 155 , '2020-06-13', '2020-06-18','Joe', 225
union all select 1 , 200 , '2020-07-01', '2020-07-08','Smith', 980
union all select 2 , 544 , '2020-07-08', '2020-07-10','Smith', 980
union all select 3 , 428 , '2020-09-01', '2020-09-05','Smith', 980
--insert into #temp table which have depature date <> arrivedate
select * into #temp
from #Aridept a
where a.rownum>1 and ArrivalDate not in
(select DepartureDate from #Aridept b where a.GuestID=b.guestid
and rownum=1 )
final result query
select * from (
select * from #Aridept Ari
where rownum=1 and GuestID in ( select GuestID from #temp)
union all
select * from #temp
)a order by GuestID, rownum

How to perform rolling sum in BigQuery

I have sample data in BigQuery as -
with temp as (
select DATE("2016-10-02") date_field , 200 as salary
union all
select DATE("2016-10-09"), 500
union all
select DATE("2016-10-16"), 350
union all
select DATE("2016-10-23"), 400
union all
select DATE("2016-10-30"), 190
union all
select DATE("2016-11-06"), 550
union all
select DATE("2016-11-13"), 610
union all
select DATE("2016-11-20"), 480
union all
select DATE("2016-11-27"), 660
union all
select DATE("2016-12-04"), 690
union all
select DATE("2016-12-11"), 810
union all
select DATE("2016-12-18"), 950
union all
select DATE("2016-12-25"), 1020
union all
select DATE("2017-01-01"), 680
) ,
temp2 as (
select * , DATE("2017-01-01") as current_date
from temp
)
select * from temp2
I want to perform rolling sum on this table. As an example, I have set current date to 2017-01-01. Now, this being the current date, I want to go back 30 days and take sum of salary field. Hence, with 2017-01-01 being the current date, the total that should be returned is for the month of December , 2016, which is 690+810+950+1020. How can I do this using StandardSQL ?
Below is for BigQuery Standard SQL for Rolling last 30 days SUM
#standardSQL
SELECT *,
SUM(salary) OVER(
ORDER BY UNIX_DATE(date_field)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING
) AS rolling_30_days_sum
FROM `project.dataset.your_table`
You can test, play with above using sample data from your question as below
#standardSQL
WITH temp AS (
SELECT DATE("2016-10-02") date_field , 200 AS salary UNION ALL
SELECT DATE("2016-10-09"), 500 UNION ALL
SELECT DATE("2016-10-16"), 350 UNION ALL
SELECT DATE("2016-10-23"), 400 UNION ALL
SELECT DATE("2016-10-30"), 190 UNION ALL
SELECT DATE("2016-11-06"), 550 UNION ALL
SELECT DATE("2016-11-13"), 610 UNION ALL
SELECT DATE("2016-11-20"), 480 UNION ALL
SELECT DATE("2016-11-27"), 660 UNION ALL
SELECT DATE("2016-12-04"), 690 UNION ALL
SELECT DATE("2016-12-11"), 810 UNION ALL
SELECT DATE("2016-12-18"), 950 UNION ALL
SELECT DATE("2016-12-25"), 1020 UNION ALL
SELECT DATE("2017-01-01"), 680
)
SELECT *,
SUM(salary) OVER(
ORDER BY UNIX_DATE(date_field)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING
) AS rolling_30_days_sum
FROM temp
-- ORDER BY date_field
with result
Row date_field salary rolling_30_days_sum
1 2016-10-02 200 null
2 2016-10-09 500 200
3 2016-10-16 350 700
4 2016-10-23 400 1050
5 2016-10-30 190 1450
6 2016-11-06 550 1440
7 2016-11-13 610 1490
8 2016-11-20 480 1750
9 2016-11-27 660 1830
10 2016-12-04 690 2300
11 2016-12-11 810 2440
12 2016-12-18 950 2640
13 2016-12-25 1020 3110
14 2017-01-01 680 3470
This is not exactly a "rolling sum", but it's the exact answer to "I want to go back 30 days and take sum of salary field. Hence, with 2017-01-01 being the current date, the total that should be returned is for the month of December"
with temp as (
select DATE("2016-10-02") date_field , 200 as salary
union all
select DATE("2016-10-09"), 500
union all
select DATE("2016-10-16"), 350
union all
select DATE("2016-10-23"), 400
union all
select DATE("2016-10-30"), 190
union all
select DATE("2016-11-06"), 550
union all
select DATE("2016-11-13"), 610
union all
select DATE("2016-11-20"), 480
union all
select DATE("2016-11-27"), 660
union all
select DATE("2016-12-04"), 690
union all
select DATE("2016-12-11"), 810
union all
select DATE("2016-12-18"), 950
union all
select DATE("2016-12-25"), 1020
union all
select DATE("2017-01-01"), 680
) ,
temp2 as (
select * , DATE("2017-01-01") as current_date_x
from temp
)
select SUM(salary)
from temp2
WHERE date_field BETWEEN DATE_SUB(current_date_x, INTERVAL 30 DAY) AND DATE_SUB(current_date_x, INTERVAL 1 DAY)
3470
Note that I wasn't able to use current_date as a variable name, as it gets replaced by the actual current date.

SQL select lapsed customers with 30 day frequency by day

The goal is to select the count of distinct customer_id's who have not made a purchase in the rolling 30 day period prior to every day in the calendar year 2016. I have created a calendar table in my database to join to.
Here is an example table for reference, let's say you have customers orders normalized as follows:
+-------------+------------+----------+
| customer_id | date | order_id |
+-------------+------------+----------+
| 123 | 01/25/2016 | 1000 |
+-------------+------------+----------+
| 123 | 04/27/2016 | 1025 |
+-------------+------------+----------+
| 444 | 02/02/2016 | 1010 |
+-------------+------------+----------+
| 521 | 01/23/2016 | 998 |
+-------------+------------+----------+
| 521 | 01/24/2016 | 999 |
+-------------+------------+----------+
The goal output is effectively a calendar with 1 row for every single day of 2016 with a count on each day of how many customers "lapsed" on that day, meaning their last purchase was 30 days or more prior from that day of the year. The final output will look like this:
+------------+--------------+
| date | lapsed_count |
+------------+--------------+
| 01/01/2016 | 0 |
+------------+--------------+
| 01/02/2016 | 0 |
+------------+--------------+
| ... | ... |
+------------+--------------+
| 03/01/2016 | 12 |
+------------+--------------+
| 03/02/2016 | 9 |
+------------+--------------+
| 03/03/2016 | 7 |
+------------+--------------+
This data does not exist in 2015, therefore it's not possible for Jan-01-2016 to have a count of lapsed customers because that is the first possible day to ever make a purchase.
So for customer_id #123, they purchased on 01/25/2016 and 04/27/2016. They should have 2 lapse counts because their purchases are more than 30 days apart. One lapse occurring on 2/24/2016 and another lapse on 05/27/2016.
Customer_id#444 only purchased once, so they should have one lapse count for 30 days after 02/02/2016 on 03/02/2016.
Customer_id#521 is tricky, since they purchased with a frequency of 1 day we will not count the first purchase on 03/02/2016, so there is only one lapse starting from their last purchase of 03/03/2016. The count for the lapse will occur on 04/02/2016 (+30 days).
If you have a table of dates, here is one expensive method:
select date,
sum(case when prev_date < date - 30 then 1 else 0 end) as lapsed
from (select c.date, o.customer_id, max(o.date) as prev_date
from calendar c cross join
(select distinct customer_id from orders) c left join
orders o
on o.date <= c.date and o.customer_id = c.customer_id
group by c.date, o.customer_id
) oc
group by date;
For each date/customer pair, it determines the latest purchase the customer made before the date. It then uses this information to count the lapsed.
To be honest, this will probably work well on a handful of dates, but not for a full year's worth.
Apologies, I didn't read your question properly the first time around. This query will give you all the lapses you have. It takes each order and uses an analytic function to work out the next order date - if the gap is greater than 30 days then a lapse is recorded
WITH
cust_orders (customer_id , order_date , order_id )
AS
(SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual
)
SELECT
customer_id
,order_date
,order_id
,next_order_date
,order_date + 30 lapse_date
FROM
(SELECT
customer_id
,order_date
,order_id
,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
FROM
cust_orders
)
WHERE NVL(next_order_date,sysdate) - order_date > 30
;
Now join that to a set of dates and run a COUNT function (enter the year parameter as YYYY) :
WITH
cust_orders (customer_id , order_date , order_id )
AS
(SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual
)
,calendar (date_value)
AS
(SELECT TO_DATE('01/01/'||:P_year,'DD/MM/YYYY') + (rownum -1)
FROM all_tables
WHERE rownum < (TO_DATE('31/12/'||:P_year,'DD/MM/YYYY') - TO_DATE('01/01/'||:P_year,'DD/MM/YYYY')) + 2
)
SELECT
calendar.date_value
,COUNT(*)
FROM
(
SELECT
customer_id
,order_date
,order_id
,next_order_date
,order_date + 30 lapse_date
FROM
(SELECT
customer_id
,order_date
,order_id
,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
FROM
cust_orders
)
WHERE NVL(next_order_date,sysdate) - order_date > 30
) lapses
,calendar
WHERE 1=1
AND calendar.date_value = TRUNC(lapses.lapse_date)
GROUP BY
calendar.date_value
;
Or if you really want every date printed out then use this :
WITH
cust_orders (customer_id , order_date , order_id )
AS
(SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual
)
,lapses
AS
(SELECT
customer_id
,order_date
,order_id
,next_order_date
,order_date + 30 lapse_date
FROM
(SELECT
customer_id
,order_date
,order_id
,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
FROM
cust_orders
)
WHERE NVL(next_order_date,sysdate) - order_date > 30
)
,calendar (date_value)
AS
(SELECT TO_DATE('01/01/'||:P_year,'DD/MM/YYYY') + (rownum -1)
FROM all_tables
WHERE rownum < (TO_DATE('31/12/'||:P_year,'DD/MM/YYYY') - TO_DATE('01/01/'||:P_year,'DD/MM/YYYY')) + 2
)
SELECT
calendar.date_value
,(SELECT COUNT(*)
FROM lapses
WHERE calendar.date_value = lapses.lapse_date
)
FROM
calendar
WHERE 1=1
ORDER BY
calendar.date_value
;
Here's how I'd do it:
WITH your_table AS (SELECT 123 customer_id, to_date('24/01/2016', 'dd/mm/yyyy') order_date, 12345 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('24/01/2016', 'dd/mm/yyyy') order_date, 12346 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('25/01/2016', 'dd/mm/yyyy') order_date, 12347 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('24/02/2016', 'dd/mm/yyyy') order_date, 12347 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('16/03/2016', 'dd/mm/yyyy') order_date, 12348 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('18/04/2016', 'dd/mm/yyyy') order_date, 12349 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('20/02/2016', 'dd/mm/yyyy') order_date, 12350 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('01/03/2016', 'dd/mm/yyyy') order_date, 12351 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('03/03/2016', 'dd/mm/yyyy') order_date, 12352 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('18/04/2016', 'dd/mm/yyyy') order_date, 12353 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('20/05/2016', 'dd/mm/yyyy') order_date, 12354 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('23/06/2016', 'dd/mm/yyyy') order_date, 12355 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('19/01/2017', 'dd/mm/yyyy') order_date, 12356 order_id FROM dual),
-- end of mimicking your_table with data in it
lapsed_info AS (SELECT customer_id,
order_date,
CASE WHEN TRUNC(SYSDATE) - order_date <= 30 THEN NULL
WHEN COUNT(*) OVER (PARTITION BY customer_id ORDER BY order_date RANGE BETWEEN 1 FOLLOWING AND 30 FOLLOWING) = 0 THEN order_date+30
ELSE NULL
END lapsed_date
FROM your_table),
dates AS (SELECT to_date('01/01/2016', 'dd/mm/yyyy') + LEVEL -1 dt
FROM dual
CONNECT BY to_date('01/01/2016', 'dd/mm/yyyy') + LEVEL -1 <= TRUNC(SYSDATE))
SELECT dates.dt,
COUNT(li.lapsed_date) lapsed_count
FROM dates
LEFT OUTER JOIN lapsed_info li ON dates.dt = li.lapsed_date
GROUP BY dates.dt
ORDER BY dates.dt;
Results:
DT LAPSED_COUNT
---------- ------------
01/01/2016 0
<snip>
23/01/2016 0
24/01/2016 0
25/01/2016 0
26/01/2016 0
<snip>
19/02/2016 0
20/02/2016 0
21/02/2016 0
22/02/2016 0
23/02/2016 0
24/02/2016 1
25/02/2016 0
<snip>
29/02/2016 0
01/03/2016 0
02/03/2016 0
03/03/2016 0
04/03/2016 0
<snip>
15/03/2016 0
16/03/2016 0
17/03/2016 0
<snip>
20/03/2016 0
21/03/2016 0
22/03/2016 0
<snip>
30/03/2016 0
31/03/2016 0
01/04/2016 0
02/04/2016 1
03/04/2016 0
<snip>
14/04/2016 0
15/04/2016 1
16/04/2016 0
17/04/2016 0
18/04/2016 0
19/04/2016 0
<snip>
17/05/2016 0
18/05/2016 2
19/05/2016 0
20/05/2016 0
21/05/2016 0
<snip>
18/06/2016 0
19/06/2016 1
20/06/2016 0
21/06/2016 0
22/06/2016 0
23/06/2016 0
24/06/2016 0
<snip>
22/07/2016 0
23/07/2016 1
24/07/2016 0
<snip>
18/01/2017 0
19/01/2017 0
20/01/2017 0
<snip>
08/02/2017 0
This takes your data, and uses an the analytic count function to work out the number of rows that have a value within 30 days of (but excluding) the current row's date.
Then we apply a case expression to determine that if the row has a date within 30 days of today's date, we'll count those as not lapsed. If a count of 0 was returned, then the row is considered lapsed and we'll output the lapsed date as the order_date plus 30 days. Any other count result means the row has not lapsed.
The above is all worked out in the lapsed_info subquery.
Then all we need to do is list the dates (see the dates subquery) and outer join the lapsed_info subquery to it based on the lapsed_date and then do a count of the lapsed dates for each day.

Selecting a single ID by the most recent date Order Date -SQL

I have a table that looks like this
Indvdl_Store_ID Indvdl_ID Order_ID Order_Date
101 123 A000 12/24/2011
101 241 B002 01/01/2013
101 201 Y180 01/01/2016
Since we have the same Indvdl_Store_ID associated with 3 different Indvdl_IDs, I want to select/keep the most recent Individual ID for that Indvdl_StoreID based on the order date, but still keep all of the orders associated to the Indvdl_Store_ID. So I would like my final results to look like this
Indvdl_Store_ID Indvdl_ID Order_ID Order_Date
101 201 A000 12/24/2011
101 201 B002 01/01/2013
101 201 Y180 01/01/2016
I have tried using row_number to dedupe and then joining the final results back to the table on Indvdl_store_ID, but I still seem to be having Issues getting the correct results. I would appreciate any help or suggestions.
Thanks in Advance!
Oracle Setup:
CREATE TABLE table_name (Indvdl_Store_ID, Indvdl_ID, Order_ID, Order_Date ) AS
SELECT 101, 123, 'A000', DATE '2011-12-24' FROM DUAL UNION ALL
SELECT 101, 241, 'B002', DATE '2013-01-01' FROM DUAL UNION ALL
SELECT 101, 201, 'Y180', DATE '2016-01-01' FROM DUAL;
Query:
SELECT Indvdl_Store_ID,
MAX( Indvdl_ID ) KEEP ( DENSE_RANK LAST ORDER BY ORDER_DATE )
OVER ( PARTITION BY INDVDL_STORE_ID )
AS Indvdl_ID,
Order_ID,
Order_Date
FROM table_name;
Output:
INDVDL_STORE_ID INDVDL_ID ORDER_ID ORDER_DATE
--------------- ---------- -------- -------------------
101 201 A000 2011-12-24 00:00:00
101 201 Y180 2016-01-01 00:00:00
101 201 B002 2013-01-01 00:00:00
Could be with an inner join and group by
select b.Order_Date, max(a.Indvdl_Store_ID), max( a.Indvdl_ID), b.Order_ID
from my_table as b
inner join my_table as a on a.Indvdl_ID = b.Indvdl_ID
group by b.Order_Date