I have the following table showing when customers bought a certain product. The data I have is CustomerID, Amount, Dat. I am trying to create the column ProductsIn30Days, which represents how many products a customer bought in the range Dat-30 days inclusive the current day.
For example, ProductsIn30Days for CustomerID 1 on Dat 25.3.2020 is 7, since the customer bought 2 products on 25.3.2020 and 5 more products on 24.3.2020, which falls within 30 days before 25.3.2020.
CustomerID
Amount
Dat
ProductsIn30Days
1
1
23.3.2018
1
1
2
24.3.2020
2
1
3
24.3.2020
5
1
2
25.3.2020
7
1
2
24.5.2020
2
1
1
15.6.2020
3
2
7
24.3.2017
7
2
2
24.3.2020
2
I tried something like this with no success, since the partition only works on a single date rather than on a range like I would need:
select CustomerID, Amount, Dat,
sum(Amount) over (partition by CustomerID, Dat-30)
from table
Thank you for help.
You can use an analytic SUM function with a range window:
SELECT t.*,
SUM(Amount) OVER (
PARTITION BY CustomerID
ORDER BY Dat
RANGE BETWEEN INTERVAL '30' DAY PRECEDING AND CURRENT ROW
) AS ProductsIn30Days
FROM table_name t;
Which, for the sample data:
CREATE TABLE table_name (CustomerID, Amount, Dat) AS
SELECT 1, 1, DATE '2018-03-23' FROM DUAL UNION ALL
SELECT 1, 2, DATE '2020-03-24' FROM DUAL UNION ALL
SELECT 1, 3, DATE '2020-03-24' FROM DUAL UNION ALL
SELECT 1, 2, DATE '2020-03-25' FROM DUAL UNION ALL
SELECT 1, 2, DATE '2020-05-24' FROM DUAL UNION ALL
SELECT 1, 1, DATE '2020-06-15' FROM DUAL UNION ALL
SELECT 2, 7, DATE '2017-03-24' FROM DUAL UNION ALL
SELECT 2, 2, DATE '2020-03-24' FROM DUAL;
Outputs:
CUSTOMERID
AMOUNT
DAT
PRODUCTSIN30DAYS
1
1
2018-03-23 00:00:00
1
1
2
2020-03-24 00:00:00
5
1
3
2020-03-24 00:00:00
5
1
2
2020-03-25 00:00:00
7
1
2
2020-05-24 00:00:00
2
1
1
2020-06-15 00:00:00
3
2
7
2017-03-24 00:00:00
7
2
2
2020-03-24 00:00:00
2
Note: If you have values on the same date then they will be tied in the order and always aggregated together (i.e. rows 2 & 3). If you want them to be aggregated separately then you need to order by something else to break the ties but that would not work with a RANGE window.
db<>fiddle here
The goal is to select the count of distinct customer_id's who have not made a purchase in the rolling 30 day period prior to every day in the calendar year 2016. I have created a calendar table in my database to join to.
Here is an example table for reference, let's say you have customers orders normalized as follows:
+-------------+------------+----------+
| customer_id | date | order_id |
+-------------+------------+----------+
| 123 | 01/25/2016 | 1000 |
+-------------+------------+----------+
| 123 | 04/27/2016 | 1025 |
+-------------+------------+----------+
| 444 | 02/02/2016 | 1010 |
+-------------+------------+----------+
| 521 | 01/23/2016 | 998 |
+-------------+------------+----------+
| 521 | 01/24/2016 | 999 |
+-------------+------------+----------+
The goal output is effectively a calendar with 1 row for every single day of 2016 with a count on each day of how many customers "lapsed" on that day, meaning their last purchase was 30 days or more prior from that day of the year. The final output will look like this:
+------------+--------------+
| date | lapsed_count |
+------------+--------------+
| 01/01/2016 | 0 |
+------------+--------------+
| 01/02/2016 | 0 |
+------------+--------------+
| ... | ... |
+------------+--------------+
| 03/01/2016 | 12 |
+------------+--------------+
| 03/02/2016 | 9 |
+------------+--------------+
| 03/03/2016 | 7 |
+------------+--------------+
This data does not exist in 2015, therefore it's not possible for Jan-01-2016 to have a count of lapsed customers because that is the first possible day to ever make a purchase.
So for customer_id #123, they purchased on 01/25/2016 and 04/27/2016. They should have 2 lapse counts because their purchases are more than 30 days apart. One lapse occurring on 2/24/2016 and another lapse on 05/27/2016.
Customer_id#444 only purchased once, so they should have one lapse count for 30 days after 02/02/2016 on 03/02/2016.
Customer_id#521 is tricky, since they purchased with a frequency of 1 day we will not count the first purchase on 03/02/2016, so there is only one lapse starting from their last purchase of 03/03/2016. The count for the lapse will occur on 04/02/2016 (+30 days).
If you have a table of dates, here is one expensive method:
select date,
sum(case when prev_date < date - 30 then 1 else 0 end) as lapsed
from (select c.date, o.customer_id, max(o.date) as prev_date
from calendar c cross join
(select distinct customer_id from orders) c left join
orders o
on o.date <= c.date and o.customer_id = c.customer_id
group by c.date, o.customer_id
) oc
group by date;
For each date/customer pair, it determines the latest purchase the customer made before the date. It then uses this information to count the lapsed.
To be honest, this will probably work well on a handful of dates, but not for a full year's worth.
Apologies, I didn't read your question properly the first time around. This query will give you all the lapses you have. It takes each order and uses an analytic function to work out the next order date - if the gap is greater than 30 days then a lapse is recorded
WITH
cust_orders (customer_id , order_date , order_id )
AS
(SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual
)
SELECT
customer_id
,order_date
,order_id
,next_order_date
,order_date + 30 lapse_date
FROM
(SELECT
customer_id
,order_date
,order_id
,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
FROM
cust_orders
)
WHERE NVL(next_order_date,sysdate) - order_date > 30
;
Now join that to a set of dates and run a COUNT function (enter the year parameter as YYYY) :
WITH
cust_orders (customer_id , order_date , order_id )
AS
(SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual
)
,calendar (date_value)
AS
(SELECT TO_DATE('01/01/'||:P_year,'DD/MM/YYYY') + (rownum -1)
FROM all_tables
WHERE rownum < (TO_DATE('31/12/'||:P_year,'DD/MM/YYYY') - TO_DATE('01/01/'||:P_year,'DD/MM/YYYY')) + 2
)
SELECT
calendar.date_value
,COUNT(*)
FROM
(
SELECT
customer_id
,order_date
,order_id
,next_order_date
,order_date + 30 lapse_date
FROM
(SELECT
customer_id
,order_date
,order_id
,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
FROM
cust_orders
)
WHERE NVL(next_order_date,sysdate) - order_date > 30
) lapses
,calendar
WHERE 1=1
AND calendar.date_value = TRUNC(lapses.lapse_date)
GROUP BY
calendar.date_value
;
Or if you really want every date printed out then use this :
WITH
cust_orders (customer_id , order_date , order_id )
AS
(SELECT 1, TO_DATE('01/01/2016','DD/MM/YYYY'), 1001 FROM dual UNION ALL
SELECT 1, TO_DATE('29/01/2016','DD/MM/YYYY'), 1002 FROM dual UNION ALL
SELECT 1, TO_DATE('01/03/2016','DD/MM/YYYY'), 1003 FROM dual UNION ALL
SELECT 2, TO_DATE('01/01/2016','DD/MM/YYYY'), 1004 FROM dual UNION ALL
SELECT 2, TO_DATE('29/01/2016','DD/MM/YYYY'), 1005 FROM dual UNION ALL
SELECT 2, TO_DATE('01/04/2016','DD/MM/YYYY'), 1006 FROM dual UNION ALL
SELECT 2, TO_DATE('01/06/2016','DD/MM/YYYY'), 1007 FROM dual UNION ALL
SELECT 2, TO_DATE('01/08/2016','DD/MM/YYYY'), 1008 FROM dual UNION ALL
SELECT 3, TO_DATE('01/09/2016','DD/MM/YYYY'), 1009 FROM dual UNION ALL
SELECT 3, TO_DATE('01/12/2016','DD/MM/YYYY'), 1010 FROM dual UNION ALL
SELECT 3, TO_DATE('02/12/2016','DD/MM/YYYY'), 1011 FROM dual UNION ALL
SELECT 3, TO_DATE('03/12/2016','DD/MM/YYYY'), 1012 FROM dual UNION ALL
SELECT 3, TO_DATE('04/12/2016','DD/MM/YYYY'), 1013 FROM dual UNION ALL
SELECT 3, TO_DATE('05/12/2016','DD/MM/YYYY'), 1014 FROM dual UNION ALL
SELECT 3, TO_DATE('06/12/2016','DD/MM/YYYY'), 1015 FROM dual UNION ALL
SELECT 3, TO_DATE('07/12/2016','DD/MM/YYYY'), 1016 FROM dual
)
,lapses
AS
(SELECT
customer_id
,order_date
,order_id
,next_order_date
,order_date + 30 lapse_date
FROM
(SELECT
customer_id
,order_date
,order_id
,LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) next_order_date
FROM
cust_orders
)
WHERE NVL(next_order_date,sysdate) - order_date > 30
)
,calendar (date_value)
AS
(SELECT TO_DATE('01/01/'||:P_year,'DD/MM/YYYY') + (rownum -1)
FROM all_tables
WHERE rownum < (TO_DATE('31/12/'||:P_year,'DD/MM/YYYY') - TO_DATE('01/01/'||:P_year,'DD/MM/YYYY')) + 2
)
SELECT
calendar.date_value
,(SELECT COUNT(*)
FROM lapses
WHERE calendar.date_value = lapses.lapse_date
)
FROM
calendar
WHERE 1=1
ORDER BY
calendar.date_value
;
Here's how I'd do it:
WITH your_table AS (SELECT 123 customer_id, to_date('24/01/2016', 'dd/mm/yyyy') order_date, 12345 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('24/01/2016', 'dd/mm/yyyy') order_date, 12346 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('25/01/2016', 'dd/mm/yyyy') order_date, 12347 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('24/02/2016', 'dd/mm/yyyy') order_date, 12347 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('16/03/2016', 'dd/mm/yyyy') order_date, 12348 order_id FROM dual UNION ALL
SELECT 123 customer_id, to_date('18/04/2016', 'dd/mm/yyyy') order_date, 12349 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('20/02/2016', 'dd/mm/yyyy') order_date, 12350 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('01/03/2016', 'dd/mm/yyyy') order_date, 12351 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('03/03/2016', 'dd/mm/yyyy') order_date, 12352 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('18/04/2016', 'dd/mm/yyyy') order_date, 12353 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('20/05/2016', 'dd/mm/yyyy') order_date, 12354 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('23/06/2016', 'dd/mm/yyyy') order_date, 12355 order_id FROM dual UNION ALL
SELECT 456 customer_id, to_date('19/01/2017', 'dd/mm/yyyy') order_date, 12356 order_id FROM dual),
-- end of mimicking your_table with data in it
lapsed_info AS (SELECT customer_id,
order_date,
CASE WHEN TRUNC(SYSDATE) - order_date <= 30 THEN NULL
WHEN COUNT(*) OVER (PARTITION BY customer_id ORDER BY order_date RANGE BETWEEN 1 FOLLOWING AND 30 FOLLOWING) = 0 THEN order_date+30
ELSE NULL
END lapsed_date
FROM your_table),
dates AS (SELECT to_date('01/01/2016', 'dd/mm/yyyy') + LEVEL -1 dt
FROM dual
CONNECT BY to_date('01/01/2016', 'dd/mm/yyyy') + LEVEL -1 <= TRUNC(SYSDATE))
SELECT dates.dt,
COUNT(li.lapsed_date) lapsed_count
FROM dates
LEFT OUTER JOIN lapsed_info li ON dates.dt = li.lapsed_date
GROUP BY dates.dt
ORDER BY dates.dt;
Results:
DT LAPSED_COUNT
---------- ------------
01/01/2016 0
<snip>
23/01/2016 0
24/01/2016 0
25/01/2016 0
26/01/2016 0
<snip>
19/02/2016 0
20/02/2016 0
21/02/2016 0
22/02/2016 0
23/02/2016 0
24/02/2016 1
25/02/2016 0
<snip>
29/02/2016 0
01/03/2016 0
02/03/2016 0
03/03/2016 0
04/03/2016 0
<snip>
15/03/2016 0
16/03/2016 0
17/03/2016 0
<snip>
20/03/2016 0
21/03/2016 0
22/03/2016 0
<snip>
30/03/2016 0
31/03/2016 0
01/04/2016 0
02/04/2016 1
03/04/2016 0
<snip>
14/04/2016 0
15/04/2016 1
16/04/2016 0
17/04/2016 0
18/04/2016 0
19/04/2016 0
<snip>
17/05/2016 0
18/05/2016 2
19/05/2016 0
20/05/2016 0
21/05/2016 0
<snip>
18/06/2016 0
19/06/2016 1
20/06/2016 0
21/06/2016 0
22/06/2016 0
23/06/2016 0
24/06/2016 0
<snip>
22/07/2016 0
23/07/2016 1
24/07/2016 0
<snip>
18/01/2017 0
19/01/2017 0
20/01/2017 0
<snip>
08/02/2017 0
This takes your data, and uses an the analytic count function to work out the number of rows that have a value within 30 days of (but excluding) the current row's date.
Then we apply a case expression to determine that if the row has a date within 30 days of today's date, we'll count those as not lapsed. If a count of 0 was returned, then the row is considered lapsed and we'll output the lapsed date as the order_date plus 30 days. Any other count result means the row has not lapsed.
The above is all worked out in the lapsed_info subquery.
Then all we need to do is list the dates (see the dates subquery) and outer join the lapsed_info subquery to it based on the lapsed_date and then do a count of the lapsed dates for each day.
I was going through this forum for my query to get data for previous 7 days,but most of them give it for current date.Below is my requirement:
I have a Table 1 as below:
These are start dates of week which is monday
from_date
2016-01-04
2016-01-11
2016-01-18
Table 2
I have all days of week here starting from monday.Ex: jan 04 - monday to jan 10 - sunday and so on for other weeks also.
get_date flag value
2016-01-04 N 4
2016-01-05 N 9
2016-01-06 Y 2
2016-01-07 Y 13
2016-01-08 Y 7
2016-01-09 Y 8
2016-01-10 Y 8
2016-01-11 Y 1
2016-01-12 Y 9
2016-01-13 N 8
2016-01-14 N 24
2016-01-15 N 8
2016-01-16 Y 4
2016-01-17 Y 5
2016-01-18 Y 9
2016-01-19 Y 2
2016-01-20 Y 8
2016-01-21 Y 4
2016-01-22 N 9
2016-01-23 N 87
2016-01-24 Y 3
Expected Result
here wk is the unique number for each start-end dates respectively
avg value is the avg of the values for the dates in that week.
last 2 days of the week are weekend days.
say 2016-01-09 and 2016-01-10 are weekends
from_date get_date Wk Total_days Total_weekdays_flag_Y Total_weekenddays_flag_Y Avg_value
2016-01-04 2016-01-10 1 7 3 2 6.714285714
2016-01-11 2016-01-17 2 7 2 2 8.428571429
2016-01-18 2016-01-24 3 7 4 1 17.42857143
Could anyone help me with this as I am not good at sql.
Thanks
select
from_date
, Wk
, count(case when day_of_week <=5 and flag = 'Y' then 1 end) as Total_weekdays_flag_Y
, count(case when day_of_week > 5 and flag = 'Y' then 1 end) as Total_weekenddays_flag_Y
, avg(value) as Avg_value
from (
select trunc(get_date,'IW') as from_date
, (trunc(get_date,'IW')- trunc(date'2016-01-04','IW'))/7 + 1 as Wk
, flag
, value
, get_date - trunc(get_date,'IW') as day_of_week
from Table_2)
group by from_date, Wk
order by from_date, Wk;
EDIT:
/*generate some test_data for table 2*/
with table_2 (get_date, flag, value) as (
select date'2016-01-03' + level,
DECODE(mod(level,3),0,'Y','N'),
round(dbms_random.value(0,10))
from dual connect by level < 101
),
/*generate some test_weeks for table 1*/
table_1 (FROM_date) as (select date'2016-01-04' + (level-1)*7 from dual connect by level < 101 )
/*main query */
select
from_date
, Wk
, count(day_of_week) as total
, count(case when day_of_week <=5 and flag = 'Y' then 1 end) as Total_weekdays_flag_Y
, count(case when day_of_week > 5 and flag = 'Y' then 1 end) as Total_weekenddays_flag_Y
, avg(value) as Avg_value
from (
select last_value(from_date ignore nulls) over (order by get_date) as from_date
,last_value(Wk ignore nulls) over (order by get_date) as Wk
, flag
, value
, get_date - trunc(get_date,'IW') as day_of_week
from Table_2 t2
full join (select row_number() over (order by from_date) as wk,from_date from table_1) t1 on t2.get_date = t1.from_date
)
group by from_date, Wk
having count(day_of_week) > 0
order by from_date, Wk
In the query below, I create the test data right within the query; in final form, you would delete the subqueries table_1 and table_2 and use the rest.
The syntax will work from Oracle 11.2 on. In Oracle 11.1, you need to move the column names in factored subqueries to the select... from dual part. Or, since you really only have one subquery (prep) and an outer query, you can write prep as an actual, in-line subquery.
Your arithmetic seems off on the average for the first week.
In your sample output you use get_date for the last day of the week. That is odd, since in table_2 that name has a different meaning. I used to_date in my output. I also do not show total_days - that is always 7, so why include it at all? (If it is not always 7, then there is something you didn't tell us; anyway, a count(...), if that is what it should be, is easy to add).
with
-- begin test data, can be removed in final solution
table_1 ( from_date ) as (
select date '2016-01-04' from dual union all
select date '2016-01-11' from dual union all
select date '2016-01-18' from dual
)
,
table_2 ( get_date, flag, value ) as (
select date '2016-01-04', 'N', 4 from dual union all
select date '2016-01-05', 'N', 9 from dual union all
select date '2016-01-06', 'Y', 2 from dual union all
select date '2016-01-07', 'Y', 13 from dual union all
select date '2016-01-08', 'Y', 7 from dual union all
select date '2016-01-09', 'Y', 8 from dual union all
select date '2016-01-10', 'Y', 8 from dual union all
select date '2016-01-11', 'Y', 1 from dual union all
select date '2016-01-12', 'Y', 9 from dual union all
select date '2016-01-13', 'N', 8 from dual union all
select date '2016-01-14', 'N', 24 from dual union all
select date '2016-01-15', 'N', 8 from dual union all
select date '2016-01-16', 'Y', 4 from dual union all
select date '2016-01-17', 'Y', 5 from dual union all
select date '2016-01-18', 'Y', 9 from dual union all
select date '2016-01-19', 'Y', 2 from dual union all
select date '2016-01-20', 'Y', 8 from dual union all
select date '2016-01-21', 'Y', 4 from dual union all
select date '2016-01-22', 'N', 9 from dual union all
select date '2016-01-23', 'N', 87 from dual union all
select date '2016-01-24', 'Y', 3 from dual
),
-- end test data, continue actual query
prep ( get_date, flag, value, from_date, wd_flag ) as (
select t2.get_date, t2.flag, t2.value, t1.from_date,
case when t2.get_date - t1.from_date <= 4 then 'wd' else 'we' end
from table_1 t1 inner join table_2 t2
on t2.get_date between t1.from_date and t1.from_date + 6
)
select from_date,
from_date + 6 as to_date,
row_number() over (order by from_date) as wk,
count(case when flag = 'Y' and wd_flag = 'wd' then 1 end)
as total_weekday_Y,
count(case when flag = 'Y' and wd_flag = 'we' then 1 end)
as total_weekend_Y,
round(avg(value), 6) as avg_value
from prep
group by from_date;
Output:
FROM_DATE TO_DATE WK TOTAL_WEEKDAY_Y TOTAL_WEEKEND_Y AVG_VALUE
---------- ---------- ---- --------------- --------------- ----------
2016-01-04 2016-01-10 1 3 2 7.285714
2016-01-11 2016-01-17 2 2 2 8.428571
2016-01-18 2016-01-24 3 4 1 17.428571
I have month and value columns in a table,like
Month Value Market
2010/01 100 1
2010/02 200 1
2010/03 300 1
2010/04 400 1
2010/05 500 1
2010/01 100 2
2010/02 200 2
2010/03 300 2
2010/04 400 2
2010/05 500 2
What I want to do is get new Month and Value combinations using (value in month(n-1)+value in month(n))/2=value in month n, also this calculation is based on market column, it group by market number. So, for the above example, the new month and value combination should be
Month Value Market
2010/01 null 1
2010/02 (100+200)/2 1
2010/03 (200+300)/2 1
2010/04 (300+400)/2 1
2010/05 (400+500)/2 1
2010/01 null 2
2010/02 (100+200)/2 2
2010/03 (200+300)/2 2
2010/04 (300+400)/2 2
2010/05 (400+500)/2 2
Do you know how to achieve it in Oracle? Thank you!
If there is no gap in your data, you can use LAG:
SQL> WITH DATA AS (
2 SELECT DATE '2010-01-01' mon, 100 val FROM dual UNION ALL
3 SELECT DATE '2010-02-01' mon, 200 val FROM dual UNION ALL
4 SELECT DATE '2010-03-01' mon, 300 val FROM dual UNION ALL
5 SELECT DATE '2010-04-01' mon, 400 val FROM dual UNION ALL
6 SELECT DATE '2010-05-01' mon, 500 val FROM dual
7 )
8 SELECT mon, (LAG(val) OVER (ORDER BY mon) + val) / 2 avg_val FROM DATA;
MON AVG_VAL
----------- ----------
01/01/2010
01/02/2010 150
01/03/2010 250
01/04/2010 350
01/05/2010 450
However, if there is a gap the result might not be what you expect. In that case, you can either use a self-join or narrow the windowing clause:
SQL> WITH DATA AS (
2 SELECT DATE '2010-01-01' mon, 100 val FROM dual UNION ALL
3 SELECT DATE '2010-02-01' mon, 200 val FROM dual UNION ALL
4 SELECT DATE '2010-03-01' mon, 300 val FROM dual UNION ALL
5 /* gap ! */
6 SELECT DATE '2010-05-01' mon, 400 val FROM dual UNION ALL
7 SELECT DATE '2010-06-01' mon, 500 val FROM dual
8 )
9 SELECT mon, (first_value(val)
10 OVER (ORDER BY mon
11 RANGE BETWEEN INTERVAL '1' MONTH PRECEDING
12 AND INTERVAL '1' MONTH PRECEDING)
13 + val) / 2 avg_val
14 FROM DATA;
MON AVG_VAL
----------- ----------
01/01/2010
01/02/2010 150
01/03/2010 250
01/05/2010
01/06/2010 450
This does it:
SQL> select month,
2 (value+lag(value) over (order by month))/2 as value
3* from t1
MONTH VALUE
---------- ----------
2010/01
2010/02 150
2010/03 250
2010/04 350
2010/05 450
5 rows selected.
I am attempting to write Oracle SQL.
I am looking for solution something similar. Please find below data I have
start_date end_date customer
01-01-2012 31-06-2012 a
01-01-2012 31-01-2012 b
01-02-2012 31-03-2012 c
I want the count of customer in that date period. My result should look like below
Month : Customer Count
JAN-12 : 2
FEB-12 : 2
MAR-12 : 2
APR-12 : 1
MAY-12 : 1
JUN-12 : 1
One option would be to generate the months separately in another query and join that to your data table (note that I'm assuming that you intended customer A to have an end-date of June 30, 2012 since there is no June 31).
SQL> ed
Wrote file afiedt.buf
1 with mnths as(
2 select add_months( date '2012-01-01', level - 1 ) mnth
3 from dual
4 connect by level <= 6 ),
5 data as (
6 select date '2012-01-01' start_date, date '2012-06-30' end_date, 'a' customer from dual union all
7 select date '2012-01-01', date '2012-01-31', 'b' from dual union all
8 select date '2012-02-01', date '2012-03-31', 'c' from dual
9 )
10 select mnths.mnth, count(*)
11 from data,
12 mnths
13 where mnths.mnth between data.start_date and data.end_date
14 group by mnths.mnth
15* order by mnths.mnth
SQL> /
MNTH COUNT(*)
--------- ----------
01-JAN-12 2
01-FEB-12 2
01-MAR-12 2
01-APR-12 1
01-MAY-12 1
01-JUN-12 1
6 rows selected.
WITH TMP(monthyear,start_date,end_date,customer) AS (
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from data
union all
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from TMP
where LAST_DAY(end_date) >= LAST_DAY(start_date)
)
SELECT TO_CHAR(MonthYear, 'MON-YY') TheMonth,
Count(Customer) Customers
FROM TMP
GROUP BY MonthYear
ORDER BY MonthYear;
SQLFiddle