I have written some query to get my resultant result as below :
Note: I have months starting from jan-2016 to jan-2018.
There are two types, either 'hist' or 'future'
Resultant dataset :
In this example : let consider combination of id1+id2+id3 as 1,2,3
type month id1 id2 id3 value
hist jan-17 1 2 3 10
hist feb-17 1 2 3 20
future jan-17 1 2 3 15
future feb-17 1 2 3 1
hist mar-17 1 2 3 2
future apr-17 1 2 3 5
My calculation logic depends on the quarter number of month .
For eg . for month of january(first month of quarter) i want the value to be : future of jan + future value of feb + future value of march .
so for jan-17 , output should be : 15+1 + 0(for march there is no corresponding future value)
for the month of feb (2nd month of quarter), value should be : hist of jan + future of feb + future of march i.e 10+1+0(future of march is not available)
Similarly for the month of march , value should be : history of jan + history of feb + future of march i.e 10+20+0(frecast of march no present) .
similarly for april,may.june(depending on quarter number of month)
I am aware of the lead lag function , but I am not able to apply it here
Can someone please help
I would not mess with lag, this can all be done with a group by if you convert your dates to quarters:
WITH
dset
AS
(SELECT DATE '2017-01-17' month, 5 VALUE
FROM DUAL
UNION ALL
SELECT DATE '2017-02-17' month, 6 VALUE
FROM DUAL
UNION ALL
SELECT DATE '2017-03-25' month, 7 VALUE
FROM DUAL
UNION ALL
SELECT DATE '2017-05-25' month, 4 VALUE
FROM DUAL)
SELECT SUM (VALUE) value_sum, TO_CHAR (month, 'q') quarter, TO_CHAR (month, 'YYYY') year
FROM dset
GROUP BY TO_CHAR (month, 'q'), TO_CHAR (month, 'YYYY');
This results in:
VALUE_SUM QUARTER YEAR
18 1 2017
4 2 2017
We can use an analytic function if you need the result on each record:
SELECT SUM (VALUE) OVER (PARTITION BY TO_CHAR (month, 'q'), TO_CHAR (month, 'YYYY')) quarter_sum, month, VALUE
FROM dset
This results in:
QUARTER_SUM MONTH VALUE
18 1/17/2017 5
18 2/17/2017 6
18 3/25/2017 7
4 5/25/2017 4
Make certain you include year, you don't want to combine quarters from different years.
Well, as said in one of the comments.. the trick lies in another question of yours & the corresponding answer. Well... it goes somewhat like this..
with
x as
(select 'hist' type, To_Date('JAN-2017','MON-YYYY') ym , 10 value from dual union all
select 'future' type, To_Date('JAN-2017','MON-YYYY'), 15 value from dual union all
select 'future' type, To_Date('FEB-2017','MON-YYYY'), 1 value from dual),
y as
(select * from x Pivot(Sum(Value) For Type in ('hist' as h,'future' as f))),
/* Pivot for easy lag,lead query instead of working with rows..*/
z as
(
select ym,sum(h) H,sum(f) F from (
Select y.ym,y.H,y.F from y
union all
select add_months(to_Date('01-JAN-2017','DD-MON-YYYY'),rownum-1) ym, 0 H, 0 F
from dual connect by rownum <=3 /* depends on how many months you are querying...
so this dual adds the corresponding missing 0 records...*/
) group by ym
)
select
ym,
Case
When MOD(Extract(Month from YM),3) = 1
Then F + Lead(F,1) Over(Order by ym) + Lead(F,2) Over(Order by ym)
When MOD(Extract(Month from YM),3) = 2
Then Lag(H,1) Over(Order by ym) + F + Lead(F,1) Over(Order by ym)
When MOD(Extract(Month from YM),3) = 3
Then Lag(H,2) Over(Order by ym) + Lag(H,1) Over(Order by ym) + F
End Required_Value
from z
Related
I have a table of records like this:
Item
From
To
A
2018-01-03
2018-03-16
B
2021-05-25
2021-11-10
The output of select should look like:
Item
Month
Year
A
01
2018
A
02
2018
A
03
2018
B
05
2021
B
06
2021
B
07
2021
B
08
2021
Also the range should not exceed the current month. In example above we are asuming current day is 2021-08-01.
I am trying to do something similar to THIS with CONNECT BY LEVEL but as soon as I also select my table next to dual and try to order the records the selection never completes. I also have to join few other tables to the selection but I don't think that would make a difference.
I would very much appreciate your help.
Row generator it is, but not as you did it; most probably you're missing lines #11 - 16 in my query (or their alternative).
SQL> with test (item, date_from, date_to) as
2 -- sample data
3 (select 'A', date '2018-01-03', date '2018-03-16' from dual union all
4 select 'B', date '2021-05-25', date '2021-11-10' from dual
5 )
6 -- query that returns desired result
7 select item,
8 extract(month from (add_months(date_from, column_value - 1))) month,
9 extract(year from (add_months(date_from, column_value - 1))) year
10 from test cross join
11 table(cast(multiset
12 (select level
13 from dual
14 connect by level <=
15 months_between(trunc(least(sysdate, date_to), 'mm'), trunc(date_from, 'mm')) + 1
16 ) as sys.odcinumberlist))
17 order by item, year, month;
ITEM MONTH YEAR
----- ---------- ----------
A 1 2018
A 2 2018
A 3 2018
B 5 2021
B 6 2021
B 7 2021
B 8 2021
7 rows selected.
SQL>
Recursive CTEs are the standard SQL approach to this type of problem. In Oracle, this looks like:
with cte(item, fromd, tod) as (
select item, fromd, tod
from t
union all
select item, add_months(fromd, 1), tod
from cte
where add_months(fromd, 1) < last_day(tod)
)
select item, extract(year from fromd) as year, extract(month from fromd) as month
from cte
order by item, fromd;
Here is a db<>fiddle.
input:
item loc month year qty
A DEL 5 2020 12
A DEL 6 2020 14
A DEL 8 2020 16
A DEL 9 2020 17
output:
item loc month year qty
A DEL 5 2020 12
A DEL 6 2020 14
A DEL 7 2020 26
A DEL 8 2020 16
A DEL 9 2020 17
A DEL 10 2020 33
description:
I don't have month 7 in my input. So for calculating month 7 i do sum of previous two months quantity.
for example for month 7 output will be 12(from month 5)+14(from month 6)=26
So its like whenever any month will be missing i should fill that month with this logic.
I have written a script which is two step process but it only considers missing month between the values and not boundary values i.e. it wont assume 10 is missing as it is a boundary value.
1st Step: Insert the misisng month with NULL for all other columns.
INSERT INTO TEST_MISSING(MONTH)
select min_a - 1 + level
from ( select min(MONTH) min_a
, max(MONTH) max_a
from TEST_MISSING
)
connect by level <= max_a - min_a + 1
minus
select MONTH
from TEST_MISSING;
2nd Step: Populate the values of other columns using lag with values from rows about it.
and then using Window function calculate the quantity value.
SELECT NVL(ITEM, NEW_ITEM) ITEM,
NVL(LOC, NEW_LOC) LOC,
MONTH, NVL(YEAR, NEW_YEAR) YEAR,
CASE WHEN QTY IS NULL THEN SUM(NVL(QTY, 0)) OVER(PARTITION BY NEW_ITEM ORDER BY MONTH ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING) ELSE QTY END AS QTY
FROM (
SELECT A.*,
nvl(item,CASE WHEN ITEM IS NULL THEN (LAG(ITEM) OVER(ORDER BY MONTH)) END) NEW_ITEM,
nvl(LOC,CASE WHEN LOC IS NULL THEN (LAG(LOC) OVER(ORDER BY MONTH)) END) NEW_LOC,
nvl(YEAR,CASE WHEN YEAR IS NULL THEN (LAG(YEAR) OVER(ORDER BY MONTH)) END) NEW_YEAR
FROM TEST_MISSING A)
X
ORDER BY MONTH;
How can I get the sum of two rows clubbed together for instance If I have 5 rows in total, I should get 3 rows a result.
Below is my table:
2020-08-01 1
2020-08-02 3
2020-08-03 4
2020-08-04 2
2020-08-05 4
I want to achive this:
4
6
4
August 1 and 2 = 4
August 3 and 4 = 6
August 5 = 4
You could use ROW_NUMBER here:
WITH cte AS (
SELECT dt, val, ROW_NUMBER() OVER (ORDER BY dt) rn
FROM yourTable
)
SELECT SUM(val)
FROM cte
GROUP BY FLOOR((rn - 1) / 2)
GROUP BY MIN(dt);
Here is a demo link, shown in SQL Server, but whose logic should also be working for BigQuery:
Demo
Below is for Bigquery Standard SQL
#standardSQL
SELECT SUM(value) AS value,
STRING_AGG(FORMAT_DATE('%B %d', day), ' and ') || ' = ' || CAST(SUM(value) AS STRING) AS calc
FROM (
SELECT day, value, DIV(ROW_NUMBER() OVER(ORDER BY day) - 1, 2) grp
FROM `project.dataset.table` t
)
GROUP BY grp
ORDER BY grp
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT DATE '2020-08-01' day, 1 value UNION ALL
SELECT '2020-08-02', 3 UNION ALL
SELECT '2020-08-03', 4 UNION ALL
SELECT '2020-08-04', 2 UNION ALL
SELECT '2020-08-05', 4
)
SELECT SUM(value) AS value,
STRING_AGG(FORMAT_DATE('%B %d', day), ' and ') || ' = ' || CAST(SUM(value) AS STRING) AS calc
FROM (
SELECT day, value, DIV(ROW_NUMBER() OVER(ORDER BY day) - 1, 2) grp
FROM `project.dataset.table` t
)
GROUP BY grp
ORDER BY grp
with output
Row value calc
1 4 August 01 and August 02 = 4
2 6 August 03 and August 04 = 6
3 4 August 05 = 4
I need to compare side by side the companies values by current year vs last year and current month with same month of the previous year.
I use this query to get the values
SELECT STORE, SUM(TOTAL) as VAL, DATE FROM MYTABLE
WHERE DATE=CURRENT_DATE GROUP BY STORE ORDER BY STORE
below the results
STORE | VAL | DATE
1 10 CURRENT_DATE (2018-27-03)
1 20 2018-26-03
1 30 2018-25-03
2 20 CURRENT_DATE (2018-27-03)
2 20 2018-26-02
and i need this
STORE | VALUE CURRENT YEAR | VALUE LAST YEAR
1 60 30 (CALCULATED)
2 40 50 (CALCULATED)
STORE | VALUE CURRENT MONTH | VALUE SAME MONTH OF LAST YEAR
1 60 30 (CALCULATED)
2 20 50 (CALCULATED)
Thank you
You could just join two sub-selects together.
E.g with this DDL and Data
CREATE TABLE MYTABLE (STORE int, VAL int, D DATE);
INSERT INTO MYTABLE VALUES
( 1, 10, '2018-03-27')
,( 1, 20, '2018-03-26')
,( 1, 10, '2018-02-25')
,( 1, 35, '2017-03-25')
,( 2, 20, '2018-03-27')
,( 2, 15, '2017-03-26');
This will get you current month and last month last year values
SELECT C.*, LY.VAL_CURR_MONTH_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_MONTH
FROM MYTABLE WHERE INT(D)/100=INT(CURRENT_DATE)/100
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE INT(D)/100 = INT(CURRENT_DATE)/100 -100
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
Then this for years
SELECT C.*, LY.VAL_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_YEAR
FROM MYTABLE WHERE INT(D)/10000=INT(CURRENT_DATE)/10000
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_LY
FROM MYTABLE
WHERE INT(D)/10000 = INT(CURRENT_DATE)/10000 -1
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
P.S. there are many other ways to manipulate dates, but casting to INT is maybe one of the easier ways
Also, here is a more flexible way to get the "Same Month of Last Year" value. A similar method can get "last Year" values.
SELECT T.*
, AVG(VAL) OVER(
PARTITION BY STORE
ORDER BY YEAR_MONTH
RANGE BETWEEN 101 PRECEDING AND 100 PRECEDING
) AS SAME_MONTH_PREV_YEAR
FROM
( SELECT STORE
, INTEGER(D)/100 AS YEAR_MONTH
, SUM(VAL) AS VAL
FROM
MYTABLE T
GROUP BY
STORE
, INTEGER(D)/100
) AS T
;
Gives
STORE YEAR_MONTH VAL SAME_MONTH_PREV_YEAR
----- ---------- --- --------------------
1 201703 35 NULL
1 201802 10 NULL
1 201803 30 35
2 201703 15 NULL
2 201803 20 15
It is better to avoid functions on table columns in where clauses. Check following SQLs which are based on P. Vernon sample table.
Note: These SQLs are for DB2 LUW 11.1
For month:
SELECT STORE,
SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CURR_MONTH,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE D between first_day(current date) and last_day(current date)
or D between first_day(current date - 1 year) and last_day(current date - 1 year)
GROUP BY STORE
ORDER BY STORE
For year:
SELECT STORE, SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CY,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_LY
FROM MYTABLE
WHERE D between first_day(current date - (month(current date) - 1) months)
and last_day(current date + (12 - month(current date)) months)
or D between first_day(current date - (month(current date) - 1) months - 1 year)
and last_day(current date + (12 - month(current date)) months - 1 year)
GROUP BY STORE
ORDER BY STORE
I have a dataset:
type month id1 id2 id3 value
history jan-17 1 2 3 10
future jan-17 1 2 3 15
history feb 1 2 3 12
history march 1 2 3 11
future march 1 2 3 14
I want to get value for each month based on some calculation and based on the value of type column.
For eg : the output should look like this:
month id1 id2 id3 value
JAN-17 1 2 3 15(future value of jan) + 0(as future value of feb is not present)+ 14(take the future value of march)
FEB-17 1 2 3 10(history value of jan)+14(take the future value of march)
MAR-17 1 2 3 10(history value of jan)+12(history value of feb)+11(history value of mar)
The calculation is based on the quarter number of each month in a year.
If it is the first month of a quarter, take the future value of first month + future of 2nd month + future value of 3rd month
If the month is 2nd month of a quarter, take the history value of 1st month + future value of 2nd month + future value of 3rd month
If the month is 3rd month of a quarter, take the history value of 1st month + history value of 2nd month + future value of 3rd month .
I have tried partitioning the dataset based on month id1, id2, id3, but it does not give me the expected result .
Your desired output contradict the rules which you wrote further:
If it is the first month of a quarter, take the future value of first month + future of 2nd month + future value of 3rd month
If the month is 2nd month of a quarter, take the history value of 1st month + future value of 2nd month + future value of 3rd month
If the month is 3rd month of a quarter, take the history value of 1st month + history value of 2nd month + future value of 3rd month
I implemented these rules, because they have more clear logic and are easier to implement:
with t1 as (select 'history' type from dual union all select 'future' type from dual),
d as (select add_months(date '2017-01-01', rownum - 1) month, 1 id1, 2 id2, 3 id3 from dual connect by level < 4),
t2 as (select * from t1, d),
source (type, month, id1, id2, id3, value) as (
select 'history', 'jan-17', 1, 2, 3, 10 from dual union all
select 'future', 'jan-17', 1, 2, 3, 15 from dual union all
select 'history', 'feb-17', 1, 2, 3, 12 from dual union all
select 'history', 'mar-17', 1, 2, 3, 11 from dual union all
select 'future', 'mar-17', 1, 2, 3, 14 from dual)
select to_char(month, 'mon-yy') mon, id1, id2, id3, history, future,
sum(history) over (partition by to_char(month, 'Q') order by month) - history +
sum(future) over (partition by to_char(month, 'Q') order by month desc) value
from (select t2.type, t2.month, nvl(s.id1, t2.id1) id1, nvl(s.id2, t2.id2) id2, nvl(s.id3, t2.id3) id3, nvl(value, 0) value
from t2 left join source s on s.type = t2.type and s.month = to_char(t2.month, 'mon-yy'))
pivot (sum(value) for type in ('history' history, 'future' future))
order by month
How it works:
subqueries t1, t2 and d are used to generate full list of months and types
then join them with source data and make a pivot
calculate running sum of "futures" and "histories" in different directions
then combine these running sums together minus current history value, what gives us exactly what we need:
MON ID1 ID2 ID3 HISTORY FUTURE VALUE
------------ ---------- ---------- ---------- ---------- ---------- ----------
jan-17 1 2 3 10 15 29
feb-17 1 2 3 12 0 24
mar-17 1 2 3 11 14 36
This solution could be applied to any period of time, data will be calculated for each quarter separately.
I have no idea what to do with id1,id2,id3 so i choose to ignore it, Better you explain the corresponding logic for it. Will it always be the same value for history,future of a particular month?
with
x as
(select 'hist' type, To_Date('JAN-2017','MON-YYYY') ym , 10 value from dual union all
select 'future' type, To_Date('JAN-2017','MON-YYYY'), 15 value from dual union all
select 'future' type, To_Date('FEB-2017','MON-YYYY'), 1 value from dual),
y as
(select * from x Pivot(Sum(Value) For Type in ('hist' as h,'future' as f))),
/* Pivot for easy lag,lead query instead of working with rows..*/
z as
(
select ym,sum(h) H,sum(f) F from (
Select y.ym,y.H,y.F from y
union all
select add_months(to_Date('01-JAN-2017','DD-MON-YYYY'),rownum-1) ym, 0 H, 0 F
from dual connect by rownum <=3 /* depends on how many months you are querying...
so this dual adds the corresponding missing 0 records...*/
) group by ym
)
select
ym,
Case
When MOD(Extract(Month from YM),3) = 1
Then F + Lead(F,1) Over(Order by ym) + Lead(F,2) Over(Order by ym)
When MOD(Extract(Month from YM),3) = 2
Then Lag(H,1) Over(Order by ym) + F + Lead(F,1) Over(Order by ym)
When MOD(Extract(Month from YM),3) = 3
Then Lag(H,2) Over(Order by ym) + Lag(H,1) Over(Order by ym) + F
End Required_Value
from z