SQL Group By weeks and months in the same time (Redshift) - sql

In the code below I am selecting 42 days period and grouping it by SNAPSHOT_WEEK (where SNAPSHOT_WEEK has a number from 1 to 52(53) during the year).
SELECT
CASE
WHEN video_code = 'A' THEN 'Seller'
WHEN video_code = 'B' THEN 'Vendor'
WHEN video_code = 'C' THEN 'Others'
END AS CATEGORY
TO_CHAR(snapshot_time - DATE_PART('dow', snapshot_time)::int + 4, 'IW') AS SNAPSHOT_WEEK,
SUM(VIOLATION_COUNT)
FROM my_table
WHERE 1=1
AND snapshot_time BETWEEN '20180505'::date - '41 days'::interval AND '20180505'::date -- to calculate WoW
GROUP BY
CATEGORY, SNAPSHOT_WEEK;
Output for this query looks like this:
CATEGORY WEEK OR MONTH SUM_VIOLATION_COUNT
A 14 954
B 14 454
C 14 299
A 15 954
B 16 454
Is it possible, in the same query, beside grouping by week, group this data by month where month should start from 28th of one month to 28th of second month?
For example, in my output I need column that will show following values:
CATEGORY WEEK OR MONTH SUM_VIOLATION_COUNT
A 14 954
B 14 454
C 14 299
A 15 954
B 16 454
C 17 299
A 28 March 9354
B 28 March 2454
C 28 March 5354
A 28 April 1354
...... ..... .....
Where "28 March" - means number of violation between 28-Feb and 28 March; "28 April" - number of violation between 28 Feb and 28 April etc.
Is that possible to do using the same query?

You can do that with WITH Subquery, this will allow you do to run the query once on the database and group by twice based on your logic.
Your query has some disconnects between your column names but again it will look like something like this
P.S. Union requires number of columns should be same in both selects
WITH ALLDATA AS (
SELECT
CASE
WHEN video_code = 'A' THEN 'Seller'
WHEN video_code = 'B' THEN 'Vendor'
WHEN video_code = 'C' THEN 'Others'
END AS CATEGORY
TO_CHAR(snapshot_time - DATE_PART('dow', snapshot_time)::int + 4, 'IW') AS SNAPSHOT_WEEK,
SUM(VIOLATION_COUNT) SUM_VIOLATION_COUNT
FROM my_table
WHERE 1=1
AND snapshot_time BETWEEN '20180505'::date - '41 days'::interval AND '20180505'::date -- to calculate WoW
GROUP BY
CATEGORY, SNAPSHOT_WEEK)
SELECT CATEGORY, SNAPSHOT_WEEK, SUM_VIOLATION_COUNT FROM ALLDATA
UNION
SELECT CATEGORY, SNAPSHOT_WEEK, SUM_VIOLATION_COUNT FROM ALLDATA
GROUP BY <your month grouping logic>
To reiterate the logic in pseudo code
WITH ALLDATA AS (
SELECT <your base data without group by> )
SELECT columns FROM ALLDATA
GROUP BY <weekly group by logic>
UNION
SELECT columns FROM ALLDATA
GROUP BY <monthly group by logic>

You would need to UNION the output of two separate queries to generate those results.
The basic rule is that one input row will map to (at most) one output row.

Related

Generate a range of records depending on from-to dates

I have a table of records like this:
Item
From
To
A
2018-01-03
2018-03-16
B
2021-05-25
2021-11-10
The output of select should look like:
Item
Month
Year
A
01
2018
A
02
2018
A
03
2018
B
05
2021
B
06
2021
B
07
2021
B
08
2021
Also the range should not exceed the current month. In example above we are asuming current day is 2021-08-01.
I am trying to do something similar to THIS with CONNECT BY LEVEL but as soon as I also select my table next to dual and try to order the records the selection never completes. I also have to join few other tables to the selection but I don't think that would make a difference.
I would very much appreciate your help.
Row generator it is, but not as you did it; most probably you're missing lines #11 - 16 in my query (or their alternative).
SQL> with test (item, date_from, date_to) as
2 -- sample data
3 (select 'A', date '2018-01-03', date '2018-03-16' from dual union all
4 select 'B', date '2021-05-25', date '2021-11-10' from dual
5 )
6 -- query that returns desired result
7 select item,
8 extract(month from (add_months(date_from, column_value - 1))) month,
9 extract(year from (add_months(date_from, column_value - 1))) year
10 from test cross join
11 table(cast(multiset
12 (select level
13 from dual
14 connect by level <=
15 months_between(trunc(least(sysdate, date_to), 'mm'), trunc(date_from, 'mm')) + 1
16 ) as sys.odcinumberlist))
17 order by item, year, month;
ITEM MONTH YEAR
----- ---------- ----------
A 1 2018
A 2 2018
A 3 2018
B 5 2021
B 6 2021
B 7 2021
B 8 2021
7 rows selected.
SQL>
Recursive CTEs are the standard SQL approach to this type of problem. In Oracle, this looks like:
with cte(item, fromd, tod) as (
select item, fromd, tod
from t
union all
select item, add_months(fromd, 1), tod
from cte
where add_months(fromd, 1) < last_day(tod)
)
select item, extract(year from fromd) as year, extract(month from fromd) as month
from cte
order by item, fromd;
Here is a db<>fiddle.

How to use lead lag function in oracle

I have written some query to get my resultant result as below :
Note: I have months starting from jan-2016 to jan-2018.
There are two types, either 'hist' or 'future'
Resultant dataset :
In this example : let consider combination of id1+id2+id3 as 1,2,3
type month id1 id2 id3 value
hist jan-17 1 2 3 10
hist feb-17 1 2 3 20
future jan-17 1 2 3 15
future feb-17 1 2 3 1
hist mar-17 1 2 3 2
future apr-17 1 2 3 5
My calculation logic depends on the quarter number of month .
For eg . for month of january(first month of quarter) i want the value to be : future of jan + future value of feb + future value of march .
so for jan-17 , output should be : 15+1 + 0(for march there is no corresponding future value)
for the month of feb (2nd month of quarter), value should be : hist of jan + future of feb + future of march i.e 10+1+0(future of march is not available)
Similarly for the month of march , value should be : history of jan + history of feb + future of march i.e 10+20+0(frecast of march no present) .
similarly for april,may.june(depending on quarter number of month)
I am aware of the lead lag function , but I am not able to apply it here
Can someone please help
I would not mess with lag, this can all be done with a group by if you convert your dates to quarters:
WITH
dset
AS
(SELECT DATE '2017-01-17' month, 5 VALUE
FROM DUAL
UNION ALL
SELECT DATE '2017-02-17' month, 6 VALUE
FROM DUAL
UNION ALL
SELECT DATE '2017-03-25' month, 7 VALUE
FROM DUAL
UNION ALL
SELECT DATE '2017-05-25' month, 4 VALUE
FROM DUAL)
SELECT SUM (VALUE) value_sum, TO_CHAR (month, 'q') quarter, TO_CHAR (month, 'YYYY') year
FROM dset
GROUP BY TO_CHAR (month, 'q'), TO_CHAR (month, 'YYYY');
This results in:
VALUE_SUM QUARTER YEAR
18 1 2017
4 2 2017
We can use an analytic function if you need the result on each record:
SELECT SUM (VALUE) OVER (PARTITION BY TO_CHAR (month, 'q'), TO_CHAR (month, 'YYYY')) quarter_sum, month, VALUE
FROM dset
This results in:
QUARTER_SUM MONTH VALUE
18 1/17/2017 5
18 2/17/2017 6
18 3/25/2017 7
4 5/25/2017 4
Make certain you include year, you don't want to combine quarters from different years.
Well, as said in one of the comments.. the trick lies in another question of yours & the corresponding answer. Well... it goes somewhat like this..
with
x as
(select 'hist' type, To_Date('JAN-2017','MON-YYYY') ym , 10 value from dual union all
select 'future' type, To_Date('JAN-2017','MON-YYYY'), 15 value from dual union all
select 'future' type, To_Date('FEB-2017','MON-YYYY'), 1 value from dual),
y as
(select * from x Pivot(Sum(Value) For Type in ('hist' as h,'future' as f))),
/* Pivot for easy lag,lead query instead of working with rows..*/
z as
(
select ym,sum(h) H,sum(f) F from (
Select y.ym,y.H,y.F from y
union all
select add_months(to_Date('01-JAN-2017','DD-MON-YYYY'),rownum-1) ym, 0 H, 0 F
from dual connect by rownum <=3 /* depends on how many months you are querying...
so this dual adds the corresponding missing 0 records...*/
) group by ym
)
select
ym,
Case
When MOD(Extract(Month from YM),3) = 1
Then F + Lead(F,1) Over(Order by ym) + Lead(F,2) Over(Order by ym)
When MOD(Extract(Month from YM),3) = 2
Then Lag(H,1) Over(Order by ym) + F + Lead(F,1) Over(Order by ym)
When MOD(Extract(Month from YM),3) = 3
Then Lag(H,2) Over(Order by ym) + Lag(H,1) Over(Order by ym) + F
End Required_Value
from z

Summing together values from the same table in different databases

I have a table on each database for a region of a company with the number of sales per month like so:
Region1.dbo.SalesPerMonth Region2.dbo.SalesPerMonth
ID Month Sales ID Month Sales
1 Jan 23 1 Jan 21
2 Feb 19 2 Feb 15
3 Jan 31 3 Jan 25
... ... ... ... ... ...
I am looking to write a query to join these tables into one table that shows the sales for the entire company per month, so it has the total sales from all regions added together:
AllRegions
ID Month Sales
1 Jan 44
2 Feb 34
3 Jan 56
... ... ...
I am however new to SQL and am not sure how to go about doing so. Any help or advice on how to write the query would be greatly appreciated.
Union together the two tables, and then aggregate by ID and Month to generate the sum of sales.
SELECT
ID, Month, SUM(Sales) AS Sales
FROM
(
SELECT ID, Month, Sales
FROM Region1.dbo.SalesPerMonth
UNION ALL
SELECT ID, Month, Sales
FROM Region2.dbo.SalesPerMonth
) t
GROUP BY
ID, Month
ORDER BY
ID;
Demo here:
Rextester
Try this:
WITH DataSource AS
(
SELECT *
FROM Region1.dbo.SalesPerMonth
UNION ALL
SELECT *
FROM Region2.dbo.SalesPerMonth
)
SELECT [id]
,[Month]
,SUM(Sales) AS Sales
FROM DataSource
GROUP BY [id]
,[Month]

How do I write a query that imputes values for records that are not present in a table?

I have a table that looks like this:
MONTH | WIDGET | VALUE
------+--------+------
Dec | A | 3
Jan | B | 5
Feb | B | 6
Mar | B | 7
and I want to write a query that produces, for each MONTH and WIDGET the difference in VALUE between the current month and the previous month. So I want an output table like this:
MONTH | WIDGET | VALUE
------+--------+------
Dec | A | 3
Jan | A | -3
Feb | A | 0
Mar | A | 0
Dec | B | 0
Jan | B | 5
Feb | B | 1
Mar | B | 1
If there is no recorded value for the previous month for a given widget, I want to assume the previous month's value is zero. Conversely, if there is no recorded value for the current month, I want to assume the current month's value is zero.
I believe a cross join over all combinations of month and widget might work, by giving me a "spine" to which I can left join my data and then use coalesce - but is there a better way?
Edit: We can assume the MONTH column actually has a numeric representation to make it easier to identify the previous.
I would use the lag function. IBM Reference I just defaulted to 0 for values whose prior value doesn't exist but you can handle that a number of different ways.
create temp table test (
mth date
,widget char(1)
,value integer
)
distribute on random;
insert into test values('2013-12-01','A',3);
insert into test values('2014-01-01','A',-3);
insert into test values('2014-02-01','A',0);
insert into test values('2014-03-01','A',0);
insert into test values('2013-12-01','B',0);
insert into test values('2014-01-01','B',5);
insert into test values('2014-02-01','B',1);
insert into test values('2014-03-01','B',1);
select *
,lag(value,1) over(partition by widget order by mth) as prior_row
,value - nvl(lag(value,1) over(partition by widget order by mth),0) as diff
from test
OK. I have solved this in MS SQL but it should be transferable to PostgresQL. I have SQLFiddled the answer:
CREATE TABLE WidgetMonths (Month tinyint, Widget varchar(1), Value int)
CREATE TABLE Months (Month tinyint, MonthOrder tinyint)
insert into WidgetMonths Values
(12, 'A', 3),
(1,'B', 5),
(2,'B', 6),
(3,'B', 7);
insert into Months Values
(12, 1), (1, 2), (2, 3), (3, 4)
Select
AllWidgetMonths.Widget,
AllWidgetMonths.Month,
IsNull(wm.Value,0) - IsNull(wmn.Value,0) as Value
from (
select Distinct Widget, Months.Month, Months.MonthOrder
from WidgetMonths
Cross Join months
) AllWidgetMonths
left join WidgetMonths wm on wm.Widget = AllWidgetMonths.Widget
AND wm.Month = AllWidgetMonths.Month
left join WidgetMonths wmn on wmn.Widget = AllWidgetMonths.Widget
AND Case When wmn.Month = 12 Then 1 Else wmn.Month + 1 End = AllWidgetMonths.Month
Order by AllWidgetMonths.Widget, AllWidgetMonths.MonthOrder
I have started off with a Table of WidgetMonths from your example the only difference being I have converted the months into a representative integer.
I have then Created the Months Table of All months we are interested in from your example. If you want months for the whole year you can simply add to this table or find another way of generating a 1-12 row result set. The MonthOrder is optional and just helped me achieve your answer ordering.
As you mentioned AllwidgetMonths has the Cross join which gives us all combinations of Widgets and Months. This maybe better achieved by a Cross join between 'Widgets' Table and the Months Table. But I wasn't sure if this existed so left this out.
We left join WidgetMonths onto our master table of All widget months to show us which months we have a value for.
The trick up the sleeve is then left joining the same table again but this time adding 1 to the month number inside the join. This shifts the rows down one. Notice I have a Case statement (not sure about this in PostgresSql) to deal with the roll over of Month 12 to Month 1. This effectively gives me the values for each month and its previous on each row of AllwidgetMonths.
The final bit is to take one value from the other.
Hey presto. I can try to update this to PostgresSQL but you may have more knowledge and can solve it quicker than I.
Here is another alternative to get the required data. Two CTE's are used, including one to contain the month numbers.
The SQL Fiddle can be accessed here.
WITH month_order as
(
SELECT 'Jan' as month, 1 as month_no, 12 as prev_month_no
UNION ALL
SELECT 'Feb' as month, 2 as month_no, 1 as prev_month_no
UNION ALL
SELECT 'Mar' as month, 3 as month_no, 2 as prev_month_no
UNION ALL
SELECT 'Apr' as month, 4 as month_no, 3 as prev_month_no
UNION ALL
SELECT 'May' as month, 5 as month_no, 4 as prev_month_no
UNION ALL
SELECT 'Jun' as month, 6 as month_no, 5 as prev_month_no
UNION ALL
SELECT 'Jul' as month, 7 as month_no, 6 as prev_month_no
UNION ALL
SELECT 'Aug' as month, 8 as month_no, 7 as prev_month_no
UNION ALL
SELECT 'Sep' as month, 9 as month_no, 8 as prev_month_no
UNION ALL
SELECT 'Oct' as month, 10 as month_no, 9 as prev_month_no
UNION ALL
SELECT 'Nov' as month, 11 as month_no, 10 as prev_month_no
UNION ALL
SELECT 'Dec' as month, 12 as month_no, 11 as prev_month_no
)
, values_all_months as
(
SELECT
month_order.prev_month_no as prev_month_no
, month_order.month_no as month_no
, w4.month as month
, w4.widget as widget
, COALESCE(w3.value, 0) as value
FROM widgets w3
RIGHT OUTER JOIN
(
SELECT
w1.widget as widget
,w2.month as month
FROM
(SELECT
DISTINCT
widget
FROM widgets) w1,
(SELECT
DISTINCT
month
FROM widgets) w2
) w4
ON w3.month = w4.month and w3.widget = w4.widget
INNER JOIN month_order
ON w4.month = month_order.month
)
SELECT mo.month, vam1.widget, vam1.value - COALESCE(vam2.value, 0) VALUE
FROM values_all_months vam1
LEFT OUTER JOIN values_all_months vam2
ON vam1.widget = vam2.widget AND vam1.prev_month_no = vam2.month_no
INNER JOIN month_order mo
ON vam1.month_no = mo.month_no
ORDER BY vam1.widget, (SELECT CASE vam1.month_no WHEN 12 THEN 0 ELSE vam1.month_no END);

How to make a time dependent distribution in SQL?

I have an SQL Table in which I keep project information coming from primavera.
Suppose that i have columns for Start Date,End Date,Duration, and Total Qty as shown below .
How can i distribute Total Qty over Months using these information. What kind of additional columns, sql queries i need in order to get correct monthly distribution?
Thanks in Advance.
Columns in order:
itemname,quantity,startdate,duration,enddate
item1 -- 108 -- 2013-03-25 -- 720 -- 2013-07-26
item2 -- 640 -- 2013-03-25 -- 720 -- 2013-07-26
.
.
I think the key is to break the records apart by month. Here is an example of how to do it:
with months as (
select 1 as mon union all select 2 union all select 3 union all
select 4 as mon union all select 5 union all select 6 union all
select 7 as mon union all select 8 union all select 9 union all
select 10 as mon union all select 11 union all select 12
)
select item, m.mon, quantity / nummonths
from (select t.*, (month(enddate) - month(startdate) + 1) as nummonths
from t
) t join
months m
on month(t.startDate) <= m.mon and
months(t.endDate) >= m.mon;
This works because all the months are within the same year -- as in your example. You are quite vague on how the split should be calculated. So, I assumed that every month from the start to the end gets an equal amount.