Running cumulative count group by month - sql

I have an table with values like this:
count month-year
6 12-2020
5 12-2020
4 11-2020
3 11-2020
3 10-2020
2 10-2020
2 09-2020
1 09-2020
I want to group the data by the month and show the sum of the count for the current month and the months before it. I am expecting the following output:
count month-year
26 12-2020 <- month 12 count equal to month 12 sum + count start from month 9
15 11-2020 <- month 11 count equal to month 11 sum + count start from month 9
8 10-2020 <- month 10 count equal to month 9 sum + month 10
3 09-2020 <- assume month 9 is the launch month, count = sum count of month 9

You want to use SUM here twice, both as an aggregate and as an analytic function:
SELECT
[month-year],
SUM(SUM(count)) OVER (ORDER BY [month-year]) AS count
FROM yourTable
GROUP BY
[month-year]
ORDER BY
[month-year] DESC;
Demo

There is another way to calculate the desired result
select Distinct [month-year] ,
SUM(count) OVER (ORDER BY [month-year]) AS count
from yourTable
order by [month-year] desc

Related

SQL (Redshift) - Rolling average with all preceding values

I have a table like so:
Day
Value
1
3
1
5
1
1
2
4
2
7
3
1
3
1
3
2
3
5
How do I create a rolling average that takes into account all previous days to produce a table like so:
Day
Rolling_avg
1
3
2
4
3
3.22
Day1 = avg(day 1 values)
Day2 = avg(day1 + day2 values)
Day3 = avg(day1 + day2 + day3 values)
so on so forth..thank you!
First aggregate by day to get the sum of values and counts for each day. Then use analytic functions to find the rolling averages.
WITH cte AS (
SELECT Day, SUM(Value) ValueSum, COUNT(*) AS Count
FROM yourTable
GROUP BY Day
)
SELECT Day, SUM(ValueSum) OVER (ORDER BY Day) /
SUM(Count) OVER (ORDER BY Day) AS Rolling_avg
FROM cte
ORDER BY Day;
Demo

SQL query for incoming and outgoing stocks, first and last

I need to make a query that shows sales and stocks (incoming and outgoing) for each model in October 2021.
The point is that for obtaining incoming and outgoing stocks I need to get vt_stocks_cube_sz.qty respectively for the first day of month and for the last day of month .
Now I wrote just sum of stocks (SUM(vt_stocks_cube_sz.qty) as stocks) but it isn't correct.
Could you help me to split the stocks according to the rule above, I cannot understant how to write the query correctly.
%%time
SELECT vt_sales_cube_sz.modc_barc2 model,
SUM(vt_sales_cube_sz.qnt) sales,
SUM(vt_stocks_cube_sz.qty) as stocks
FROM vt_sales_cube_sz
LEFT JOIN vt_date_cube2
ON vt_sales_cube_sz.id_calendar_int = vt_date_cube2.id_calendar_int
LEFT JOIN vt_stocks_cube_sz ON
vt_stocks_cube_sz.parent_modc_barc = vt_sales_cube_sz.modc_barc AND
vt_stocks_cube_sz.id_stock = vt_sales_cube_sz.id_stock AND
vt_stocks_cube_sz.id_calendar_int = vt_sales_cube_sz.id_calendar_int AND
vt_stocks_cube_sz.vipusk_type = vt_sales_cube_sz.price_type
WHERE vt_date_cube2.wk_year_id = 2021
AND vt_date_cube2.wk_MoY_id = 10
AND vt_sales_cube_sz.id_stock IN
(SELECT id_stock
FROM vt_warehouse_cube
WHERE channel = \'OffLine\')
GROUP BY vt_sales_cube_sz.modc_barc2
If you're looking for a robust and generalizable approach I'd suggest using analytic functions such as FIRST_VALUE, LAST_VALUE or something slightly different with RANK or ROW_NUMBER.
A simple example follows, so you can rerun it on your side and adjust it to the specific tables/fields you're using.
N.B.: You might need some tiebreakers in case you had multiple entries for the same first/last day.
with dummy_table as (
SELECT 1 as month, 1 as day, 10 as value UNION ALL
SELECT 1 as month, 2 as day, 20 as value UNION ALL
SELECT 1 as month, 3 as day, 30 as value UNION ALL
SELECT 2 as month, 1 as day, 5 as value UNION ALL
SELECT 2 as month, 3 as day, 15 as value UNION ALL
SELECT 2 as month, 5 as day, 25 as value
)
SELECT
month,
day,
case when day = first_day then 'first' else 'last' end as type,
value,
FROM (
SELECT *
, FIRST_VALUE(day) over (partition by month order by day ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as first_day
, LAST_VALUE(day) over (partition by month order by day ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as last_day
FROM dummy_table
) tmp
WHERE day = first_day OR day=last_day
Dummy table:
Row
month
day
value
1
1
1
10
2
1
2
20
3
1
3
30
4
2
1
5
5
2
3
15
6
2
5
25
Result:
Row
month
day
type
value
1
1
1
first
10
2
1
3
last
30
3
2
1
first
5
4
2
5
last
25

filling missing combination in a table

input:
item loc month year qty
A DEL 5 2020 12
A DEL 6 2020 14
A DEL 8 2020 16
A DEL 9 2020 17
output:
item loc month year qty
A DEL 5 2020 12
A DEL 6 2020 14
A DEL 7 2020 26
A DEL 8 2020 16
A DEL 9 2020 17
A DEL 10 2020 33
description:
I don't have month 7 in my input. So for calculating month 7 i do sum of previous two months quantity.
for example for month 7 output will be 12(from month 5)+14(from month 6)=26
So its like whenever any month will be missing i should fill that month with this logic.
I have written a script which is two step process but it only considers missing month between the values and not boundary values i.e. it wont assume 10 is missing as it is a boundary value.
1st Step: Insert the misisng month with NULL for all other columns.
INSERT INTO TEST_MISSING(MONTH)
select min_a - 1 + level
from ( select min(MONTH) min_a
, max(MONTH) max_a
from TEST_MISSING
)
connect by level <= max_a - min_a + 1
minus
select MONTH
from TEST_MISSING;
2nd Step: Populate the values of other columns using lag with values from rows about it.
and then using Window function calculate the quantity value.
SELECT NVL(ITEM, NEW_ITEM) ITEM,
NVL(LOC, NEW_LOC) LOC,
MONTH, NVL(YEAR, NEW_YEAR) YEAR,
CASE WHEN QTY IS NULL THEN SUM(NVL(QTY, 0)) OVER(PARTITION BY NEW_ITEM ORDER BY MONTH ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING) ELSE QTY END AS QTY
FROM (
SELECT A.*,
nvl(item,CASE WHEN ITEM IS NULL THEN (LAG(ITEM) OVER(ORDER BY MONTH)) END) NEW_ITEM,
nvl(LOC,CASE WHEN LOC IS NULL THEN (LAG(LOC) OVER(ORDER BY MONTH)) END) NEW_LOC,
nvl(YEAR,CASE WHEN YEAR IS NULL THEN (LAG(YEAR) OVER(ORDER BY MONTH)) END) NEW_YEAR
FROM TEST_MISSING A)
X
ORDER BY MONTH;

sql command to find average count of user visits to a website from past 6 months

I have a table with 2 columns, Date and number of visits.
i need to calculate average count difference of visits by month from past 6 months
Date Number_of_Visits
2018-04-06 5
2018-02-06 6
2017-04-10 3
2017-02-10 9
SQL should output
Avg_count difference visits past 6 months
5-3=2
6-9=-3
-3+2/2=-0.5
sql query output should be -0.5
creating sql as below
With cte as (
SELECT Year(v1.date) as Year, Month(v1.date) as Month, sum(v1.visits) as SumCount
FROM visits_table v1
group by Year(v1.date), Month(v1.date)
)
You wanted the average of the different of the same month over the years ? Year on Year comparison ?
This will gives you the result that you want -0.5
; With
cte as
(
SELECT Year(v1.date) as Year, Month(v1.date) as Month, sum(v1.visits) as SumCount
FROM visits_table v1
WHERE v1.date >= DATEADD(MONTH, -6, GETDATE()) -- Add here
group by Year(v1.date), Month(v1.date)
)
SELECT AVG (diff * 1.0)
FROM
(
SELECT *, diff = SumCount
- LAG (SumCount) OVER (PARTITION BY Month
ORDER BY Year)
FROM cte
) d

Grouping data on SQL Server

I have this table in SQL Server:
Year Month Quantity
----------------------------
2015 January 10
2015 February 20
2015 March 30
2014 November 40
2014 August 50
How can I identify the different years and months adding two more columns that group the same years with a number and then different months in sequential way like the example
Year Month Quantity Group Subgroup
------------------------------------------------
2015 January 10 1 1
2015 February 20 1 2
2015 March 30 1 3
2014 November 40 2 1
2014 August 50 2 2
You can use DENSE_RANK to calculate the groups for you:
SELECT t1.*, DENSE_RANK() OVER (ORDER BY Year DESC) AS [Group],
DENSE_RANK() OVER (PARTITION BY Year ORDER BY DATEPART(month, Month + ' 01 2010')) AS [SubGroup]
FROM t1
ORDER BY 4, 5
See this fiddle.
To associate group and subgroup with a number you can do this:
WITH RankedTable AS (
SELECT year, month, quantity,
ROW_NUMBER() OVER (partition by year order by Month) AS rn
FROM yourtable)
SELECT year, month, quantity,
SUM (CASE WHEN rn = 1 THEN 1 ELSE 0 END) OVER (ORDER BY YEAR) as year_group,
rn AS subgroup
FROM RankedTable
Here ROW_NUMBER() OVER clause calculates rank of a month within a year.
And SUM() ... OVER calculates running SUM for the months with rank 1.
SQL Fiddle