How to calculate moving average in SQL?

How to calculate moving average in SQL? - sql

I've a table with 2 columns in SQL
+------+--------+
| WEEK | OUTPUT |
+------+--------+
| 1 | 10 |
| 2 | 20 |
| 3 | 30 |
| 4 | 40 |
| 5 | 50 |
| 6 | 50 |
+------+--------+
How do I calculate to sum up output for 2 weeks before (ex : on week 3, it will sum up the output for week 3, 2 and 1), I've seen many tutorials to do moving average but they are using date, in my case i want to use (int), is that possible ?.
Thanks !.

I think you want something like this :
SELECT *,
(SELECT Sum(output)
FROM table1 b
WHERE b.week IN( a.week, a.week - 1, a.week - 2 )) AS SUM
FROM table1 a
OR
In clause can be converted to between a.week-2 and a.week.
sql fiddle

You can use a self-join. The idea is to put you table beside itself with a condition that brings matching rows in a single row:
SELECT * FROM [output] o1
INNER JOIN [output] o2 ON o1.Week between o2.Week and o2.Week + 2
this select will produce this output:
o1.Week o1.Output o2.Week o2.Output
--------------------------------------------
1 10 1 10
2 20 1 10
2 20 2 20
3 30 1 10
3 30 2 20
3 30 3 30
4 40 2 20
4 40 3 30
4 40 4 40
and so on. Note that for weeks 1 and 2 there aren't previous weeks available.
Now you should just group the data by o1.Week and get the SUM:
SELECT o1.Week, SUM(o2.Output)
FROM [output] o1
INNER JOIN [output] o2 ON o1.Week between o2.Week and o2.Week + 2
GROUP BY o1.Week

If week is continuous, you can simply use Window function
SELECT [Week], [Output],
SUM([Output]) OVER (ORDER BY [Week] ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM dbo.SomeTable
Range is more accurate for your calculation, but it not implemented in SQL Server yet. Other database engines may support
SELECT [Week], [Output],
SUM([Output]) OVER (ORDER BY [Week] RANGE BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM dbo.SomeTable

Try this:
SELECT SUM(t1.output) / 3
FROM yourtable t1
WHERE t1.week <=
(select t2.week from yourtable t2 where t2.week - t1.week > 0 and t2.week - t1.week <= 2)

You are not written your sqlserver, if it is sqlserver2012 or above , then the simple example is
declare #table table(wk int,outpt int )
insert into #table values (1,10)
,(2,20)
,(3,30)
,(4,40)
,(5,50)
,(6,60)
select *,SUM(outpt) over(partition by id order by id rows between unbounded preceding and current row ) dd
from (
select * , 1 id
from #table
where wk < 5
) a

Related

Redshift: Add Row for each hour in a day

I have a table contains item_wise quantity at different hour of date. trying to add data for each hour(24 enteries in a day) with previous hour available quantity. For example for hour(2-10), it will be 5.
I created a table with hours enteries (1-24) & full join with shared table.
How can i add previous available entry. Need suggestion
item_id| date | hour| quantity
101 | 2022-04-25 | 2 | 5
101 | 2022-04-25 | 10 | 13
101 | 2022-04-25 | 18 | 67
101 | 2022-04-25 | 23 | 27

You can try to use generate_series to generate hours number, let it be the OUTER JOIN base table,
Then use a correlated-subquery to get your expect quantity column
SELECT t1.*,
(SELECT quantity
FROM T tt
WHERE t1.item_id = tt.item_id
AND t1.date = tt.date
AND t1.hour >= tt.hour
ORDER BY tt.hour desc
LIMIT 1) quantity
FROM (
SELECT DISTINCT item_id,date,v.hour
FROM generate_series(1,24) v(hour)
CROSS JOIN T
) t1
ORDER BY t1.hour

Provided the table of int 1 .. 24 is all24(hour) you can use lead and join
select t.item_id, t.date, all24.hour, t.quantity
from all24
join (
select *,
lead(hour, 1, 25) over(partition by item_id, date order by hour) - 1 nxt_h
from tbl
) t on all24.hour between t.hour and t.nxt_h

List value by group by sql server

I have a table as below
Id | PriceCardId | Days
1 | 1 | 2
2 | 1 | 4
3 | 1 | 5
4 | 1 | 6
5 | 1 | 3
6 | 2 | 5
7 | 2 | 3
8 | 3 | 6
How to write SQL query to get all PriceCardId has Day contain
List<int> days
Example days = [2,4,5,6], with data as above result is 1
Thanks!

I think you want:
select pricecardid
from t
where day in (2, 4, 5, 6)
group by pricecardid
having count(*) = 4; -- the number of days you are looking for
This assumes no duplicates in your table. If there are duplicates, use having count(distinct day) instead of count(*).
Note: You can phrase this as:
with d as (
select v.dy
from values ( (2), (4), (5), (6) ) v(dy)
)
select pricecardid
from t
where day in (select d.dy from d)
group by pricecardid
having count(*) = (select count(*) from d);

Try this:
SELECT PriceCardId
FROM [My_Table]
WHERE [Day] IN(2,4,5,6)
GROUP BY PriceCardId
HAVING COUNT(DISTINCT [Day])=4

Without using DISTINCT, how to group data without altering value?

I have a feeling this is a dumb question with a simple answer, but here goes.
How can I group the following data without using DISTINCT? #Table has 5 rows, which shows data for Hrs 5-9. I just don't like DISTINCT.
Since I need to display all hours of the day upto Hr9 (including 0-4), I'm joining it with table DimTime. DimTime has all hours, but with its 15-min intervals. So, DimTime looks like this:
Hour Minute
0 0
0 15
0 30
0 45
1 0
1 15
1 30
1 45
So here's my script:
declare #table table
(
Hour int,
Value int
)
insert into #table select 5, 25
insert into #table select 6, 34
insert into #table select 7, 54
insert into #table select 8, 65
insert into #table select 9, 11
select d.hour, t.hour, sum(value)
from #table t
left join dimtime d on d.hour = t.hour
group by d.hour, t.hour
If I use GROUP BY, then I need to have an aggregate function. So if I use SUM, it'll multiply all values by 4. If I remove the aggregate function, I'll get a syntax error.
Also, I cannot use a CTE since the contents in #table comes from a CTE (I just didn't include it here).
Here's the result that I need to display:
Hour Value
0 null
1 null
2 null
3 null
4 null
5 25
6 34
7 54
8 65
9 11

Simply add a condition WHERE minute = 0 to return only one row per hour.

If you really with to skip the sorting operation on dimtime with the use of distinct clause then check the below explanation.
Display all hours (0-9) from dimtime and sum the value given in #table for a particular hour:
SELECT
d.hour, SUM(t.value)
FROM
dimtime d
LEFT JOIN #table t
ON d.hour = t.hour
WHERE d.minute = 0 -- retrieves one row for every hour from dimtime
GROUP BY d.hour
ORDER BY d.hour -- not needed, but will give you resultset sorted by hour
Assuming that you have a row with value minute = 0 in your dimtable for every hour you could just limit the rows retrieved for join operation. That will work with any value from list 0, 15, 30, 45.
SUM() will work properly by summing all the values for a given hour in #table. If there are no rows with a particular hour, it will return 0 value.

You should have a better reason for not using a programming function than "I just don't like it"
You can have a CTE that uses another CTE
#dnoeth provided an excellent answer, but here's another option:
SELECT
d.hour,
t.value
FROM
#table t
INNER JOIN (SELECT DISTINCT hour FROM dimTime) d ON d.hour = t.hour

Try
SELECT *
FROM DimTime D
LEFT JOIN myTable T
ON D.Hour = T.Hour
WHERE D.Minute = 0
SQL Fiddle Demo
Output
| Hour | Minute | Hour | Value |
|------|--------|--------|--------|
| 0 | 0 | (null) | (null) |
| 1 | 0 | (null) | (null) |
| 2 | 0 | (null) | (null) |
| 3 | 0 | (null) | (null) |
| 4 | 0 | (null) | (null) |
| 5 | 0 | 5 | 25 |
| 6 | 0 | 6 | 34 |
| 7 | 0 | 7 | 54 |
| 8 | 0 | 8 | 65 |
| 9 | 0 | 9 | 11 |

If I use GROUP BY, then I need to have an aggregate function
Only if you include expressions in your SELECT that are not part of your group key. You could certainly do
select d.hour, t.hour, value
from #table t
inner join dimtime d
on d.hour = t.hour
group by d.hour, t.hour, value
or
select d.hour, t.hour, MIN(value)
from #table t
inner join dimtime d
on d.hour = t.hour
group by d.hour, t.hour
Note that the first query gives you the exact same results as DISTINCT (and may even be compiled to the same query plan) so I'm not sure what your aversion is to DISTINCT.

calculate sum based on value of other row in another column

I am trying to figure how can I calculate the number of days,the customer did not eat any candy.
Assuming that the Customer eats 1 candy/day.
If customer purchases more candy, it gets added to previous stock
Eg.
Day Candy Puchased
0 30
40 30
65 30
110 30
125 40
170 30
Answer here is 20.
Meaning on 0th day, customer brought 30 candies and his next purchase was on 40th day so he did not get to eat any candy between 30th to 39th day, also in the same way he did not eat any candy between 100th to 109th day.
Can anyone help me to write the query. I think I have got the wrong logic in my query.
select sum(curr.candy_purchased-(nxt.day-curr.day)) as diff
from candies as curr
left join candies as nxt
on nxt.day=(select min(day) from candies where day > curr.day)

You need a recursive CTE
First I need create a row_id so I use row_number
Now I need the base case for recursion.
Day: Mean how many day has pass. (0 from db)
PrevD: Is the Prev day amount so you can calculate Day (start at 0)
Candy Puchased: How many cadies bought (30 from db)
Remaining: How many candies left after eating (start at 0)
NotEat: How many days couldnt eat candy (start at 0)
Level: Recursion Level (start at 0)
Recursion Case
Day, PrevD, Candy Puchased are easy
Remaining: if I eat more than I have then 0
NotEat: Keep adding the diffence when doesnt have candy.
SQL Fiddle Demo
WITH Candy as (
SELECT
ROW_NUMBER() over (order by [Day]) as rn,
*
FROM Table1
), EatCandy ([Day], [PrevD], [Candy Puchased], [Remaining], [NotEat], [Level]) as (
SELECT [Day], 0 as [PrevD], [Candy Puchased], [Candy Puchased] as [Remaining], 0 as [NotEat], 1 as [Level]
FROM Candy
WHERE rn = 1
UNION ALL
SELECT c.[Day] - ec.[PrevD],
c.[Day],
c.[Candy Puchased],
c.[Candy Puchased] +
IIF((c.[Day] - ec.[PrevD]) > ec.[Remaining], 0, ec.[Remaining] - (c.[Day] - ec.[PrevD])),
ec.[NotEat] +
IIF((c.[Day] - ec.[PrevD]) > ec.[Remaining], (c.[Day] - ec.[PrevD]) - ec.[Remaining], 0),
ec.[Level] + 1
FROM Candy c
JOIN EatCandy ec
ON c.rn = ec.[level] + 1
)
select * from EatCandy
OUTPUT
| Day | PrevD | Candy Puchased | Remaining | NotEat | Level |
|-----|-------|----------------|-----------|--------|-------|
| 0 | 0 | 30 | 30 | 0 | 1 |
| 40 | 40 | 30 | 30 | 10 | 2 |
| 25 | 65 | 30 | 35 | 10 | 3 |
| 45 | 110 | 30 | 30 | 20 | 4 |
| 15 | 125 | 40 | 55 | 20 | 5 |
| 45 | 170 | 30 | 40 | 20 | 6 |
Just add SELECT MAX(NotEat) over the last query

Nice question.
Check my answer and also try with different sample data.
and please,if with different sample data it is not working then let me know.
declare #t table([Day] int, CandyPuchased int)
insert into #t
values (0, 30),(40,30),(65, 30)
,(110, 30),(125,40),(170,30)
select * from #t
;With CTE as
(
select *,ROW_NUMBER()over(order by [day])rn from #t
)
,CTE1 as
(
select [day],[CandyPuchased],rn from CTE c where rn=1
union all
select a.[Day],case when a.Day-b.Day<b.CandyPuchased
then a.CandyPuchased+(b.CandyPuchased-(a.Day-b.Day))
else a.CandyPuchased end CandyPuchased
,a.rn from cte A
inner join CTE B on a.rn=b.rn+1
)
--select * from CTE1
select sum(case when a.Day-b.Day>b.CandyPuchased
then (a.Day-b.Day)-b.CandyPuchased else 0 end)[CandylessDays]
from CTE1 A
inner join CTE1 b on a.rn=b.rn+1

If you just need the result at the end of the series, you don't really need that join.
select max(days) --The highest day in the table (convert these to int first)
- (sum(candies) --Total candies purchased
- (select top 1 candies from #a order by days desc)) --Minus the candies purchased on the last day
from MyTable
If you need this as a sort of running total, try over:
select *, sum(candies) over (order by days) as TotalCandies
from MyTable
order by days desc

Get Monthly Totals from Running Totals

I have a table in a SQL Server 2008 database with two columns that hold running totals called Hours and Starts. Another column, Date, holds the date of a record. The dates are sporadic throughout any given month, but there's always a record for the last hour of the month.
For example:
ContainerID | Date | Hours | Starts
1 | 2010-12-31 23:59 | 20 | 6
1 | 2011-01-15 00:59 | 23 | 6
1 | 2011-01-31 23:59 | 30 | 8
2 | 2010-12-31 23:59 | 14 | 2
2 | 2011-01-18 12:59 | 14 | 2
2 | 2011-01-31 23:59 | 19 | 3
How can I query the table to get the total number of hours and starts for each month between two specified years? (In this case 2011 and 2013.) I know that I need to take the values from the last record of one month and subtract it by the values from the last record of the previous month. I'm having a hard time coming up with a good way to do this in SQL, however.
As requested, here are the expected results:
ContainerID | Date | MonthlyHours | MonthlyStarts
1 | 2011-01-31 23:59 | 10 | 2
2 | 2011-01-31 23:59 | 5 | 1

Try this:
SELECT c1.ContainerID,
c1.Date,
c1.Hours-c3.Hours AS "MonthlyHours",
c1.Starts - c3.Starts AS "MonthlyStarts"
FROM Containers c1
LEFT OUTER JOIN Containers c2 ON
c1.ContainerID = c2.ContainerID
AND datediff(MONTH, c1.Date, c2.Date)=0
AND c2.Date > c1.Date
LEFT OUTER JOIN Containers c3 ON
c1.ContainerID = c3.ContainerID
AND datediff(MONTH, c1.Date, c3.Date)=-1
LEFT OUTER JOIN Containers c4 ON
c3.ContainerID = c4.ContainerID
AND datediff(MONTH, c3.Date, c4.Date)=0
AND c4.Date > c3.Date
WHERE
c2.ContainerID is null
AND c4.ContainerID is null
AND c3.ContainerID is not null
ORDER BY c1.ContainerID, c1.Date

Using recursive CTE and some 'creative' JOIN condition, you can fetch next month's value for each ContainterID:
WITH CTE_PREP AS
(
--RN will be 1 for last row in each month for each container
--MonthRank will be sequential number for each subsequent month (to increment easier)
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY ContainerID, YEAR(Date), MONTH(DATE) ORDER BY Date DESC) RN
,DENSE_RANK() OVER (ORDER BY YEAR(Date),MONTH(Date)) MonthRank
FROM Table1
)
, RCTE AS
(
--"Zero row", last row in decembar 2010 for each container
SELECT *, Hours AS MonthlyHours, Starts AS MonthlyStarts
FROM CTE_Prep
WHERE YEAR(date) = 2010 AND MONTH(date) = 12 AND RN = 1
UNION ALL
--for each next row just join on MonthRank + 1
SELECT t.*, t.Hours - r.Hours, t.Starts - r.Starts
FROM RCTE r
INNER JOIN CTE_Prep t ON r.ContainerID = t.ContainerID AND r.MonthRank + 1 = t.MonthRank AND t.Rn = 1
)
SELECT ContainerID, Date, MonthlyHours, MonthlyStarts
FROM RCTE
WHERE Date >= '2011-01-01' --to eliminate "zero row"
ORDER BY ContainerID
SQLFiddle DEMO (I have added some data for February and March in order to test on different lengths of months)
Old version fiddle

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to calculate moving average in SQL? - sql

I think you want something like this : SELECT *, (SELECT Sum(output) FROM table1 b WHERE b.week IN( a.week, a.week - 1, a.week - 2 )) AS SUM FROM table1 a OR In clause can be converted to between a.week-2 and a.week. sql fiddle

Try this: SELECT SUM(t1.output) / 3 FROM yourtable t1 WHERE t1.week <= (select t2.week from yourtable t2 where t2.week - t1.week > 0 and t2.week - t1.week <= 2)

Related

Redshift: Add Row for each hour in a day

List value by group by sql server

Without using DISTINCT, how to group data without altering value?

calculate sum based on value of other row in another column

Get Monthly Totals from Running Totals

Categories

Resources