Moving sum over date range - sql

I have this table that has wide range of dates and a corresponding value for each one of those dates, an example shown below.
Date Value
6/01/2013 8
6/02/2013 4
6/03/2013 1
6/04/2013 7
6/05/2013 1
6/06/2013 1
6/07/2013 3
6/08/2013 8
6/09/2013 4
6/10/2013 2
6/11/2013 10
6/12/2013 4
6/13/2013 7
6/14/2013 3
6/15/2013 2
6/16/2013 1
6/17/2013 7
6/18/2013 5
6/19/2013 1
6/20/2013 4
What I am trying to do is create a query that will create a new column that will display the sum of the Value’s column for a specified date range. For example down below, the sum column contains the sum of its corresponding date going back one full week. So the Sum of the date 6/9/2013 would be the sum of the values from 6/03/2013 to 6/09/2013.
Date Sum
6/01/2013 8
6/02/2013 12
6/03/2013 13
6/04/2013 20
6/05/2013 21
6/06/2013 22
6/07/2013 25
6/08/2013 25
6/09/2013 25
6/10/2013 26
6/11/2013 29
6/12/2013 32
6/13/2013 38
6/14/2013 38
6/15/2013 32
6/16/2013 29
6/17/2013 34
6/18/2013 29
6/19/2013 26
6/20/2013 23
I’ve tried to using the LIMIT clause but I could not get it to work, any help would be greatly appreciated.

zoo has a function rollapply which can do what you need:
z <- zoo(x$Value, order.by=x$Date)
rollapply(z, width = 7, FUN = sum, partial = TRUE, align = "right")
## 2013-06-01 8
## 2013-06-02 12
## 2013-06-03 13
## 2013-06-04 20
## 2013-06-05 21
## 2013-06-06 22
## 2013-06-07 25
## 2013-06-08 25
## 2013-06-09 25
## 2013-06-10 26
## 2013-06-11 29
## 2013-06-12 32
## 2013-06-13 38
## 2013-06-14 38
## 2013-06-15 32
## 2013-06-16 29
## 2013-06-17 34
## 2013-06-18 29
## 2013-06-19 26
## 2013-06-20 23

Using data.table
require(data.table)
#Build some sample data
data <- data.table(Date=1:20,Value=rpois(20,10))
#Build reference table
Ref <- data[,list(Compare_Value=list(I(Value)),Compare_Date=list(I(Date)))]
#Use lapply to get last seven days of value by id
data[,Roll.Val := lapply(Date, function(x) {
d <- as.numeric(Ref$Compare_Date[[1]] - x)
sum((d <= 0 & d >= -7)*Ref$Compare_Value[[1]])})]
head(data,10)
Date Value Roll.Val
1: 1 14 14
2: 2 7 21
3: 3 9 30
4: 4 5 35
5: 5 10 45
6: 6 10 55
7: 7 15 70
8: 8 14 84
9: 9 8 78
10: 10 12 83
Here is another solution if anyone is interested:
library("devtools")
install_github("boRingTrees","mgahan")
require(boRingTrees)
rollingByCalcs(data,dates="Date",target="Value",stat=sum,lower=0,upper=7)

Here is one way of doing it
> input <- read.table(text = "Date Value
+ 6/01/2013 8
+ 6/02/2013 4
+ 6/03/2013 1
+ 6/04/2013 7
+ 6/05/2013 1
+ 6/06/2013 1
+ 6/07/2013 3
+ 6/08/2013 8
+ 6/09/2013 4
+ 6/10/2013 2
+ 6/11/2013 10
+ 6/12/2013 4
+ 6/13/2013 7
+ 6/14/2013 3
+ 6/15/2013 2
+ 6/16/2013 1
+ 6/17/2013 7
+ 6/18/2013 5
+ 6/19/2013 1
+ 6/20/2013 4 ", as.is = TRUE, header = TRUE)
> input$Date <- as.Date(input$Date, format = "%m/%d/%Y") # convert Date
>
> # create a sequence that goes a week back from the current data
> x <- data.frame(Date = seq(min(input$Date) - 6, max(input$Date), by = '1 day'))
>
> # merge
> merged <- merge(input, x, all = TRUE)
>
> # replace NAs with zero
> merged$Value[is.na(merged$Value)] <- 0L
>
> # use 'filter' for the running sum and delete first 6
> input$Sum <- filter(merged$Value, rep(1, 7), sides = 1)[-(1:6)]
> input
Date Value Sum
1 2013-06-01 8 8
2 2013-06-02 4 12
3 2013-06-03 1 13
4 2013-06-04 7 20
5 2013-06-05 1 21
6 2013-06-06 1 22
7 2013-06-07 3 25
8 2013-06-08 8 25
9 2013-06-09 4 25
10 2013-06-10 2 26
11 2013-06-11 10 29
12 2013-06-12 4 32
13 2013-06-13 7 38
14 2013-06-14 3 38
15 2013-06-15 2 32
16 2013-06-16 1 29
17 2013-06-17 7 34
18 2013-06-18 5 29
19 2013-06-19 1 26
20 2013-06-20 4 23
>

Related

Grouping data by columns

I have a data set like this:
id_tecnico dia hora total
<chr> <dbl> <int> <int>
1 0011ab4f-6871-40f4-91f2-818e309baa41 8 13 1
2 0011ab4f-6871-40f4-91f2-818e309baa41 45 10 1
3 0011ab4f-6871-40f4-91f2-818e309baa41 46 9 1
4 0011ab4f-6871-40f4-91f2-818e309baa41 50 14 1
5 0011ab4f-6871-40f4-91f2-818e309baa41 58 12 1
6 0011ab4f-6871-40f4-91f2-818e309baa41 70 12 1
7 0011ab4f-6871-40f4-91f2-818e309baa41 81 11 1
8 0011ab4f-6871-40f4-91f2-818e309baa41 86 11 1
9 0011ab4f-6871-40f4-91f2-818e309baa41 89 9 1
10 0011ab4f-6871-40f4-91f2-818e309baa41 92 11 1
I would like to group the data the column total by hour, but I would like the result by column, not by row, creating a new column for each hour sum : hour1, hour2, hour3...
Can someone help me?

sum every 7 rows from column sales while ints representing n days away from installation of promotion-material (before and after the installation)

2 Stores, each with its sales data per day. Both get equipped with promotion material but not at the same day. After the pr_day the promotion material will stay there. Meaning, there should be a sales boost from the day of the installation of the promotion material.
Installation Date:
Store A - 05/15/2019
Store B - 05/17/2019
To see if the promotion was a success we measure the sales before the pr-date and after by returning number of sales (not revenue but pieces sold) next to the int, indicating how far away it was from the pr-day: (sum of sales from both stores)
pr_date| sales
-28 | 35
-27 | 40
-26 | 21
-25 | 36
-24 | 29
-23 | 36
-22 | 43
-21 | 31
-20 | 32
-19 | 21
-18 | 17
-17 | 34
-16 | 34
-15 | 37
-14 | 32
-13 | 29
-12 | 25
-11 | 45
-10 | 43
-9 | 26
-8 | 27
-7 | 33
-6 | 36
-5 | 17
-4 | 34
-3 | 33
-2 | 21
-1 | 28
1 | 16
2 | 6
3 | 16
4 | 29
5 | 32
6 | 30
7 | 30
8 | 30
9 | 17
10 | 12
11 | 35
12 | 30
13 | 15
14 | 28
15 | 14
16 | 16
17 | 13
18 | 27
19 | 22
20 | 34
21 | 33
22 | 22
23 | 13
24 | 35
25 | 28
26 | 19
27 | 17
28 | 29
you may noticed, that i already removed the day from the installation of the promotion material.
The issue starts with the different installation date of the pr-material. If I group by weekday it will combine the sales from different days away from the installation. It will just start at whatever weekday i define:
Select DATEDIFF(wk, change_date, sales_date), sum(sales)
from tbl_sales
group by DATEDIFF(wk, change_date, sales_date)
result:
week | sales
-4 | 75
-3 | 228
-2 | 204
-1 | 235
0 | 149
1 | 173
2 | 151
3 | 167
4 | 141
the numbers are not from the right days and there is one week to many. Guess this is comming from sql grouping the sales starting from Sunday and because the pr_dates are different it generates more than just the 8 weeks (4 before, 4 after)
trying to find a sustainable solution i couldn't find the right fit and decided to post it here. Very thankfull for every thoughts of the community about this topics. Quite sure there is a smart solution for this problem cause it doesn't look like a rare request to me
I tried it with over as well but i don't see how to sum the 7 days together as they are not date days anymore but delta to the pr-date
Desired Result:
week | sales
-4 | 240
-3 | 206
-2 | 227
-1 | 202
1 | 159
2 | 167
3 | 159
4 | 163
Attachment from my analysis by hand what the Results should be:
Why do i need the weekly summary -> the Stores are performing differently depending on the weekday. With summing 7 days together I make sure we don't compare mondays to sundays and so on. Furthermore, the result will be represented in a Line- or Barchart where you could see the weekday variation in a ugly way. Meaning it will be hard for your eyes to see the trend/devolopment of the salesnumbers. Whereas the weekly comparison will absorb this variations.
If anything is unclear please feel free to let me know so i could provide you with futher details
Thank you very much
Additional the different Installation date overview:
Shop A:
store A
delta date sales
-28 17.04.2019 20
-27 18.04.2019 20
-26 19.04.2019 13
-25 20.04.2019 25
-24 21.04.2019 16
-23 22.04.2019 20
-22 23.04.2019 26
-21 24.04.2019 15
-20 25.04.2019 20
-19 26.04.2019 13
-18 27.04.2019 13
-17 28.04.2019 20
-16 29.04.2019 21
-15 30.04.2019 20
-14 01.05.2019 17
-13 02.05.2019 13
-12 03.05.2019 9
-11 04.05.2019 34
-10 05.05.2019 28
-9 06.05.2019 19
-8 07.05.2019 14
-7 08.05.2019 23
-6 09.05.2019 18
-5 10.05.2019 9
-4 11.05.2019 22
-3 12.05.2019 17
-2 13.05.2019 14
-1 14.05.2019 19
0 15.05.2019 11
1 16.05.2019 0
2 17.05.2019 0
3 18.05.2019 1
4 19.05.2019 19
5 20.05.2019 18
6 21.05.2019 14
7 22.05.2019 11
8 23.05.2019 12
9 24.05.2019 8
10 25.05.2019 7
11 26.05.2019 19
12 27.05.2019 15
13 28.05.2019 15
14 29.05.2019 11
15 30.05.2019 5
16 31.05.2019 8
17 01.06.2019 10
18 02.06.2019 19
19 03.06.2019 14
20 04.06.2019 21
21 05.06.2019 22
22 06.06.2019 7
23 07.06.2019 6
24 08.06.2019 23
25 09.06.2019 17
26 10.06.2019 9
27 11.06.2019 8
28 12.06.2019 23
Shop B:
store B
delta date sales
-28 19.04.2019 15
-27 20.04.2019 20
-26 21.04.2019 8
-25 22.04.2019 11
-24 23.04.2019 13
-23 24.04.2019 16
-22 25.04.2019 17
-21 26.04.2019 16
-20 27.04.2019 12
-19 28.04.2019 8
-18 29.04.2019 4
-17 30.04.2019 14
-16 01.05.2019 13
-15 02.05.2019 17
-14 03.05.2019 15
-13 04.05.2019 16
-12 05.05.2019 16
-11 06.05.2019 11
-10 07.05.2019 15
-9 08.05.2019 7
-8 09.05.2019 13
-7 10.05.2019 10
-6 11.05.2019 18
-5 12.05.2019 8
-4 13.05.2019 12
-3 14.05.2019 16
-2 15.05.2019 7
-1 16.05.2019 9
0 17.05.2019 9
1 18.05.2019 16
2 19.05.2019 6
3 20.05.2019 15
4 21.05.2019 10
5 22.05.2019 14
6 23.05.2019 16
7 24.05.2019 19
8 25.05.2019 18
9 26.05.2019 9
10 27.05.2019 5
11 28.05.2019 16
12 29.05.2019 15
13 30.05.2019 17
14 31.05.2019 9
15 01.06.2019 8
16 02.06.2019 3
17 03.06.2019 8
18 04.06.2019 8
19 05.06.2019 13
20 06.06.2019 11
21 07.06.2019 15
22 08.06.2019 7
23 09.06.2019 12
24 10.06.2019 11
25 11.06.2019 10
26 12.06.2019 9
27 13.06.2019 6
28 14.06.2019 9
Try
select wk, sum(sales)
from (
select
isnull(sa.sales,0) + isnull(sb.sales,0) sales
, isnull(sa.delta , sb.delta) delta
, case when isnull(sa.delta , sb.delta) = 0 then 0
else case when isnull(sa.delta , sb.delta) > 0 then (isnull(sa.delta , sb.delta) -1) /7 +1
else (isnull(sa.delta , sb.delta) +1) /7 -1
end
end wk
from shopA sa
full join shopB sb on sa.delta=sb.delta
) t
group by wk;
sql fiddle
A more readable version, it doesn't run faster, just using CROSS APLLY this way allows to indroduce sort of intermediate variables for cleaner code.
select wk, sum(sales)
from (
select
isnull(sa.sales,0) + isnull(sb.sales,0) sales
, dlt delta
, case when dlt = 0 then 0
else case when dlt > 0 then (dlt - 1) / 7 + 1
else (dlt + 1) / 7 - 1
end
end wk
from shopA sa
full join shopB sb on sa.delta=sb.delta
cross apply (
select dlt = isnull(sa.delta, sb.delta)
) tmp
) t
group by wk;
Finally, if you already have a query which produces a dataset with the (pr_date, sales) columns
select wk, sum(sales)
from (
select sales
, case when pr_date = 0 then 0
else case when pr_date > 0 then (pr_date - 1) / 7 + 1
else (pr_date + 1) / 7 - 1
end
end wk
from (
-- ... you query here ...
)pr_date_sales
) t
group by wk;
I think you just need to take the day difference and use arithmetic. Using datediff() with week counts week-boundaries -- which is not what you want. That is, it normalizes the weeks to calendar weeks.
You want to leave out the day of the promotion, which makes this a wee bit more complicated.
I think this is the logic:
Select v.week_diff, sum(sales)
from tbl_sales s cross join
(values (case when change_date < sales_date
then (datediff(day, change_date, sales_date) + 1) / 7
else (datediff(day, change_date, sales_date) - 1) / 7
end)
) v(week_diff)
where change_date <> sales_date
group by v.week_diff;
There might be an off-by-one problem, depending on what you really want to do when the dates are the same.

Oracle - Group By Creating Duplicate Rows

I have a query that looks like this:
select nvl(trim(a.code), 'Blanks') as Ward, count(b.apcasekey) as UNSP, count(c.apcasekey) as GRAPH,
count(d.apcasekey) as "ANI/PIG",
(count(b.apcasekey) + count(c.apcasekey) + count(d.apcasekey)) as "TOTAL ACTIVE",
count(a.apcasekey) as "TOTAL OPEN" from (etc...)
group by a.code
order by Ward
The reason I have nvl(trim(a.code), 'Blanks') as Ward is that sometimes a.code is a blank string, sometimes it's a null.
The problem is that when I use the Group By statement, I can't use Ward or I get the error
Ward: Invalid Identifier
I can only use a.code so I get 2 rows for 'Blanks', as per below
1 Blanks 7 0 0 7 7
2 Blanks 23 1 1 25 30
3 W01 75 4 0 79 91
4 W02 62 1 0 63 72
5 W03 140 2 0 142 162
6 W04 6 1 0 7 7
7 W05 46 0 1 47 48
8 W06 322 46 1 369 425
9 W07 91 0 1 92 108
10 W08 93 2 0 95 104
11 W09 28 1 0 29 30
12 W10 25 0 0 25 28
What I need, is for the row with 'Blanks' to combined into 1 row. Little help?
Thanks.
You can not use the alias in the GROUP BY, but you can use the expression that builds the value:
GROUP BY nvl(trim(a.code), 'Blanks')

Spotfire - Sum() over

I'm trying to sum the values in column INDICATOR for the last 30 days from DATE, by account.
My expression is: Sum([INDICATOR]) over (Intersect([id],LastPeriods(30,[DATE]))) but the results are not accurate.
Any help is appreciated.
Sample data below:
DATE 30DAYSBACK ID INDICATOR RUNNING30 EXPECTED
3/2/16 2/1/16 ABC 1 3 3
3/2/16 2/1/16 ABC 1 3 3
3/2/16 2/1/16 ABC 1 3 3
3/7/16 2/6/16 ABC 1 7 7
3/7/16 2/6/16 ABC 1 7 7
3/7/16 2/6/16 ABC 1 7 7
3/7/16 2/6/16 ABC 1 7 7
3/8/16 2/7/16 ABC 1 10 10
3/8/16 2/7/16 ABC 1 10 10
3/8/16 2/7/16 ABC 1 10 10
3/10/16 2/9/16 ABC 1 12 12
3/10/16 2/9/16 ABC 1 12 12
3/14/16 2/13/16 ABC 1 13 13
3/15/16 2/14/16 ABC 1 14 14
3/16/16 2/15/16 ABC 1 15 15
3/21/16 2/20/16 ABC 1 16 16
3/22/16 2/21/16 ABC 1 17 17
3/23/16 2/22/16 ABC 1 19 19
3/23/16 2/22/16 ABC 1 19 19
3/25/16 2/24/16 ABC 1 20 20
3/29/16 2/28/16 ABC 1 22 22
3/29/16 2/28/16 ABC 1 22 22
3/30/16 2/29/16 ABC 1 27 27
3/30/16 2/29/16 ABC 1 27 27
3/30/16 2/29/16 ABC 1 27 27
3/30/16 2/29/16 ABC 1 27 27
3/30/16 2/29/16 ABC 1 27 27
3/31/16 3/1/16 ABC 1 29 29
3/31/16 3/1/16 ABC 1 29 29
4/1/16 3/2/16 ABC 1 31 31
4/1/16 3/2/16 ABC 1 31 31
4/4/16 3/5/16 ABC 1 32 29
4/5/16 3/6/16 ABC 1 33 30
4/13/16 3/14/16 ABC 1 34 27
4/13/16 3/14/16 ABC 1 34 27
4/13/16 3/14/16 ABC 1 34 27
4/13/16 3/14/16 ABC 1 34 27
4/15/16 3/16/16 ABC 1 35 24
4/20/16 3/21/16 ABC 1 31 26
4/20/16 3/21/16 ABC 1 31 26
4/20/16 3/21/16 ABC 1 31 26
4/25/16 3/26/16 ABC 1 31 25
4/25/16 3/26/16 ABC 1 31 25
4/25/16 3/26/16 ABC 1 31 25
4/26/16 3/27/16 ABC 1 31 26
4/27/16 3/28/16 ABC 1 34 29
4/27/16 3/28/16 ABC 1 34 29
4/27/16 3/28/16 ABC 1 34 29
4/27/16 3/28/16 ABC 1 34 29
4/28/16 3/29/16 ABC 1 35 30
I wan't able to determine a suitable solution within Spotfire. However, I was able to write some R code that allows for the Date and Indicator columns to be passed to it, sums the indicator for the past 30 days, and then returns this as a column.
Here is the code:
record.date <-as.Date(date.column, "%m/%d/%Y")
indicator.vector <- indicator.column
num.entry <- length(record.date)
running.thirty <- vector(length=num.entry)
count <- 0
for(i in 1:num.entry){
date.test <- record.date[i]
for(j in 1:num.entry){
if(record.date[j] >= date.test-30 ){
if(record.date[j] <= date.test){
count <- count + indicator.vector[j]
}
}
}
running.thirty[i] <- count
count <- 0
}
output<-running.thirty
Use Tools >> Register Data Function
(1) Insert the script
(2) Create the two input parameters:
input parameters
(3) Create the output parameter:
output parameter
NOTE: I think there are some errors in your expected values near the end of your data set.

How to check if all days of the month are in the database

This is my table wys_attendence:
id studid adate amonth ayear acls_id attendence
1 28 02 07 2015 10 1
2 31 02 07 2015 10 0
4 32 02 07 2015 10 1
5 28 13 07 2015 10 0
6 31 13 07 2015 10 1
7 32 13 07 2015 10 1
9 28 14 07 2015 10 1
10 31 14 07 2015 10 1
11 32 14 07 2015 10 1
13 28 15 07 2015 10 1
14 31 15 07 2015 10 0
15 32 15 07 2015 10 1
17 28 16 07 2015 10 0
18 31 16 07 2015 10 1
19 32 16 07 2015 10 1
21 28 17 07 2015 10 1
22 31 17 07 2015 10 1
23 32 17 07 2015 10 0
24 28 20 08 2015 10 1
25 31 20 08 2015 10 1
26 32 20 08 2015 10 0
I want to check if every day of a specific year and month is in the table, and display the results in a pivot table.
I am using this code:
$dayCount = date('t', strtotime('01-'. $amonth . '-' . $ayear));
$days = [];
for($i = 1; $i <= $dayCount; $i++) {
$days[] = $i; }
to display all the dates of a selected month and year.
I am using this code for controller:
for($i = 1; $i <= $dayCount; $i++) {
$days[] = sprintf('%02d/%02d/%04d', $i, $amonth, $ayear);
$attendance = DB::table('wys_teacherattendances')
->where('t_amonth', $amonth)
->where('t_ayear', $ayear)
->get();
}
The output I get is incorrect. This is what I get:
studid 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17.......... 31
28 0 1 1 1 0 1
31 1 1 1 1 1 1
32 0 1 1 1 1 0
But I want it like this:
studid 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17.......... 31
28 x 0 x x x x x x x x x x x 1 1 1 0 1.....x x x..x
31 x 1 x x x x x x x x x x x 1 1 1 1 1.....x x x..x
32 x 0 x x x x x x x x x x x 1 1 1 1 0.....x x x..x
How can I modify my query to achieve the above result as well as check if all days of my selected month and year are in the database or not? If studid doesn't have a specific day, I want to display x in the corresponding column, otherwise I want to display the value of attendance.
And also how to change my view.blade.php page to fetch my database table?
This is my view.blade.php code:
#for($i = 1; $i <= $dayCount; $i++)
<td>{{$i}}</td>
#endfor
</tr>
#foreach($stud as $studs)
<tr>
<td>{{$studs->sname}}</td>
#foreach($attendance as $attendances)
#if($studs->id == $attendances->t_auserid)
#if($attendances->t_attendance == 1)
<td><font color="green">p</font></td>
#elseif($attendances->t_attendance == 0)
<td><font color="red">a</font></td>
#endif
#endif
#endforeach
</tr>
#endforeach
</tr>
#endforeach