Getting average of product sales each day and calculate number of days that have positive sales - sql

I have this table TARGETSALE that have the following columns
SELECT DATE, WEEK, BRANCH, PROD, TARGETREACH
FROM TARGETSALE
WHERE BRANCH = 1
AND WEEK BETWEEN 52 AND 53;
DATE WEEK BRANCH PROD TARGETREACH
-------------------------------------------------------------------
01/09/2014 52 1 1 50
02/09/2014 52 1 1 -10
03/09/2014 52 1 1 50
04/09/2014 52 1 1 50
05/09/2014 52 1 1 40
06/09/2014 52 1 1 -10
07/09/2014 53 1 1 -5
08/09/2014 53 1 1 0
09/09/2014 53 1 1 10
10/09/2014 53 1 1 20
11/09/2014 53 1 1 30
12/09/2014 53 1 1 40
13/09/2014 53 1 1 0
01/09/2014 52 1 2 20
02/09/2014 52 1 2 0
03/09/2014 52 1 2 0
04/09/2014 52 1 2 10
05/09/2014 52 1 2 20
06/09/2014 52 1 2 10
07/09/2014 53 1 2 -10
08/09/2014 53 1 2 10
09/09/2014 53 1 2 -10
10/09/2014 53 1 2 20
11/09/2014 53 1 2 20
12/09/2014 53 1 2 40
13/09/2014 53 1 2 0
01/09/2014 52 1 3 30
02/09/2014 52 1 3 30
03/09/2014 52 1 3 5
04/09/2014 52 1 3 0
05/09/2014 52 1 3 10
06/09/2014 52 1 3 -10
07/09/2014 53 1 3 -10
08/09/2014 53 1 3 -10
09/09/2014 53 1 3 20
10/09/2014 53 1 3 10
11/09/2014 53 1 3 40
12/09/2014 53 1 3 10
13/09/2014 53 1 3 10
"targetsales" shows how much over the target the sales is, where negative means how far below the target the sales was. How can I do the following:
1. I need to get the average for all the product for each day. Something like this:
DATE BRANCH AVERAGE_SALES_OF_ALL_PRODUCT
01/09/2014 1 33.33
02/09/2014 1 -1.67
...and so on
And then I need to have another query that shows how many days within those two weeks that there's positive average sales. Something like this:
BRANCH 2WEEKS_SINCE DAYS_WITH_POSITIVE_AVERAGE_SALES
1 53 9
Above just an example not a real result.
Sorry, hope this not too confusing. Thank you so much.

In Oracle, the date type might still have a time component. If you do not know if this is there, then use trunc() to remove it:
select trunc(date), branch, avg(targetreach)
from targetsale
group by truncdate, branch
order by 1, 2;
For the second query, you want to use case:
select branch, count(distinct case when targetreach > 0 then date end) as DaysWithPositiveSales
from targetsales
group by branch;
If you know there is one row per date per branch -- and the time component of the date is empty -- then the distinct is not necessary.

1)
SELECT TRUNC(DATE, 'DD'), BRANCH, SUM(TARGETREACH)
FROM TARGETSALE WHERE BRANCH = 1 AND WEEK BETWEEN 52 AND 53
GROUP BY TRUNC(DATE, 'DD'), BRANCH;
2)
SELECT BRANCH, SUM(DECODE(ABS(TARGETREACH), 1, 1, 0)
FROM TARGETSALE WHERE BRANCH = 1 AND WEEK BETWEEN 52 AND 53
GROUP BY BRANCH;

Related

running total starting from a date column

I'm trying to get a running total as of a date. This is the data I have
Date
transaction Amount
End of Week Balance
jan 1
5
100
jan 2
3
100
jan 3
4
100
jan 4
3
100
jan 5
1
100
jan 6
3
100
I would like to find out what the daily end balance is. My thought is to get a running total from each day to the end of the week and subtract it from the end of week balance, like below
Date
transaction Amount
Running total
End of Week Balance
Balance - Running total
jan 1
5
19
100
86
jan 2
3
14
100
89
jan 3
4
11
100
93
jan 4
3
7
100
96
jan 5
1
4
100
97
jan 6
3
3
100
100
I can use
SUM(transactionAmount) OVER (Order by Date)
to get a running total, is there a way to specify that I only want the total of transactions that have taken place after the date?
You can use sum() as a window function, but accumulate in reverse:
select t.*,
(end_of_week_balance -
sum(transactionAmount) over (order by date desc)
)
from t;
If you have this example:
1> select i, sum(i) over (order by i) S from integers where i<10;
2> go
i S
----------- -----------
1 1
2 3
3 6
4 10
5 15
6 21
7 28
8 36
9 45
you can also do:
1> select i, sum(case when i>3 then i else 0 end) over (order by i) S from integers where i<10;
2> go
i S
----------- -----------
1 0
2 0
3 0
4 4
5 9
6 15
7 22
8 30
9 39

Hive Summing up data in the table based on the date range

Have a table with the following schema design and the data residing inside it is like:
ID HITS MISS DDATE
1 10 3 20180101
1 33 21 20180122
1 84 11 20180901
1 11 2 20180405
1 54 23 20190203
1 33 43 20190102
4 54 22 20170305
4 56 88 20180115
5 87 22 20180809
5 66 48 20180617
5 91 53 20170606
DataTypes:
ID INT
HITS INT
MISS INT
DDATE STRING
The requirement is to calculate the total of the given (HITS and MISS) on yearly basis i.e 2017,2018,2019...
Written the following query:
SELECT ID,
SUM(HITS) AS HITS,SUM(MISS) AS MISS,
CASE
WHEN DDATE BETWEEN '201701' AND '201712' THEN '2017' ELSE
'NOTHING' END AS TTL_YR17_DATA
CASE
WHEN DDATE BETWEEN '201801' AND '201812' THEN '2018' ELSE
'NOTHING' END AS TTL_YR18_DATA
CASE
WHEN DDATE BETWEEN '201901' AND '201912' THEN '2019' ELSE
'NOTHING' END AS TTL_YR19_DATA
FROM
HST_TABLE
WHERE
DDATE BETWEEN '201801' AND '201812'
GROUP BY
ID,DDATE;
But, the query is not fetching the expected result.
Actual O/P:
1 10 3 2018
1 33 21 2018
1 84 11 2018
1 11 2 2018
1 54 23 2019
1 33 43 2019
4 54 22 2017
4 56 88 2018
5 87 22 2018
5 66 48 2018
5 91 53 2017
Expected O/P:
1 138 37 2018
4 56 88 2018
5 153 70 2018
1 87 66 2019
5 91 53 2017
Another related question:
Is there a way that I can avoid passing the DDATE range in the query? As this should be given by the user and shouldn't be hardcoded.
Any help/advice to achieve the above two requirements will be really helpful.
OK,it's easy to implement this with the substring function in HIVE, as below:
select
substring(dddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(dddate,0,4),
id
order by
the_year,
id
The answer above by #Shawn.X is correct but has a logical flaw. Below is the corrected one:
select
substring(ddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(ddate,0,4),
id
order by
the_year,
id;

sum every 7 rows from column sales while ints representing n days away from installation of promotion-material (before and after the installation)

2 Stores, each with its sales data per day. Both get equipped with promotion material but not at the same day. After the pr_day the promotion material will stay there. Meaning, there should be a sales boost from the day of the installation of the promotion material.
Installation Date:
Store A - 05/15/2019
Store B - 05/17/2019
To see if the promotion was a success we measure the sales before the pr-date and after by returning number of sales (not revenue but pieces sold) next to the int, indicating how far away it was from the pr-day: (sum of sales from both stores)
pr_date| sales
-28 | 35
-27 | 40
-26 | 21
-25 | 36
-24 | 29
-23 | 36
-22 | 43
-21 | 31
-20 | 32
-19 | 21
-18 | 17
-17 | 34
-16 | 34
-15 | 37
-14 | 32
-13 | 29
-12 | 25
-11 | 45
-10 | 43
-9 | 26
-8 | 27
-7 | 33
-6 | 36
-5 | 17
-4 | 34
-3 | 33
-2 | 21
-1 | 28
1 | 16
2 | 6
3 | 16
4 | 29
5 | 32
6 | 30
7 | 30
8 | 30
9 | 17
10 | 12
11 | 35
12 | 30
13 | 15
14 | 28
15 | 14
16 | 16
17 | 13
18 | 27
19 | 22
20 | 34
21 | 33
22 | 22
23 | 13
24 | 35
25 | 28
26 | 19
27 | 17
28 | 29
you may noticed, that i already removed the day from the installation of the promotion material.
The issue starts with the different installation date of the pr-material. If I group by weekday it will combine the sales from different days away from the installation. It will just start at whatever weekday i define:
Select DATEDIFF(wk, change_date, sales_date), sum(sales)
from tbl_sales
group by DATEDIFF(wk, change_date, sales_date)
result:
week | sales
-4 | 75
-3 | 228
-2 | 204
-1 | 235
0 | 149
1 | 173
2 | 151
3 | 167
4 | 141
the numbers are not from the right days and there is one week to many. Guess this is comming from sql grouping the sales starting from Sunday and because the pr_dates are different it generates more than just the 8 weeks (4 before, 4 after)
trying to find a sustainable solution i couldn't find the right fit and decided to post it here. Very thankfull for every thoughts of the community about this topics. Quite sure there is a smart solution for this problem cause it doesn't look like a rare request to me
I tried it with over as well but i don't see how to sum the 7 days together as they are not date days anymore but delta to the pr-date
Desired Result:
week | sales
-4 | 240
-3 | 206
-2 | 227
-1 | 202
1 | 159
2 | 167
3 | 159
4 | 163
Attachment from my analysis by hand what the Results should be:
Why do i need the weekly summary -> the Stores are performing differently depending on the weekday. With summing 7 days together I make sure we don't compare mondays to sundays and so on. Furthermore, the result will be represented in a Line- or Barchart where you could see the weekday variation in a ugly way. Meaning it will be hard for your eyes to see the trend/devolopment of the salesnumbers. Whereas the weekly comparison will absorb this variations.
If anything is unclear please feel free to let me know so i could provide you with futher details
Thank you very much
Additional the different Installation date overview:
Shop A:
store A
delta date sales
-28 17.04.2019 20
-27 18.04.2019 20
-26 19.04.2019 13
-25 20.04.2019 25
-24 21.04.2019 16
-23 22.04.2019 20
-22 23.04.2019 26
-21 24.04.2019 15
-20 25.04.2019 20
-19 26.04.2019 13
-18 27.04.2019 13
-17 28.04.2019 20
-16 29.04.2019 21
-15 30.04.2019 20
-14 01.05.2019 17
-13 02.05.2019 13
-12 03.05.2019 9
-11 04.05.2019 34
-10 05.05.2019 28
-9 06.05.2019 19
-8 07.05.2019 14
-7 08.05.2019 23
-6 09.05.2019 18
-5 10.05.2019 9
-4 11.05.2019 22
-3 12.05.2019 17
-2 13.05.2019 14
-1 14.05.2019 19
0 15.05.2019 11
1 16.05.2019 0
2 17.05.2019 0
3 18.05.2019 1
4 19.05.2019 19
5 20.05.2019 18
6 21.05.2019 14
7 22.05.2019 11
8 23.05.2019 12
9 24.05.2019 8
10 25.05.2019 7
11 26.05.2019 19
12 27.05.2019 15
13 28.05.2019 15
14 29.05.2019 11
15 30.05.2019 5
16 31.05.2019 8
17 01.06.2019 10
18 02.06.2019 19
19 03.06.2019 14
20 04.06.2019 21
21 05.06.2019 22
22 06.06.2019 7
23 07.06.2019 6
24 08.06.2019 23
25 09.06.2019 17
26 10.06.2019 9
27 11.06.2019 8
28 12.06.2019 23
Shop B:
store B
delta date sales
-28 19.04.2019 15
-27 20.04.2019 20
-26 21.04.2019 8
-25 22.04.2019 11
-24 23.04.2019 13
-23 24.04.2019 16
-22 25.04.2019 17
-21 26.04.2019 16
-20 27.04.2019 12
-19 28.04.2019 8
-18 29.04.2019 4
-17 30.04.2019 14
-16 01.05.2019 13
-15 02.05.2019 17
-14 03.05.2019 15
-13 04.05.2019 16
-12 05.05.2019 16
-11 06.05.2019 11
-10 07.05.2019 15
-9 08.05.2019 7
-8 09.05.2019 13
-7 10.05.2019 10
-6 11.05.2019 18
-5 12.05.2019 8
-4 13.05.2019 12
-3 14.05.2019 16
-2 15.05.2019 7
-1 16.05.2019 9
0 17.05.2019 9
1 18.05.2019 16
2 19.05.2019 6
3 20.05.2019 15
4 21.05.2019 10
5 22.05.2019 14
6 23.05.2019 16
7 24.05.2019 19
8 25.05.2019 18
9 26.05.2019 9
10 27.05.2019 5
11 28.05.2019 16
12 29.05.2019 15
13 30.05.2019 17
14 31.05.2019 9
15 01.06.2019 8
16 02.06.2019 3
17 03.06.2019 8
18 04.06.2019 8
19 05.06.2019 13
20 06.06.2019 11
21 07.06.2019 15
22 08.06.2019 7
23 09.06.2019 12
24 10.06.2019 11
25 11.06.2019 10
26 12.06.2019 9
27 13.06.2019 6
28 14.06.2019 9
Try
select wk, sum(sales)
from (
select
isnull(sa.sales,0) + isnull(sb.sales,0) sales
, isnull(sa.delta , sb.delta) delta
, case when isnull(sa.delta , sb.delta) = 0 then 0
else case when isnull(sa.delta , sb.delta) > 0 then (isnull(sa.delta , sb.delta) -1) /7 +1
else (isnull(sa.delta , sb.delta) +1) /7 -1
end
end wk
from shopA sa
full join shopB sb on sa.delta=sb.delta
) t
group by wk;
sql fiddle
A more readable version, it doesn't run faster, just using CROSS APLLY this way allows to indroduce sort of intermediate variables for cleaner code.
select wk, sum(sales)
from (
select
isnull(sa.sales,0) + isnull(sb.sales,0) sales
, dlt delta
, case when dlt = 0 then 0
else case when dlt > 0 then (dlt - 1) / 7 + 1
else (dlt + 1) / 7 - 1
end
end wk
from shopA sa
full join shopB sb on sa.delta=sb.delta
cross apply (
select dlt = isnull(sa.delta, sb.delta)
) tmp
) t
group by wk;
Finally, if you already have a query which produces a dataset with the (pr_date, sales) columns
select wk, sum(sales)
from (
select sales
, case when pr_date = 0 then 0
else case when pr_date > 0 then (pr_date - 1) / 7 + 1
else (pr_date + 1) / 7 - 1
end
end wk
from (
-- ... you query here ...
)pr_date_sales
) t
group by wk;
I think you just need to take the day difference and use arithmetic. Using datediff() with week counts week-boundaries -- which is not what you want. That is, it normalizes the weeks to calendar weeks.
You want to leave out the day of the promotion, which makes this a wee bit more complicated.
I think this is the logic:
Select v.week_diff, sum(sales)
from tbl_sales s cross join
(values (case when change_date < sales_date
then (datediff(day, change_date, sales_date) + 1) / 7
else (datediff(day, change_date, sales_date) - 1) / 7
end)
) v(week_diff)
where change_date <> sales_date
group by v.week_diff;
There might be an off-by-one problem, depending on what you really want to do when the dates are the same.

How to verify whether records exist for the last x days (calendar days) in SQL not using the between key word

Want verify whether my table is having the records for the last 6 consecutive days in SQL
SNO FLIGHT_DATE LANDINGS
45 9/1/2013 1
31 10/1/2013 1
32 11/1/2013 1
30 11/24/2013 1
27 11/25/2013 1
28 11/26/2013 1
29 11/26/2013 1
33 11/26/2013 1
26 11/30/2013 1
25 12/1/2013 1
34 12/1/2013 1
24 12/2/2013 1
35 12/3/2013 1
36 12/3/2013 1
44 12/4/2013 1
46 12/6/2013 1
47 12/6/2013 1
Is this what you want?
SELECT
*
FROM
Table1
WHERE
FLIGHT_DATE > dateadd(day,-6,datediff(day,0,getdate()))
AND
FLIGHT_DATE < GETDATE();
SQL FIDDLE

SQL terminology to combine a NOT EXIST query with latest value

I am a beginner with basic knowledge.
I have a single table that I am trying to pull all UID's that have not had a particular code in the table within the past year.
My table looks like this: (but much larger of course)
FACID DPID EID DID UID DT Code Units Charge ET Ord
1 1 6 2 1002 15-Mar-07 99204 1 180 09:36.7 1
1 1 7 5 10004 15-Mar-07 99213 1 68 02:36.9 1
1 1 24 55 25887 15-Mar-07 99213 1 68 43:55.3 1
1 1 25 2 355688 15-Mar-07 99213 1 68 53:20.2 1
1 1 26 5 555654 15-Mar-07 99213 1 68 42:22.6 1
1 1 27 44 135514 15-Mar-07 99213 1 68 00:36.8 1
1 1 28 2 3244522 15-Mar-07 99214 1 98 34:59.4 1
1 1 29 5 235445 15-Mar-07 99213 1 68 56:42.1 1
1 1 30 3 3214444 15-Mar-07 99213 1 68 54:56.5 1
1 1 33 1 221444 15-Mar-07 99204 1 180 37:44.5 1
I am attempting to use the following, but this is not working for my time frame limits.
select distinct UID from PtProcTbl
where DT<'20120101'
and NOT EXISTS (Select Distinct UID
where Code in ('99203','99204','99205','99213',
'99214','99215','99244','99245'))
I need to know how to make sure the UID's that I am pulling are the ones don't have a DT after the 1/1/2012 cut off date that contains one of the NOT Exists codes.
The above query returned UID's that actually dates after 1/1/2012 that does contain one of the above codes...
Not sure what I am doing wrong or if I am totally off base on this..
Thanks in advance.
Are you sure you need the NOT EXISTS? How about instead:
AND Code NOT IN ('99203','99204','99205','99213','99214','99215','99244','99245')