DB2/SQL aggregates with preceeding weekdays - sql

I have a query that currently gets daily records against a weekly number from a prepopulated table:
SELECT Employee,
sum(case when category = 'Shirts' then daily_total else 0 end) as Shirts_DAILY,
sum(case when category = 'Shirts' then weekly_quota else 0 end) as Shirts_QUOTA, -- this is a static column, this number is the same for every record
sum(case when category = 'Shoes' then daily_total else 0 end) as Shoes_DAILY,
sum(case when category = 'Shoes' then weekly_quota else 0 end) as Shoes_QUOTA, -- this is a static column, this number is the same for every record
CURRENT_DATE as DATE_OF_REPORT
from SalesNumbers
where date_of_report >= current_date
group by Employee;
This runs in a script nightly and returns records like this:
Employee | shirts_DAILY | shirts_QUOTA | Shoes_DAILY | Shoes_QUOTA | DATE_OF_REPORT
--------------------------------------------------------------------------------------------------------
123 15 75 14 85 2019-08-30
That's the record from last Friday Night's report. I'm trying to figure out a way to add a column for each category that would take the sum of daily totals (shirts_DAILY, shoes_DAILY) for each category on preceding weekdays (running sunday through saturday as a week) and divide by that category's quota (shirts_QUOTA, shoes_QUOTA).
For example, here are records from sunday through thursday
Employee | shirts_DAILY | shirts_QUOTA | Shoes_DAILY | Shoes_QUOTA | DATE_OF_REPORT
--------------------------------------------------------------------------------------------------------
123 15 75 16 85 2019-08-25
123 4 75 2 85 2019-08-26
123 8 75 6 85 2019-08-27
123 2 75 8 85 2019-08-28
123 15 75 14 85 2019-08-29
With my new change, I would want Friday night's record to take the sum of sunday through thursday's daily records and divide by the quota (including friday's daily in the sum)
Friday night's record with new column:
Employee | shirts_DAILY | shirts_QUOTA | shirtsPercent | Shoes_DAILY | Shoes_QUOTA | shoesPercent | DATE_OF_REPORT
-----------------------------------------------------------------------------------------------------------------------------------------------
123 2 75 61.3 7 85 62.4 2019-08-30
So friday's run added 15,4,8,2,15,2 for the shirts for 46/75 and 7,14,8,6,2,16 for shoes for 53/85. So the daily sum of each for the preceding week, including present day daily totals, if that makes sense.
What is the best way for me to achieve this?

SELECT Employee,
sum(case when category = 'Shirts' and date_of_report >= current date then
daily_total else 0 end) as Shirts_DAILY,
sum(case when category = 'Shirts' and date_of_report >= current date then
weekly_quota else 0 end) as Shirts_QUOTA,
( sum(case when category = 'Shirts' then
daily_total else 0 end) * 100 ) /
( sum(case when category = 'Shirts' and date_of_report >= current date then
weekly_quota else 0 end) ) as Shirts_PERCENT,
CURRENT_DATE as DATE_OF_REPORT
from SalesNumbers
where date_of_report >= ( current date - ( dayofweek(current date) - 1 ) days )
group by Employee

Related

Calculating weekly Hires, Rehires, and Terminations from monthly snapshot

We have a table that contains a snapshot of every employees data at the end of each month until the month they leave the company. This table also has the snapshot of each employee for the current day which is replaced each day until the end of the month.
What we're trying to do is select weekly statistics for Hires, Rehires, and Terms for each department. However since we only capture data by month and not by week, I'm having trouble breaking this down by week without getting duplicates.
I'm able to pull monthly statistics similar to this. Is there a method to group by each week in a month if there is only an entry for a month?
select
Max(AsOfDate) as AsOfDate,
Sector,
Department,
sum(case
when DatePart(Year, TermDate) = DatePart(Year, AsOfDate) and DatePart(Month, TermDate) = DatePart(Month, AsOfDate) then 1
else 0
end) as Terms,
sum(case
when DatePart(Year, HireDate) = DatePart(Year, AsOfDate) and DatePart(Month, HireDate) = DatePart(Month, AsOfDate) then 1
else 0
end) as Hires,
sum(case
when DatePart(Year, RehireDate) = DatePart(Year, AsOfDate) and DatePart(Month, RehireDate) = DatePart(Month, AsOfDate) then 1
else 0
end) as Rehires
from Employee_History
group by Year(AsOfDate), datepart(Month, AsOfDate), Department
Example data if today was 2022-03-17
AsOfDate
EmployeeID
Department
Title
HireDate
RehireDate
TermDate
2022-01-31
EMP22
HR
Admin
2021-01-12
null
2022-01-17
2022-01-31
EMP45
IT
Programmer
2022-01-10
null
null
2022-02-28
EMP45
IT
Programmer
2022-01-10
null
null
2022-03-17
EMP45
IT
Programmer
2022-01-10
null
null
2022-01-31
EMP03
IT
Manager
2018-08-17
2022-01-24
null
2022-02-28
EMP03
IT
Manager
2018-08-17
2022-01-24
null
2022-03-17
EMP03
IT
Manager
2018-08-17
2022-01-24
null
Desired output for January 2022 for example
AsOfDate
Department
Hires
Rehires
Terms
2022-01-01
HR
0
0
0
2022-01-08
HR
0
0
0
2022-01-15
HR
0
0
0
2022-01-22
HR
0
0
1
2022-01-29
HR
0
0
0
2022-01-01
IT
0
0
0
2022-01-08
IT
0
0
0
2022-01-15
IT
1
0
0
2022-01-22
IT
0
0
0
2022-01-29
IT
0
1
0
What you need is a mapping table for week <-> end of the Month thing containing:
create table weekmap(asOfDate DATE PRIMARY KEY, weekDayStart DATE, weekDayEnd DATE)
One problem is that your snapshot table contains "current date" if month isn't finished. I would advice to change that so it always has end of month to simplify stuff. Alternatively, create new column for that.
Populate it with whatever logic your weeks should be, some use ISO WEEK, some use day from start of new year etc.
Then you join your snapshot against this table (and you need to handle case where asOfDate isn't end of the month):
select w.asOfDate, w.weekDayStart, t.Department
, SUM(case when HireDate between weekdaystart and weekdayend then 1 else 0 end) AS hires
, SUM(case when ReHireDate between weekdaystart and weekdayend then 1 else 0 end) AS rehires
, SUM(case when TermDate between weekdaystart and weekdayend then 1 else 0 end) AS term
from snapshottable t
inner join weekmap w
ON w.asOfDate = t.asOfDateFixedEndOfMonth
group by w.asOfDate, w.weekDayStart, t.Department
There will be some loss of data if a guy is hired and fired twice in one month, but then you probably have a bigger problem

SQL query to get top 24 records, then average the first 12 and bottom 12

I'm attempting to analyze each account's performance (A_Count & B_Count) during their first year versus their second year. This should only return clients who have at least 24 months of totals (records).
Volume Table
Account
ReportDate
A_Count
B_Count
1001A
2019-01-01
47
100
1001A
2019-02-01
50
105
1002A
2019-02-01
50
105
I think I'm on the right track by wanting to grab the top 24 records for each account (only if 24 exist) and then grabbing the top 12 and bottom 12, but not sure how to get there.
I guess ideal output would be:
Account
YR1_A_Avg
YR1_B_Avg
YR2_A_Avg
YR2_B_Avg
FirstDate
LastDate
1001A
47
100
53
115
2019-01-01
2021-12-31
1002A
50
105
65
130
2019-02-01
2022-01-01
1003A
15
180
38
200
2017-05-01
2019-04-01
I'm not too worried about performance.
Assuming there are no gaps in ReportDate (per Account).
select Account
,avg(case when year_index = 1 then A_Count end) as YR1_A_Avg
,avg(case when year_index = 1 then B_Count end) as YR1_B_Avg
,avg(case when year_index = 2 then A_Count end) as YR2_A_Avg
,avg(case when year_index = 2 then B_Count end) as YR2_B_Avg
,min(ReportDate) as FirstDate
,max(ReportDate) as LastDate
from
(
select *
,count(*) over(partition by Account) as cnt
,(row_number() over(partition by Account order by ReportDate)-1)/12 +1 as year_index
from Volume
) t
where cnt >= 24 and year_index <= 2
group by Account

SQL Conditional Counting

I am working with a dataset that contains information about train delays. The dataset contains an arrival delay column and departing delay column. Each delay column is measured in minutes. I need to calculate the number of total delays for each day of the week to determine which day has the most train delays. If the delay is equal to or more than 1 minute, it needs to be counted as a delay. How can I complete this in SQL? I have tried the following code.
select dayofweek
count(case when arrivaldelay>=1 then 1 end)+
count(case when departuredelay>=1 then 1 end)
group by dayofweek;
dayofweek arrivaldelay departuredelay
2 12 5
4 7 10
4 6 -3
6 5 4
dayofweek delays
2 1
4 1
6 1
Assuming dayofweek is a stored column and not a function, then you can use either count or sum
select
dayofweek
, count(case when arrivaldelay >= 1 then 1 end)
+ count(case when departuredelay >= 1 then 1 end)
as delays
from mytable as t
group by dayofweek;
select
dayofweek
, sum(case when arrivaldelay >= 1 then 1 else 0 end)
+ sum(case when departuredelay >= 1 then 1 else 0 end)
as delays
from mytable as t
group by dayofweek;
both give the following result from the sample data in the question
+-----------+--------+
| dayofweek | delays |
+-----------+--------+
| 2 | 2 |
| 4 | 3 |
| 6 | 2 |
+-----------+--------+
IF dayofweek is NOT a stored column then you can extract the day of week from a date or timestamp, BUT there are differences in how this is achieved in different databases
demonstrated #db<>fiddle here
You can use sum() like this:
select dayofweek
( sum(case when arrivaldelay >= 1 then 1 else 0 end)+
sum(case when departuredelay >= 1 then 1 else 0 end)
)
from t
group by dayofweek;

day of week function for week day aggregates

I currently have a query that reads from a table and aggregates based on category. It gives me what I need but I"m trying to add another column that looks at all records for that category/employee combo for the days of this past week. SO if the job with this query runs on Wednesday Night, it needs to get a total of all category/employee records for Monday and Tuesday Night as well.
The query:
SELECT employee,
sum(case when category = 'Shoes' and date_of_report >= current_date - 1 days then daily_total else 0 end) as Shoes_DAILY,
sum(case when category = 'Shoes' and date_of_report >= ( current date - ( dayofweek(current date) - 1 ) days ) then sum(daily_total) else 0 end) as dailyTotalWeek
from shoeTotals
where date_of_report >= current_date
group by employee;
So the third column there is what's messing me up saying function use not valid. here's what I want:
The source table has these records for this past week:
employee | daily_total | date_of_report
--------------------------------------------------
123 14 2019-08-26
123 1 2019-08-27
123 56 2019-08-28
123 6 2019-08-29
123 8 2019-08-30 * today
My desired output would get (based on employee and category) the total for today (8) and then the sum of all the employees' records for that category on each preceding weekday. Running on Monday night would only count that days records, friday night would count monday through friday's as shown above.
employee | shoes_daily | dailyTotalWeek
--------------------------------------------------
123 8 85
What am I doing wrong with the dayofweek function?
You cannot nest aggregation functions. I think you simply want:
select employee,
sum(case when category = 'Shoes' and date_of_report >= current_date - 1 days
then daily_total else 0
end) as Shoes_DAILY,
sum(case when category = 'Shoes' and date_of_report >= ( current date - ( dayofweek(current date) - 1 ) days )
then daily_total else 0
end) as dailyTotalWeek
from shoeTotals
where date_of_report >= current date - ( dayofweek(current date) - 1 ) days
group by employee;

SQL Calculations for budgeting

I have a database that contains the following columns:
Vendor, Amount, StartDate, Months
I would like to be able to calculate the average monthly amount based on the Months that are entered. I would also like to see it calculate out from the start date to the end date based on the StartDate + Months calculation. The resulting table would look something like this:
Vendor1 has 2 months of 1112 starting Jan 1 while Vendor2 has 3 months of 2040 staring Feb 1
| | ANNUAL | JAN | FEB | MAR | APR |
Vendor1 | 2,224 | 1,112 | 1,112 | | |
Vendor2 | 6,120 | | 2,040 | 2,040 | 2,040 |
Any assistance or direction would be greatly appreciated.
That's a strange DB design. However, here's what you've got to try:
SELECT (Amount * Months) AS Annual, (Case #(StartDate < DATE("01.02.year")) WHEN 1 THEN Amount ELSE NULL) AS Jan FROM Table --etc for all months
Will think of modifications though, because this way is a little too straightforward.
You would use conditional aggregation. Assuming the start dates are all in the same year, the code might look like this:
select vendorid, (amount * months) as total,
(case when month(startdate) <= 1 and month(startdate) + months >= 1
then amount
end) as jan,
(case when month(startdate) <= 2 and month(startdate) + months >= 2
then amount
end) as feb,
(case when month(startdate) <= 3 and month(startdate) + months >= 3
then amount
end) as mar,
(case when month(startdate) <= 4 and month(startdate) + months >= 4
then amount
end) as apr,
from t;