Calculating a running count of Weeks - sql

I am looking to calculate a running count of the weeks that have occurred since a starting point. The biggest problem here is that the calendar I am working on is not a traditional Gregorian calendar.
The easiest dimension to reference would be something like 'TWEEK' which actually tells you the week of the year that the record falls into.
Example data:
CREATE TABLE #foobar
( DateKey INT
,TWEEK INT
,CumWEEK INT
);
INSERT INTO #foobar (DateKey, TWEEK, CumWEEK)
VALUES(20150630, 1,1),
(20150701,1,1),
(20150702,1,1),
(20150703,1,1),
(20150704,1,1),
(20150705,1,1),
(20150706,1,1),
(20150707,2,2),
(20150708,2,2),
(20150709,2,2),
(20150710,2,2),
(20150711,2,2),
(20150712,2,2),
(20150713,2,2),
(20150714,1,3),
(20150715,1,3),
(20150716,1,3),
(20150717,1,3),
(20150718,1,3),
(20150719,1,3),
(20150720,1,3),
(20150721,2,4),
(20150722,2,4),
(20150723,2,4),
(20150724,2,4),
(20150725,2,4),
(20150726,2,4),
(20150727,2,4)
For sake of ease, I did not go all the way to 52, but you get the point. I am trying to recreate the 'CumWEEK' column. I have a column already that tells me the correct week of the year according to the weird calendar convention ('TWEEK').
I know this will involve some kind of OVER() windowing, but I cannot seem to figure It out.

The windowing function LAG() along with a summation of ORDER BY ROWS BETWEEN "Changes" should get you close enough to work with. The caveat to this is that the ORDER BY ROWS BETWEEN can only take an integer literal.
Year Rollover : I guess you could create another ranking level based on mod 52 to start the count fresh. So 53 would become year 2, week 1, not 53.
SELECT
* ,
SUM(ChangedRow) OVER (ORDER BY DateKey ROWS BETWEEN 99999 PRECEDING AND CURRENT ROW)
FROM
(
SELECT
DateKey,
TWEEK,
ChangedRow=CASE WHEN LAG(TWEEK) OVER (ORDER BY DateKey) <> TWEEK THEN 1 ELSE 0 END
FROM
#foobar F2
)AS DETAIL

Some minutes ago I answered a different question, in a way this is a similar question to
https://stackoverflow.com/a/31303395/5089204
The idea is roughly to create a table of a running number and find the weeks with modulo 7. This you could use as grouping in an OVER clause...
EDIT: Example
CREATE FUNCTION dbo.RunningNumber(#Counter AS INT)
RETURNS TABLE
AS
RETURN
SELECT TOP (#Counter) ROW_NUMBER() OVER(ORDER BY o.object_id) AS RunningNumber
FROM sys.objects AS o; --take any large table here...
GO
SELECT 'test',CAST(numbers.RunningNumber/7 AS INT)
FROM dbo.RunningNumber(100) AS numbers
Dividing by 7 "as INT" offers a quite nice grouping criteria.
Hope this helps...

Related

Adding x work days onto a date in SQL Server?

I'm a bit confused if there is a simple way to do this.
I have a field called receipt_date in my data table and I wish to add 10 working days to this (with bank holidays).
I'm not sure if there is any sort of query I could use to join onto this table from my original to calculate 10 working days from this, I've tried a few sub queries but I couldn't get it right or perhaps its not possible to do this. I didn't know if there was a way to extract the 10th rowcount after the receipt date to get the calendar date if I only include 'Y' into the WHERE?
Any help appreciated.
This is making several assumptions about your data, because we have none. One method, however, would be to create a function, I use a inline table value function here, to return the relevant row from your calendar table. Note that this assumes that the number of days must always be positive, and that if you provide a date that isn't a working day that day 0 would be the next working day. I.e. adding zero working days to 2021-09-05 would return 2021-09-06, or adding 3 would return 2021-09-09. If that isn't what you want, this should be more than enough for you to get there yourself.
CREATE FUNCTION dbo.AddWorkingDays (#Days int, #Date date)
RETURNS TABLE AS
RETURN
WITH Dates AS(
SELECT CalendarDate,
WorkingDay
FROM dbo.CalendarTable
WHERE CalendarDate >= #Date)
SELECT CalendarDate
FROM Dates
WHERE WorkingDay = 1
ORDER BY CalendarDate
OFFSET #Days ROWS FETCH NEXT 1 ROW ONLY;
GO
--Using the function
SELECT YT.DateColumn,
AWD.CalendarDate AS AddedWorkingDays
FROM dbo.YourTable YT
CROSS APPLY dbo.AddWorkingDays(10,YT.DateColumn) AWD;

SQL: Dynamic Join Based on Row Value

Context:
I am working with some complicated schema and have got many CTEs and joins to get to this point. This is a watered-down version and completely different source data and example to illustrate my point (data anonymity). Hopefully it provides enough of a snapshot.
Data Overview:
I have a service which generates a production forecast looking ahead 30 days. The forecast is generated for each facility, for each shift (morning/afternoon). Each forecast produced covers all shifts (morning/afternoon/evening) so they share a common generation_id but different forecast_profile_key.
What I am trying to do: I want to find the SUM of the forecast error for a given forecast generation constrained by a dynamic date range based on whether the date is a weekday or weekend. The SUM must be grouped only on similar IDs.
Basically, the temp table provides one record per facility per date per shift with the forecast error. I want to SUM the historical error dynamically for a facility/shift/date based on whether the date is weekday/weekend, and only SUM the error where the IDs match up.. (hope that makes sense!!)
Specifics: I want to find the SUM grouped by 'week_part_grouping', 'forecast_profile_key', 'forecast_profile' and 'forecast_generation_id'. The part I am struggling with is that I only want to SUM the error dynamically based on date: (a) if the date is a weekday, I want to SUM the error from up to the 5 recent-most days in a 7 day look back period, or (b) if the date is a weekend, I want to SUM the error from up to the 3 recent-most days in a 16 day look back period.
Ideally, having an extra column for 'total_forecast_error_in_lookback_range'.
Specific examples:
For 'facility_a', '2020-11-22' is a weekend. The lookback range is 16 days, so any date between '2020-11-21' and '2020-11-05' is eligible. The 3 recent-most dates would be '2020-11-21', '2020-11-15' and '2020-11'14'. Therefore, the sum of error would be 2000+3250+1050.
For 'facility_a', '2020-11-20' is a weekday. The lookback range is 7 days, so any date between '2020-11-19 and '2020-11-13'. That would work out to be '2020-11-19':'2020-11-16' and '2020-11-13'.
For 'facility_b', notice there is a change in the 'forecast_generation_id'. So, the error for '2020-11-20' would be only be 4565.
What I have tried: I'll confess to not being quite sure how to break down this portion. I did consider a case statement on the week_part but then got into a nested mess. I considered using a RANK windowed function but I didn't make much progress as was unsure how to implement the dynamic lookback component. I then also thought about doing some LISTAGG to get all the dates and do a REGEXP wildcard lookup but that would be very slow..
I am seeking pointers how to go about achieving this in SQL. I don't know if I am missing something from my toolkit here to go about breaking this down into something I can implement.
DROP TABLE IF EXISTS seventh__error_calc;
create temporary table seventh__error_calc
(
facility_name varchar,
shift varchar,
date_actuals date,
week_part_grouping varchar,
forecast_profile_key varchar,
forecast_profile_id varchar,
forecast_generation_id varchar,
count_dates_in_forecast bigint,
forecast_error bigint
);
Insert into seventh__error_calc
VALUES
('facility_a','morning','2020-11-22','weekend','facility_a_morning_Sat_Sun','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','1000'),
('facility_a','morning','2020-11-21','weekend','facility_a_morning_Sat_Sun','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','2000'),
('facility_a','morning','2020-11-20','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','3000'),
('facility_a','morning','2020-11-19','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','2500'),
('facility_a','morning','2020-11-18','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','1200'),
('facility_a','morning','2020-11-17','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','5000'),
('facility_a','morning','2020-11-16','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','4400'),
('facility_a','morning','2020-11-15','weekend','facility_a_morning_Sat_Sun','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','3250'),
('facility_a','morning','2020-11-14','weekend','facility_a_morning_Sat_Sun','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','1050'),
('facility_a','morning','2020-11-13','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','2450'),
('facility_a','morning','2020-11-12','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','2450'),
('facility_a','morning','2020-11-11','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','2450'),
('facility_a','morning','2020-11-10','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','2450'),
('facility_a','morning','2020-11-09','weekday','facility_a_morning_Mon_Fri','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','2450'),
('facility_a','morning','2020-11-08','weekend','facility_a_morning_Sat_Sun','Profile#facility_a#dfc3989b#b6e5386a','6809dea6','8','2450'),
('facility_b','morning','2020-11-22','weekend','facility_b_morning_Sat_Sun','Profile#facility_b#dfc3989b#b6e5386a','6809dea6','8','3400'),
('facility_b','morning','2020-11-21','weekend','facility_b_morning_Sat_Sun','Profile#facility_b#dfc3989b#b6e5386a','6809dea6','8','2800'),
('facility_b','morning','2020-11-20','weekday','facility_b_morning_Mon_Fri','Profile#facility_b#dfc3989b#b6e5386a','6809dea6','8','3687'),
('facility_b','morning','2020-11-19','weekday','facility_b_morning_Mon_Fri','Profile#facility_b#dfc3989b#b6e5386a','6809dea6','8','4565'),
('facility_b','morning','2020-11-18','weekday','facility_b_morning_Mon_Fri','Profile#facility_b#dfc3989b#b6e5386a','7252fzw5','8','1262'),
('facility_b','morning','2020-11-17','weekday','facility_b_morning_Mon_Fri','Profile#facility_b#dfc3989b#b6e5386a','7252fzw5','8','8765'),
('facility_b','morning','2020-11-16','weekday','facility_b_morning_Mon_Fri','Profile#facility_b#dfc3989b#b6e5386a','7252fzw5','8','5678'),
('facility_b','morning','2020-11-15','weekend','facility_b_morning_Mon_Fri','Profile#facility_b#dfc3989b#b6e5386a','7252fzw5','8','2893'),
('facility_b','morning','2020-11-14','weekend','facility_b_morning_Sat_Sun','Profile#facility_b#dfc3989b#b6e5386a','7252fzw5','8','1928'),
('facility_b','morning','2020-11-13','weekday','facility_b_morning_Sat_Sun','Profile#facility_b#dfc3989b#b6e5386a','7252fzw5','8','4736')
;
SELECT *
FROM seventh__error_calc
This achieved what I was trying to do. There were two learning points here.
Self Joins. I've never used one before but can now see why they are powerful!
Using a CASE statement in the WHERE clause.
Hope this might help someone else some day!
select facility_name,
forecast_profile_key,
forecast_profile_id,
shift,
date_actuals,
week_part_grouping,
forecast_generation_id,
sum(forecast_error) forecast_err_calc
from (
select rank() over (partition by forecast_profile_id, forecast_profile_key, facility_name, a.date_actuals order by b.date_actuals desc) rnk,
a.facility_name, a.forecast_profile_key, a.forecast_profile_id, a.shift, a.date_actuals, a.week_part_grouping, a.forecast_generation_id, b.forecast_error
from seventh__error_calc a
join seventh__error_calc b
using (facility_name, forecast_profile_key, forecast_profile_id, week_part_grouping, forecast_generation_id)
where case when a.week_part_grouping = 'weekend' then b.date_actuals between a.date_actuals - 16 and a.date_actuals
when a.week_part_grouping = 'weekday' then b.date_actuals between a.date_actuals - 7 and a.date_actuals
end
) src
where case when week_part_grouping = 'weekend' then rnk < 4
when week_part_grouping = 'weekday' then rnk < 6
end

SQL Server: Create sequence column based on a non-distinct column

I'm not sure if I'm asking this question right, but hopefully I can explain it well enough. I have a table that has a Date, Value, and WeekEndDate column. I want to create a sequence column that counts the distinct weeks from 1-13 and cycles every 13 weeks.
I attached a small sample of the output I'm trying to create. Is this even possible?
Use dense_rank() and some arithmetic:
select t.*,
((dense_rank() over (order by weekEnd) - 1) % 13) + 1
from t;

Impala get the difference between 2 dates excluding weekends

I'm trying to get the day difference between 2 dates in Impala but I need to exclude weekends.
I know it should be something like this but I'm not sure how the weekend piece would go...
DATEDIFF(resolution_date,created_date)
Thanks!
One approach at such task is to enumerate each and every day in the range, and then filter out the week ends before counting.
Some databases have specific features to generate date series, while in others offer recursive common-table-expression. Impala does not support recursive queries, so we need to look at alternative solutions.
If you have a table wit at least as many rows as the maximum number of days in a range, you can use row_number() to offset the starting date, and then conditional aggregation to count working days.
Assuming that your table is called mytable, with column id as primary key, and that the big table is called bigtable, you would do:
select
t.id,
sum(
case when dayofweek(dateadd(t.created_date, n.rn)) between 2 and 6
then 1 else 0 end
) no_days
from mytable t
inner join (select row_number() over(order by 1) - 1 rn from bigtable) n
on t.resolution_date > dateadd(t.created_date, n.rn)
group by id

Display a rolling 12 weeks chart in SSRS report

I am calling the data query in ssrs like this:
SELECT * FROM [DATABASE].[dbo].[mytable]
So, the current week is the last week from the query (e.g. 3/31 - 4/4) and each number represents the week before until we have reached the 12 weeks prior to this week and display in a point chart.
How can I accomplish grouping all the visits for all locations by weeks and adding it to the chart?
I suggest updating your SQL query to Group by a descending Dense_Rank of DatePart(Week,ARRIVED_DATE). In this example, I have one column for Visits because I couldn't tell which columns you were using to get your Visit count:
-- load some test data
if object_id('tempdb..#MyTable') is not null
drop table #MyTable
create table #MyTable(ARRIVED_DATE datetime,Visits int)
while (select count(*) from #MyTable) < 1000
begin
insert into #MyTable values
(dateadd(day,round(rand()*100,0),'2014-01-01'),round(rand()*1000,0))
end
-- Sum Visits by WeekNumber relative to today's WeekNumber
select
dense_rank() over(order by datepart(week,ARRIVED_DATE) desc) [Week],
sum(Visits) Visits
from #MyTable
where datepart(week,ARRIVED_DATE) >= datepart(week,getdate()) - 11
group by datepart(week,ARRIVED_DATE)
order by datepart(week,ARRIVED_DATE)
Let me know if I can provide any more detail to help you out.
You are going to want to do the grouping of the visits within SQL. You should be able to add a calculated column to your table which is something like WorkWeek and it should be calculated on the days difference from a certain day such as Sunday. This column will then by your X value rather than the date field you were using.
Here is a good article that goes into first day of week: First Day of Week