Split multi-month records into individual months - sql

I have data in a table in this format - where date range is multi-month:
SourceSink Class ShadowPrice Round Period StartDate EndDate
AEC Peak 447.038 3 WIN2020 2020-12-01 2021-02-28
I want to create a view/ insert into a new table - the above record broken by month as shown below:
SourceSink Class ShadowPrice Round Period StartDate EndDate
AEC Peak 447.038 3 WIN2020 2020-12-01 2021-12-31
AEC Peak 447.038 3 WIN2020 2021-01-01 2021-01-31
AEC Peak 447.038 3 WIN2020 2021-02-01 2021-02-28
Please advise.

One option is a recursive query. Assuming that periods always start on the the first day of a month and end on the last day of a month, as shown in your sample data, that would be:
with cte as (
select t.*, startDate newStartDate, eomonth(startDate) newEndDate
from mytable t
union all
select
sourceSink,
class,
shadowPrice,
period,
startDate,
endDate,
dateadd(month, 1, newStartDate),
eomonth(dateadd(month, 1, newStartDate))
from cte
where newStartDate < endDate
)
select * from cte
If periods start and end on variying month days, then we need a little more logic:
with cte as (
select
t.*,
startDate newStartDate,
case when eomonth(startDate) <= endDate then eomonth(startDate) else endDate end newEndDate
from mytable t
union all
select
sourceSink,
class,
shadowPrice,
period,
startDate,
endDate,
dateadd(month, 1, datefromparts(year(newStartDate), month(newStartDate), 1)),
case when eomonth(dateadd(month, 1, datefromparts(year(newStartDate), month(newStartDate), 1))) <= endDate
then eomonth(dateadd(month, 1, datefromparts(year(newStartDate), month(newStartDate), 1)))
else endDate
end
from cte
where datefromparts(year(newStartDate), month(newStartDate), 1) < endDate
)
select * from cte

Just another option using a CROSS APPLY and an ad-hoc tally table
Example
Select A.[SourceSink]
,A.[Class]
,A.[ShadowPrice]
,A.[Round]
,A.[Period]
,B.[StartDate]
,B.[EndDate]
From YourTable A
Cross Apply (
Select StartDate=min(D)
,EndDate =max(D)
From (
Select Top (DateDiff(DAY,[StartDate],[EndDate])+1)
D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),[StartDate])
From master..spt_values n1,master..spt_values n2
) B1
Group By Year(D),Month(D)
) B
Returns

Related

SQL how to write a query that return missing date ranges?

I am trying to figure out how to write a query that looks at certain records and finds missing date ranges between today and 9999-12-31.
My data looks like below:
ID |start_dt |end_dt |prc_or_disc_1
10412 |2018-07-17 00:00:00.000 |2018-07-20 00:00:00.000 |1050.000000
10413 |2018-07-23 00:00:00.000 |2018-07-26 00:00:00.000 |1040.000000
So for this data I would want my query to return:
2018-07-10 | 2018-07-16
2018-07-21 | 2018-07-22
2018-07-27 | 9999-12-31
I'm not really sure where to start. Is this possible?
You can do that using the lag() function in MS SQL (but that is available starting with 2012?).
with myData as
(
select *,
lag(end_dt,1) over (order by start_dt) as lagEnd
from myTable),
myMax as
(
select Max(end_dt) as maxDate from myTable
)
select dateadd(d,1,lagEnd) as StartDate, dateadd(d, -1, start_dt) as EndDate
from myData
where lagEnd is not null and dateadd(d,1,lagEnd) < start_dt
union all
select dateAdd(d,1,maxDate) as StartDate, cast('99991231' as Datetime) as EndDate
from myMax
where maxDate < '99991231';
If lag() is not available in MS SQL 2008, then you can mimic it with row_number() and joining.
select
CASE WHEN DATEDIFF(day, end_dt, ISNULL(LEAD(start_dt) over (order by ID), '99991231')) > 1 then end_dt +1 END as F1,
CASE WHEN DATEDIFF(day, end_dt, ISNULL(LEAD(start_dt) over (order by ID), '99991231')) > 1 then ISNULL(LEAD(start_dt) over (order by ID) - 1, '99991231') END as F2
from t
Working SQLFiddle example is -> Here
FOR 2008 VERSION
SELECT
X.end_dt + 1 as F1,
ISNULL(Y.start_dt-1, '99991231') as F2
FROM t X
LEFT JOIN (
SELECT
*
, (SELECT MAX(ID) FROM t WHERE ID < A.ID) as ID2
FROM t A) Y ON X.ID = Y.ID2
WHERE DATEDIFF(day, X.end_dt, ISNULL(Y.start_dt, '99991231')) > 1
Working SQLFiddle example is -> Here
This should work in 2008, it assumes that ranges in your table do not overlap. It will also eliminate rows where the end_date of the current row is a day before the start date of the next row.
with dtRanges as (
select start_dt, end_dt, row_number() over (order by start_dt) as rownum
from table1
)
select t2.end_dt + 1, coalesce(start_dt_next -1,'99991231')
FROM
( select dr1.start_dt, dr1.end_dt,dr2.start_dt as start_dt_next
from dtRanges dr1
left join dtRanges dr2 on dr2.rownum = dr1.rownum + 1
) t2
where
t2.end_dt + 1 <> coalesce(start_dt_next,'99991231')
http://sqlfiddle.com/#!18/65238/1
SELECT
*
FROM
(
SELECT
end_dt+1 AS start_dt,
LEAD(start_dt-1, 1, '9999-12-31')
OVER (ORDER BY start_dt)
AS end_dt
FROM
yourTable
)
gaps
WHERE
gaps.end_dt >= gaps.start_dt
I would, however, strongly urge you to use end dates that are "exclusive". That is, the range is everything up to but excluding the end_dt.
That way, a range of one day becomes '2018-07-09', '2018-07-10'.
It's really clear that my range is one day long, if you subtract one from the other you get a day.
Also, if you ever change to needing hour granularity or minute granularity you don't need to change your data. It just works. Always. Reliably. Intuitively.
If you search the web you'll find plenty of documentation on why inclusive-start and exclusive-end is a very good idea from a software perspective. (Then, in the query above, you can remove the wonky +1 and -1.)
This solves your case, but provide some sample data if there will ever be overlaps, fringe cases, etc.
Take one day after your end date and 1 day before the next line's start date.
DECLARE # TABLE (ID int, start_dt DATETIME, end_dt DATETIME, prc VARCHAR(100))
INSERT INTO # (id, start_dt, end_dt, prc)
VALUES
(10410, '2018-07-09 00:00:00.00','2018-07-12 00:00:00.000','1025.000000'),
(10412, '2018-07-17 00:00:00.00','2018-07-20 00:00:00.000','1050.000000'),
(10413, '2018-07-23 00:00:00.00','2018-07-26 00:00:00.000','1040.000000')
SELECT DATEADD(DAY, 1, end_dt)
, DATEADD(DAY, -1, LEAD(start_dt, 1, '9999-12-31') OVER(ORDER BY id) )
FROM #
You may want to take a look at this:
http://sqlfiddle.com/#!18/3a224/1
You just have to edit the begin range to today and the end range to 9999-12-31.

Creating multiple row per row based on date caclulation

I have never done anything like it and not able to find any guidance when I google. How can I create multiple row based on one row data
For example here's my data:
Employee | EmploymentDate
1 1/1/2017
Every three months I need to calculate level.
Employee | EmploymentDate | Level | LevelDateRange
1 1/1/2017 1 1/1/2017 - 3/31/2017
1 1/1/2017 2 4/1/2017 - 6/30/2017
1 1/1/2017 3 7/1/2017 - 9/30/2017
LevelDateRange Calculation:
Start Date: If 1st level then EmploymentDate
else previous LevelDateRange end date + 1
End Date: If 1st level then EmployemeentDate + 3 months minus a day
else Start date + 3 months
Any suggestion?
In your case, you might consider cross apply:
select d.employee, v.*
from mydata d cross apply
(values (1, employmentdate, dateadd(day, -1, dateadd(month, 3, d.employmentdate))),
(2, dateadd(month, 3, employmentdate), dateadd(day, -1, dateadd(month, 6, d.employmentdate))),
(3, dateadd(month, 6, employmentdate), dateadd(day, -1, dateadd(month, 9, d.employmentdate)))
) v(lev, startdate, enddate);
I would advise you to keep the start date and end date in separate columns. Combine them into a string at the application level or when you query the database.
A tally table and a CTE:
DECLARE #StartDate DATE
SELECT #StartDate = EmploymentDate FROM emp
DECLARE #MonthsSinceStart INT = DATEDIFF(mm,#startDate,GETDATE())
DECLARE #NumLevels INT = #MonthsSinceStart / 3
--100 row tally table
IF OBJECT_ID('tempdb..#Tally') IS NOT NULL DROP TABLE #Tally
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS Level
INTO #Tally
FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n);
WITH cte AS
(
SELECT Employee,
EmploymentDate,
Level,
EmploymentDate AS StartDate,
DATEADD(dd,-1,DATEADD(mm,3,EmploymentDate)) AS EndDate
FROM emp
CROSS JOIN #Tally
WHERE #Tally.Level =1
UNION ALL
SELECT Employee,
EmploymentDate,
Level + 1,
DATEADD(dd,1,EndDate) AS StartDate,
DATEADD(dd,-1,DATEADD(mm,3,DATEADD(dd,1,EndDate))) AS EndDate
FROM cte
WHERE cte.Level < #NumLevels
)
SELECT Employee,
EmploymentDate,
Level,
CONVERT(NVARCHAR(10),StartDate) + ' - ' + CONVERT(NVARCHAR(10),EndDate)
FROM cte;
Suggest you give it a test with more than one row in your table

SQL calculate date segments within calendar year

What I need is to calculate the missing time periods within the calendar year given a table such as this in SQL:
DatesTable
|ID|DateStart |DateEnd |
1 NULL NULL
2 2015-1-1 2015-12-31
3 2015-3-1 2015-12-31
4 2015-1-1 2015-9-30
5 2015-1-1 2015-3-31
5 2015-6-1 2015-12-31
6 2015-3-1 2015-6-30
6 2015-7-1 2015-10-31
Expected return would be:
1 2015-1-1 2015-12-31
3 2015-1-1 2015-2-28
4 2015-10-1 2015-12-31
5 2015-4-1 2015-5-31
6 2015-1-1 2015-2-28
6 2015-11-1 2015-12-31
It's essentially work blocks. What I need to show is the part of the calendar year which was NOT worked. So for ID = 3, he worked from 3/1 through the rest of the year. But he did not work from 1/1 till 2/28. That's what I'm looking for.
You can do it using LEAD, LAG window functions available from SQL Server 2012+:
;WITH CTE AS (
SELECT ID,
LAG(DateEnd) OVER (PARTITION BY ID ORDER BY DateEnd) AS PrevEnd,
DateStart,
DateEnd,
LEAD(DateStart) OVER (PARTITION BY ID ORDER BY DateEnd) AS NextStart
FROM DatesTable
)
SELECT ID, DateStart, DateEnd
FROM (
-- Get interval right before current [DateStart, DateEnd] interval
SELECT ID,
CASE
WHEN DateStart IS NULL THEN '20150101'
WHEN DateStart > start THEN start
ELSE NULL
END AS DateStart,
CASE
WHEN DateStart IS NULL THEN '20151231'
WHEN DateStart > start THEN DATEADD(d, -1, DateStart)
ELSE NULL
END AS DateEnd
FROM CTE
CROSS APPLY (SELECT COALESCE(DATEADD(d, 1, PrevEnd), '20150101')) x(start)
-- If there is no next interval then get interval right after current
-- [DateStart, DateEnd] interval (up-to end of year)
UNION ALL
SELECT ID, DATEADD(d, 1, DateEnd) AS DateStart, '20151231' AS DateEnd
FROM CTE
WHERE DateStart IS NOT NULl -- Do not re-examine [Null, Null] interval
AND NextStart IS NULL -- There is no next [DateStart, DateEnd] interval
AND DateEnd < '20151231' -- Current [DateStart, DateEnd] interval
-- does not terminate on 31/12/2015
) AS t
WHERE t.DateStart IS NOT NULL
ORDER BY ID, DateStart
The idea behind the above query is simple: for every [DateStart, DateEnd] interval get 'not worked' interval right before it. If there is no interval following the current interval, then also get successive 'not worked' interval (if any).
Also note that I assume that if DateStart is NULL then DateStart is also NULL for the same ID.
Demo here
If your data is not too big, this approach will work. It expands all the days and ids and then re-groups them:
with d as (
select cast('2015-01-01' as date)
union all
select dateadd(day, 1, d)
from d
where d < cast('2015-12-31' as date)
),
td as (
select *
from d cross join
(select distinct id from t) t
where not exists (select 1
from t t2
where d.d between t2.startdate and t2.enddate
)
)
select id, min(d) as startdate, max(d) as enddate
from (select td.*,
dateadd(day, - row_number() over (partition by id order by d), d) as grp
from td
) td
group by id, grp
order by id, grp;
An alternative method relies on cumulative sums and similar functionality that is much easier to expression in SQL Server 2012+.
Somewhat simpler approach I think.
Basically create a list of dates for all work block ranges (A). Then create a list of dates for the whole year for each ID (B). Then remove the A from B. Compile the remaining list of dates into date ranges for each ID.
DECLARE #startdate DATETIME, #enddate DATETIME
SET #startdate = '2015-01-01'
SET #enddate = '2015-12-31'
--Build date ranges from remaining date list
;WITH dateRange(ID, dates, Grouping)
AS
(
SELECT dt1.id, dt1.Dates, dt1.Dates + row_number() over (order by dt1.id asc, dt1.Dates desc) AS Grouping
FROM
(
--Remove (A) from (B)
SELECT distinct dt.ID, tmp.Dates FROM DatesTable dt
CROSS APPLY
(
--GET (B) here
SELECT DATEADD(DAY, number, #startdate) [Dates]
FROM master..spt_values
WHERE type = 'P' AND DATEADD(DAY, number, #startdate) <= #enddate
) tmp
left join
(
--GET (A) here
SELECT DISTINCT T.Id,
D.Dates
FROM DatesTable AS T
INNER JOIN master..spt_values as N on N.number between 0 and datediff(day, T.DateStart, T.DateEnd)
CROSS APPLY (select dateadd(day, N.number, T.DateStart)) as D(Dates)
WHERE N.type ='P'
) dr
ON dr.Id = dt.Id and dr.Dates = tmp.Dates
WHERE dr.id is null
) dt1
)
SELECT ID, CAST(MIN(Dates) AS DATE) DateStart, CAST(MAX(Dates) AS DATE) DateEnd
FROM dateRange
GROUP BY ID, Grouping
ORDER BY ID
Heres the code:
http://sqlfiddle.com/#!3/f3615/1
I hope this helps!

How can I sum values per day and then plot them on calendar from start date to last date

I have a table, part of which is given below. It contain multiple values (durations) per day. I need two things 1) addition of durations per day. 2) plotting them on calendar in such a way that startdate is first_date from the table and last_date is Last_update from the table. I want to mention 0 for which date there is no duration. I think it will something like below but need help.
;WITH AllDates AS(
SELECT #Fromdate As TheDate
UNION ALL
SELECT TheDate + 1
FROM AllDates
WHERE TheDate + 1 <= #ToDate
)SELECT UserId,
TheDate,
COALESCE(
SUM(
-- When the game starts and ends in the same date
CASE WHEN DATEDIFF(DAY, GameStartTime, GameEndTime) = 0
Here is what I am looking for
Another way to generate the date range you are after would be something like .....
;WITH DateLimits AS
(
SELECT MIN(First_Date) FirstDate
,MAX(Last_Update) LastDate
FROM TableName
),
DateRange AS
(
SELECT TOP (SELECT DATEDIFF(DAY,FirstDate,LastDate ) FROM DateLimits)
DATEADD(DAY
,ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, (SELECT FirstDate FROM DateLimits)
) AS Dates
FROM master..spt_values a cross join master..spt_values b
)
SELECT * FROM DateRange --<-- you have the desired date range here
-- other query whatever you need.

Create a weekCount column in SQL Server 2012

I have this data:
id worked_date
-----------------
1 2013-09-25
2 2013-09-26
3 2013-10-01
4 2013-10-04
5 2013-10-07
I want to add a column called weekCount. The based date is 2013-09-25. So all the data with worked_date from 2013-09-25 to 2013-10-01 will have weekCount as 1 and from 2013-10-02 to 2013-10-8 will have weekCount as 2 and so on. How can that be done?
Thanks.
Here's one way using DATEDIFF:
select id,
worked_date,
1 + (datediff(day, '2013-09-25', worked_date) / 7) weekCount
from yourtable
SQL Fiddle Demo
Perhaps an approach like this will solve your problem.
I compute an in-memory table that contains the week's boundaries along with a monotonically increasing number (BuildWeeks). I then compare my worked_date values to my date boundaries. Based on your comment to #sgeddes, you need the reverse week number so I then use a DENSE_RANK function to calculate the ReverseWeekNumber.
WITH BOT(StartDate) AS
(
SELECT CAST('2013-09-25' AS date)
)
, BuildWeeks (WeekNumber, StartOfWeek, EndOfWeek) AS
(
SELECT
N.number AS WeekNumber
, DateAdd(week, N.number -1, B.StartDate) AS StartOfWeek
, DateAdd(d, -1, DateAdd(week, N.number, B.StartDate)) AS EndOfWeek
FROM
dbo.Numbers AS N
CROSS APPLY
BOT AS B
)
SELECT
M.*
, BW.*
, DENSE_RANK() OVER (ORDER BY BW.WeekNumber DESC) AS ReverseWeekNumber
FROM
dbo.MyTable M
INNER JOIN
BuildWeeks AS BW
ON M.worked_date BETWEEN BW.StartOfWeek ANd BW.EndOfWeek
;
SQLFiddle
If you are looking for a Fiscal Week number, I would use a function that would calculate the week:
CREATE FUNCTION FiscalWeek(#FiscalStartDate datetime, #EvalDate datetime)
RETURNS INT
AS
BEGIN
DECLARE #weekNumber INT = (DATEDIFF(DAY, #FiscalStartDate, #EvalDate) / 7) + 1
RETURN (#weekNumber % 52)
END
GO
If you used a fiscal starting date of '2013-09-25' and an evaluation date of '2014-09-25' you would get a week number of 1.
Using a function gives you a little more flexibility to do whatever you need.
Perhaps not the most elegant way but this works for me to get the top rank number:
WITH CTE AS (
SELECT employee_id, DENSE_RANK() OVER (ORDER BY DATEDIFF(DAY, ''20130925'', worked_date )/7 DESC) AS weekRank
FROM Timesheet
)
SELECT TOP (1) weekRank
FROM CTE
WHERE employee_id=#employee_id
ORDER BY weekRank DESC
This is how I can create weekRank column and pass a parameter dynamically:
WITH rank_cte AS (
SELECT timesheet_id,employee_id, date_worked,
dateadd(week, datediff(day,'20000105',worked_date) / 7, '20000105') AS WeekStart,
dateadd(week, datediff(day,'20000105',worked_date) / 7, '20000105')+6 AS WeekEnd,
DENSE_RANK() OVER (ORDER BY 1 + DATEDIFF(DAY, '20130925', worked_date )/7 DESC) AS weekRank
FROM Timesheet
)
SELECT timesheet_id, worked_date, WeekStart, WeekEnd, weekRank
FROM rank_cte rc
WHERE employee_id=#employee_id
AND weekRank=#weekRank
ORDER BY worked_date DESC
Thanks