Is there way to add date difference values we get to the date automatically? - sql

What I was trying to do is I have two dates and using DateDiff to get a difference between dates. For example, I Have planned Start Date and actual start Date and I got the difference between this date is 5, now I want to add this day to the Finish date.
If my Finish date is not what I assumed, but behind, then that difference we got I want to add and want to find next finish date because we are behind so next upcoming dates.
Sum (DATEDIFF(day, sa.PlannedStartDate, sa.ActualStartDate)) OVER
(Partition
By ts.Id)as TotalVariance,
Case when (Sum (DATEDIFF(day, sa.PlannedStartDate, sa.ActualStartDate))
OVER
(Partition By ts.Id) >30) then 'Positive' end as Violation,
DATEADD (day, DATEDIFF(day, sa.PlannedStartDate, sa.ActualStartDate))as
Summar violations,
If the activity 1 - planned Start date is 8/21/2019 but the actual start date is 9/21/2019, in this case we are behind 30 days.
Now the next activity will be delayed, so I want to add this difference to the next activity.
If the second activity planned Start date was 08/25/2019, but because of the delay of activity 1 the start date will change for second activity, in this case I want to find that new date.
Activity PlannedStartdate ActualStartDate Variance NewPlannedstartdate
Activity 1 8/21/2019 9/21/2019 30
Acivity 2 8/26/2019 null 9/26/2019

Here's an example you can run in SSMS:
-- CREATE ACTIVITY TABLE AND ADD SOME DATA --
DECLARE #Activity TABLE ( ActivityId INT, PlannedStart DATE, ActualStart DATE );
INSERT INTO #Activity (
ActivityId, PlannedStart, ActualStart
)
VALUES
( 1, '08/21/2019', '08/27/2019' ), ( 1, '08/26/2019', NULL ), ( 1, '09/14/2019', NULL );
Query #Activity to see what's in it:
SELECT * FROM #Activity ORDER BY ActivityId, PlannedStart;
#Activity content:
+------------+--------------+-------------+
| ActivityId | PlannedStart | ActualStart |
+------------+--------------+-------------+
| 1 | 2019-08-21 | 2019-08-27 |
| 1 | 2019-08-26 | NULL |
| 1 | 2019-09-14 | NULL |
+------------+--------------+-------------+
Query #Activity to factor the new starting dates:
;WITH Activity_CTE AS (
SELECT
ROW_NUMBER() OVER ( ORDER BY PlannedStart ) AS Id,
ActivityId, PlannedStart, ActualStart, DATEDIFF( dd, PlannedStart, ActualStart ) Delayed
FROM #Activity
WHERE
ActivityId = #ActivityId
)
SELECT
ActivityId,
PlannedStart,
ActualStart,
DATEADD( dd, Delays.DaysDelayed, PlannedStart ) AS NewStart
FROM Activity_CTE AS Activity
OUTER APPLY (
SELECT CASE
WHEN ( Delayed IS NOT NULL ) THEN Delayed
ELSE ISNULL( ( SELECT TOP 1 Delayed FROM Activity_CTE WHERE Id < Activity.Id AND Delayed IS NOT NULL ORDER BY Id DESC ), 0 )
END AS DaysDelayed
) AS Delays
ORDER BY
PlannedStart;
Returns
+------------+--------------+-------------+------------+
| ActivityId | PlannedStart | ActualStart | NewStart |
+------------+--------------+-------------+------------+
| 1 | 2019-08-21 | 2019-08-27 | 2019-08-27 |
| 1 | 2019-08-26 | NULL | 2019-09-01 |
| 1 | 2019-09-14 | NULL | 2019-09-20 |
+------------+--------------+-------------+------------+
The real "magic" here is this line:
ELSE ISNULL( ( SELECT TOP 1 Delayed FROM Activity_CTE WHERE Id < Activity.Id AND Delayed IS NOT NULL ORDER BY Id DESC ), 0 )
It's checking for any prior records to itself that has a delay. If none are found, it returns 0. This value is then used to add days to the PlannedStart date to determine the NewStart date. The ORDER BY is of particular note too. Sorting in a DESC order ensures we get the "closest" delay prior to the current row.
Using a CTE in this way also takes into account the idea that the delay may not happen on the very first record (e.g., say the 08/26 planned was delayed instead of 08/21). It conveniently gives us a subtable to query against in our OUTER APPLY.
This is what you would see if you included all columns on the CTE's SELECT:
+----+------------+--------------+-------------+---------+-------------+
| Id | ActivityId | PlannedStart | ActualStart | Delayed | DaysDelayed |
+----+------------+--------------+-------------+---------+-------------+
| 1 | 1 | 2019-08-21 | 2019-08-27 | 6 | 6 |
| 2 | 1 | 2019-08-26 | NULL | NULL | 6 |
| 3 | 1 | 2019-09-14 | NULL | NULL | 6 |
+----+------------+--------------+-------------+---------+-------------+
Because the very first record is the only record with a delay, its delay of 6 days persists through each of the following records.

Related

Calculating working minutes for Normal and Night Shift

I am making a query to fetch the working minutes for employees. The problem I have is the Night Shift. I know that I need to subtract the "ShiftStartMinutesFromMidnight" but I can't find the right logic.
NOTE: I can't changing the database, I only can use the data from it.
Let's say I have these records.
+----+--------------------------+----------+
| ID | EventTime | ReaderNo |
-----+--------------------------+----------+
| 1 | 2019-12-04 11:28:46.000 | In |
| 1 | 2019-12-04 12:36:17.000 | Out |
| 1 | 2019-12-04 12:39:23.000 | In |
| 1 | 2019-12-04 12:51:21.000 | Out |
| 1 | 2019-12-05 07:37:49.000 | In |
| 1 | 2019-12-05 08:01:22.000 | Out |
| 2 | 2019-12-04 22:11:46.000 | In |
| 2 | 2019-12-04 23:06:17.000 | Out |
| 2 | 2019-12-04 23:34:23.000 | In |
| 2 | 2019-12-05 01:32:21.000 | Out |
| 2 | 2019-12-05 01:38:49.000 | In |
| 2 | 2019-12-05 06:32:22.000 | Out |
-----+--------------------------+----------+
WITH CT AS (SELECT
EIn.PSNID, EIn.PSNNAME
,CAST(DATEADD(minute, -0, EIn.EventTime) AS date) AS dt
,EIn.EventTime AS LogIn
,CA_Out.EventTime AS LogOut
,DATEDIFF(minute, EIn.EventTime, CA_Out.EventTime) AS WorkingMinutes
FROM
VIEW_EVENT_EMPLOYEE AS EIn
CROSS APPLY
(
SELECT TOP(1) EOut.EventTime
FROM VIEW_EVENT_EMPLOYEE AS EOut
WHERE
EOut.PSNID = EIn.PSNID
AND EOut.ReaderNo = 'Out'
AND EOut.EventTime >= EIn.EventTime
ORDER BY EOut.EventTime
) AS CA_Out
WHERE
EIn.ReaderNo = 'In'
)
SELECT
PSNID
,PSNNAME
,dt
,LogIn
,LogOut
,WorkingMinutes
FROM CT
WHERE dt BETWEEN '2019-11-29' AND '2019-12-05'
ORDER BY LogIn
;
OUTPUT FROM QUERY
+----+------------+-------------------------+-------------------------+----------------+
| ID | date | In | Out | WorkingMinutes |
-----+------------+-------------------------+-------------------------+----------------+
| 1 | 2019-12-04 | 2019-12-04 11:28:46.000 | 2019-12-04 12:36:17.000 | 68 |
| 1 | 2019-12-04 | 2019-12-04 12:39:23.000 | 2019-12-04 12:51:21.000 | 12 |
| 1 | 2019-12-05 | 2019-12-05 07:37:49.000 | 2019-12-05 08:01:22.000 | 24 |
-----+------------+-------------------------+-------------------------+----------------+
I was thinking something like this. When Out is between 06:25 - 6:40. But I also need to check If employee, previous day has In between 21:50 - 22:30. I need that second condition because some employee from first shift maybe can Out, for example at 6:30.
*(1310 is the ShiftStartMinutesFromMidnight
Line 3 of Query
CAST(DATEADD(minute, -0, EIn.EventTime) AS date) AS dt
Updating the Line 3 with this code.
CASE
WHEN CAST(CA_Out.LogDate AS time) BETWEEN '06:25:00' AND '06:40:00'
AND CAST(EIn.LogDate AS time) BETWEEN '21:50:00' AND '22:30:00' THEN CAST(DATEADD(minute, -1310, EIn.LogDate) AS date)
ELSE CAST(DATEADD(minute, -0, EIn.LogDate) AS date)
END as dt
Expected Output
+----+------------+-------------------------+-------------------------+----------------+
| ID | date | In | Out | WorkingMinutes |
-----+------------+-------------------------+-------------------------+----------------+
| 2 | 2019-12-04 | 2019-12-04 22:11:46.000 | 2019-12-04 23:06:17.000 | 55 |
| 2 | 2019-12-04 | 2019-12-04 23:34:23.000 | 2019-12-05 01:32:21.000 | 118 |
| 2 | 2019-12-04 | 2019-12-05 01:38:49.000 | 2019-12-05 06:32:22.000 | 294 |
-----+------------+-------------------------+-------------------------+----------------+
Assuming that total minutes per separate date is enough:
WITH
/* enumerate pairs */
cte1 AS ( SELECT *,
COUNT(CASE WHEN ReaderNo = 'In' THEN 1 END)
OVER (PARTITION BY ID
ORDER BY EventTime) pair
FROM test ),
/* divide by pairs */
cte2 AS ( SELECT ID, MIN(EventTime) starttime, MAX(EventTime) endtime
FROM cte1
GROUP BY ID, pair ),
/* get dates range */
cte3 AS ( SELECT CAST(MIN(EventTime) AS DATE) minDate,
CAST(MAX(EventTime) AS DATE) maxDate
FROM test),
/* generate dates list */
cte4 AS ( SELECT minDate theDate
FROM cte3
UNION ALL
SELECT DATEADD(dd, 1, theDate)
FROM cte3, cte4
WHERE theDate < maxDate ),
/* add overlapped dates to pairs */
cte5 AS ( SELECT ID, starttime, endtime, theDate
FROM cte2, cte4
WHERE theDate BETWEEN CAST(starttime AS DATE) AND CAST(endtime AS DATE) ),
/* adjust borders */
cte6 AS ( SELECT ID,
CASE WHEN starttime < theDate
THEN theDate
ELSE starttime
END starttime,
CASE WHEN CAST(endtime AS DATE) > theDate
THEN DATEADD(dd, 1, theDate)
ELSE endtime
END endtime,
theDate
FROM cte5 )
/* calculate total minutes per date */
SELECT ID,
theDate,
SUM(DATEDIFF(mi, starttime, endtime)) workingminutes
FROM cte6
GROUP BY ID,
theDate
ORDER BY 1,2
fiddle
The solution is specially made detailed, step by step, so that you can easily understand the logic.
You may freely combine some CTEs into one. You may also use pre-last cte5 combined with cte2 if you need the output strongly as shown.
The solution assumes that none records are lost in source data (each 'In' matches strongly one 'Out' and backward, and no adjacent or overlapped pairs).
Don't know where you stopped but here is how I do,
Night shift 20:00 - 05:00 so in one day 00:00 - 5:00; 22:00 - 24:00
day shift 5:00 - 22:00
To get easier overlapping checking you need to change all dates to unix timestamp. so you don't have to split time intervals like shown above
So generate map of each period work for fetch period date_from and date_till, make sure to add holiday and pre-holiday exceptions where periods are different
something like:
Unix values is only for understanding.
unix_from_tim, unix_till_tim, shift_type
1580680800, 1580680800, 1 => example 02-02-2020:22:00:00, 03-02-2020:05:00:00, 1
1580680800, 1580680800, 0 => example 03-02-2020:05:00:00, 03-02-2020:22:00:00, 0
1580680800, 1580680800, 1 => example 03-02-2020:22:00:00, 04-02-2020:05:00:00, 1
...
Make sure you don't calculate overlapping minutes on period start/end..
And there is worker one row
with unix_from_tim, unix_from_tim
1580680800, 1580680800=> something like 02-02-2020:16:30:00, 03-02-2020:07:10:00
When you check overlapping you can get ms like this:
MIN(work_period:till,worker_period:till) - MAX(work_period:from, worker_period:from);
example in simple numbers:
work_period 3 - 7
worker_period 5 - 12
MIN(7,12) - MAX(3,5) = 7 - 5 = 2 //overlap
work_period 3 - 7
worker_period 8 - 12
MIN(7,12) - MAX(3,8) = 7 - 8 = -1 //if negative not overlap!
work_period 3 - 13
worker_period 8 - 12
MIN(13,12) - MAX(3,8) = 13 - 8 = 5 //full overlap!
And you have to check each worker period on all overlaping time generated work intervals.
May be someone can make select where you don't have to generate work_shift overlapping but its not a easy task if you add more holidays, transferred days, reduced time days etc.
Hope it helps

Average over intersect

I've got this data set where I'm creating a report per shift displaying a certain activity per hour. Eg.: A shift runs from 7-15, so the report will be:
7:00 - 15 moves
8:00 - 18 moves
This report contains filters for weeks and days, as the goal is for users to see their shift performance.
However, I'd like to compare the activities in a specific hour with the same hour, same day in the last 10 weeks. So for example:
I've got 15 moves in hour 7:00 of a Tuesday and i'd like to compare that with the average move count on 7:00 of a Tuesday in the last 10 weeks.
How would I get this in SQL! :-)
Here is something that might get you moving in the right direction. Given that you have not stated your data structure, I have taken the liberty to assume what that may be.
You can run the following in SSMS to review/modify the results yourself.
First, I created a table variable to simulate the assumed data.
-- Create a dummy "shift" table --
DECLARE #shifts TABLE ( shift_id INT IDENTITY ( 1, 1 ) PRIMARY KEY, shift_date DATETIME, shift_moves INT );
Then I inserted some dummy data. For the sake of time and simplicity, I stuck with Tuesdays over the last ten weeks.
-- Insert ten weeks of Tuesdays --
INSERT INTO #shifts ( shift_date, shift_moves ) VALUES
( '11/06/2018 07:10:00', 5 )
, ( '11/13/2018 07:08:00', 12 )
, ( '11/20/2018 07:00:00', 14 )
, ( '11/27/2018 07:20:00', 15 )
, ( '12/04/2018 07:35:00', 12 )
, ( '12/11/2018 07:18:00', 11 )
, ( '12/18/2018 07:16:00', 10 )
, ( '12/25/2018 07:00:00', 12 )
, ( '01/01/2019 07:00:00', 13 )
, ( '01/08/2019 07:22:00', 15 );
The table variable #shifts now contains the following data:
+----------+-------------------------+-------------+
| shift_id | shift_date | shift_moves |
+----------+-------------------------+-------------+
| 1 | 2018-11-06 07:10:00.000 | 5 |
| 2 | 2018-11-13 07:08:00.000 | 12 |
| 3 | 2018-11-20 07:00:00.000 | 14 |
| 4 | 2018-11-27 07:20:00.000 | 15 |
| 5 | 2018-12-04 07:35:00.000 | 12 |
| 6 | 2018-12-11 07:18:00.000 | 11 |
| 7 | 2018-12-18 07:16:00.000 | 10 |
| 8 | 2018-12-25 07:00:00.000 | 12 |
| 9 | 2019-01-01 07:00:00.000 | 13 |
| 10 | 2019-01-08 07:22:00.000 | 15 |
+----------+-------------------------+-------------+
I created a few parameters that you might pass to a stored procedure.
-- What date are we looking at? --
DECLARE #date DATETIME = '01/08/2019';
-- How many weeks back to compare? --
DECLARE #weeks_back INT = -10;
*Remember: To look backward, #weeks_back MUST be a negative number. In production, you would have a check/handle for this.
Next, I created two local variables to help keep the date/time simple for use when querying.
-- Create variables for the start and end times for simplicity --
DECLARE
#sDT DATETIME = CAST( CONVERT( VARCHAR(10), #date, 101 ) + ' 00:00:00' AS DATETIME ),
#eDT DATETIME = CAST( CONVERT( VARCHAR(10), #date, 101 ) + ' 23:59:59' AS DATETIME );
Then, it was a matter of querying the data for the desired results using CROSS APPLY.
Using CROSS APPLY allows me to query a second subset of shift data that is tied to the primary record queried ( in this case a shift_date of 01/08/2019 ) for the desired time frame ( #weeks_back ).
-- Get resultset --
SELECT
DATEPART( hh, shift_date ) AS [shift_hour]
, DATENAME( dw, s.shift_date ) AS [shift_day]
, CONVERT( VARCHAR(10), s.shift_date, 101 ) AS [shift_date]
, s.shift_moves
, shift_avg.shift_average
FROM #shifts AS s
CROSS APPLY (
SELECT
AVG( a.shift_moves ) AS [shift_average]
FROM #shifts a
WHERE
-- restrict to the current hour.
DATEPART( HH, a.shift_date ) = DATEPART( HH, s.shift_date )
-- restrict to the current day of the week.
AND DATEPART( DW, a.shift_date ) = DATEPART( DW, s.shift_date )
-- compare against the desired time period / weeks back.
AND a.shift_date BETWEEN DATEADD( WW, #weeks_back, CAST( CONVERT( VARCHAR(10), s.shift_date, 101 ) AS DATETIME ) ) AND a.shift_date
) AS shift_avg
WHERE
s.shift_date BETWEEN #sDT AND #eDT
ORDER BY
s.shift_date;
Which returns the following resultset:
+------------+-----------+------------+-------------+---------------+
| shift_hour | shift_day | shift_date | shift_moves | shift_average |
+------------+-----------+------------+-------------+---------------+
| 7 | Tuesday | 01/08/2019 | 15 | 11 |
+------------+-----------+------------+-------------+---------------+
I hope this helps get you moving in the desired direction, #Aron.

Discard existing dates that are included in the result, SQL Server

In my database I have a Reservation table and it has three columns Initial Day, Last Day and the House Id.
I want to count the total days and omit those who are repeated, for example:
+-------------+------------+------------+
| | Results | |
+-------------+------------+------------+
| House Id | InitialDay | LastDay |
+-------------+------------+------------+
| 1 | 2017-09-18 | 2017-09-20 |
| 1 | 2017-09-18 | 2017-09-22 |
| 19 | 2017-09-18 | 2017-09-22 |
| 20 | 2017-09-18 | 2017-09-22 |
+-------------+------------+------------+
If you noticed the House Id with the number 1 has two rows, and each row has dates but the first row is in the interval of dates of the second row. In total the number of days should be 5 because the first shouldn't be counted as those days already exist in the second.
The reason why this is happening is that each house has two rooms, and different persons can stay in that house on the same dates.
My question is: how can I omit those cases, and only count the real days the house was occupied?
In your are using SQL Server 2012 or higher you can use LAG() to get the previous final date and adjust the initial date:
with ReservationAdjusted as (
select *,
lag(LastDay) over(partition by HouseID order by InitialDay, LastDay) as PreviousLast
from Reservation
)
select HouseId,
sum(case when PreviousLast>LastDay then 0 -- fully contained in the previous reservation
when PreviousLast>=InitialDay then datediff(day,PreviousLast,LastDay) -- overlap
else datediff(day,InitialDay,LastDay)+1 -- no overlap
end) as Days
from ReservationAdjusted
group by HouseId
The cases are:
The reservation is fully included in the previous reservation: we only need to compare end dates because the previous row is obtained ordering by InitialDay, LastDay, so the previous start date is always minor or equal than the current start date.
The current reservation overlaps with the previous: in this case we adjust the start and don't add 1 (the initial day is already counted), this case include when the previous end is equal to the current start (is a one day overlap).
There is no overlap: we just calculate the difference and add 1 to count also the initial day.
Note that we don't need extra condition for the reservation of a HouseID because by default the LAG() function returns NULL when there isn't a previous row, and comparisons with null always are false.
Sample input and output:
| HouseId | InitialDay | LastDay |
|---------|------------|------------|
| 1 | 2017-09-18 | 2017-09-20 |
| 1 | 2017-09-18 | 2017-09-22 |
| 1 | 2017-09-21 | 2017-09-22 |
| 19 | 2017-09-18 | 2017-09-27 |
| 19 | 2017-09-24 | 2017-09-26 |
| 19 | 2017-09-29 | 2017-09-30 |
| 20 | 2017-09-19 | 2017-09-22 |
| 20 | 2017-09-22 | 2017-09-26 |
| 20 | 2017-09-24 | 2017-09-27 |
| HouseId | Days |
|---------|------|
| 1 | 5 |
| 19 | 12 |
| 20 | 9 |
select house_id,min(initialDay),max(LastDay)
group by houseId
If I understood correctly!
Try out and let me know how it works out for you.
Ted.
While thinking through your question I came across the wonder that is the idea of a Calendar table. You'd use this code to create one, with whatever range of dates your want for your calendar. Code is from http://blog.jontav.com/post/9380766884/calendar-tables-are-incredibly-useful-in-sql
declare #start_dt as date = '1/1/2010';
declare #end_dt as date = '1/1/2020';
declare #dates as table (
date_id date primary key,
date_year smallint,
date_month tinyint,
date_day tinyint,
weekday_id tinyint,
weekday_nm varchar(10),
month_nm varchar(10),
day_of_year smallint,
quarter_id tinyint,
first_day_of_month date,
last_day_of_month date,
start_dts datetime,
end_dts datetime
)
while #start_dt < #end_dt
begin
insert into #dates(
date_id, date_year, date_month, date_day,
weekday_id, weekday_nm, month_nm, day_of_year, quarter_id,
first_day_of_month, last_day_of_month,
start_dts, end_dts
)
values(
#start_dt, year(#start_dt), month(#start_dt), day(#start_dt),
datepart(weekday, #start_dt), datename(weekday, #start_dt), datename(month, #start_dt), datepart(dayofyear, #start_dt), datepart(quarter, #start_dt),
dateadd(day,-(day(#start_dt)-1),#start_dt), dateadd(day,-(day(dateadd(month,1,#start_dt))),dateadd(month,1,#start_dt)),
cast(#start_dt as datetime), dateadd(second,-1,cast(dateadd(day, 1, #start_dt) as datetime))
)
set #start_dt = dateadd(day, 1, #start_dt)
end
select *
into Calendar
from #dates
Once you have a calendar table your query is as simple as:
select distinct t.House_id, c.date_id
from Reservation as r
inner join Calendar as c
on
c.date_id >= r.InitialDay
and c.date_id <= r.LastDay
Which gives you a row for each unique day each room was occupied. If you need a sum of how many days each room was occupied it becomes:
select a.House_id, count(a.House_id) as Days_occupied
from
(select distinct t.House_id, c.date_id
from so_test as t
inner join Calendar as c
on
c.date_id >= t.InitialDay
and c.date_id <= t.LastDay) as a
group by a.House_id
Create a table of all the possible dates and then join it to the Reservations table so that you have a list of all days between InitialDay and LastDay. Like this:
DECLARE #i date
DECLARE #last date
CREATE TABLE #temp (Date date)
SELECT #i = MIN(Date) FROM Reservations
SELECT #last = MAX(Date) FROM Reservations
WHILE #i <= #last
BEGIN
INSERT INTO #temp VALUES(#i)
SET #i = DATEADD(day, 1, #i)
END
SELECT HouseID, COUNT(*) FROM
(
SELECT DISTINCT HouseID, Date FROM Reservation
LEFT JOIN #temp
ON Reservation.InitialDay <= #temp.Date
AND Reservation.LastDay >= #temp.Date
) AS a
GROUP BY HouseID
DROP TABLE #temp

Correlate Sequences of Independent Events - Calculate Time Intersection

We are building a PowerBI reporting solution and I (well Stack) solved one problem and the business came up with a new reporting idea. Not sure of the best way to approach it as I know very little about PowerBI and the business seems to want quite complex reports.
We have two sequences of events from separate data sources. They both contain independent events occurring to vehicles. One describes what location a vehicle is within - the other describes incident events which have a reason code for the incident. The business wants to report on time spent in each location for each reason. Vehicles can change location totally independent of the incident events occurring - and events actually are datetime and occur at random points throughtout day. Each type of event has a startime/endtime and a vehicleID.
Vehicle Location Events
+------------------+-----------+------------+-----------------+----------------+
| LocationDetailID | VehicleID | LocationID | StartDateTime | EndDateTime |
+------------------+-----------+------------+-----------------+----------------+
| 1 | 1 | 1 | 2012-1-1 | 2016-1-1 |
| 2 | 1 | 2 | 2016-1-1 | 2016-4-1 |
| 3 | 1 | 1 | 2016-4-1 | 2016-11-1 |
| 4 | 2 | 1 | 2011-1-1 | 2016-11-1 |
+------------------+-----------+------------+-----------------+----------------+
Vehicle Status Events
+---------+---------------+-------------+-----------+--------------+
| EventID | StartDateTime | EndDateTime | VehicleID | ReasonCodeID |
+---------+---------------+-------------+-----------+--------------+
| 1 | 2012-1-1 | 2013-1-1 | 1 | 1 |
| 2 | 2013-1-1 | 2015-1-1 | 1 | 3 |
| 3 | 2015-1-1 | 2016-5-1 | 1 | 4 |
| 4 | 2016-5-1 | 2016-11-1 | 1 | 2 |
| 5 | 2015-9-1 | 2016-2-1 | 2 | 1 |
+---------+---------------+-------------+-----------+--------------+
Is there anyway I can correlate the two streams together and calculate total time per Vehicle per ReasonCode per location? This would seem to require me to be able to relate the two events - so a change of location may occur part way through a given ReasonCode.
Calculation Example ReasonCodeID 4
VehicleID 1 is in location ID 1 from 2012-1-1 to 2016-1-1 and
2016-4-1 to 2016-11-1
VehicleID 1 is in location ID 2 from 2016-1-1
to 2016-4-1
VehcileID 1 has ReasonCodeID 4 from 2015-1-1 to
2016-5-1
Therefore first Period in location 1 intersects with 365 days of ReasonCodeID 4 (2015-1-1 to 2016-1-1). 2nd period in location 1 intersects with 30 days (2016-4-1 to 2016-5-1).
In location 2 intersects with 91 days of ReasonCodeID 4(2016-1-1 to 2016-4-1
Desired output would be the below.
+-----------+--------------+------------+------------+
| VehicleID | ReasonCodeID | LocationID | Total Days |
+-----------+--------------+------------+------------+
| 1 | 1 | 1 | 366 |
| 1 | 3 | 1 | 730 |
| 1 | 4 | 1 | 395 |
| 1 | 4 | 2 | 91 |
| 1 | 2 | 1 | 184 |
| 2 | 1 | 1 | 154 |
+-----------+--------------+------------+------------+
I have created a SQL fiddle that shows the structure here
Vehicles have related tables and I'm sure the business will want them grouped by vehicle class etc but if I can understand how to calculate the intersection points in this case that would give me the basis for rest of reporting.
I think this solution requires a CROSS JOIN implementation. The relationship between both tables is Many to Many which implies the creation of a third table that bridges LocationEvents and VehicleStatusEvents tables so I think specifying the relationship in the expression could be easier.
I use a CROSS JOIN between both tables, then filter the results only to get those rows which VehicleID columns are the same in both tables. I am also filtering the rows that VehicleStatusEvents range dates intersects LocationEvents range dates.
Once the filtering is done I am adding a column to calculate the count of days between each intersection. Finally, the measure sums up the days for each VehicleID, ReasonCodeID and LocationID.
In order to implement the CROSS JOIN you will have to rename the VehicleID, StartDateTime and EndDateTime on any of both tables. It is necessary for avoiding ambigous column names errors.
I rename the columns as follows:
VehicleID : LocationVehicleID and StatusVehicleID
StartDateTime : LocationStartDateTime and StatusStartDateTime
EndDateTime : LocationEndDateTime and StatusEndDateTime
After this you can use CROSSJOIN in the Total Days measure:
Total Days =
SUMX (
FILTER (
ADDCOLUMNS (
FILTER (
CROSSJOIN ( LocationEvents, VehicleStatusEvents ),
LocationEvents[LocationVehicleID] = VehicleStatusEvents[StatusVehicleID]
&& LocationEvents[LocationStartDateTime] <= VehicleStatusEvents[StatusEndDateTime]
&& LocationEvents[LocationEndDateTime] >= VehicleStatusEvents[StatusStartDateTime]
),
"CountOfDays", IF (
[LocationStartDateTime] <= [StatusStartDateTime]
&& [LocationEndDateTime] >= [StatusEndDateTime],
DATEDIFF ( [StatusStartDateTime], [StatusEndDateTime], DAY ),
IF (
[LocationStartDateTime] > [StatusStartDateTime]
&& [LocationEndDateTime] >= [StatusEndDateTime],
DATEDIFF ( [LocationStartDateTime], [StatusEndDateTime], DAY ),
IF (
[LocationStartDateTime] <= [StatusStartDateTime]
&& [LocationEndDateTime] <= [StatusEndDateTime],
DATEDIFF ( [StatusStartDateTime], [LocationEndDateTime], DAY ),
IF (
[LocationStartDateTime] >= [StatusStartDateTime]
&& [LocationEndDateTime] <= [StatusEndDateTime],
DATEDIFF ( [LocationStartDateTime], [LocationEndDateTime], DAY ),
BLANK ()
)
)
)
)
),
LocationEvents[LocationID] = [LocationID]
&& VehicleStatusEvents[ReasonCodeID] = [ReasonCodeID]
),
[CountOfDays]
)
Then in Power BI you can build a matrix (or any other visualization) using this measure:
If you don't understand completely the measure expression, here is the T-SQL translation:
SELECT
dt.VehicleID,
dt.ReasonCodeID,
dt.LocationID,
SUM(dt.Diff) [Total Days]
FROM
(
SELECT
CASE
WHEN a.StartDateTime <= b.StartDateTime AND a.EndDateTime >= b.EndDateTime -- Inside range
THEN DATEDIFF(DAY, b.StartDateTime, b.EndDateTime)
WHEN a.StartDateTime > b.StartDateTime AND a.EndDateTime >= b.EndDateTime -- |-----|*****|....|
THEN DATEDIFF(DAY, a.StartDateTime, b.EndDateTime)
WHEN a.StartDateTime <= b.StartDateTime AND a.EndDateTime <= b.EndDateTime -- |...|****|-----|
THEN DATEDIFF(DAY, b.StartDateTime, a.EndDateTime)
WHEN a.StartDateTime >= b.StartDateTime AND a.EndDateTime <= b.EndDateTime -- |---|****|-----
THEN DATEDIFF(DAY, a.StartDateTime, a.EndDateTime)
END Diff,
a.VehicleID,
b.ReasonCodeID,
a.LocationID --a.StartDateTime, a.EndDateTime, b.StartDateTime, b.EndDateTime
FROM LocationEvents a
CROSS JOIN VehicleStatusEvents b
WHERE a.VehicleID = b.VehicleID
AND
(
(a.StartDateTime <= b.EndDateTime)
AND (a.EndDateTime >= b.StartDateTime)
)
) dt
GROUP BY dt.VehicleID,
dt.ReasonCodeID,
dt.LocationID
Note in T-SQL you could use an INNER JOIN operator too.
Let me know if this helps.
select coalesce(l.VehicleID,s.VehicleID) as VehicleID
,s.ReasonCodeID
,l.LocationID
,sum
(
datediff
(
day
,case when s.StartDateTime > l.StartDateTime then s.StartDateTime else l.StartDateTime end
,case when s.EndDateTime < l.EndDateTime then s.EndDateTime else l.EndDateTime end
)
) as TotalDays
from VehicleLocationEvents as l
full join VehicleStatusEvents as s
on s.VehicleID =
l.VehicleID
and case when s.StartDateTime > l.StartDateTime then s.StartDateTime else l.StartDateTime end <=
case when s.EndDateTime < l.EndDateTime then s.EndDateTime else l.EndDateTime end
group by coalesce(l.VehicleID,s.VehicleID)
,s.ReasonCodeID
,l.LocationID
or
select VehicleID
,ReasonCodeID
,LocationID
,sum (datediff (day,max_StartDateTime,min_EndDateTime)) as TotalDays
from (select coalesce(l.VehicleID,s.VehicleID) as VehicleID
,s.ReasonCodeID
,l.LocationID
,case when s.StartDateTime > l.StartDateTime then s.StartDateTime else l.StartDateTime end as max_StartDateTime
,case when s.EndDateTime < l.EndDateTime then s.EndDateTime else l.EndDateTime end as min_EndDateTime
from VehicleLocationEvents as l
full join VehicleStatusEvents as s
on s.VehicleID =
l.VehicleID
) ls
where max_StartDateTime <= min_EndDateTime
group by VehicleID
,ReasonCodeID
,LocationID

SQL Server get number of days from date range excluding certain days from specific date range

I am using SQL Server 2008 R2.
I have a table in database with records as shown below :
Id | Status | UserId | StatusDate | ProgramStartDate
1 | Active |1 | 2014-04-02 00:00:00.000 | 2014-03-23
2 | Inactive |1 | 2014-04-05 00:00:00.000 | NULL
3 | Pause |1 | 2014-04-07 00:00:00.000 | NULL
4 | Inactive |1 | 2014-04-10 00:00:00.000 | NULL
5 | Active |1 | 2014-04-14 00:00:00.000 | NULL
ProgramStartDate is any date that is inserted by user. While StatusDate is actual date-time whenever user have inserted/updated his Status.
Now, I want to count the number of days from ProgramStartDate (2014-03-23) to Today's date (GETDATE()) excluding the number of days in which user was in Inactive status.
Here, user is Active from ProgramStartDate 2014-03-23 to 2014-04-05 (13 Days), 2014-04-07 to 2014-04-10 (3 days), and 2014-04-14 to GETDATE() (9 days)
So total number of active days = 13 + 3 + 9 = 25 days.
The formula work is like below example :
'2014/03/23' '2014/04/05' 13
'2014/04/05' '2014/04/07' -2
'2014/04/07' '2014/04/10' 3
'2014/04/10' '2014/04/14' -4
'2014/04/14' GetDate() 9
and total = 25 days.
Is there any way to achieve this Total Number of Days by SQL query?
here is the solution for your query. try it now.
Select SUM(TDays) SumDays
From (
Select Id, Status, UserId,
Case When (Status = 'Inactive') Then 0 Else
(DATEDIFF(DAY,StatusDate,(Case When (NextDate IS NULL) Then GetDate() Else NextDate End)))
End TDays
From (
Select Id, Status, UserId, Case When (ProgramStartDate IS NOT NULL) Then ProgramStartDate Else StatusDate End StatusDate,
(Select Min(StatusDate) From StatusMast M Where M.StatusDate > S.StatusDate) NextDate
From StatusMast S
) As Stat
)As TotDay
Your output is :
SumDays
25