How do I relate a table of events to a table of intervals? - sql

I need to merge two tables based on overlapping time data, and I don't know how to do this in SQL. I have a table of events and times, such as this:
Event_Table
+----------------+-------+
| Event | Time |
+----------------+-------+
| Fire Alarm | 10:00 |
| Smoke Alarm | 13:00 |
| Security Alarm | 16:00 |
+----------------+-------+
I also have a table of time intervals, such as this:
Interval_Table
+--------+-------------+-----------+
| Warden | Shift_Start | Shift_End |
+--------+-------------+-----------+
| Jack | 09:00 | 10:30 |
| John | 14:00 | 20:00 |
+--------+-------------+-----------+
I need to make a table of events which includes which warden was on duty at the time:
Output_Table
+----------------+-------+----------------+
| Event | Time | Warden_On_Duty |
+----------------+-------+----------------+
| Fire Alarm | 10:00 | Jack |
| Smoke Alarm | 13:00 | [null] |
| Security Alarm | 16:00 | John |
+----------------+-------+----------------+
Some Warden shifts might overlap, but that should be ignored; maximum one warden name should be displayed for every event. The tables are very large (~500,000 rows). Any ideas on how this can be achieved with SQL?

Here is one way to do this. Notice how I posted consumable ddl and sample data? You should do this in the future. It makes it a LOT easier to help. Most of time on this was setting up the problem. The query itself was a trivial effort.
if OBJECT_ID('tempdb..#Event') is not null
drop table #Event
create table #Event
(
EventName varchar(20)
, EventTime time
)
insert #Event
select 'Fire Alarm', '10:00' union all
select 'Smoke Alarm', '13:00' union all
select 'Security Alarm', '16:00'
if OBJECT_ID('tempdb..#Shifts') is not null
drop table #Shifts
create table #Shifts
(
Warden varchar(10)
, StartTime time
, EndTime time
)
insert #Shifts
select 'Jack', '09:00', '10:30' union all
select 'John', '14:00', '20:00' union all
select 'overlap', '15:00', '22:00';
with SortedResults as
(
select *
, ROW_NUMBER() over (partition by e.EventName order by s.StartTime) as RowNum
from #Event e
join #Shifts s on s.StartTime <= e.EventTime and s.EndTime >= e.EventTime
)
select *
from SortedResults
where RowNum = 1

Try this:
select
Event,
Time,
(select top 1 Warden from Interval_Table where Time between Shift_Start and Shift_End) as Warden_On_Duty
from Event_Table

Related

Pivot all Time Data on Date Column

I need to create a report in SQL Server Reporting Service. The source table/query data is structured as follows:
CNum | EmpNo | TDate | TimeIn | TimeOut
100 | 2 | 12/4/2019 | 7:00 AM | 12:00 PM
100 | 2 | 12/4/2019 | 12:30 PM | 3:30 PM
100 | 2 | 12/5/2019 | 7:00 AM | 12:00 PM
100 | 2 | 12/5/2019 | 12:30 PM | 3:30 PM
I need the report output to be displayed as follows (or something similar, just need to show the TDate as columns and any related time entries based on the CNum as rows).
CNum | 12/4/2019 | 12/5/2019 |
100 | 7:00 AM | 7:00 AM |
| 12:00 PM | 12:00 PM |
100 | 12:30 PM | 12:30 PM |
| 3:30 PM | 3:30 PM |
I have tried using the Matrix Tablix but this forces the group to only return on record per day, when there may be multiple. My goal is to write a SQL Query (CTE or PIVOT) which will give me the report data in the correct format so I will not have to get crazy in the report designer.
I am familiar with SQL but for some reason I cannot get any query to output (Pivot) and include both records for the day.
Any help/guidance will be much appreciated.
You can do this easily in SSRS with a small change to your dataset query.
I reproduced your sample data with the following
DECLARE #t TABLE(CNum int, EMpNo int, TDate Date, TimeIn Time, [Timeout] Time)
INSERT INTO #t VALUES
(100, 2, '2019/12/04', '07:00', '12:00'),
(100, 2, '2019/12/04', '12:30', '15:30'),
(100, 2, '2019/12/05', '07:00', '12:00'),
(100, 2, '2019/12/05', '12:30', '15:30')
SELECT *, ROW_NUMBER() OVER(PARTITION BY TDate, Cnum ORDER BY TimeIn) as RowN FROM #t
Note: I added the RowN column which gives each row a unique number within each TDate and CNum. We add this to the CNum group in the matrix (so it groups by CNum then RowN)
Here's the final design including the row and column groups (Column group is just by TDate)
To get the 2nd row I right clicked the [TimeIn] 'cell' and did "Insert Row = > Inside Group - Below"
The final output looks like this
I think, you may take it fwd from Efficiently convert rows to columns in sql server
Here is the answer :
With CTE as (
Select
CNum,
TDate,
TimeIn as [Time],
'In' as [Action]
From TimeTable
Union All
Select
CNum,
TDate,
[TimeOut] as [Time],
'Out' as [Action]
From TimeTable
)
Select
*
From CTE
Pivot(min([Time]) for TDate in ([2019-12-04],[2019-12-05])) as pivot_table
union all
Select
*
From CTE
Pivot(max([Time]) for TDate in ([2019-12-04],[2019-12-05])) as pivot_table

Is there way to add date difference values we get to the date automatically?

What I was trying to do is I have two dates and using DateDiff to get a difference between dates. For example, I Have planned Start Date and actual start Date and I got the difference between this date is 5, now I want to add this day to the Finish date.
If my Finish date is not what I assumed, but behind, then that difference we got I want to add and want to find next finish date because we are behind so next upcoming dates.
Sum (DATEDIFF(day, sa.PlannedStartDate, sa.ActualStartDate)) OVER
(Partition
By ts.Id)as TotalVariance,
Case when (Sum (DATEDIFF(day, sa.PlannedStartDate, sa.ActualStartDate))
OVER
(Partition By ts.Id) >30) then 'Positive' end as Violation,
DATEADD (day, DATEDIFF(day, sa.PlannedStartDate, sa.ActualStartDate))as
Summar violations,
If the activity 1 - planned Start date is 8/21/2019 but the actual start date is 9/21/2019, in this case we are behind 30 days.
Now the next activity will be delayed, so I want to add this difference to the next activity.
If the second activity planned Start date was 08/25/2019, but because of the delay of activity 1 the start date will change for second activity, in this case I want to find that new date.
Activity PlannedStartdate ActualStartDate Variance NewPlannedstartdate
Activity 1 8/21/2019 9/21/2019 30
Acivity 2 8/26/2019 null 9/26/2019
Here's an example you can run in SSMS:
-- CREATE ACTIVITY TABLE AND ADD SOME DATA --
DECLARE #Activity TABLE ( ActivityId INT, PlannedStart DATE, ActualStart DATE );
INSERT INTO #Activity (
ActivityId, PlannedStart, ActualStart
)
VALUES
( 1, '08/21/2019', '08/27/2019' ), ( 1, '08/26/2019', NULL ), ( 1, '09/14/2019', NULL );
Query #Activity to see what's in it:
SELECT * FROM #Activity ORDER BY ActivityId, PlannedStart;
#Activity content:
+------------+--------------+-------------+
| ActivityId | PlannedStart | ActualStart |
+------------+--------------+-------------+
| 1 | 2019-08-21 | 2019-08-27 |
| 1 | 2019-08-26 | NULL |
| 1 | 2019-09-14 | NULL |
+------------+--------------+-------------+
Query #Activity to factor the new starting dates:
;WITH Activity_CTE AS (
SELECT
ROW_NUMBER() OVER ( ORDER BY PlannedStart ) AS Id,
ActivityId, PlannedStart, ActualStart, DATEDIFF( dd, PlannedStart, ActualStart ) Delayed
FROM #Activity
WHERE
ActivityId = #ActivityId
)
SELECT
ActivityId,
PlannedStart,
ActualStart,
DATEADD( dd, Delays.DaysDelayed, PlannedStart ) AS NewStart
FROM Activity_CTE AS Activity
OUTER APPLY (
SELECT CASE
WHEN ( Delayed IS NOT NULL ) THEN Delayed
ELSE ISNULL( ( SELECT TOP 1 Delayed FROM Activity_CTE WHERE Id < Activity.Id AND Delayed IS NOT NULL ORDER BY Id DESC ), 0 )
END AS DaysDelayed
) AS Delays
ORDER BY
PlannedStart;
Returns
+------------+--------------+-------------+------------+
| ActivityId | PlannedStart | ActualStart | NewStart |
+------------+--------------+-------------+------------+
| 1 | 2019-08-21 | 2019-08-27 | 2019-08-27 |
| 1 | 2019-08-26 | NULL | 2019-09-01 |
| 1 | 2019-09-14 | NULL | 2019-09-20 |
+------------+--------------+-------------+------------+
The real "magic" here is this line:
ELSE ISNULL( ( SELECT TOP 1 Delayed FROM Activity_CTE WHERE Id < Activity.Id AND Delayed IS NOT NULL ORDER BY Id DESC ), 0 )
It's checking for any prior records to itself that has a delay. If none are found, it returns 0. This value is then used to add days to the PlannedStart date to determine the NewStart date. The ORDER BY is of particular note too. Sorting in a DESC order ensures we get the "closest" delay prior to the current row.
Using a CTE in this way also takes into account the idea that the delay may not happen on the very first record (e.g., say the 08/26 planned was delayed instead of 08/21). It conveniently gives us a subtable to query against in our OUTER APPLY.
This is what you would see if you included all columns on the CTE's SELECT:
+----+------------+--------------+-------------+---------+-------------+
| Id | ActivityId | PlannedStart | ActualStart | Delayed | DaysDelayed |
+----+------------+--------------+-------------+---------+-------------+
| 1 | 1 | 2019-08-21 | 2019-08-27 | 6 | 6 |
| 2 | 1 | 2019-08-26 | NULL | NULL | 6 |
| 3 | 1 | 2019-09-14 | NULL | NULL | 6 |
+----+------------+--------------+-------------+---------+-------------+
Because the very first record is the only record with a delay, its delay of 6 days persists through each of the following records.

Dynamically compare week to week by cohorts

Objective:
Get Id logins in week 1. Then how many of those Ids logged in in Week 2.
Restart the same logic for Week 2 to Week 3.
Then week 3 and week 4 and so on... This exercise needs to be done every week.
The Ids need to be segmented by cohorts which are the month and year they subscribed.
Story:
First table (member) has the email and its creation date. The 2nd table (login table) is the login activity. First, I need to group emails by creation date(month-year) to create the cohorts.
Then, the login activity comparing week to week for each cohort. Is it possible
for this query to be dynamic each week?
Output:
The result should look like this:
+--------+--------+--------+--------+---------+
| Cohort | 2019-1 | 2019-2 | 2019-3 | 2019-4 |...
+--------+--------+--------+--------+---------+
| 2018-3 | 7000 | 6800 | 7400| 7100 |...
| 2018-4 | 6800 | 6500 | 8400| 8000 |...
| 2018-5 | 9500 | 8000 | 6400| 6200 |...
| 2018-6 | 9100 | 8500 | 8000| 7800 |...
| 2018-7 | 10000 | 8000 | 7000| 6800 |...
+--------+--------+--------+--------+---------+
What I tried:
SELECT CONCAT(DATEPART(YEAR,m.date_created),'-',DATEPART(MONTH,m.date_created)) AS Cohort
,CONCAT(subquery.[YYYY],'-',subquery.[ISO]) AS YYYY_ISO
,m.email
FROM member as m
INNER JOIN (SELECT DATEPART(YEAR,log.login_time) AS [YYYY]
,DATEPART(ISO_WEEK,log.login_time) AS [ISO]
,log.email
,ROW_NUMBER()
OVER(PARTITION BY
DATEPART(YEAR,log.login_time),
DATEPART(ISO_WEEK,log.login_time),
log.email
ORDER BY log.login_time ASC) AS Log_Rank
FROM login AS log
WHERE CAST(log.login_time AS DATE) >= '2019-01-01'
) AS subquery ON m.email=subquery.email AND Log_Rank = 1
ORDER BY cohort
Sample Data:
CREATE TABLE member
([email] varchar(50), [date_created] Datetime)
CREATE TABLE login
([email] varchar(50), [login_time] Datetime)
INSERT INTO member
VALUES
('player123#google.com', '2018-03-01 05:00:00'),
('player999#google.com', '2018-04-12 12:00:00'),
('player555#google.com', '2018-04-25 20:15:00')
INSERT INTO login
VALUES
('player123#google.com', '2019-01-07 05:30:00'),
('player123#google.com', '2019-01-08 08:30:00'),
('player123#google.com', '2019-01-15 06:30:00'),
('player999#google.com', '2019-01-08 11:30:00'),
('player999#google.com', '2019-01-10 07:30:00'),
('player555#google.com', '2019-01-08 04:30:00')

Discard existing dates that are included in the result, SQL Server

In my database I have a Reservation table and it has three columns Initial Day, Last Day and the House Id.
I want to count the total days and omit those who are repeated, for example:
+-------------+------------+------------+
| | Results | |
+-------------+------------+------------+
| House Id | InitialDay | LastDay |
+-------------+------------+------------+
| 1 | 2017-09-18 | 2017-09-20 |
| 1 | 2017-09-18 | 2017-09-22 |
| 19 | 2017-09-18 | 2017-09-22 |
| 20 | 2017-09-18 | 2017-09-22 |
+-------------+------------+------------+
If you noticed the House Id with the number 1 has two rows, and each row has dates but the first row is in the interval of dates of the second row. In total the number of days should be 5 because the first shouldn't be counted as those days already exist in the second.
The reason why this is happening is that each house has two rooms, and different persons can stay in that house on the same dates.
My question is: how can I omit those cases, and only count the real days the house was occupied?
In your are using SQL Server 2012 or higher you can use LAG() to get the previous final date and adjust the initial date:
with ReservationAdjusted as (
select *,
lag(LastDay) over(partition by HouseID order by InitialDay, LastDay) as PreviousLast
from Reservation
)
select HouseId,
sum(case when PreviousLast>LastDay then 0 -- fully contained in the previous reservation
when PreviousLast>=InitialDay then datediff(day,PreviousLast,LastDay) -- overlap
else datediff(day,InitialDay,LastDay)+1 -- no overlap
end) as Days
from ReservationAdjusted
group by HouseId
The cases are:
The reservation is fully included in the previous reservation: we only need to compare end dates because the previous row is obtained ordering by InitialDay, LastDay, so the previous start date is always minor or equal than the current start date.
The current reservation overlaps with the previous: in this case we adjust the start and don't add 1 (the initial day is already counted), this case include when the previous end is equal to the current start (is a one day overlap).
There is no overlap: we just calculate the difference and add 1 to count also the initial day.
Note that we don't need extra condition for the reservation of a HouseID because by default the LAG() function returns NULL when there isn't a previous row, and comparisons with null always are false.
Sample input and output:
| HouseId | InitialDay | LastDay |
|---------|------------|------------|
| 1 | 2017-09-18 | 2017-09-20 |
| 1 | 2017-09-18 | 2017-09-22 |
| 1 | 2017-09-21 | 2017-09-22 |
| 19 | 2017-09-18 | 2017-09-27 |
| 19 | 2017-09-24 | 2017-09-26 |
| 19 | 2017-09-29 | 2017-09-30 |
| 20 | 2017-09-19 | 2017-09-22 |
| 20 | 2017-09-22 | 2017-09-26 |
| 20 | 2017-09-24 | 2017-09-27 |
| HouseId | Days |
|---------|------|
| 1 | 5 |
| 19 | 12 |
| 20 | 9 |
select house_id,min(initialDay),max(LastDay)
group by houseId
If I understood correctly!
Try out and let me know how it works out for you.
Ted.
While thinking through your question I came across the wonder that is the idea of a Calendar table. You'd use this code to create one, with whatever range of dates your want for your calendar. Code is from http://blog.jontav.com/post/9380766884/calendar-tables-are-incredibly-useful-in-sql
declare #start_dt as date = '1/1/2010';
declare #end_dt as date = '1/1/2020';
declare #dates as table (
date_id date primary key,
date_year smallint,
date_month tinyint,
date_day tinyint,
weekday_id tinyint,
weekday_nm varchar(10),
month_nm varchar(10),
day_of_year smallint,
quarter_id tinyint,
first_day_of_month date,
last_day_of_month date,
start_dts datetime,
end_dts datetime
)
while #start_dt < #end_dt
begin
insert into #dates(
date_id, date_year, date_month, date_day,
weekday_id, weekday_nm, month_nm, day_of_year, quarter_id,
first_day_of_month, last_day_of_month,
start_dts, end_dts
)
values(
#start_dt, year(#start_dt), month(#start_dt), day(#start_dt),
datepart(weekday, #start_dt), datename(weekday, #start_dt), datename(month, #start_dt), datepart(dayofyear, #start_dt), datepart(quarter, #start_dt),
dateadd(day,-(day(#start_dt)-1),#start_dt), dateadd(day,-(day(dateadd(month,1,#start_dt))),dateadd(month,1,#start_dt)),
cast(#start_dt as datetime), dateadd(second,-1,cast(dateadd(day, 1, #start_dt) as datetime))
)
set #start_dt = dateadd(day, 1, #start_dt)
end
select *
into Calendar
from #dates
Once you have a calendar table your query is as simple as:
select distinct t.House_id, c.date_id
from Reservation as r
inner join Calendar as c
on
c.date_id >= r.InitialDay
and c.date_id <= r.LastDay
Which gives you a row for each unique day each room was occupied. If you need a sum of how many days each room was occupied it becomes:
select a.House_id, count(a.House_id) as Days_occupied
from
(select distinct t.House_id, c.date_id
from so_test as t
inner join Calendar as c
on
c.date_id >= t.InitialDay
and c.date_id <= t.LastDay) as a
group by a.House_id
Create a table of all the possible dates and then join it to the Reservations table so that you have a list of all days between InitialDay and LastDay. Like this:
DECLARE #i date
DECLARE #last date
CREATE TABLE #temp (Date date)
SELECT #i = MIN(Date) FROM Reservations
SELECT #last = MAX(Date) FROM Reservations
WHILE #i <= #last
BEGIN
INSERT INTO #temp VALUES(#i)
SET #i = DATEADD(day, 1, #i)
END
SELECT HouseID, COUNT(*) FROM
(
SELECT DISTINCT HouseID, Date FROM Reservation
LEFT JOIN #temp
ON Reservation.InitialDay <= #temp.Date
AND Reservation.LastDay >= #temp.Date
) AS a
GROUP BY HouseID
DROP TABLE #temp

SQL Total time based on Timestamp and States

I have a table like this:
Timestamp | State
01-jan-2016 00:01:00 | ON
01-jan-2016 00:02:00 | OFF
01-jan-2016 00:02:01 | ON
01-jan-2016 00:03:00 | OFF
A Sample result would look like, considering NOW is 01-Jan-2016 00:03:10.
State | TotalTime
ON | 00:01:59
OFF | 00:00:11
I'd like to have a query that returns the total time [in hours, mins and secs] for each of the states. Is that possible using SQL Server Express 2012? Any ideas/directions I should take?
A small change would be required if you want to see over 24 hours
Declare #YourTable table (Timestamp datetime,State varchar(25))
Insert Into #YourTable values
('01-jan-2016 00:01:00','ON'),
('01-jan-2016 00:02:00','OFF'),
('01-jan-2016 00:02:01','ON'),
('01-jan-2016 00:03:00','OFF')
Declare #Default DateTime ='01-Jan-2016 00:03:10'
;with cteBase as (
Select *,NextTime = Lead(TimeStamp,1,#Default) over (Order By TimeStamp)
From #YourTable
)
Select State
,Duration=cast(DateAdd(SECOND,sum(DateDiff(SECOND,TimeStamp,NextTime)),cast('1900-01-01 00:00:00' as datetime)) as time)
From cteBase
Group By State
Returns
State Duration
OFF 00:00:11
ON 00:01:59
Just a quick note.
Lead(TimeStamp,1,TimeStamp) could be Lead(TimeStamp,1,GetDate()) if you want final state to current.