SQL Query to show gaps between multiple date ranges - sql

Im working on a SSRS / SQL project and trying to write a query to get the gaps between dates and I am completely lost with how to write this.Basically we have a number of devices which can be scheduled for use and I need a report to show when they are not in use.
I have a table with Device ID, EventStart and EventEnd times, I need to run a query to get the times between these events for each device but I am not really sure how to do this.
For example:
Device 1 Event A runs from `01/01/2012 08:00 - 01/01/2012 10:00`
Device 1 Event B runs from `01/01/2012 18:00 - 01/01/2012 20:00`
Device 1 Event C runs from `02/01/2012 18:00 - 02/01/2012 20:00`
Device 2 Event A runs from `01/01/2012 08:00 - 01/01/2012 10:00`
Device 2 Event B runs from `01/01/2012 18:00 - 01/01/2012 20:00`
My query should have as its result
`Device 1 01/01/2012 10:00 - 01/01/2012 18:00`
`Device 1 01/01/2012 20:00 - 02/01/2012 18:00`
`Device 2 01/01/2012 10:00 - 01/01/2012 18:00`
There will be around 4 - 5 devices on average in this table, and maybe 200 - 300 + events.
Updates:
Ok I'll update this to try give a bit more info since I dont seem to have explained this too well (sorry!)
What I am dealing with is a table which has details for Events, Each event is a booking of a flight simulator, We have a number of flight sims( refered to as devices in the table) and we are trying to generate a SSRS report which we can give to a customer to show the days / times each sim is available.
So I am going to pass in a start / end date parameter and select all availabilities between those dates. The results should then display as something like:
Device Available_From Available_To
1 01/01/2012 10:00 01/01/2012 18:00`
1 01/01/2012 20:00 02/01/2012 18:00`
2 01/01/2012 10:00 01/01/2012 18:00`
Also Events can sometimes overlap though this is very rare and due to bad data, it doesnt matter about an event on one device overlapping an event on a different device as I need to know availability for each device seperately.

The Query:
Assuming the fields containing the interval are named Start and Finish, and the table is named YOUR_TABLE, the query...
SELECT Finish, Start
FROM
(
SELECT DISTINCT Start, ROW_NUMBER() OVER (ORDER BY Start) RN
FROM YOUR_TABLE T1
WHERE
NOT EXISTS (
SELECT *
FROM YOUR_TABLE T2
WHERE T1.Start > T2.Start AND T1.Start < T2.Finish
)
) T1
JOIN (
SELECT DISTINCT Finish, ROW_NUMBER() OVER (ORDER BY Finish) RN
FROM YOUR_TABLE T1
WHERE
NOT EXISTS (
SELECT *
FROM YOUR_TABLE T2
WHERE T1.Finish > T2.Start AND T1.Finish < T2.Finish
)
) T2
ON T1.RN - 1 = T2.RN
WHERE
Finish < Start
...gives the following result on your test data:
Finish Start
2012-01-01 10:00:00.000 2012-01-01 18:00:00.000
The important property of this query is that it would work on overlapping intervals as well.
The Algorithm:
1. Merge Overlapping Intervals
The subquery T1 accepts only those interval starts that are outside other intervals. The subquery T2 does the same for interval ends. This is what removes overlaps.
The DISTINCT is important in case there are two identical interval starts (or ends) that are both outside other intervals. The WHERE Finish < Start simply eliminates any empty intervals (i.e. duration 0).
We also attach a row number relative to temporal ordering, which will be needed in the next step.
The T1 yields:
Start RN
2012-01-01 08:00:00.000 1
2012-01-01 18:00:00.000 2
The T2 yields:
Finish RN
2012-01-01 10:00:00.000 1
2012-01-01 20:00:00.000 2
2. Reconstruct the Result
We can now reconstruct either the "active" or the "inactive" intervals.
The inactive intervals are reconstructed by putting together end of the previous interval with the beginning of the next one, hence - 1 in the ON clause. Effectively, we put...
Finish RN
2012-01-01 10:00:00.000 1
...and...
Start RN
2012-01-01 18:00:00.000 2
...together, resulting in:
Finish Start
2012-01-01 10:00:00.000 2012-01-01 18:00:00.000
(The active intervals could be reconstructed by putting rows from T1 alongside rows from T2, by using JOIN ... ON T1.RN = T2.RN and reverting WHERE.)
The Example:
Here is a slightly more realistic example. The following test data:
Device Event Start Finish
Device 1 Event A 2012-01-01 08:00:00.000 2012-01-01 10:00:00.000
Device 2 Event B 2012-01-01 18:00:00.000 2012-01-01 20:00:00.000
Device 3 Event C 2012-01-02 11:00:00.000 2012-01-02 15:00:00.000
Device 4 Event D 2012-01-02 10:00:00.000 2012-01-02 12:00:00.000
Device 5 Event E 2012-01-02 10:00:00.000 2012-01-02 15:00:00.000
Device 6 Event F 2012-01-03 09:00:00.000 2012-01-03 10:00:00.000
Gives the following result:
Finish Start
2012-01-01 10:00:00.000 2012-01-01 18:00:00.000
2012-01-01 20:00:00.000 2012-01-02 10:00:00.000
2012-01-02 15:00:00.000 2012-01-03 09:00:00.000

First Answer -- but see below for final one with additional constraints added by OP.
--
If you want to get the next startTime after the most recent endTime and avoid overlaps, you want something like:
select
distinct
e1.deviceId,
e1.EventEnd,
e3.EventStart
from Events e1
join Events e3 on e1.eventEnd < e3.eventStart /* Finds the next start Time */
and e3.eventStart = (select min(eventStart) from Events e5
where e5.eventStart > e1.eventEnd)
and not exists (select * /* Eliminates an e1 rows if it is overlapped */
from Events e5
where e5.eventStart < e1.eventEnd
and e5.eventEnd > e1.eventEnd)
For the case of your three rows:
INSERT INTO Events VALUES (1, '01/01/2012 08:00', '01/01/2012 10:00')
INSERT INTO Events VALUES (2, '01/01/2012 18:00', '01/01/2012 20:00')
insert into Events values (2, '01/01/2012 09:00', '01/01/2012 11:00')
This gives 1 result:
January, 01 2012 11:00:00-0800 January, 01 2012 18:00:00-0800
However, I assume you probably want to match on DeviceId also. In which case, on the joins, you'd add e1.DeviceId = e3.DeviceId and e1.deviceId = e5.deviceId
SQL Fiddle here: http://sqlfiddle.com/#!3/3899c/8
--
OK, final edit. Here's a query adding in deviceIds and adding in a distinct to account for simultenously ending events:
SELECT distinct
e1.DeviceID,
e1.EventEnd as LastEndTime,
e3.EventStart as NextStartTime
FROM Events e1
join Events e3 on e1.eventEnd < e3.eventStart
and e3.deviceId = e1.deviceId
and e3.eventStart = (select min(eventStart) from Events e5
where e5.eventStart > e1.eventEnd
and e5.deviceId = e3.deviceId)
where not exists (select * from Events e7
where e7.eventStart < e1.eventEnd
and e7.eventEnd > e1.eventEnd
and e7.deviceId = e1.deviceId)
order by e1.deviceId, e1.eventEnd
The join to the e3 finds the next start. The join to e5 guarantees that this is the earliest starttime after the current endtime. The join to e7 eliminates a row if the end-time of the considered row is overlapped by a different row.
For this data:
INSERT INTO Events VALUES (1, '01/01/2012 08:00', '01/01/2012 10:00')
INSERT INTO Events VALUES (2, '01/01/2012 18:00', '01/01/2012 20:00')
insert into Events values (2, '01/01/2012 09:00', '01/01/2012 11:00')
insert into Events values (2, '01/02/2012 11:00', '01/02/2012 15:00')
insert into Events values (1, '01/02/2012 10:00', '01/02/2012 12:00')
insert into Events values (2, '01/02/2012 10:00', '01/02/2012 15:00')
insert into Events values (2, '01/03/2012 09:00', '01/03/2012 10:00')
You get this result:
1 January, 01 2012 10:00:00-0800 January, 02 2012 10:00:00-0800
2 January, 01 2012 11:00:00-0800 January, 01 2012 18:00:00-0800
2 January, 01 2012 20:00:00-0800 January, 02 2012 10:00:00-0800
2 January, 02 2012 15:00:00-0800 January, 03 2012 09:00:00-0800
SQL Fiddle here: http://sqlfiddle.com/#!3/db0fa/3

I'm going to assume that it's not really this simple... but here's a query based on my current understanding of your scenario:
DECLARE #Events TABLE (
DeviceID INT,
EventStart DATETIME,
EventEnd DATETIME
)
INSERT INTO #Events VALUES (1, '01/01/2012 08:00', '01/01/2012 10:00')
INSERT INTO #Events VALUES (2, '01/01/2012 18:00', '01/01/2012 20:00')
SELECT
e1.DeviceID,
e1.EventEnd,
e2.EventStart
FROM
#Events e1
JOIN #Events e2
ON e2.EventStart = (
SELECT MIN(EventStart)
FROM #Events
WHERE EventStart > e1.EventEnd
)

Does this solve your issue:
http://www.simple-talk.com/sql/t-sql-programming/find-missing-date-ranges-in-sql/
http://www.simple-talk.com/sql/t-sql-programming/missing-date-ranges--the-sequel/
The second one seems more relevant
'There is a table, where two of the columns are DateFrom and DateTo.
Both columns contain date and time values. How does one find the
missing date ranges or, in other words, all the date ranges that are
not covered by any of the entries in the table'.

Here is a Postgres solution that I just did, that does not involve stored procedures:
SELECT minute, sum(case when dp.id is null then 0 else 1 end) as s
FROM generate_series(
'2017-12-28'::timestamp,
'2017-12-30'::timestamp,
'1 minute'::interval
) minute
left outer join device_periods as dp
on minute >= dp.start_date and minute < dp.end_date
group by minute order by minute
The generate_series function generates a table that has one row for each minute in the date range. You can change the interval to 1 second, to be more precise. It is a postgres specific function, but probably something similar exists in other engines.
This query will give you all the minutes that are filled, and all that are blank. You can wrap this query in an outer query, that can group by hours, days or do some window function operations to get the exact output as you need it. For my purposes, I only needed to count if there are blanks or not.

Related

Select the time from the last register before another register in another column - Amazon redshift

I am trying to select the time from the last register before another register in another column.
Here's the case:
I have some ids and two datetime columns that registers two different events
The column A can happen multiple times and the column B happens only once. Column A can happen before or after the event from B.
I want to select in another column the last time the column A happenned before column B. I am using AWS redshift. I've quite a success using last_value window function to get the last value from column A, but as it may occur after the column B register i am missing some entries. Here is an example:
ID
event A
Event B
Last event A before event B
1
11:20
11:40
11:20
1
10:40
11:40
11:20
2
09:40
09:50
9:42
2
09:42
09:50
9:42
2
10:50
09:50
9:42
2
11:00
09:50
9:42
create table amish.window_test
(id int, time1 time, time2 time)
insert into amish.window_test values
(1, '11:20' ,'11:40' ),
(1, '10:40' ,'11:40' ),
(2, '09:40' ,'09:50' ),
(2, '09:42' ,'09:50' ),
(2, '10:50' ,'09:50' ),
(2, '11:00' ,'09:50' )
select id, time1,time2, max(case when time1 < time2 then time1 end)
over (partition by id )
from amish.window_test

SQL query to find available slots with multiple providers and users

I want to be able to find the number of available slots for a particular time duration for all locations and all days
For example: I have to know the number of available appointments before 10 AM in all locations from the below sample tables
I have looked at other answers in stack overflow, mine is peculiar in the sense it also involves data on multiple doctors/patients.
Doctor's time table
Location
RESOURCE
Day
StartTime
EndTime
ABC
D1
Mon
8:00 AM
12:00 PM
ABC
D1
Tue
8:00 AM
12:00 PM
ABC
D2
Mon
9:00 AM
01:00 PM
ABC
D2
Tue
8:00 AM
12:00 PM
XYZ
D1
Mon
8:00 AM
12:00 PM
XYZ
D1
Tue
8:00 AM
12:00 PM
XYZ
D4
Mon
9:00 AM
01:00 PM
XYZ
D4
Tue
8:00 AM
12:00 PM
Patient's appointment time table
Location
Patient
Duration
StartTime
ApptDt
ABC
P1
15
8:00 AM
10/4/2021
ABC
P2
15
8:15 AM
10/4/2021
ABC
P3
15
9:00 AM
10/4/2021
ABC
P4
15
9:00 AM
10/5/2021
XYZ
P5
15
10:00 AM
10/5/2021
XYZ
P6
15
10:00 AM
10/5/2021
XYZ
P7
15
10:15 AM
10/5/2021
XYZ
P8
15
10:15 AM
10/5/2021
Doctor's time table does not have dates as it is the same throughout the year.
On Mondays in ABC location, since there are 2 doctors overlapping the time between 9:00 AM to 12:00 noon, they can accept multiple appointments at the same time. ie, 2 patients from 9:00 am to 9:15 am can be served in location ABC.
A typical duration(Duration) for an appointment is 15 minutes as indicated in the patient's table.
Expected result set
Location
Date
Available appts
ABC.
10/4/2021
8
XYZ
10/4/2021
12
On 10/4/2021 there were 8 slots available for booking before 10 AM because there were no appointments between
8:30-8:45 for D1
8:45-9:00 for D1
9:00-9:15(2) for D1,D2
9:15-9:30(2) for D1,D2
9:30-9:45(2) for D1,D2
9:45-10:00(2) for D1,D2
I want to also know for a specific time slot how many appointments were booked vs available.
I'd re-imagine this data as transactional using CTEs, compute balances and then find the points where the balance is non-zero.
Conceptually, that means there's a +1 doctor transaction on each doctor's start time, and a -1 doctor transaction on each doctor's end time. Patients are just the reverse, there is a -1 doctor transaction at their start time and a +1 doctor transaction at their start time plus duration.
So something like:
WITH DrStarts AS (
SELECT
1 [Drs],
[Dates].[Date] + [DrSched].StartTime [Timestamp]
FROM [DrSched]
INNER JOIN [Dates]
ON WEEKDAY([Dates]) = [DrSched].[Day]
), DrEnds AS (
SELECT
-1 [Drs],
[Dates].[Date] + [DrSched].EndTime [Timestamp]
FROM [DrSched]
INNER JOIN [Dates]
ON WEEKDAY([Dates]) = [DrSched].[Day]
), ApptStarts AS (
SELECT -1 [Drs], [Date] + [Time] FROM [Appts]
), ApptEnds AS (
SELECT -1 [Drs], DATEADD(MM,[Duration],[Date] + [Time]) FROM [Appts]
), Txns AS (
SELECT *, 1 Priority FROM DrStarts
UNION ALL SELECT *, 1 Priority FROM DrEnds
UNION ALL SELECT *, 0 Priority FROM ApptStarts
UNION ALL SELECT *, 0 Priority FROM ApptEnds
)
I added priorities at the end so we can make sure the patient leaves an instant before the doctor leaves. Then you can get the balance using a windowed function like so:
, AvailDrs AS (
SELECT
*,
SUM([Drs]) OVER( ORDER BY [Timestamp] DESC, [Priority] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) [AvailDrs]
FROM Txns
)
Then to get the available slots, you just do:
SELECT
[AvailDrs].[Timestamp] [From],
LEAD([AvailDrs].[Timestamp]) OVER(ORDER BY [AvailDrs].[Timestamp]) [To],
[AvailDrs].[AvailDrs]
FROM AvailDrs
WHERE [AvailDrs] > 0
Though you may want to filter that to get rid of zero-length windows because those will occur.
This is not very performant, but if you have a high volume scenario, you probably want to reconsider your database design to make this function require less transformation.
You also need to make a date table. I presume you actually have a work calendar somewhere, but if not there are myriad ways to create a date table within a dynamic start/end date so I just assume it exists here. this approach also lets you easily slot in holidays, and perhaps incorporate a dr-specific leave calendar too.
In general, a wide range of difficult SQL probnlems become much easier if you reimagine the data as account/amount/timestamp transactions. Here you don't even subdivide into accounts but you often need that concept for other puzzles.
Also, I haven't tested this exact code, so you may end up with duplicates. If that's the case you may need to global key ORDER BY tie breaker to keep everything running smooth in the windowed functions. You can add this as an identity column to both tables, or just define a CTE with a DENSE_RANK() key column and use that instead of selecting from the tables directly.

SQL getting datediff from same field

I have a problem. I need to get the date difference in terms of hours in my table but the problem is it is saved in the same field. This is my table would look like.
RecNo. Employeeno recorddate recordtime recordval
1 001 8/22/2014 8:15 AM 1
2 001 8/22/2014 5:00 PM 2
3 001 8/24/2014 8:01 AM 1
4 001 8/24/2014 5:01 PM 2
1 indicates time in and 2 indicates time out. Now, How will i get the number of hours worked for each day? What i want to get is something like this.
Date hoursworked
8/22/2014 8
8/24/2014 8
I am using VS 2010 and SQL server 2005
You could self-join each "in" record with its corresponding "out" record and use datediff to subtract them:
SELECT time_in.employeeno AS "Employee No",
time_in.recorddate AS "Date",
DATEDIFF (hour, time_in.recordtime, time_out.recordtime)
AS "Hours Worked"
FROM (SELECT *
FROM my_table
WHERE recordval = 1) time_in
INNER JOIN (SELECT *
FROM my_table
WHERE recordval = 2) time_out
ON time_in.employeeno = time_out.employeeno AND
time_in.recorddate = time_out.recorddate
If you always record time in and time out for every employee, and just one per day, using a self-join should work:
SELECT
t1.Employeeno,
t1.recorddate,
t1.recordtime AS [TimeIn],
t2.recordtime AS [TimeOut],
DATEDIFF(HOUR,t1.recordtime, t2.recordtime) AS [HoursWorked]
FROM Table1 t1
INNER JOIN Table1 t2 ON
t1.Employeeno = t2.Employeeno
AND t1.recorddate = t2.recorddate
WHERE t1.recordval = 1 AND t2.recordval = 2
I included the recordtime fields as time in, time out, if you don't want them just remove them.
Note that this datediff calculation gives 9 hours, and not 8 as you suggested.
Sample SQL Fiddle
Using this sample data:
with table1 as (
select * from ( values
(1,'001', cast('20140822' as datetime),cast('08:15:00 am' as time),1)
,(2,'001', cast('20140822' as datetime),cast('05:00:00 pm' as time),2)
,(3,'001', cast('20140824' as datetime),cast('08:01:00 am' as time),1)
,(4,'001', cast('20140824' as datetime),cast('04:59:00 pm' as time),2)
,(5,'001', cast('20140825' as datetime),cast('10:01:00 pm' as time),1)
,(6,'001', cast('20140826' as datetime),cast('05:59:00 am' as time),2)
)data(RecNo,EmployeeNo,recordDate,recordTime,recordVal)
)
this query
SELECT
Employeeno
,convert(char(10),recorddate,120) as DateStart
,convert(char(5),cast(TimeIn as time)) as TimeIn
,convert(char(5),cast(TimeOut as time)) as TimeOut
,DATEDIFF(minute,timeIn, timeOut) / 60 AS [HoursWorked]
,DATEDIFF(minute,timeIn, timeOut) % 60 AS [MinutesWorked]
FROM (
SELECT
tIn.Employeeno,
tIn.recorddate,
dateadd(minute, datediff(minute,0,tIn.recordTime), tIn.recordDate)
as TimeIn,
( SELECT TOP 1
dateadd(minute, datediff(minute,0,tOut.recordTime), tOut.recordDate)
as TimeOut
FROM Table1 tOut
WHERE tOut.RecordVal = 2
AND tOut.EmployeeNo = tIn.EmployeeNo
AND tOut.RecNo > tIn.RecNo
ORDER BY tOut.EmployeeNo, tOut.RecNo
) as TimeOut
FROM Table1 tIn
WHERE tIn.recordval = 1
) T
yields (as desired)
Employeeno DateStart TimeIn TimeOut HoursWorked MinutesWorked
---------- ---------- ------ ------- ----------- -------------
001 2014-08-22 08:15 17:00 8 45
001 2014-08-24 08:01 16:59 8 58
001 2014-08-25 22:01 05:59 7 58
No assumptions are made about shifts not running across midnight (see case 3).
This particular implementation may not be the most performant way to construct this correlated subquery, so if there is a performance problem come back and we can look at it again. However running those tests requires a large dataset which I don't feel like constructing just now.

SQL Server find time slot between start time and end time

SQL Server, how to find the time slot from a schedule table like I need to output first column's end time and next column's start time?
select
s, e,
Max(cid)as c_id,
ROW_NUMBER()OVER(order by CAST(s as datetime)) as row_id
from classroom
where Room like '3310' and Days like '%T%'
group by s,e
order by CAST(s as datetime)
For example:
s e c_id row_id
------- ------- ------- ------
9:30 10:45 235 1
11:00 12:15 236 2
12:30 13:45 238 3
14:00 15:15 1415 4
15:30 16:45 273 5
17:00 18:15 270 6
I need to output
10:45-11:00
12:15-12:30
13:45-14:00
Thanks
You can insert your data in a temp table and then query that temp table
select s,e,Max(cid)as c_id,
ROW_NUMBER()OVER(order by CAST(s as datetime))as row_id
into #t
from classroom
where Room like '3310' and Days like '%T%'
group by s,e
order by CAST(s as datetime)
select t1.e, t2.s
from #t t1
INNER JOIN #t t2 on t1.row_id + 1 = t2.row_id
If you want to know only when there's a time gap between one class finishing and the next starting add where t2.s > t1.e to Abhi's answer. If you need a minimum size of slot, say 15 minutes, use where DATEDIFF(mi, t1.e, t2.s) > 15.

Multiple rows of dates between using custom calendar

So banging my head against the wall and can't see the wood for the trees...
I've got two tables;
1. ID field, start date and end date columns.
2. Date and Workday columns.
I just need to be able to count the days between the two for each row using this dates on the second calendar. Googl'ing had found plenty of examples without the dates table and plenty of examples where its just based on 1 start and end date.
Table_1 - Contains an entry for every id
id start_date end_date
123 01/01/2013 03/01/2013
456 02/01/2013 08/01/2013
789 06/01/2013 07/01/2013
Table_2 - Contains an entry for everyday
e_day workday
01/01/2013 1
02/01/2013 0
03/01/2013 1
04/01/2013 1
05/01/2013 0
06/01/2013 1
07/01/2013 0
08/01/2013 0
Results
id start_date end_date days_between
123 01/01/2013 03/01/2013 2
456 02/01/2013 08/01/2013 3
789 06/01/2013 07/01/2013 1
I can find out the value for 1 id;
SELECT COUNT(workday) FROM table_2
WHERE workday = 1 AND cal_day >= '01/01/2013'
AND cal_day <= '03/01/2013';
Just not sure how to put this logic in to table_1.
IE (Clearly not correct)
SELECT
table_1.id,
table_1.start_date,
table_1.end_date,
(COUNT(table_2.workday) FROM table_2 WHERE table_2.workday = 1
AND table_2.e_day >= table_1.start_date
AND table_2.e_day <= table_2.end_date) AS days_between
FROM table_1
Code to generate bodged example tables;
CREATE TABLE #table_1(id INT, start_date SMALLDATETIME, end_date SMALLDATETIME);
CREATE TABLE #table_2(e_day SMALLDATETIME, workday BIT);
INSERT #table_1 VALUES (123,'01/01/2013','03/01/2013')
INSERT #table_1 VALUES (456,'02/01/2013','08/01/2013')
INSERT #table_1 VALUES (789,'06/01/2013','07/01/2013')
INSERT #table_2 VALUES ('01/01/2013',1)
INSERT #table_2 VALUES ('02/01/2013',0)
INSERT #table_2 VALUES ('03/01/2013',1)
INSERT #table_2 VALUES ('04/01/2013',1)
INSERT #table_2 VALUES ('05/01/2013',0)
INSERT #table_2 VALUES ('06/01/2013',1)
INSERT #table_2 VALUES ('07/01/2013',0)
INSERT #table_2 VALUES ('08/01/2013',0)
SELECT * FROM #table_1
SELECT * FROM #table_2
Code to remove tables;
DROP TABLE #table_1 DROP TABLE #table_2;
Thanks all for you help in advance :)
Try this:
select a.id,a.start_date,a.end_date,sum(cast(workday as tinyint)) as NumWorkDays,
count(*) as Total_days
from idTable a
join workdaytable b on b.eday between a.start_date and a.end_Date
group by a.id,a.start_date,a.end_date
To visualize what is happening
select a.id,a.start_date,a.end_date
where id=123
id start_date end_date
123 1/1/2013 3/1/2013
returns one row for id=123
Now, when we do the join, we add e_day and the workday flag columns AND we add one row for each e_day in the second table
id start_date end_date e_day work_day
123 1/1/2013 3/1/2013 1/1/2013 0
123 1/1/2013 3/1/2013 1/2/2013 1
123 1/1/2013 3/1/2013 1/3/2013 1
etc.
Now we had a big "table" with 5 columns and one row for each day in the second table that falls between 1/1/2013 and 3/1/2013. The Sum operation simply adds all of the work_day flag from the "table" we created by the join. If you run the query without the JOIN (and remove the sum and count), you can see the "table" that gets created...
Hope this helps a bit...