How do I calculate coverage dates during a period of time in SQL against a transactional table? - sql

I'm attempting to compile a list of date ranges like so:
Coverage Range: 10/1/2016 - 10/5/2016
Coverage Range: 10/9/2016 - 10/31/2016
for each policy in a database table. The table is transactional, and there is one cancellation transaction code, but three codes that can indicate coverage has begun. Also, there can be instances where the codes that indicate start of coverage can occur in sequence (start on 10/1, then another start on 10/5, then cancel on 10/14). Below is an example of a series of transactions that I would like to generate the above results from:
TransID PolicyID EffDate
NewBus 1 9/15/2016
Confirm 1 9/17/2016
Cancel 1 10/5/2016
Reinst 1 10/9/2016
Cancel 1 10/15/2016
Reinst 1 10/15/2016
PolExp 1 3/15/2017
SO in this dataset, I want the following results for the date range 10/1 - 10/31
Coverage Range: 10/1/2016 - 10/5/2016
Coverage Range: 10/9/2016 - 10/31/2016
Note that since the cancel and reinstatement happen on the same day, I'm excluding them from the results set. I tried pairing the transactions with subqueries:
CONVERT(varchar(10),
CASE WHEN overall.sPTRN_ID in (SELECT code FROM #cancelTransCodes)
-- This is a coverage cancellationentry
THEN -- Set coverage start date using previous paired record
CASE WHEN((SELECT MAX(inn.PD_EffectiveDate) FROM PolicyData inn WHERE inn.sPTRN_ID in (SELECT code FROM #startCoverageTransCodes)
and inn.PD_EffectiveDate <= overall.PD_EffectiveDate
and inn.PD_PolicyCode = overall.PD_PolicyCode) < #sDate) THEN #sDate
ELSE
(SELECT MAX(inn.PD_EffectiveDate) FROM PolicyData inn WHERE inn.sPTRN_ID in (SELECT code FROM #startCoverageTransCodes)
and inn.PD_EffectiveDate <= overall.PD_EffectiveDate
and inn.PD_PolicyCode = overall.PD_PolicyCode)
END
ELSE -- Set coverage start date using current record
CASE WHEN (overall.PD_EffectiveDate < #sDate) THEN #sDate ELSE overall.PD_EffectiveDate END END, 101)
as [Effective_Date]
This mostly works except for the situation I listed above. I'd rather not rewrite this query if I can help it. I have a similar line for expiration date:
ISNULL(CONVERT(varchar(10),
CASE WHEN overall.sPTRN_ID in (SELECT code FROM #cancelTransCodes) -- This is a coverage cancellation entry
THEN -- Set coverage end date with current record
overall.PD_EffectiveDate
ELSE -- check for future coverage end dates
CASE WHEN
(SELECT COUNT(*) FROM PolicyData pd WHERE pd.PD_EffectiveDate > overall.PD_EffectiveDate and pd.sPTRN_ID in (SELECT code FROM #cancelTransCodes)) > 1
THEN -- There are future end dates
CASE WHEN((SELECT TOP 1 pd.PD_ExpirationDate FROM PolicyData pd
WHERE pd.PD_PolicyCode = overall.PD_PolicyCode
and pd.PD_EntryDate between #sDate and #eDate
and pd.sPTRN_ID in (SELECT code FROM #cancelTransCodes))) > #eDate
THEN #eDate
ELSE
(SELECT TOP 1 pd.PD_ExpirationDate FROM PolicyData pd
WHERE pd.PD_PolicyCode = overall.PD_PolicyCode
and pd.PD_EntryDate between #sDate and #eDate
and pd.sPTRN_ID in (SELECT code FROM #cancelTransCodes))
END
ELSE -- No future coverage end dates
CASE WHEN(overall.PD_ExpirationDate > #eDate) THEN #eDate ELSE overall.PD_ExpirationDate END
END
END, 101), CONVERT(varchar(10), CASE WHEN(overall.PD_ExpirationDate > #eDate) THEN #eDate ELSE overall.PD_ExpirationDate END, 101))
as [Expiration_Date]
I can't help but feel like there's a simpler solution I'm missing here. So my question is: how can I modify the above portion of my query to accomodate the above scenario? OR What is the better answer? If I cam simplify this, I would love to hear how.
Here's the solution I ended up implementing
I took a simplified table where I boiled all the START transaction codes to START and all the cancel transaction codes to CANCEL. When I viewed the table based on that, it was MUCH easier to watch how my logic affected the results. I ended up using a simplified system where I used CASE WHEN clauses to identify specific scenarios and built my date ranges based on that. I also changed my starting point away from looking at cancellations and finding the related starts, and reversing it (find starts and then related calcellations). So here's the code I implemented:
/* Get Coverage Dates */
,cast((CASE WHEN sPTRN_ID in (SELECT code FROM #startCoverageTransCodes) THEN
CASE WHEN (cast(overall.PD_EntryDate as date) <= #sDate) THEN #sDate
WHEN (cast(overall.PD_EntryDate as date) > #sDate AND cast(overall.PD_EntryDate as date) <= #eDate) THEN overall.PD_EntryDate
WHEN (cast(overall.PD_EntryDate as date) > #eDate) THEN #eDate
ELSE cast(overall.PD_EntryDate as date) END
ELSE
null
END) as date) as Effective_Date
,cast((CASE WHEN sPTRN_ID in (SELECT code FROM #startCoverageTransCodes) THEN
CASE WHEN (SELECT MIN(p.PD_EntryDate) FROM PolicyData p WITH (NOLOCK) WHERE p.sPTRN_ID in (SELECT code FROM #cancelTransCodes) AND p.PD_EntryDate > overall.PD_EntryDate AND p.PD_PolicyCOde = overall.PD_PolicyCode) > #eDate THEN #eDate
ELSE ISNULL((SELECT MIN(p.PD_EntryDate) FROM PolicyData p WITH (NOLOCK) WHERE p.sPTRN_ID in (SELECT code FROM #cancelTransCodes) AND p.PD_EntryDate > overall.PD_EntryDate AND p.PD_PolicyCOde = overall.PD_PolicyCode), #eDate) END
ELSE
CASE WHEN (SELECT MAX(p.PD_EntryDate) FROM PolicyData p WITH (NOLOCK) WHERE p.sPTRN_ID in (SELECT code FROM #startCoverageTransCodes) AND p.PD_EntryDate > overall.PD_EntryDate AND p.PD_PolicyCOde = overall.PD_PolicyCode) > #eDate THEN #eDate
ELSE (SELECT MAX(p.PD_EntryDate) FROM PolicyData p WITH (NOLOCK) WHERE p.sPTRN_ID in (SELECT code FROM #startCoverageTransCodes) AND p.PD_EntryDate > overall.PD_EntryDate AND p.PD_PolicyCOde = overall.PD_PolicyCode)
END END) as date) as Expiration_Date
As you can see, I relied on subqueries in this case. I had a lot of this logic as joins, which caused extra rows where I didn't need them. So by making the date range logic based on sub-queries, I ended up speeding the stored procedure up by several seconds, bringing my execution time to under 1 second where before it was between 2-5 seconds.

There might be a simpler solution, but I just do not see it right now.
The outline for each step is:
Generate dates for date range, which you do not need to do if you have a calendar table.
Transform the incoming data set as you described in your question (skipping start/cancel on the same day); and add the next EffDate for each row.
Explode the data set with a row for each day between the generated ranges of step 2.
Reduce the data set back down based on consecutive days of converage.
test setup: http://rextester.com/GUNSO45644
/* set date range */
declare #fromdate date = '20161001'
declare #thrudate date = '20161031'
/* generate dates in range -- you can skip this if you have a calendar table */
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
select top (datediff(day, #fromdate, #thrudate)+1)
[Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1, #fromdate))
from n as deka
cross join n as hecto /* 100 days */
cross join n as kilo /* 2.73 years */
cross join n as [tenK] /* 27.3 years */
order by [Date]
)
/* reduce test table to desired input*/
, pol as (
select
Coverage = case when max(TransId) in ('Cancel','PolExp')
then 0 else 1 end
, PolicyId
, EffDate = case when max(TransId) in ('Cancel','PolExp')
then dateadd(day,1,EffDate) else EffDate end
, NextEffDate = oa.NextEffDate
from t
outer apply (
select top 1
NextEffDate = case
when i.TransId in ('Cancel','PolExp')
then dateadd(day,1,i.EffDate)
else i.EffDate
end
from t as i
where i.PolicyId = t.PolicyId
and i.EffDate > t.EffDate
order by
i.EffDate asc
, case when i.TransId in ('Cancel','PolExp') then 1 else 0 end desc
) as oa
group by t.PolicyId, t.EffDate, oa.NextEffDate
)
/* explode desired input by day, add row_numbers() */
, e as (
select pol.PolicyId, pol.Coverage, d.Date
, rn_x = row_number() over (
partition by pol.PolicyId
order by d.Date
)
, rn_y = row_number() over (
partition by pol.PolicyId, pol.Coverage
order by d.date)
from pol
inner join dates as d
on d.date >= pol.EffDate
and d.date < pol.NextEffDate
)
/* reduce to date ranges where Coverage = 1 */
select
PolicyId
, FromDate = convert(varchar(10),min(Date),120)
, ThruDate = convert(varchar(10),max(Date),120)
from e
where Coverage = 1
group by PolicyId, (rn_x-rn_y);
returns:
+----------+------------+------------+
| PolicyId | FromDate | ThruDate |
+----------+------------+------------+
| 1 | 2016-10-01 | 2016-10-05 |
| 1 | 2016-10-09 | 2016-10-31 |
+----------+------------+------------+

Related

Move Functions from Where Clause to Select Statement

I have a union query that runs abysmally slow I believe mostly because there are two functions in the where clause of each union. I am pretty sure that there is no getting around the unions, but there may be a way to move the functions from the where of each. I won't post ALL of the union sections because I don't think it is necessary as they are all almost identical with the exception of one table in each. The first function was created by someone else but it takes a date, and uses the "frequency" value like "years, months, days, etc." and the "interval" value like 3, 4, 90 to calculate the new "Due Date". For instance, a date of today with a frequency of years, and an interval of 3, would produce the date 4/21/2025. Here is the actual function:
ALTER FUNCTION [dbo].[ReturnExpiration_IntervalxFreq](#Date datetime2,#int int, #freq int)
RETURNS datetime2
AS
BEGIN
declare #d datetime2;
SELECT #d = case when #int = 1 then null-- '12-31-9999'
when #int = 2 then dateadd(day,#freq,#date)
when #int = 3 then dateadd(week,#freq,#date)
when #int = 4 then dateadd(month,#freq,#date)
when #int = 5 then dateadd(quarter,#freq,#date)
when #int = 6 then dateadd(year,#freq,#date)
end
RETURN #d;
The query itself is supposed to find and identify records whose Due Date has past or is within 90 days of the current date. Here is what each section of the union looks like
SELECT
R.RequirementId
, EC.EmployeeCompanyId
, EC.CompanyId
, DaysOverdue =
CASE WHEN
R.DueDate IS NULL
THEN
CASE WHEN
EXISTS(SELECT 1 FROM tbl_Training_Requirement_Compliance RC WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId AND RC.RequirementId = R.RequirementId AND RC.Active = 1 AND ((DATEDIFF(DAY, R.DueDate, GETDATE()) > -91 OR R.DueDate Is Null ) OR (DATEDIFF(DAY, dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), GETDATE()) > -91)) OR R.IntervalId IS NULL)
THEN
DateDiff(day,ISNULL(dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), '12/31/9999'),getdate())
ELSE
0
END
ELSE
DATEDIFF(day,R.DueDate,getdate())
END
,CASE WHEN
EXISTS(SELECT 1 FROM tbl_Training_Requirement_Compliance RC WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId AND RC.RequirementId = R.RequirementId AND RC.Active=1 AND (GETDATE() > dbo.ReturnExpiration_IntervalxFreq(RC.EffectiveDate, R.IntervalId, R.Frequency) OR R.IntervalId IS NULL))
THEN
CONVERT(VARCHAR(12),dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), 101)
ELSE
CONVERT(VARCHAR(12),R.DueDate,101)
END As DateDue
FROM
#Employees AS EC
INNER JOIN dbo.tbl_Training_Requirement_To_Position TRP ON TRP.PositionId = EC.PositionId
INNER JOIN #CompanyReqs R ON R.RequirementId = TRP.RequirementId
LEFT OUTER JOIN tbl_Training_Requirement_Compliance TRC ON TRC.EmployeeCompanyId = EC.EmployeeCompanyId AND TRC.RequirementId = R.RequirementId AND TRC.Active = 1
WHERE
NOT EXISTS(SELECT 1
FROM tbl_Training_Requirement_Compliance RC
WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId
AND RC.RequirementId = R.RequirementId
AND RC.Active = 1
)
OR (
(DATEDIFF(DAY, R.DueDate, GETDATE()) > -91
OR R.DueDate Is Null )
OR (DATEDIFF(DAY, dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), GETDATE()) > -91))
UNION...
It is supposed to exclude records that either don't exist at all on the tbl_Training_Requirement_Compliance table, or if they do exist, once the frequency an intervals have been calculated, would have a new due date that is within 90 days of the current date. I am hoping that someone with much more experience and expertise in SQL Server can show me a way, if possible, to remove the functions from the WHERE clause and help the performance of this stored procedure.

How To create an expiration date based off the next effective date

So I can not use a temp table to accomplish this. Basically I need to create an experation date based off the next effective date available.
So for one proc code i have 10 fee schedules with only effective dates. I tried writing this but its not working because it is giving for some records the same expiration date. I know an order by is needed but i placed it in various places and it still gives me the same issue. Any help is greatly appreciated
SELECT a.*,
( CASE
WHEN (SELECT TOP 1 b.effectivedate - 1
FROM ietl_profileprocedure b
WHERE b.effectivedate > a.effectivedate
AND a.profilesid = b.profilesid) IS NULL THEN (SELECT
Dateadd(mm, Datediff(mm, 0, Getdate()) + 1, -1))
ELSE (SELECT TOP 1 b.effectivedate - 1
FROM ietl_profileprocedure b
WHERE b.effectivedate > a.effectivedate
AND a.profilesid = b.profilesid)
END ) AS 'ExpDate'
FROM ietl_profileprocedure a
WHERE profilesid = '4197'
AND procedurecode = '90685'
Thank you Juan I looked up lead and lag and was able to come up with the code below to suit my needs! now i can have a date something was posted and find the amount that should have pertained to the ProcedureCode at that time more easily
SELECT s.ProfileSID,s.ProcedureCode,s.Amount,cast(s.EffectiveDate as date) as EffectiveDate ,
cast(isnull(dateadd(day,-1,LEAD(EffectiveDate) OVER (ORDER BY ProfileSID,ProcedureCode,EffectiveDate
)),getdate())as date) ExpDate
FROM iETL_ProfileProcedure s
WHERE ProfileSID IN ('4197')and ProcedureCode = '90685'

SQL Merging 4 Queries to one

Im having a slight issue merging the following statements
declare #From DATE
SET #From = '01/01/2014'
declare #To DATE
SET #To = '31/01/2014'
--ISSUED SB
SELECT
COUNT(pm.DateAppIssued) AS Issued,
pm.Lender,
pm.AmountRequested,
p.CaseTypeID
FROM BPS.dbo.tbl_Profile_Mortgage AS pm
INNER JOIN BPS.dbo.tbl_Profile AS p
ON pm.FK_ProfileId = p.Id
WHERE CaseTypeID = 2
AND (CONVERT(DATE,DateAppIssued, 103)
Between CONVERT(DATE,#From,103) and CONVERT(DATE,#To,103))
And Lender > ''
GROUP BY pm.Lender,p.CaseTypeID,pm.AmountRequested;
--Paased
SELECT
COUNT(pm.DatePassed) AS Passed,
pm.Lender,
pm.AmountRequested,
p.CaseTypeID
FROM BPS.dbo.tbl_Profile_Mortgage AS pm
INNER JOIN BPS.dbo.tbl_Profile AS p
ON pm.FK_ProfileId = p.Id
WHERE CaseTypeID = 2
AND (CONVERT(DATE,DatePassed, 103)
Between CONVERT(DATE,#From,103) and CONVERT(DATE,#To,103))
And Lender > ''
GROUP BY pm.Lender,p.CaseTypeID,pm.AmountRequested;
--Received
SELECT
COUNT(pm.DateAppRcvd) AS Received,
pm.Lender,
pm.AmountRequested,
p.CaseTypeID
FROM BPS.dbo.tbl_Profile_Mortgage AS pm
INNER JOIN BPS.dbo.tbl_Profile AS p
ON pm.FK_ProfileId = p.Id
WHERE CaseTypeID = 2
AND (CONVERT(DATE,DateAppRcvd, 103)
Between CONVERT(DATE,#From,103) and CONVERT(DATE,#To,103))
And Lender > ''
GROUP BY pm.Lender,p.CaseTypeID,pm.AmountRequested;
--Offered
SELECT
COUNT(pm.DateOffered) AS Offered,
pm.Lender,
pm.AmountRequested,
p.CaseTypeID
FROM BPS.dbo.tbl_Profile_Mortgage AS pm
INNER JOIN BPS.dbo.tbl_Profile AS p
ON pm.FK_ProfileId = p.Id
WHERE CaseTypeID = 2
AND (CONVERT(DATE,DateOffered, 103)
Between CONVERT(DATE,#From,103) and CONVERT(DATE,#To,103))
And Lender > ''
GROUP BY pm.Lender,p.CaseTypeID,pm.AmountRequested;
Ideally I would like the result of theses query's to show as follows
Issued, Passed , Offered, Received,
All in one table
Any Help on this would be greatly appreciated
Thanks
Rusty
I'm fairly certain in this case the query can be written without the use of any CASE statements, actually:
DECLARE #From DATE = '20140101'
declare #To DATE = '20140201'
SELECT Mortgage.lender, Mortgage.amountRequested, Profile.caseTypeId,
COUNT(Issue.issued) as issued,
COUNT(Pass.passed) as passed,
COUNT(Receive.received) as received,
COUNT(Offer.offered) as offered
FROM BPS.dbo.tbl_Profile_Mortgage as Mortgage
JOIN BPS.dbo.tbl_Profile as Profile
ON Mortgage.fk_profileId = Profile.id
AND Profile.caseTypeId = 2
LEFT JOIN (VALUES (1, #From, #To)) Issue(issued, rangeFrom, rangeTo)
ON Mortgage.DateAppIssued >= Issue.rangeFrom
AND Mortgage.DateAppIssued < Issue.rangeTo
LEFT JOIN (VALUES (2, #From, #To)) Pass(passed, rangeFrom, rangeTo)
ON Mortgage.DatePassed >= Pass.rangeFrom
AND Mortgage.DatePassed < Pass.rangeTo
LEFT JOIN (VALUES (3, #From, #To)) Receive(received, rangeFrom, rangeTo)
ON Mortgage.DateAppRcvd >= Receive.rangeFrom
AND Mortgage.DateAppRcvd < Receive.rangeTo
LEFT JOIN (VALUES (4, #From, #To)) Offer(offered, rangeFrom, rangeTo)
ON Mortgage.DateOffered >= Offer.rangeFrom
AND Mortgage.DateOffered < Offer.rangeTo
WHERE Mortgage.lender > ''
AND (Issue.issued IS NOT NULL
OR Pass.passed IS NOT NULL
OR Receive.received IS NOT NULL
OR Offer.offered IS NOT NULL)
GROUP BY Mortgage.lender, Mortgage.amountRequested, Profile.caseTypeId
(not tested, as I lack a provided data set).
... Okay, some explanations are in order, because some of this is slightly non-intuitive.
First off, read this blog entry for tips about dealing with date/time/timestamp ranges (interestingly, this also applies to all other non-integral types). This is why I modified the #To date - so the range could be safely queried without needing to convert types (and thus ignore indices). I've also made sure to choose a safe format - depending on how you're calling this query, this is a non issue (ie, parameterized queries taking an actual Date type are essentially format-less).
......
COUNT(Issue.issued) as issued,
......
LEFT JOIN (VALUES (1, #From, #To)) Issue(issued, rangeFrom, rangeTo)
ON Mortgage.DateAppIssued >= Issue.rangeFrom
AND Mortgage.DateAppIssued < Issue.rangeTo
.......
What's the difference between COUNT(*) and COUNT(<expression>)? If <expression> evaluates to null, it's ignored. Hence the LEFT JOINs; if the entry for the mortgage isn't in the given date range for the column, the dummy table doesn't attach, and there's no column to count. Unfortunately, I'm not sure how the interplay between the dummy table, LEFT JOIN, and COUNT() here will appear to the optimizer - the joins should be able to use indices, but I don't know if it's smart enough to be able to use that for the COUNT() here too....
(Issue.issued IS NOT NULL
OR Pass.passed IS NOT NULL
OR Receive.received IS NOT NULL
OR Offer.offered IS NOT NULL)
This is essentially telling it to ignore rows that don't have at least one of the columns. They wouldn't be "counted" in any case (well, they'd likely have 0) - there's no data for the function to consider - but they would show up in the results, which probably isn't what you want. I'm not sure if the optimizer is smart enough to use this to restrict which rows it operates over - that is, turn the JOIN conditions into a way to restrict the various date columns, as if they were in the WHERE clause too. If the query runs slow, try adding the date restrictions to the WHERE clause and see if it helps.
You could either as Dan Bracuk states use a union, or you could use a case-statement.
declare #From DATE = '01/01/2014'
declare #To DATE = '31/01/2014'
select
sum(case when (CONVERT(DATE,DateAppIssued, 103) Between CONVERT(DATE,#From,103) and CONVERT(DATE,#To,103)) then 1 else 0 end) as Issued
, sum(case when (CONVERT(DATE,DatePassed, 103) Between CONVERT(DATE,#From,103) and CONVERT(DATE,#To,103)) then 1 else 0 end) as Passed
, sum(case when (CONVERT(DATE,DateAppRcvd, 103) Between CONVERT(DATE,#From,103) and CONVERT(DATE,#To,103)) then 1 else 0 end) as Received
, sum(case when (CONVERT(DATE,DateOffered, 103) Between CONVERT(DATE,#From,103) and CONVERT(DATE,#To,103)) then 1 else 0 end) as Offered
, pm.Lender
, pm.AmountRequested
, p.CaseTypeID
FROM BPS.dbo.tbl_Profile_Mortgage AS pm
INNER JOIN BPS.dbo.tbl_Profile AS p
ON pm.FK_ProfileId = p.Id
WHERE CaseTypeID = 2
And Lender > ''
GROUP BY pm.Lender,p.CaseTypeID,pm.AmountRequested;
Edit:
What I've done is looked at your queries.
All four queries have identical Where Clause, with the exception of the date comparison. Therefore I've created a new query, which selects all your data which might be used in one of the four counts.
The last clause; the data-comparison, is moved into a case statement, returning 1 if the row is between the selected date-range, and 0 otherwise. This basically indicates whether the row would be returned in your previous queries.
Therefore a sum of this column would return the equivalent of a count(*), with this date-comparison in the where-clause.
Edit 2 (After comments by Clockwork-muse):
Some notes on performance, (tested on MS-SQL 2012):
Changing BETWEEN to ">=" and "<" inside a case-statement does not affect the cost of the query.
Depending on the size of the table, the query might be optimized quite a lot, by adding the dates in the where clause.
In my sample data (~20.000.000 rows, spanning from 2001 to today), i got a 48% increase in speed by adding.
or (DateAppIssued BETWEEN #From and #to )
or (DatePassed BETWEEN #From and #to )
or (DateAppRcvd BETWEEN #From and #to )
or (DateOffered BETWEEN #From and #to )
(There were no difference using BETWEEN and ">=" and "<".)
It is also worth nothing that i got a 6% increase when changing the #From = '01/01/2014' to #From '2014-01-01' and thus omitting the convert().
Eg. an optimized query could be:
declare #From DATE = '2014-01-01'
declare #To DATE = '2014-01-31'
select
sum(case when (DateAppIssued >= #From and DateAppIssued < #To) then 1 else 0 end) as Issued
, sum(case when (DatePassed >= #From and DatePassed < #To) then 1 else 0 end) as Passed
, sum(case when (DateAppRcvd >= #From and DateAppRcvd < #To) then 1 else 0 end) as Received
, sum(case when (DateOffered >= #From and DateOffered < #To) then 1 else 0 end) as Offered
, pm.Lender
, pm.AmountRequested
, p.CaseTypeID
FROM BPS.dbo.tbl_Profile_Mortgage AS pm
INNER JOIN BPS.dbo.tbl_Profile AS p
ON pm.FK_ProfileId = p.Id
WHERE 1=1
and CaseTypeID = 2
and Lender > ''
and (
(DateAppIssued >= #From and DateAppIssued < #To)
or (DatePassed >= #From and DatePassed < #To)
or (DateAppRcvd >= #From and DateAppRcvd < #To)
or (DateOffered >= #From and DateOffered < #To)
)
GROUP BY pm.Lender,p.CaseTypeID,pm.AmountRequested;
I do however really like Clockwork-muse's answer, as I prefer joins to case-statements, where posible :)
The all-in-one queries here in other answers are certainly elegant, but if you are in a rush to get something working as a one-off, or if you agree the following approach is easy to read and maintain when you have to revisit it some time down the road (or someone else less skilled has to work out what's going on) - here's a skeleton of a Common Table Expression alternative which I believe is quite clear to read :
WITH Unioned_Four AS
( SELECT .. -- first select : Issued
UNION ALL
SELECT .. -- second : Passed
UNION ALL
SELECT .. -- Received
UNION ALL
SELECT .. -- Offered
)
SELECT
-- group fields
-- SUMs of the count fields
FROM Unioned_Four
GROUP BY .. -- etc
Obviously the fields have to match in the 4 parts of the UNION, requiring dummy fields returning zero in each one.
So you could have kept the simple approach that you started with, but wrapped it up as a derived table using the CTE syntax to allow you to have the four counts all on one row per GROUPing. Also if you have to add extra filtering to specific queries of the four, then it's easier to meddle with the individual SELECTs - the flipside being (of course) that further requirements for all four would need to be duplicated!

SQL to find overlapping time periods and sub-faults

Long time stalker, first time poster (and SQL beginner). My question is similar to this one SQL to find time elapsed from multiple overlapping intervals, except I'm able to use CTE, UDFs etc and am looking for more detail.
On a piece of large scale equipment I have a record of all faults that arise. Faults can arise on different sub-components of the system, some may take it offline completely (complete outage = yes), while others do not (complete outage = no). Faults can overlap in time, and may not have end times if the fault has not yet been repaired.
Outage_ID StartDateTime EndDateTime CompleteOutage
1 07:00 3-Jul-13 08:55 3-Jul13 Yes
2 08:30 3-Jul-13 10:00 4-Jul13 No
3 12:00 4-Jul-13 No
4 12:30 4-Jul13 12:35 4-Jul-13 No
1 |---------|
2 |---------|
3 |--------------------------------------------------------------
4 |---|
I need to be able to work out for a user defined time period, how long the total system is fully functional (no faults), how long its degraded (one or more non-complete outages) and how long inoperable (one or more complete outages). I also need to be able to work out for any given time period which faults were on the system. I was thinking of creating a "Stage Change" table anytime a fault is opened or closed, but I am stuck on the best way to do this - any help on this or better solutions would be appreciated!
This isn't a complete solution (I leave that as an exercise :)) but should illustrate the basic technique. The trick is to create a state table (as you say). If you record a 1 for a "start" event and a -1 for an "end" event then a running total in event date/time order gives you the current state at that particular event date/time. The SQL below is T-SQL but should be easily adaptable to whatever database server you're using.
Using your data for partial outage as an example:
DECLARE #Faults TABLE (
StartDateTime DATETIME NOT NULL,
EndDateTime DATETIME NULL
)
INSERT INTO #Faults (StartDateTime, EndDateTime)
SELECT '2013-07-03 08:30', '2013-07-04 10:00'
UNION ALL SELECT '2013-07-04 12:00', NULL
UNION ALL SELECT '2013-07-04 12:30', '2013-07-04 12:35'
-- "Unpivot" the events and assign 1 to a start and -1 to an end
;WITH FaultEvents AS (
SELECT *, Ord = ROW_NUMBER() OVER(ORDER BY EventDateTime)
FROM (
SELECT EventDateTime = StartDateTime, Evt = 1
FROM #Faults
UNION ALL SELECT EndDateTime, Evt = -1
FROM #Faults
WHERE EndDateTime IS NOT NULL
) X
)
-- Running total of Evt gives the current state at each date/time point
, FaultEventStates AS (
SELECT A.Ord, A.EventDateTime, A.Evt, [State] = (SELECT SUM(B.Evt) FROM FaultEvents B WHERE B.Ord <= A.Ord)
FROM FaultEvents A
)
SELECT StartDateTime = S.EventDateTime, EndDateTime = F.EventDateTime
FROM FaultEventStates S
OUTER APPLY (
-- Find the nearest transition to the no-fault state
SELECT TOP 1 *
FROM FaultEventStates B
WHERE B.[State] = 0
AND B.Ord > S.Ord
ORDER BY B.Ord
) F
-- Restrict to start events transitioning from the no-fault state
WHERE S.Evt = 1 AND S.[State] = 1
If you are using SQL Server 2012 then you have the option to calculate the running total using a windowing function.
The below is a rough guide to getting this working. It will compare against an interval table of dates and an interval table of 15 mins. It will then sum the outage events (1 event per interval), but not sum a partial outage if there is a full outage.
You could use a more granular time interval if you needed, I choose 15 mins for speed of coding.
I already had a date interval table set up "CAL.t_Calendar" so you would need to create one of your own to run this code.
Please note, this does not represent actual code you should use. It is only intended as a demonstration and to point you in a possible direction...
EDIT I've just realised I have't accounted for the null end dates. The code will need amending to check for NULL endDates and use #EndDate or GETDATE() if #EndDate is in the future
--drop table ##Events
CREATE TABLE #Events (OUTAGE_ID INT IDENTITY(1,1) PRIMARY KEY
,StartDateTime datetime
,EndDateTime datetime
, completeOutage bit)
INSERT INTO #Events VALUES ('2013-07-03 07:00','2013-07-03 08:55',1),('2013-07-03 08:30','2013-07-04 10:00',0)
,('2013-07-04 12:00',NULL,0),('2013-07-04 12:30','2013-07-04 12:35',0)
--drop table #FiveMins
CREATE TABLE #FiveMins (ID int IDENTITY(1,1) PRIMARY KEY, TimeInterval Time)
DECLARE #Time INT = 0
WHILE #Time <= 1410 --number of 15 min intervals in day * 15
BEGIN
INSERT INTO #FiveMins SELECT DATEADD(MINUTE , #Time, '00:00')
SET #Time = #Time + 15
END
SELECT * from #FiveMins
DECLARE #StartDate DATETIME = '2013-07-03'
DECLARE #EndDate DATETIME = '2013-07-04 23:59:59.999'
SELECT SUM(FullOutage) * 15 as MinutesFullOutage
,SUM(PartialOutage) * 15 as MinutesPartialOutage
,SUM(NoOutage) * 15 as MinutesNoOutage
FROM
(
SELECT DateAnc.EventDateTime
, CASE WHEN COUNT(OU.OUTAGE_ID) > 0 THEN 1 ELSE 0 END AS FullOutage
, CASE WHEN COUNT(OU.OUTAGE_ID) = 0 AND COUNT(pOU.OUTAGE_ID) > 0 THEN 1 ELSE 0 END AS PartialOutage
, CASE WHEN COUNT(OU.OUTAGE_ID) > 0 OR COUNT(pOU.OUTAGE_ID) > 0 THEN 0 ELSE 1 END AS NoOutage
FROM
(
SELECT CAL.calDate + MI.TimeInterval AS EventDateTime
FROM CAL.t_Calendar CAL
CROSS JOIN #FiveMins MI
WHERE CAL.calDate BETWEEN #StartDate AND #EndDate
) DateAnc
LEFT JOIN #Events OU
ON DateAnc.EventDateTime BETWEEN OU.StartDateTime AND OU.EndDateTime
AND OU.completeOutage = 1
LEFT JOIN #Events pOU
ON DateAnc.EventDateTime BETWEEN pOU.StartDateTime AND pOU.EndDateTime
AND pOU.completeOutage = 0
GROUP BY DateAnc.EventDateTime
) AllOutages

SQL grouping and running total of open items for a date range

I have a table of items that, for sake of simplicity, contains the ItemID, the StartDate, and the EndDate for a list of items.
ItemID StartDate EndDate
1 1/1/2011 1/15/2011
2 1/2/2011 1/14/2011
3 1/5/2011 1/17/2011
...
My goal is to be able to join this table to a table with a sequential list of dates,
and say both how many items are open on a particular date, and also how many items are cumulatively open.
Date ItemsOpened CumulativeItemsOpen
1/1/2011 1 1
1/2/2011 1 2
...
I can see how this would be done with a WHILE loop,
but that has performance implications. I'm wondering how
this could be done with a set-based approach?
SELECT COUNT(CASE WHEN d.CheckDate = i.StartDate THEN 1 ELSE NULL END)
AS ItemsOpened
, COUNT(i.StartDate)
AS ItemsOpenedCumulative
FROM Dates AS d
LEFT JOIN Items AS i
ON d.CheckDate BETWEEN i.StartDate AND i.EndDate
GROUP BY d.CheckDate
This may give you what you want
SELECT DATE,
SUM(ItemOpened) AS ItemsOpened,
COUNT(StartDate) AS ItemsOpenedCumulative
FROM
(
SELECT d.Date, i.startdate, i.enddate,
CASE WHEN i.StartDate = d.Date THEN 1 ELSE 0 END AS ItemOpened
FROM Dates d
LEFT OUTER JOIN Items i ON d.Date BETWEEN i.StartDate AND i.EndDate
) AS x
GROUP BY DATE
ORDER BY DATE
This assumes that your date values are DATE data type. Or, the dates are DATETIME with no time values.
You may find this useful. The recusive part can be replaced with a table. To demonstrate it works I had to populate some sort of date table. As you can see, the actual sql is short and simple.
DECLARE #i table (itemid INT, startdate DATE, enddate DATE)
INSERT #i VALUES (1,'1/1/2011', '1/15/2011')
INSERT #i VALUES (2,'1/2/2011', '1/14/2011')
INSERT #i VALUES (3,'1/5/2011', '1/17/2011')
DECLARE #from DATE
DECLARE #to DATE
SET #from = '1/1/2011'
SET #to = '1/18/2011'
-- the recusive sql is strictly to make a datelist between #from and #to
;WITH cte(Date)
AS (
SELECT #from DATE
UNION ALL
SELECT DATEADD(day, 1, DATE)
FROM cte ch
WHERE DATE < #to
)
SELECT cte.Date, sum(case when cte.Date=i.startdate then 1 else 0 end) ItemsOpened, count(i.itemid) ItemsOpenedCumulative
FROM cte
left join #i i on cte.Date between i.startdate and i.enddate
GROUP BY cte.Date
OPTION( MAXRECURSION 0)
If you are on SQL Server 2005+, you could use a recursive CTE to obtain running totals, with the additional help of the ranking function ROW_NUMBER(), like this:
WITH grouped AS (
SELECT
d.Date,
ItemsOpened = COUNT(i.ItemID),
rn = ROW_NUMBER() OVER (ORDER BY d.Date)
FROM Dates d
LEFT JOIN Items i ON d.Date BETWEEN i.StartDate AND i.EndDate
GROUP BY d.Date
WHERE d.Date BETWEEN #FilterStartDate AND #FilterEndDate
),
cumulative AS (
SELECT
Date,
ItemsOpened,
ItemsOpenedCumulative = ItemsOpened
FROM grouped
WHERE rn = 1
UNION ALL
SELECT
g.Date,
g.ItemsOpened,
ItemsOpenedCumulative = g.ItemsOpenedCumulative + c.ItemsOpened
FROM grouped g
INNER JOIN cumulative c ON g.Date = DATEADD(day, 1, c.Date)
)
SELECT *
FROM cumulative