weekly aggregate with CTE not behaving as expected - sql

I have this USERS table with users that can be of two different types (A and B). I need to show a report with the aggregate per type for each week. The query I have so far works well except some weeks are not grouping properly. In the example below, the week starting Jan 28th should have one line, not two.
Week Starts |Week| Type A | Type B
------------+----+--------+------
2013-02-04 | 14 | 2 | 26
2013-01-28 | 13 | 5 | 191
2013-01-28 | 13 | 0 | 24
2013-01-21 | 12 | 1 | 134
2013-01-21 | 12 | 0 | 20
2013-01-14 | 11 | 1 | 143
2013-01-14 | 11 | 0 | 2
2013-01-07 | 10 | 0 | 233
2013-01-07 | 10 | 0 | 23
2012-12-31 | 9 | 0 | 12
2012-12-31 | 9 | 4 | 164
2012-12-31 | 9 | 0 | 20
SQL
;with cte as
(
select DATEADD(m,-3,GETDATE()) firstday, DATEADD(m,-3,GETDATE()) + 6 - DATEDIFF(day, 0, DATEADD(m,-3,GETDATE())) %7 lastday, 1 week
union all
select lastday + 1, case when GETDATE() < lastday + 7 then GETDATE() else lastday + 7 end, week + 1
from cte
where lastday < GETDATE()
)
SELECT
cast(firstday as date) 'Week Starts',
cte.week as 'Week',
Sum(CASE WHEN USR_TYPE = 'A' THEN 1 ELSE 0 END) As 'Type A',
Sum(CASE WHEN USR_TYPE = 'B' THEN 1 ELSE 0 END) As 'Type B'
FROM cte left join USERS
ON cte.firstday <= USERS.CREATED
AND cte.lastday > USERS.CREATED
GROUP BY cte.week, cte.firstday, cte.lastday, DATEPART(YEAR,USERS.CREATED), DATEPART(wk,USERS.CREATED)
ORDER BY week desc
What am I doing wrong?

Without seeing any data from your users table I am going to take a guess.
The list of dates you are generating in the CTE includes the time.
You might need to cast() your firstday and lastday values as either a date or generate the list with no time.
See a SQL Fiddle Demo
Sample from your CTE and the new dates cast:
| CASTFIRSTDAY | CASTLASTDAY | WEEK | FIRSTDAY | LASTDAY |
---------------------------------------------------------------------------------------------------------
| 2012-11-05 | 2012-11-11 | 1 | November, 05 2012 20:08:10+0000 | November, 11 2012 20:08:10+0000 |
| 2012-11-12 | 2012-11-18 | 2 | November, 12 2012 20:08:10+0000 | November, 18 2012 20:08:10+0000 |
| 2012-11-19 | 2012-11-25 | 3 | November, 19 2012 20:08:10+0000 | November, 25 2012 20:08:10+0000 |
| 2012-11-26 | 2012-12-02 | 4 | November, 26 2012 20:08:10+0000 | December, 02 2012 20:08:10+0000 |
| 2012-12-03 | 2012-12-09 | 5 | December, 03 2012 20:08:10+0000 | December, 09 2012 20:08:10+0000 |
| 2012-12-10 | 2012-12-16 | 6 | December, 10 2012 20:08:10+0000 | December, 16 2012 20:08:10+0000 |
You might want to edit your CTE to return the date only values:
;with cte as
(
select
cast(DATEADD(m,-3,GETDATE()) as date) firstday,
cast(DATEADD(m,-3,GETDATE()) + 6 - DATEDIFF(day, 0, DATEADD(m,-3,GETDATE())) %7 as DATE) lastday,
1 week
union all
select
cast(DATEADD(DAY, 1, lastday) as date),
case
when cast(GETDATE() as date) < cast(DATEADD(DAY, 7, lastday) as date)
then cast(GETDATE() as date)
else cast(DATEADD(DAY, 7, lastday) as date)
end,
week + 1
from cte
where cast(lastday as date) < cast(GETDATE() as date)
)
select *
from cte
See SQL Fiddle with Demo

Related

How to insert a Calender Items into Oracle DB records? [duplicate]

This question already has answers here:
How to populate calendar table in Oracle?
(3 answers)
Calendar table in SQL
(3 answers)
Closed 3 years ago.
I'm trying to insert a full year Calender into an ORACLE DB records
MY Columns are
----------------------------------------------------------------
| [FULL_DATE] | [DAY] | [MONTH_NAME] | [MONTH_NUMBER] | [YEAR] |
----------------------------------------------------------------
Function
(
#DATEFROM AS DATE
#DATETO AS DATE
) RETURNS DATE
AS
BEGIN
set #datefrom = '01/01/1995'
set #dateto = '31/12/1996'
while(#datefrom < #dateto)
BEGIN set #datefrom = DATEADD(day , 1 , #datefrom)
insert into SHEMA.DIM_TIME_TABLE ( FULL_DATE , DAY , MONTH , YEAR ) select DAY(GETDATE(#datefrom)) , DATENAME(MONTH , #datefrom), MONTH(GETDATE(#datefrom)) , YEAR(GETDATE(#datefrom))
END
RETURN
END
EXPECTED :
---------------------------------------------------------------
01 / 01 /1995 | 01 | JAN | 01 | 1995
---------------------------------------------------------------
02 / 01 /1995 | 02 | JAN | 01 | 1995
---------------------------------------------------------------
03 / 01 /1996 | 03 | JAN | 01 | 1995
In Oracle, you can use a recursive query to generate the date series, and then generate the expected columns in the outer query:
create table dim_time_table as
select
dt full_date,
extract(day from dt) day,
to_char(dt, 'month') month_name,
extract(month from dt) month_number,
extract(year from dt) year
from (
select to_date('1995-01-01', 'yyyy-mm-dd') + level - 1 as dt
from dual
connect by
to_date('1995-01-01', 'yyyy-mm-dd') + level
<= to_date('1997-01-01', 'yyyy-mm-dd')
)
Demo on DB Fiddle:
FULL_DATE | DAY | MONTH_NAME | MONTH_NUMBER | YEAR
:-------- | --: | :--------- | -----------: | ---:
01-JAN-95 | 1 | january | 1 | 1995
02-JAN-95 | 2 | january | 1 | 1995
03-JAN-95 | 3 | january | 1 | 1995
04-JAN-95 | 4 | january | 1 | 1995
05-JAN-95 | 5 | january | 1 | 1995
06-JAN-95 | 6 | january | 1 | 1995
07-JAN-95 | 7 | january | 1 | 1995
...

Showing date even zero value SQL

I have SQL Query:
SELECT Date, Hours, Counts FROM TRANSACTION_DATE
Example Output:
Date | Hours | Counts
----------------------------------
01-Feb-2018 | 20 | 5
03-Feb-2018 | 25 | 3
04-Feb-2018 | 22 | 3
05-Feb-2018 | 21 | 2
07-Feb-2018 | 28 | 1
10-Feb-2018 | 23 | 1
If you can see, there are days that missing because no data/empty, but I want the missing days to be shown and have a value of zero:
Date | Hours | Counts
----------------------------------
01-Feb-2018 | 20 | 5
02-Feb-2018 | 0 | 0
03-Feb-2018 | 25 | 3
04-Feb-2018 | 22 | 3
05-Feb-2018 | 21 | 2
06-Feb-2018 | 0 | 0
07-Feb-2018 | 28 | 1
08-Feb-2018 | 0 | 0
09-Feb-2018 | 0 | 0
10-Feb-2018 | 23 | 1
Thank you in advanced.
You need to generate a sequence of dates. If there are not too many, a recursive CTE is an easy method:
with dates as (
select min(date) as dte, max(date) as last_date
from transaction_date td
union all
select dateadd(day, 1, dte), last_date
from dates
where dte < last_date
)
select d.date, coalesce(td.hours, 0) as hours, coalesce(td.count, 0) as count
from dates d left join
transaction_date td
on d.dte = td.date;

Return two values with CASE in SQL Server

I have a question: here is the problem, in my table I have some clients in this format:
id | year | Month | Amount
---+------+-------+--------
1 | 2016 | 02 | 250.00
2 | 2013 | 08 | 350.00
3 | 2015 | 12 | 450.00
4 | 2016 | 02 | 750.00
In my other table, I have a column ClientStartDate in this format 2015-12-15 00:00:00.000. So if the client start date is on 15th day of the month, than in table above I need to have two records , first record for the date Client started his work and second record for the same client but for the next month. It should look like this, let's say that client with id=3 started on 2015-12-15,table should look like this:
id | year | Month | Amount
---+------+-------+--------
1 | 2016 | 02 | 250.00
2 | 2013 | 08 | 350.00
3 | 2015 | 12 | 450.00
3 | 2016 | 01 | 150.00
4 | 2016 | 02 | 750.00
This is table where ClientStartDate comes from:
id | Name | ClientStartDate
---+--------+-------------------------
1 | John | 2016-02-01 00:00:00.000
2 | Anna | 2013-08-01 00:00:00.000
3 | Mike | 2015-12-15 00:00:00.000
4 | Nicolas| 2016-02-04 00:00:00.000
5 | Monika | 2013-11-15 00:00:00.000
This is FactTrans table where DateKey comes from:
id | amount | DateKey
----+--------+----------
1 | 208.67 | 20160201
1 | 19.12 | 20160205
2 | 55.42 | 20130820
2 | 5.42 | 20130811
4 | 23.98 | 20151121
5 | 17.99 | 20140820
Here is full code that I tried:
select
t1.ID,
left(cast(t1.datekey as varchar), 4) as Year,
left(right(cast(t1.datekey as varchar), 4), 2) as Month,
sum(t1.amount) as SumAmount
from
.dbo.FactTrans t1
inner join
dbo.Client t2 on t2.clientid = t1.clientid
where
(left(cast(t1.datekey as varchar), 4) = year(t2.clientstartdate)
and left(right(cast(t1.datekey as varchar), 4), 2) = month(t2.clientstartdate))
or
(case
when left(right(cast(t1.datekey as varchar), 4), 2)= 1
then 12
else left(right(cast(t1.datekey as varchar), 4), 2) - 1
end = month(t2.ClientStartDate)
and
case
when
month(t2.ClientStartDate) = 12
then
LEFT(cast(t1.datekey as varchar), 4) - 1
else
LEFT(cast(t1.datekey as varchar), 4)
end
= year(t2.ClientStartDate)
and
case
when
day(t2.ClientStartDate) = 15
then
month(t2.ClientStartDate) + 1
else
month(t2.ClientStartDate)
end
= left(cast(t1.datekey as varchar), 4)
)
group by t1.clientid, LEFT(cast(t1.datekey as varchar), 4) ,left(right(cast(t1.datekey as varchar), 4), 2)
order by t1.clientID
This case cover part where day = 15 but month =! 12, so here I only need to return next month and not next year. So my question is, can someone help me to write case where day=15 but month=12, and I need to return both next onth and next year for that client.
This case is WHERE statement.

Get count for each time row appears in the range between 2 dates

I'am trying to calculate how many times a row "appears" a in the range between 2 dates and grouping them by the month.
So, let's say i have rows that look like this:
Name | StartDate | EndDate
-----------|-----------------|------------
Mathias | 2017-01-01 | 2017-04-01
Lucas | 2017-01-01 | 2017-04-01
i would like to get the output that shows how many records exists between the 2 dates in a query, so something like the following output:
Count | Year | Month
-----------|-----------------|------------
2 | 2017 | 1
2 | 2017 | 2
2 | 2017 | 3
2 | 2017 | 4
0 | 2017 | 5
0 | 2017 | 6
what i've tried is:
SELECT COUNT(*) as COUNT, YEAR(StartDate) YEAR, MONTH(StartDate) MONTH
FROM NamesTable
WHERE Start >= '2017-01-01 00:00:00'
AND Slut <= '2017-06-01 00:00:00'
group by YEAR(StartDate), MONTH(StartDate)
where this is giving me the expected output of:
Count | Year | Month
-----------|-----------------|------------
2 | 2017 | 1
0 | 2017 | 2
0 | 2017 | 3
0 | 2017 | 4
0 | 2017 | 5
0 | 2017 | 6
Because of grouping by the "start date", how can i count rows in the month for every one it expands across?
You need a table with the months range
Table allMonths
+---------+------------+------------+
| monthId | StartDate | EndDate |
+---------+------------+------------+
| 1 | 2017-01-01 | 2017-01-02 |
| 2 | 2017-01-02 | 2017-01-03 |
| 3 | 2017-01-03 | 2017-01-04 |
| 4 | 2017-01-04 | 2017-01-05 |
| 5 | 2017-01-05 | 2017-01-06 |
| 6 | 2017-01-06 | 2017-01-07 |
| 7 | 2017-01-07 | 2017-01-08 |
| 8 | 2017-01-08 | 2017-01-09 |
| 9 | 2017-01-09 | 2017-01-10 |
| 10 | 2017-01-10 | 2017-01-11 |
| 11 | 2017-01-11 | 2017-01-12 |
| 12 | 2017-01-12 | 2018-01-01 |
+---------+------------+------------+
Then your query is:
SELECT am.startDate, COUNT(y.Name)
FROM allMonths am
LEFT JOIN yourTable y
ON am.StartDate <= y.EndDate
AND am.EndDate >= y.StartDate
GROUP BY am.startDate
NOTE: You need to check border cases. Maybe you need change >= to > or change EndDate to the last day of the month.
So, what i ended up doing was something like Juan Carlos proposed, but instead of creating a table i made it up with CTE instead for a cleaner approach:
Declare #todate datetime, #fromdate datetime, #firstOfMonth datetime, #lastOfMonth datetime
Select
#fromdate='2017-01-11',
#todate='2017-12-21',
#firstOfMonth = DATEADD(month, DATEDIFF(month, 0, #fromdate), 0), ----YEAR(#fromdate) + MONTH(#fromdate) + DAY(1),
#lastOfMonth = DATEADD(month, ((YEAR(#fromdate) - 1900) * 12) + MONTH(#fromdate), -1)
;with MonthTable (MonthId, StartOfMonth, EndOfMonth) as
(
SELECT MONTH(#firstOfMonth) as MonthId, #firstOfMonth as StartOfMonth, #lastOfMonth as EndOfMonth
UNION ALL
SELECT MONTH(DATEADD(MONTH, 1, StartOfMonth)), DATEADD(MONTH, 1, StartOfMonth), DATEADD(MONTH, 1, EndOfMonth)
FROM MonthTable
WHERE StartOfMonth <= #todate
)
SELECT am.StartOfMonth, COUNT(y.Start) as count
FROM MonthTable am
left JOIN clientList y
ON y.Start <= am.StartOfMonth
AND y.End >= am.EndOfMonth
GROUP BY am.StartOfMonth

Select rows with no date range overlap

Imagine the following Loans table:
BorrowerID StartDate DueDate
=============================================
1 2012-09-02 2012-10-01
2 2012-10-05 2012-10-21
3 2012-11-07 2012-11-09
4 2012-12-01 2013-01-01
4 2012-12-01 2013-01-14
1 2012-12-20 2013-01-06
3 2013-01-07 2013-01-22
3 2013-01-15 2013-01-18
1 2013-02-20 2013-02-24
How would I go about selecting the distinct BorrowerIDs of those who have only ever taken out a single loan at a time? This includes borrowers who have only ever taken out a single loan, as well as those who have taken out more than one, provided if you were to draw a time line of their loans, none of them would overlap. For example, in the table above, it should find borrowers 1 and 2 only.
I've tried experimenting with joining the table to itself, but haven't really managed to get anywhere. Any pointers much appreciated!
Solution for dbo.Loan with PRIMARY KEY
To solve this you need a two step approach as detailed in the following SQL Fiddle. I did add a LoanId column to your example data and the query requires that such a unique id exists. If you don't have that, you need to adjust the join clause to make sure that a loan does not get matched to itself.
MS SQL Server 2008 Schema Setup:
CREATE TABLE dbo.Loans
(LoanID INT, [BorrowerID] int, [StartDate] datetime, [DueDate] datetime)
GO
INSERT INTO dbo.Loans
(LoanID, [BorrowerID], [StartDate], [DueDate])
VALUES
(1, 1, '2012-09-02 00:00:00', '2012-10-01 00:00:00'),
(2, 2, '2012-10-05 00:00:00', '2012-10-21 00:00:00'),
(3, 3, '2012-11-07 00:00:00', '2012-11-09 00:00:00'),
(4, 4, '2012-12-01 00:00:00', '2013-01-01 00:00:00'),
(5, 4, '2012-12-01 00:00:00', '2013-01-14 00:00:00'),
(6, 1, '2012-12-20 00:00:00', '2013-01-06 00:00:00'),
(7, 3, '2013-01-07 00:00:00', '2013-01-22 00:00:00'),
(8, 3, '2013-01-15 00:00:00', '2013-01-18 00:00:00'),
(9, 1, '2013-02-20 00:00:00', '2013-02-24 00:00:00')
GO
First you need to find out which loans overlap with another loan. The query uses <= to compare the start and due dates. That counts loans where the second one starts the same day the first one ends as overlapping. If you need those to not be overlapping use < instead in both places.
Query 1:
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L2.LoanID <> L1.LoanID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate)
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM dbo.Loans L1;
Results:
| LOANID | BORROWERID | STARTDATE | DUEDATE | HASOVERLAPPINGLOAN |
|--------|------------|----------------------------------|---------------------------------|--------------------|
| 1 | 1 | September, 02 2012 00:00:00+0000 | October, 01 2012 00:00:00+0000 | 0 |
| 2 | 2 | October, 05 2012 00:00:00+0000 | October, 21 2012 00:00:00+0000 | 0 |
| 3 | 3 | November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 | 0 |
| 4 | 4 | December, 01 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 | 1 |
| 5 | 4 | December, 01 2012 00:00:00+0000 | January, 14 2013 00:00:00+0000 | 1 |
| 6 | 1 | December, 20 2012 00:00:00+0000 | January, 06 2013 00:00:00+0000 | 0 |
| 7 | 3 | January, 07 2013 00:00:00+0000 | January, 22 2013 00:00:00+0000 | 1 |
| 8 | 3 | January, 15 2013 00:00:00+0000 | January, 18 2013 00:00:00+0000 | 1 |
| 9 | 1 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 0 |
Now, with that information you can determine the borrowers that have no overlapping loans with this query:
Query 2:
WITH OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L2.LoanID <> L1.LoanID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate)
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM dbo.Loans L1
),
OverlappingBorrower AS (
SELECT BorrowerID, MAX(HasOverlappingLoan) HasOverlappingLoan
FROM OverlappingLoans
GROUP BY BorrowerID
)
SELECT *
FROM OverlappingBorrower
WHERE hasOverlappingLoan = 0;
Or you could even get more information by counting the loans as well as counting the number of loans that have overlapping other loans for each borrower in the database. (Note, if loan A and loan B overlap, both will be counted as overlapping loan by this query)
Results:
| BORROWERID | HASOVERLAPPINGLOAN |
|------------|--------------------|
| 1 | 0 |
| 2 | 0 |
Query 3:
WITH OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L2.LoanID <> L1.LoanID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate)
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM dbo.Loans L1
)
SELECT BorrowerID,COUNT(1) LoanCount, SUM(hasOverlappingLoan) OverlappingCount
FROM OverlappingLoans
GROUP BY BorrowerID;
Results:
| BORROWERID | LOANCOUNT | OVERLAPPINGCOUNT |
|------------|-----------|------------------|
| 1 | 3 | 0 |
| 2 | 1 | 0 |
| 3 | 3 | 2 |
| 4 | 2 | 2 |
Solution for dbo.Loan without PRIMARY KEY
UPDATE: As the requirement actually calls for a solution that does not rely on a unique identifier for each loan, I made the following changes:
1) I added a borrower that has two loans with the same start and due dates
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE dbo.Loans
([BorrowerID] int, [StartDate] datetime, [DueDate] datetime)
GO
INSERT INTO dbo.Loans
([BorrowerID], [StartDate], [DueDate])
VALUES
( 1, '2012-09-02 00:00:00', '2012-10-01 00:00:00'),
( 2, '2012-10-05 00:00:00', '2012-10-21 00:00:00'),
( 3, '2012-11-07 00:00:00', '2012-11-09 00:00:00'),
( 4, '2012-12-01 00:00:00', '2013-01-01 00:00:00'),
( 4, '2012-12-01 00:00:00', '2013-01-14 00:00:00'),
( 1, '2012-12-20 00:00:00', '2013-01-06 00:00:00'),
( 3, '2013-01-07 00:00:00', '2013-01-22 00:00:00'),
( 3, '2013-01-15 00:00:00', '2013-01-18 00:00:00'),
( 1, '2013-02-20 00:00:00', '2013-02-24 00:00:00'),
( 5, '2013-02-20 00:00:00', '2013-02-24 00:00:00'),
( 5, '2013-02-20 00:00:00', '2013-02-24 00:00:00')
GO
2) Those "equal date" loans require an additional step:
Query 1:
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate;
Results:
| BORROWERID | STARTDATE | DUEDATE | LOANCOUNT |
|------------|----------------------------------|---------------------------------|-----------|
| 1 | September, 02 2012 00:00:00+0000 | October, 01 2012 00:00:00+0000 | 1 |
| 1 | December, 20 2012 00:00:00+0000 | January, 06 2013 00:00:00+0000 | 1 |
| 1 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 1 |
| 2 | October, 05 2012 00:00:00+0000 | October, 21 2012 00:00:00+0000 | 1 |
| 3 | November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 | 1 |
| 3 | January, 07 2013 00:00:00+0000 | January, 22 2013 00:00:00+0000 | 1 |
| 3 | January, 15 2013 00:00:00+0000 | January, 18 2013 00:00:00+0000 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 14 2013 00:00:00+0000 | 1 |
| 5 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 2 |
3) Now, with each loan range unique, we can use the old technique again. However, we also need to account for those "equal date" loans. (L1.StartDate <> L2.StartDate OR L1.DueDate <> L2.DueDate) prevents a loan getting matched with itself. OR LoanCount > 1 accounts for "equal date" loans.
Query 2:
WITH NormalizedLoans AS (
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate
)
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate
AND (L1.StartDate <> L2.StartDate
OR L1.DueDate <> L2.DueDate)
)
OR LoanCount > 1
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM NormalizedLoans L1;
Results:
| BORROWERID | STARTDATE | DUEDATE | LOANCOUNT | HASOVERLAPPINGLOAN |
|------------|----------------------------------|---------------------------------|-----------|--------------------|
| 1 | September, 02 2012 00:00:00+0000 | October, 01 2012 00:00:00+0000 | 1 | 0 |
| 1 | December, 20 2012 00:00:00+0000 | January, 06 2013 00:00:00+0000 | 1 | 0 |
| 1 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 1 | 0 |
| 2 | October, 05 2012 00:00:00+0000 | October, 21 2012 00:00:00+0000 | 1 | 0 |
| 3 | November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 | 1 | 0 |
| 3 | January, 07 2013 00:00:00+0000 | January, 22 2013 00:00:00+0000 | 1 | 1 |
| 3 | January, 15 2013 00:00:00+0000 | January, 18 2013 00:00:00+0000 | 1 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 | 1 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 14 2013 00:00:00+0000 | 1 | 1 |
| 5 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 2 | 1 |
This query logic did not change (other than switching out the beginning).
Query 3:
WITH NormalizedLoans AS (
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate
),
OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate
AND (L1.StartDate <> L2.StartDate
OR L1.DueDate <> L2.DueDate)
)
OR LoanCount > 1
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM NormalizedLoans L1
),
OverlappingBorrower AS (
SELECT BorrowerID, MAX(HasOverlappingLoan) HasOverlappingLoan
FROM OverlappingLoans
GROUP BY BorrowerID
)
SELECT *
FROM OverlappingBorrower
WHERE hasOverlappingLoan = 0;
Results:
| BORROWERID | HASOVERLAPPINGLOAN |
|------------|--------------------|
| 1 | 0 |
| 2 | 0 |
4) In this counting query we need to incorporate the "equal date" loan counts again. For that we use SUM(LoanCount) instead of a plain COUNT. We also have to multiply hasOverlappingLoan with the LoanCount to get the correct overlapping count again.
Query 4:
WITH NormalizedLoans AS (
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate
),
OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate
AND (L1.StartDate <> L2.StartDate
OR L1.DueDate <> L2.DueDate)
)
OR LoanCount > 1
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM NormalizedLoans L1
)
SELECT BorrowerID,SUM(LoanCount) LoanCount, SUM(hasOverlappingLoan*LoanCount) OverlappingCount
FROM OverlappingLoans
GROUP BY BorrowerID;
Results:
| BORROWERID | LOANCOUNT | OVERLAPPINGCOUNT |
|------------|-----------|------------------|
| 1 | 3 | 0 |
| 2 | 1 | 0 |
| 3 | 3 | 2 |
| 4 | 2 | 2 |
| 5 | 2 | 2 |
I strongly suggest finding a way to use my first solution, as a loan table without a primary key is a, let's say "odd" design. However, if you really can't get there, use the second solution.
I got it working but in a bit convoluted way. It first gets borrowers that don't meet criteria in the inner query and returns the rest. The inner query has 2 parts:
Get all overlapping borrowings not starting on the same day.
Get all borrowings starting on the same date.
select distinct BorrowerID from borrowings
where BorrowerID NOT IN
(
select b1.BorrowerID from borrowings b1
inner join borrowings b2
on b1.BorrowerID = b2.BorrowerID
and b1.StartDate < b2.StartDate
and b1.DueDate > b2.StartDate
union
select BorrowerID from borrowings
group by BorrowerID, StartDate
having count(*) > 1
)
I had to use 2 separate inner queries as your table doesn't have a unique identifier for each record and using b1.StartDate <= b2.StartDate as I should have makes a record join to itself. It would be good to have a separate identifier for each record.
try
with cte as
(
  select *,
row_number() over (partition by b order by s) r
from loans
)
select l1.b
from loans l1
except
select c1.b
from cte c1
where exists (
 select 1
 from cte c2
where c2.b = c1.b
 and c2.r <> c1.r
and (c2.s between c1.s and c1.e
      or c1.s between c2.s and c2.e)
)
If you're on SQL 2012, you can do it like this:
with cte as (
select
BorrowerID,
StartDate,
DueDate,
lag(DueDate) over (partition by borrowerid order by StartDate, DueDate) as PrevDueDate
from test
)
select
distinct BorrowerID
from cte
where BorrowerID not in
(select BorrowerID
from cte
where StartDate <= PrevDueDate)