Imagine the following Loans table:
BorrowerID StartDate DueDate
=============================================
1 2012-09-02 2012-10-01
2 2012-10-05 2012-10-21
3 2012-11-07 2012-11-09
4 2012-12-01 2013-01-01
4 2012-12-01 2013-01-14
1 2012-12-20 2013-01-06
3 2013-01-07 2013-01-22
3 2013-01-15 2013-01-18
1 2013-02-20 2013-02-24
How would I go about selecting the distinct BorrowerIDs of those who have only ever taken out a single loan at a time? This includes borrowers who have only ever taken out a single loan, as well as those who have taken out more than one, provided if you were to draw a time line of their loans, none of them would overlap. For example, in the table above, it should find borrowers 1 and 2 only.
I've tried experimenting with joining the table to itself, but haven't really managed to get anywhere. Any pointers much appreciated!
Solution for dbo.Loan with PRIMARY KEY
To solve this you need a two step approach as detailed in the following SQL Fiddle. I did add a LoanId column to your example data and the query requires that such a unique id exists. If you don't have that, you need to adjust the join clause to make sure that a loan does not get matched to itself.
MS SQL Server 2008 Schema Setup:
CREATE TABLE dbo.Loans
(LoanID INT, [BorrowerID] int, [StartDate] datetime, [DueDate] datetime)
GO
INSERT INTO dbo.Loans
(LoanID, [BorrowerID], [StartDate], [DueDate])
VALUES
(1, 1, '2012-09-02 00:00:00', '2012-10-01 00:00:00'),
(2, 2, '2012-10-05 00:00:00', '2012-10-21 00:00:00'),
(3, 3, '2012-11-07 00:00:00', '2012-11-09 00:00:00'),
(4, 4, '2012-12-01 00:00:00', '2013-01-01 00:00:00'),
(5, 4, '2012-12-01 00:00:00', '2013-01-14 00:00:00'),
(6, 1, '2012-12-20 00:00:00', '2013-01-06 00:00:00'),
(7, 3, '2013-01-07 00:00:00', '2013-01-22 00:00:00'),
(8, 3, '2013-01-15 00:00:00', '2013-01-18 00:00:00'),
(9, 1, '2013-02-20 00:00:00', '2013-02-24 00:00:00')
GO
First you need to find out which loans overlap with another loan. The query uses <= to compare the start and due dates. That counts loans where the second one starts the same day the first one ends as overlapping. If you need those to not be overlapping use < instead in both places.
Query 1:
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L2.LoanID <> L1.LoanID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate)
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM dbo.Loans L1;
Results:
| LOANID | BORROWERID | STARTDATE | DUEDATE | HASOVERLAPPINGLOAN |
|--------|------------|----------------------------------|---------------------------------|--------------------|
| 1 | 1 | September, 02 2012 00:00:00+0000 | October, 01 2012 00:00:00+0000 | 0 |
| 2 | 2 | October, 05 2012 00:00:00+0000 | October, 21 2012 00:00:00+0000 | 0 |
| 3 | 3 | November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 | 0 |
| 4 | 4 | December, 01 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 | 1 |
| 5 | 4 | December, 01 2012 00:00:00+0000 | January, 14 2013 00:00:00+0000 | 1 |
| 6 | 1 | December, 20 2012 00:00:00+0000 | January, 06 2013 00:00:00+0000 | 0 |
| 7 | 3 | January, 07 2013 00:00:00+0000 | January, 22 2013 00:00:00+0000 | 1 |
| 8 | 3 | January, 15 2013 00:00:00+0000 | January, 18 2013 00:00:00+0000 | 1 |
| 9 | 1 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 0 |
Now, with that information you can determine the borrowers that have no overlapping loans with this query:
Query 2:
WITH OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L2.LoanID <> L1.LoanID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate)
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM dbo.Loans L1
),
OverlappingBorrower AS (
SELECT BorrowerID, MAX(HasOverlappingLoan) HasOverlappingLoan
FROM OverlappingLoans
GROUP BY BorrowerID
)
SELECT *
FROM OverlappingBorrower
WHERE hasOverlappingLoan = 0;
Or you could even get more information by counting the loans as well as counting the number of loans that have overlapping other loans for each borrower in the database. (Note, if loan A and loan B overlap, both will be counted as overlapping loan by this query)
Results:
| BORROWERID | HASOVERLAPPINGLOAN |
|------------|--------------------|
| 1 | 0 |
| 2 | 0 |
Query 3:
WITH OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L2.LoanID <> L1.LoanID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate)
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM dbo.Loans L1
)
SELECT BorrowerID,COUNT(1) LoanCount, SUM(hasOverlappingLoan) OverlappingCount
FROM OverlappingLoans
GROUP BY BorrowerID;
Results:
| BORROWERID | LOANCOUNT | OVERLAPPINGCOUNT |
|------------|-----------|------------------|
| 1 | 3 | 0 |
| 2 | 1 | 0 |
| 3 | 3 | 2 |
| 4 | 2 | 2 |
Solution for dbo.Loan without PRIMARY KEY
UPDATE: As the requirement actually calls for a solution that does not rely on a unique identifier for each loan, I made the following changes:
1) I added a borrower that has two loans with the same start and due dates
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE dbo.Loans
([BorrowerID] int, [StartDate] datetime, [DueDate] datetime)
GO
INSERT INTO dbo.Loans
([BorrowerID], [StartDate], [DueDate])
VALUES
( 1, '2012-09-02 00:00:00', '2012-10-01 00:00:00'),
( 2, '2012-10-05 00:00:00', '2012-10-21 00:00:00'),
( 3, '2012-11-07 00:00:00', '2012-11-09 00:00:00'),
( 4, '2012-12-01 00:00:00', '2013-01-01 00:00:00'),
( 4, '2012-12-01 00:00:00', '2013-01-14 00:00:00'),
( 1, '2012-12-20 00:00:00', '2013-01-06 00:00:00'),
( 3, '2013-01-07 00:00:00', '2013-01-22 00:00:00'),
( 3, '2013-01-15 00:00:00', '2013-01-18 00:00:00'),
( 1, '2013-02-20 00:00:00', '2013-02-24 00:00:00'),
( 5, '2013-02-20 00:00:00', '2013-02-24 00:00:00'),
( 5, '2013-02-20 00:00:00', '2013-02-24 00:00:00')
GO
2) Those "equal date" loans require an additional step:
Query 1:
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate;
Results:
| BORROWERID | STARTDATE | DUEDATE | LOANCOUNT |
|------------|----------------------------------|---------------------------------|-----------|
| 1 | September, 02 2012 00:00:00+0000 | October, 01 2012 00:00:00+0000 | 1 |
| 1 | December, 20 2012 00:00:00+0000 | January, 06 2013 00:00:00+0000 | 1 |
| 1 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 1 |
| 2 | October, 05 2012 00:00:00+0000 | October, 21 2012 00:00:00+0000 | 1 |
| 3 | November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 | 1 |
| 3 | January, 07 2013 00:00:00+0000 | January, 22 2013 00:00:00+0000 | 1 |
| 3 | January, 15 2013 00:00:00+0000 | January, 18 2013 00:00:00+0000 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 14 2013 00:00:00+0000 | 1 |
| 5 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 2 |
3) Now, with each loan range unique, we can use the old technique again. However, we also need to account for those "equal date" loans. (L1.StartDate <> L2.StartDate OR L1.DueDate <> L2.DueDate) prevents a loan getting matched with itself. OR LoanCount > 1 accounts for "equal date" loans.
Query 2:
WITH NormalizedLoans AS (
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate
)
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate
AND (L1.StartDate <> L2.StartDate
OR L1.DueDate <> L2.DueDate)
)
OR LoanCount > 1
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM NormalizedLoans L1;
Results:
| BORROWERID | STARTDATE | DUEDATE | LOANCOUNT | HASOVERLAPPINGLOAN |
|------------|----------------------------------|---------------------------------|-----------|--------------------|
| 1 | September, 02 2012 00:00:00+0000 | October, 01 2012 00:00:00+0000 | 1 | 0 |
| 1 | December, 20 2012 00:00:00+0000 | January, 06 2013 00:00:00+0000 | 1 | 0 |
| 1 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 1 | 0 |
| 2 | October, 05 2012 00:00:00+0000 | October, 21 2012 00:00:00+0000 | 1 | 0 |
| 3 | November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 | 1 | 0 |
| 3 | January, 07 2013 00:00:00+0000 | January, 22 2013 00:00:00+0000 | 1 | 1 |
| 3 | January, 15 2013 00:00:00+0000 | January, 18 2013 00:00:00+0000 | 1 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 | 1 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 14 2013 00:00:00+0000 | 1 | 1 |
| 5 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 2 | 1 |
This query logic did not change (other than switching out the beginning).
Query 3:
WITH NormalizedLoans AS (
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate
),
OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate
AND (L1.StartDate <> L2.StartDate
OR L1.DueDate <> L2.DueDate)
)
OR LoanCount > 1
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM NormalizedLoans L1
),
OverlappingBorrower AS (
SELECT BorrowerID, MAX(HasOverlappingLoan) HasOverlappingLoan
FROM OverlappingLoans
GROUP BY BorrowerID
)
SELECT *
FROM OverlappingBorrower
WHERE hasOverlappingLoan = 0;
Results:
| BORROWERID | HASOVERLAPPINGLOAN |
|------------|--------------------|
| 1 | 0 |
| 2 | 0 |
4) In this counting query we need to incorporate the "equal date" loan counts again. For that we use SUM(LoanCount) instead of a plain COUNT. We also have to multiply hasOverlappingLoan with the LoanCount to get the correct overlapping count again.
Query 4:
WITH NormalizedLoans AS (
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate
),
OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate
AND (L1.StartDate <> L2.StartDate
OR L1.DueDate <> L2.DueDate)
)
OR LoanCount > 1
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM NormalizedLoans L1
)
SELECT BorrowerID,SUM(LoanCount) LoanCount, SUM(hasOverlappingLoan*LoanCount) OverlappingCount
FROM OverlappingLoans
GROUP BY BorrowerID;
Results:
| BORROWERID | LOANCOUNT | OVERLAPPINGCOUNT |
|------------|-----------|------------------|
| 1 | 3 | 0 |
| 2 | 1 | 0 |
| 3 | 3 | 2 |
| 4 | 2 | 2 |
| 5 | 2 | 2 |
I strongly suggest finding a way to use my first solution, as a loan table without a primary key is a, let's say "odd" design. However, if you really can't get there, use the second solution.
I got it working but in a bit convoluted way. It first gets borrowers that don't meet criteria in the inner query and returns the rest. The inner query has 2 parts:
Get all overlapping borrowings not starting on the same day.
Get all borrowings starting on the same date.
select distinct BorrowerID from borrowings
where BorrowerID NOT IN
(
select b1.BorrowerID from borrowings b1
inner join borrowings b2
on b1.BorrowerID = b2.BorrowerID
and b1.StartDate < b2.StartDate
and b1.DueDate > b2.StartDate
union
select BorrowerID from borrowings
group by BorrowerID, StartDate
having count(*) > 1
)
I had to use 2 separate inner queries as your table doesn't have a unique identifier for each record and using b1.StartDate <= b2.StartDate as I should have makes a record join to itself. It would be good to have a separate identifier for each record.
try
with cte as
(
select *,
row_number() over (partition by b order by s) r
from loans
)
select l1.b
from loans l1
except
select c1.b
from cte c1
where exists (
select 1
from cte c2
where c2.b = c1.b
and c2.r <> c1.r
and (c2.s between c1.s and c1.e
or c1.s between c2.s and c2.e)
)
If you're on SQL 2012, you can do it like this:
with cte as (
select
BorrowerID,
StartDate,
DueDate,
lag(DueDate) over (partition by borrowerid order by StartDate, DueDate) as PrevDueDate
from test
)
select
distinct BorrowerID
from cte
where BorrowerID not in
(select BorrowerID
from cte
where StartDate <= PrevDueDate)
Say I have a base number 10 and a table that has a value of 20 associated to November 2013, and a value of 10 associated to March 2014. I want to populate a list of all months, and their compounded value. So from May-November 2013, the value should be 10, then between Nov and Mar, the value should be 10+20 and afterwards it should be 10+20+10.
So in a table I have the following
MONTH VALUE
Nov-2013 20
Mar-2014 10
I'd like to have a select statement that somehow returns. There's an initial value of 10, hard-coded as the base.
MONTH VALUE
May-2013 10
Jun-2013 10
Jul-2013 10
Aug-2013 10
Sep-2013 10
Oct-2013 10
Nov-2013 30
Dec-2013 30
Jan-2014 30
Feb-2014 30
Mar-2014 40
Is this doable?
In case I understand your requirements correctly,
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE months
("MON" date, "VALUE" int)
;
INSERT ALL
INTO months ("MON", "VALUE")
VALUES (date '2013-11-01', 20)
INTO months ("MON", "VALUE")
VALUES (date '2014-03-01', 10)
SELECT * FROM dual
;
Query 1:
with months_interval as (
select date '2013-05-01' interval_start,
max(mon) interval_end
from months
)
, all_months as (
select add_months(m.interval_start,level-1) mon
from months_interval m
connect by level <= months_between(interval_end, interval_start) + 1
), data_to_sum as (
select am.mon,
decode(am.mon, first_value(am.mon) over(order by am.mon), 10, m.value) value
from months m, all_months am
where am.mon = m.mon(+)
)
select mon, value, sum(value) over(order by mon) cumulative
from data_to_sum
order by 1
Results:
| MON | VALUE | CUMULATIVE |
----------------------------------------------------------
| May, 01 2013 00:00:00+0000 | 10 | 10 |
| June, 01 2013 00:00:00+0000 | (null) | 10 |
| July, 01 2013 00:00:00+0000 | (null) | 10 |
| August, 01 2013 00:00:00+0000 | (null) | 10 |
| September, 01 2013 00:00:00+0000 | (null) | 10 |
| October, 01 2013 00:00:00+0000 | (null) | 10 |
| November, 01 2013 00:00:00+0000 | 20 | 30 |
| December, 01 2013 00:00:00+0000 | (null) | 30 |
| January, 01 2014 00:00:00+0000 | (null) | 30 |
| February, 01 2014 00:00:00+0000 | (null) | 30 |
| March, 01 2014 00:00:00+0000 | 10 | 40 |
This one is probably slightly suboptimal performance-wise (queries months table twice etc.) and should be optimized, but the idea is like this - pregenerate a list of months (I assumed your interval start is somehow fixed), left join it to your data, use analytic sum function.
Hi folks I've been noddling how to approach this one for a while now and I'm just stuck. Hoping this question is useful to the community.
I have a trend table with data like the first table below. I have another table with categories like the second table below. The goal is to display the data in a stacked column chart. Each column in the chart would be a last sample for that day, the series group for each column would be the circuit categories.
the data is sampled from every 10 minutes but for example sake I just entered 2 samples for each day:
time_stamp | circuit1 | circuit2 | circuit3
1/5/13 08:00 | 50 | 60 | 30
1/5/13 04:00 | 48 | 55 | 26
1/4/13 08:00 | 42 | 52 | 22
1/4/13 04:00 | 40 | 51 | 20
etc.
I have a category table similar to this:
Circuit_name | circuit_category
circuit1 | category4
circuit2 | category2
circuit3 | category12
etc.
Maybe I'm not thinking of a simpler way to do this from a reporting standpoint, but in order to get a stacked bar chart day by day like the requirements, I think I need a query which results in the following:
time_stamp | Circuit_name | Circuit_category | Value
1/5/13 08:00 | Circuit1 | category4 | 50
1/5/13 08:00 | Circuit2 | category2 | 60
1/5/13 08:00 | Circuit3 | category12 | 30
1/4/13 08:00 | Circuit1 | category4 | 42
1/4/13 08:00 | Circuit2 | category2 | 52
1/4/13 08:00 | Circuit3 | category12 | 22
I'm thinking I need to write a query to grab the max(time_stamp) grouped by day, but pivot the results so I can join the data to the category table. I've played around with using pivot on the first table since I have to join the circuit_name in table2 to the actual column names in table1, but I keep running into dead ends because I don't understand pivot well enough.
Anyway I'm willing to abandon table 2 if hard coding the circuit categories into the query is necessary, but again this is where I'm stuck. Any guidance would be appreciated.
The data is on a sql2008r2 server.
Thanks!
This seems like unpivot columns to rows... SQL Server has this function :) I believe following query can be improved and optimized. Pleaes comment after you have tried.
SQLFIDDLE DEMO
Query:
select m.*, t.cat
from
(SELECT ts, name, value
FROM
(
SELECT ts,
CONVERT(varchar(20), C1) AS c1,
CONVERT(varchar(20), C2) AS c2,
CONVERT(varchar(20), C3) AS c3
FROM t2
) MyTable
UNPIVOT
(Value FOR name IN
(c1,c2,c3))AS MyUnPivot) m
left join t1 t
on t.name = m.name
;
Results:
TS NAME VALUE CAT
January, 05 2013 08:00:00+0000 c1 50 category4
January, 05 2013 08:00:00+0000 c2 60 category2
January, 05 2013 08:00:00+0000 c3 30 category12
January, 05 2013 04:00:00+0000 c1 48 category4
January, 05 2013 04:00:00+0000 c2 55 category2
January, 05 2013 04:00:00+0000 c3 26 category12
January, 04 2013 08:00:00+0000 c1 42 category4
January, 04 2013 08:00:00+0000 c2 52 category2
January, 04 2013 08:00:00+0000 c3 22 category12
January, 04 2013 04:00:00+0000 c1 40 category4
January, 04 2013 04:00:00+0000 c2 51 category2
January, 04 2013 04:00:00+0000 c3 20 category12
I have a date column in a table and I want to get week number for that particular date based on the month from that date irrespective of the day
For example:
01-dec-2012 to 07-dec-2012 should give week number as 1
08-dec-2012 to 14-dec-2012 should give week number as 2
15-dec-2012 to 21-dec-2012 should give week number as 3
22-dec-2012 to 28-dec-2012 should give week number as 4
29-dec-2012 to 31-dec-2012 should give week number as 5
This week number is not dependent on the starting day of the week i.e, it can be any day
How can I write a select statement to get this output in SQL Server 2008?
You can use DAY (Transact-SQL)
select ((day(DateColumn)-1) / 7) + 1
from YourTable
SQL Fiddle
MS SQL Server 2012 Schema Setup:
create table YourTable
(
D datetime
)
insert into YourTable
select getdate()+Number
from master..spt_values
where type = 'P' and
Number between 1 and 15
Query 1:
select D,
((day(D)-1) / 7) + 1 as W
from YourTable
Results:
| D | W |
--------------------------------------
| January, 03 2013 07:48:54+0000 | 1 |
| January, 04 2013 07:48:54+0000 | 1 |
| January, 05 2013 07:48:54+0000 | 1 |
| January, 06 2013 07:48:54+0000 | 1 |
| January, 07 2013 07:48:54+0000 | 1 |
| January, 08 2013 07:48:54+0000 | 2 |
| January, 09 2013 07:48:54+0000 | 2 |
| January, 10 2013 07:48:54+0000 | 2 |
| January, 11 2013 07:48:54+0000 | 2 |
| January, 12 2013 07:48:54+0000 | 2 |
| January, 13 2013 07:48:54+0000 | 2 |
| January, 14 2013 07:48:54+0000 | 2 |
| January, 15 2013 07:48:54+0000 | 3 |
| January, 16 2013 07:48:54+0000 | 3 |
| January, 17 2013 07:48:54+0000 | 3 |
try this
declare #dates datetime
select #dates='2012-12-22'
SELECT datepart(dd,#dates), ceiling (cast(datepart(dd,#dates)as numeric(38,8))/7)
Several options here to do what you wish. Most promising of which seems to be use of the DATEPART function. But do beware that results may differ depending on your local settings.
Hope one of them works out for you.