Joining values with DateDim, where null'd dates' value will take the last non-null value in the table - sql

I need to do a join but I'm not sure which type. I have a table like this:
Date Amount | FOO
------------------
2012-01-12 x
2012-03-14 y
2012-05-06 z
2012-05-14 aa
2012-09-02 bb
I am joining this with DateDim (Google here: DATE DIM, which is a table of dates (historical and future).
I need a query that would display data like this:
datedim.Date foo.Amount | FOO x DATEDIM
------------------------------------------
2012-01-12 x
2012-01-13 x
2012-01-14 x
... etc...
2012-03-14 y
2012-03-15 y
2012-03-16 y
2012-03-17 y
... etc...
2012-05-06 z
... etc...
Basically, I need the values to persist (were it a left join, it would be NULLs) until the next non-null value. That will persist too... etc..
What I have so far...
SELECT datedim.Date
,CASE
WHEN Amount IS NULL
THEN (SELECT TOP 1 Amount
FROM FOO WHERE foo.Date <= datedim.Date
ORDER BY Date DESC)
ELSE Amount END AS Amount
FROM DATEDIM datedim
LEFT JOIN FOO foo
ON foo.Date = datedim.Date
I need to create a view out of this. I'm getting an error saying ORDER BY is invalid for views, unless specified by TOP??? I do have a TOP in the subquery...

In SQLServer2005+ use recursive CTE
;WITH cte (id, [Date], Amount) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY [Date] ASC) AS id,
[Date], Amount
FROM dbo.your_table t1
), cte2 (id, [Date], [LevelDate], Amount) AS
(
SELECT c1.id, c1.[Date], DATEDIFF(day, c1.[Date], c2.[Date]) AS [LevelDate], c1.Amount
FROM cte c1 LEFT JOIN cte c2 ON c1.id = c2.id - 1
), cte3 (id, [Date], Amount, [Level]) AS
(
SELECT id, [Date], Amount, 1 AS [Level]
FROM cte2 c
UNION ALL
SELECT c.id, DATEADD(day, 1, ct.[Date]) AS [Date], c.Amount, ct.[Level] + 1
FROM cte2 c JOIN cte3 ct ON c.id = ct.id
WHERE c.[LevelDate] > ct.[Level]
)
SELECT [Date], Amount
FROM cte3
ORDER BY Date
OPTION (maxrecursion 0)
Demo on SQLFiddle

I'm sure there are more efficient ways using a CTE, or window functions, but something along these lines should work:
Select
d.Date,
d.FooDate,
f.Amount
From
Foo f
Inner Join (
Select
d.[Date],
Max(f.[Date]) as FooDate
From
Foo f
Inner Join
DateDim d
On f.[Date] < d.[Date]
Group By
d.[Date]
) d
On d.[FooDate] = f.[Date]
http://sqlfiddle.com/#!3/3c7d5/10

Related

Collapse consecutive similar records into a single record

I have records of people from an old system that I'm trying to convert over to the new system. In the old system, a person might end up with several records for the same location. They could also go from location, to another, and then return to the previous location. Here's some example data:
PersonID | LocationID | StartDate | EndDate
1 | 1 | 1980-07-30 | 2007-07-16
1 | 1 | 2007-07-16 | 2008-01-30
1 | 2 | 2008-01-30 | 2009-03-02
1 | 2 | 2009-03-02 | 2009-11-06
1 | 3 | 2014-07-16 | 2015-01-16
1 | 1 | 2016-01-26 | 2999-12-31
I would like to collapse this data so that I get a date range for any consecutive LocationIDs. For the data above, this is what I would expect:
PersonID | LocationID | StartDate | EndDate
1 | 1 | 1980-07-30 | 2008-01-30
1 | 2 | 2008-01-30 | 2009-11-06
1 | 3 | 2014-07-16 | 2015-01-16
1 | 1 | 2016-01-26 | 2999-12-31
I'm unsure as to how to do this. I previously tried joining to the previous record, but that only works when there's two consecutive locations, not with 3 or more (there could be an undefined number of consecutive records).
select
a.PersonID,
a.LocationID,
a.StartDate,
a.EndDate,
case when a.LocationID = b.LocationID then a.PK_ID else b.PK_ID end as NewID
from employees a
left outer join employees b
on a.PersonID = b.PersonID
and a.PK_ID = b.PK_ID - 1
So, how can I write a query to get the results I need?
Note: we're treating '2999-12-31' are our 'NULL' date field
This is a classic Gaps-and-Islands (Edit- corrected for larger span 2999)
Select [PersonID]
,[LocationID]
,[StartDate] = min(D)
,[EndDate] = max(D)
From (
Select *
,Grp = Row_Number() over (Order By D) - Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
from YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
) G
Group By [PersonID],[LocationID],Grp
Order By [PersonID],min(D)
Returns
PersonID LocationID StartDate EndDate
1 1 1980-07-30 2008-01-30
1 2 2008-01-30 2009-11-06
1 3 2014-07-16 2015-01-16
1 1 2016-01-26 2999-12-31
Using your original query
Select [PersonID]
,[LocationID]
,[StartDate] = min(D)
,[EndDate] = max(D)
From (
Select *
,Grp = Row_Number() over (Order By D) - Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
From (
-- Your Original Query
select
a.PersonID,
a.LocationID,
a.StartDate,
a.EndDate,
case when a.LocationID = b.LocationID then a.PK_ID else b.PK_ID end as NewID
from employees a
left outer join employees b
on a.PersonID = b.PersonID
and a.PK_ID = b.PK_ID - 1
) A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
) G
Group By [PersonID],[LocationID],Grp
Order By [PersonID],min(D)
Requested Comments
Let's break it down to its components.
1) The CROSS APPLY Portion: This will expand a single record into N records. For example:
Declare #YourTable Table ([PersonID] int,[LocationID] int,[StartDate] date,[EndDate] date)
Insert Into #YourTable Values
(1,1,'1980-07-01','1980-07-03' )
,(1,1,'1980-07-02','1980-07-04' ) -- Notice the Overlap
,(1,2,'2008-01-30','2008-02-05')
Select *
from #YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
The above query will generate
2) The Grp Portion: Perhaps easier if I provide a simple example:
Declare #YourTable Table ([PersonID] int,[LocationID] int,[StartDate] date,[EndDate] date)
Insert Into #YourTable Values
(1,1,'1980-07-01','1980-07-03' )
,(1,1,'1980-07-02','1980-07-04' ) -- Notice the Overlap
,(1,2,'2008-01-30','2008-02-05')
Select *
,Grp = Row_Number() over (Order By D) - Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
,RN1 = Row_Number() over (Order By D)
,RN2 = Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
from #YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
The above query Generates:
RN1 and RN2 are breakouts of the GRP, just to illustrate the mechanic. Notice RN1 minus RN2 equals the GRP. Once we have the GRP, it becomes a simple matter of aggregation via a group by
3) Pulling it all Together:
Declare #YourTable Table ([PersonID] int,[LocationID] int,[StartDate] date,[EndDate] date)
Insert Into #YourTable Values
(1,1,'1980-07-01','1980-07-03' )
,(1,1,'1980-07-02','1980-07-04' ) -- Notice the Overlap
,(1,2,'2008-01-30','2008-02-05')
Select [PersonID]
,[LocationID]
,[StartDate] = min(D)
,[EndDate] = max(D)
From (
Select *
,Grp = Row_Number() over (Order By D) - Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
from #YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
) G
Group By [PersonID],[LocationID],Grp
Order By [PersonID],min(D)
Returns
For your sample data, you can use the difference of row numbers approach:
select personid, locationid, min(startdate), max(enddate)
from (select e.*,
row_number() over (partition by personid order by startdate) as seqnum_p,
row_number() over (partition by personid, locationid order by startdate) as seqnum_pl
from employees e
) e
group by (seqnum_p - seqnum_pl), personid, locationid;
This assumes that the start and end dates are contiguous. That is, there is no gap for a given employee at the same location.

Concrete Rows of Id 2 with Id 1 IF Date is Same and All Row Names Should be different in SQL Server 2008R2 and

I have following Data in myRecords Table
Id Date Name Cash
1 11/25/2016 4:23.123 Ramesh 10000
2 11/25/2016 4:23.173 Suresh 15000
1 11/27/2016 5:23.320 Ramesh 30000
2 11/27/2016 5:23.670 Suresh 40000
and I want to create view So I can get data in following Format
Id1 Date1 Name1 Cash1 Id2 Date2 Name2 Cash2
1 11/25/2016 4:23.123 Ramesh 10000 2 11/25/2016 4:23.173 Suresh 15000
1 11/27/2016 5:23.320 Ramesh 30000 2 11/27/2016 5:23.670 Suresh 40000
How can I do it.
If you are doing date and there will always only be 2 records per day you could convert to drop off the time and do a self join:
DECLARE #myRecords AS TABLE (Id INT, DATE DATETIME, Name VARCHAR(20), CASH INT)
INSERT INTO #myRecords VALUES (1,'11/25/2016 4:23','Ramesh',10000),(2,'11/25/2016 4:23','Suresh',15000)
,(1,'11/27/2016 5:23','Ramesh',30000),(2,'11/27/2016 4:23','Suresh',40000)
SELECT
m1.Id as Id1
,m1.Date as Date1
,m1.Name as Name1
,m1.Cash as Cash1
,m2.Id as Id2
,m2.Date as Date2
,m2.Name as Name2
,m2.Cash as Cash2
FROM
#myRecords m1
LEFT JOIN #myRecords m2
ON CAST(m1.DATE AS DATE) = CAST(m2.DATE AS DATE)
AND m1.Id <> m2.Id
WHERE
m1.Id = 1
Then you can also introduce ROW_NUMBER() to figure out whatever order you want then take all of the ODD RowNumbers and SELF JOIN to the Even RowNumbers:
;WITH cte AS (
SELECT
*
,RowNum = ROW_NUMBER() OVER (ORDER BY Date)
FROM
#myRecords
)
SELECT *
FROM
cte c1
LEFT JOIN cte c2
ON c1.RowNum + 1 = c2.RowNum
WHERE
c1.RowNum % 2 <> 0
As long as your Id joining logic is unclear, this will help In this case but you will need to add Id Filter or additional Identity column and row_number() in future I guess.
SELECT
T.*, TT.*
FROM
[Table] AS T
INNER JOIN
[Table] AS TT
ON T.Date = TT.Date
You can use Cross Apply for the required result set.
SELECT [ID],
[DATE],
[NAME],
[CASH],
B.*
FROM #TABLE1 A
CROSS APPLY (SELECT ID AS ID2,
[DATE] AS DATE2,
[NAME] AS NAME2,
[CASH] AS CASH2
FROM #TABLE1 B
WHERE A.ID < B.ID
AND CONVERT(DATE, A.DATE) = CONVERT(DATE, B.DATE))B
This will also return the same result:
select a.id, a.date, a.name, a.cash, b.id as id2, b.date as date2,
b.name as name2, b.cash as cash2
from myTable a
inner join myTable b on a.id+1 = b.id
and cast(a.date as date) <> cast(b.date as date)

Query to return first date of missing date ranges

Looking for help with a query using SQL 2008 R2... I have a table with client and date fields. Most clients have a record for most dates, however some don't.
For example I have this data:
CLIENTID DT
1 5/1/14
1 5/2/14
2 5/3/14
3 5/1/14
3 5/2/14
I can find the missing dates for each CLIENTID by creating a temp table with all possible dates for the period and then joining that to each CLIENTID and DT and only selecting where there is a NULL.
This is what I can get easily for the date range 5/1/14 to 5/4/14:
CLIENTID DTMISSED
1 5/3/14
1 5/4/14
2 5/1/14
2 5/2/14
2 5/4/14
3 5/3/14
3 5/4/14
However I want to group each consecutive missed period together and get the beginning of each period and the length.
For example, if I use the date range 5/1/14 to 5/4/14 I'd like to get:
CLIENTID DTSTART MISSED
1 5/3/14 2
2 5/1/14 2
2 5/4/14 1
3 5/3/14 2
Thanks for helping!
It's fascinating how more elegantly and also mere efficiently this kind of problems can be solved in 2012.
First, the tables:
create table #t (CLIENTID int, DT date)
go
insert #t values
(1, '5/1/14'),
(1, '5/2/14'),
(2, '5/3/14'),
(3, '5/1/14'),
(3, '5/2/14')
go
create table #calendar (dt date)
go
insert #calendar values ('5/1/14'),('5/2/14'),('5/3/14'),('5/4/14')
go
Here's the 2008 solution:
;with x as (
select *, row_number() over(order by clientid, dt) as rn
from #calendar c
cross join (select distinct clientid from #t) x
where not exists (select * from #t where c.dt=#t.dt and x.clientid=#t.clientid)
),
y as (
select x1.*, x2.dt as x2_dt, x2.clientid as x2_clientid
from x x1
left join x x2 on x1.clientid=x2.clientid and x1.dt=dateadd(day,1,x2.dt)
),
z as (
select *, (select sum(case when x2_dt is null then 1 else 0 end) from y y2 where y2.rn<=y.rn) as grp
from y
)
select clientid, min(dt), count(*)
from z
group by clientid, grp
order by clientid
Compare it to 2012:
;with x as (
select *, row_number() over(order by dt) as rn
from #calendar c
cross join (select distinct clientid from #t) x
where not exists (select * from #t where c.dt=#t.dt and x.clientid=#t.clientid)
),
y as (
select x1.*, sum(case when x2.dt is null then 1 else 0 end) over(order by x1.clientid,x1.dt) as grp
from x x1
left join x x2 on x1.clientid=x2.clientid and x1.dt=dateadd(day,1,x2.dt)
)
select clientid, min(dt), count(*)
from y
group by clientid, grp
order by clientid

T-SQL using SUM for a running total

I have a simple table with some dummy data setup like:
|id|user|value|
---------------
1 John 2
2 Ted 1
3 John 4
4 Ted 2
I can select a running total by executing the following sql(MSSQL 2008) statement:
SELECT a.id, a.user, a.value, SUM(b.value) AS total
FROM table a INNER JOIN table b
ON a.id >= b.id
AND a.user = b.user
GROUP BY a.id, a.user, a.value
ORDER BY a.id
This will give me results like:
|id|user|value|total|
---------------------
1 John 2 2
3 John 4 6
2 Ted 1 1
4 Ted 2 3
Now is it possible to only retrieve the most recent rows for each user? So the result would be:
|id|user|value|total|
---------------------
3 John 4 6
4 Ted 2 3
Am I going about this the right way? any suggestions or a new path to follow would be great!
No join is needed, you can speed up the query this way:
select id, [user], value, total
from
(
select id, [user], value,
row_number() over (partition by [user] order by id desc) rn,
sum(value) over (partition by [user]) total
from users
) a
where rn = 1
try this:
;with cte as
(SELECT a.id, a.[user], a.value, SUM(b.value) AS total
FROM users a INNER JOIN users b
ON a.id >= b.id
AND a.[user] = b.[user]
GROUP BY a.id, a.[user], a.value
),
cte1 as (select *,ROW_NUMBER() over (partition by [user]
order by total desc) as row_num
from cte)
select id,[user],value,total from cte1 where row_num=1
SQL Fiddle Demo
add where statement:
select * from
(
your select statement
) t
where t.id in (select max(id) from table group by user)
also you can use this query:
SELECT a.id, a.user, a.value,
(select max(b.value) from table b where b.user=a.user) AS total
FROM table a
where a.id in (select max(id) from table group by user)
ORDER BY a.id
Adding a right join would perform better than nested select.
Or even simpler:
SELECT MAX(id), [user], MAX(value), SUM(value)
FROM table
GROUP BY [user]
Compatible with SQL Server 2008 or later
DECLARE #AnotherTbl TABLE
(
id INT
, somedate DATE
, somevalue DECIMAL(18, 4)
, runningtotal DECIMAL(18, 4)
)
INSERT INTO #AnotherTbl
(
id
, somedate
, somevalue
, runningtotal
)
SELECT LEDGER_ID
, LL.LEDGER_DocDate
, LL.LEDGER_Amount
, NULL
FROM ACC_Ledger LL
ORDER BY LL.LEDGER_DocDate
DECLARE #RunningTotal DECIMAL(18, 4)
SET #RunningTotal = 0
UPDATE #AnotherTbl
SET #RunningTotal=runningtotal = #RunningTotal + somevalue
FROM #AnotherTbl
SELECT *
FROM #AnotherTbl

Creating groups of consecutive days meeting a given criteria

I have table the following data structure in SQL Server:
ID Date Allocation
1, 2012-01-01, 0
2, 2012-01-02, 2
3, 2012-01-03, 0
4, 2012-01-04, 0
5, 2012-01-05, 0
6, 2012-01-06, 5
etc.
What I need to do is get all consecutive day periods where Allocation = 0, and in the following form:
Start Date End Date DayCount
2012-01-01 2012-01-01 1
2012-01-03 2012-01-05 3
etc.
Is it possible to do this in SQL, and if so how?
In this answer, I'll assume that the "id" field numbers the rows consecutively when sorted by increasing date, like it does in the example data. (Such a column can be created if it does not exist).
This is an example of a technique described here and here.
1) Join the table to itself on adjacent "id" values. This pairs adjacent rows. Select rows where the "allocation" field has changed. Store the result in a temporary table, also keeping a running index.
SET #idx = 0;
CREATE TEMPORARY TABLE boundaries
SELECT
(#idx := #idx + 1) AS idx,
a1.date AS prev_end,
a2.date AS next_start,
a1.allocation as allocation
FROM allocations a1
JOIN allocations a2
ON (a2.id = a1.id + 1)
WHERE a1.allocation != a2.allocation;
This gives you a table having "the end of the previous period", "the start of the next period", and "the value of 'allocation' in the previous period" in each row:
+------+------------+------------+------------+
| idx | prev_end | next_start | allocation |
+------+------------+------------+------------+
| 1 | 2012-01-01 | 2012-01-02 | 0 |
| 2 | 2012-01-02 | 2012-01-03 | 2 |
| 3 | 2012-01-05 | 2012-01-06 | 0 |
+------+------------+------------+------------+
2) We need the start and end of each period in the same row, so we need to combine adjacent rows again. Do this by creating a second temporary table like boundaries but having an idx field 1 greater:
+------+------------+------------+
| idx | prev_end | next_start |
+------+------------+------------+
| 2 | 2012-01-01 | 2012-01-02 |
| 3 | 2012-01-02 | 2012-01-03 |
| 4 | 2012-01-05 | 2012-01-06 |
+------+------------+------------+
Now join on the idx field and we get the answer:
SELECT
boundaries2.next_start AS start,
boundaries.prev_end AS end,
allocation
FROM boundaries
JOIN boundaries2
USING(idx);
+------------+------------+------------+
| start | end | allocation |
+------------+------------+------------+
| 2012-01-02 | 2012-01-02 | 2 |
| 2012-01-03 | 2012-01-05 | 0 |
+------------+------------+------------+
** Note that this answer gets the "internal" periods correctly but misses the two "edge" periods where allocation = 0 at the beginning and allocation = 5 at the end. Those can be pulled in using UNION clauses but I wanted to present the core idea without that complication.
Following would be one way to do it. The gist of this solution is
Use a CTE to get a list of all consecutive start and enddates with Allocation = 0
Use the ROW_NUMBER window function to assign rownumbers depending on both start- and enddates.
Select only those records where both ROW_NUMBERS equal 1.
Use DATEDIFFto calculate the DayCount
SQL Statement
;WITH r AS (
SELECT StartDate = Date, EndDate = Date
FROM YourTable
WHERE Allocation = 0
UNION ALL
SELECT r.StartDate, q.Date
FROM r
INNER JOIN YourTable q ON DATEDIFF(dd, r.EndDate, q.Date) = 1
WHERE q.Allocation = 0
)
SELECT [Start Date] = s.StartDate
, [End Date ] = s.EndDate
, [DayCount] = DATEDIFF(dd, s.StartDate, s.EndDate) + 1
FROM (
SELECT *
, rn1 = ROW_NUMBER() OVER (PARTITION BY StartDate ORDER BY EndDate DESC)
, rn2 = ROW_NUMBER() OVER (PARTITION BY EndDate ORDER BY StartDate ASC)
FROM r
) s
WHERE s.rn1 = 1
AND s.rn2 = 1
OPTION (MAXRECURSION 0)
Test script
;WITH q (ID, Date, Allocation) AS (
SELECT * FROM (VALUES
(1, '2012-01-01', 0)
, (2, '2012-01-02', 2)
, (3, '2012-01-03', 0)
, (4, '2012-01-04', 0)
, (5, '2012-01-05', 0)
, (6, '2012-01-06', 5)
) a (a, b, c)
)
, r AS (
SELECT StartDate = Date, EndDate = Date
FROM q
WHERE Allocation = 0
UNION ALL
SELECT r.StartDate, q.Date
FROM r
INNER JOIN q ON DATEDIFF(dd, r.EndDate, q.Date) = 1
WHERE q.Allocation = 0
)
SELECT s.StartDate, s.EndDate, DATEDIFF(dd, s.StartDate, s.EndDate) + 1
FROM (
SELECT *
, rn1 = ROW_NUMBER() OVER (PARTITION BY StartDate ORDER BY EndDate DESC)
, rn2 = ROW_NUMBER() OVER (PARTITION BY EndDate ORDER BY StartDate ASC)
FROM r
) s
WHERE s.rn1 = 1
AND s.rn2 = 1
OPTION (MAXRECURSION 0)
Alternative way with CTE but without ROW_NUMBER(),
Sample data:
if object_id('tempdb..#tab') is not null
drop table #tab
create table #tab (id int, date datetime, allocation int)
insert into #tab
select 1, '2012-01-01', 0 union
select 2, '2012-01-02', 2 union
select 3, '2012-01-03', 0 union
select 4, '2012-01-04', 0 union
select 5, '2012-01-05', 0 union
select 6, '2012-01-06', 5 union
select 7, '2012-01-07', 0 union
select 8, '2012-01-08', 5 union
select 9, '2012-01-09', 0 union
select 10, '2012-01-10', 0
Query:
;with cte(s_id, e_id, b_id) as (
select s.id, e.id, b.id
from #tab s
left join #tab e on dateadd(dd, 1, s.date) = e.date and e.allocation = 0
left join #tab b on dateadd(dd, -1, s.date) = b.date and b.allocation = 0
where s.allocation = 0
)
select ts.date as [start date], te.date as [end date], count(*) as [day count] from (
select c1.s_id as s, (
select min(s_id) from cte c2
where c2.e_id is null and c2.s_id >= c1.s_id
) as e
from cte c1
where b_id is null
) t
join #tab t1 on t1.id between t.s and t.e and t1.allocation = 0
join #tab ts on ts.id = t.s
join #tab te on te.id = t.e
group by t.s, t.e, ts.date, te.date
Live example at data.SE.
Using this sample data:
CREATE TABLE MyTable (ID INT, Date DATETIME, Allocation INT);
INSERT INTO MyTable VALUES (1, {d '2012-01-01'}, 0);
INSERT INTO MyTable VALUES (2, {d '2012-01-02'}, 2);
INSERT INTO MyTable VALUES (3, {d '2012-01-03'}, 0);
INSERT INTO MyTable VALUES (4, {d '2012-01-04'}, 0);
INSERT INTO MyTable VALUES (5, {d '2012-01-05'}, 0);
INSERT INTO MyTable VALUES (6, {d '2012-01-06'}, 5);
GO
Try this:
WITH DateGroups (ID, Date, Allocation, SeedID) AS (
SELECT MyTable.ID, MyTable.Date, MyTable.Allocation, MyTable.ID
FROM MyTable
LEFT JOIN MyTable Prev ON Prev.Date = DATEADD(d, -1, MyTable.Date)
AND Prev.Allocation = 0
WHERE Prev.ID IS NULL
AND MyTable.Allocation = 0
UNION ALL
SELECT MyTable.ID, MyTable.Date, MyTable.Allocation, DateGroups.SeedID
FROM MyTable
JOIN DateGroups ON MyTable.Date = DATEADD(d, 1, DateGroups.Date)
WHERE MyTable.Allocation = 0
), StartDates (ID, StartDate, DayCount) AS (
SELECT SeedID, MIN(Date), COUNT(ID)
FROM DateGroups
GROUP BY SeedID
), EndDates (ID, EndDate) AS (
SELECT SeedID, MAX(Date)
FROM DateGroups
GROUP BY SeedID
)
SELECT StartDates.StartDate, EndDates.EndDate, StartDates.DayCount
FROM StartDates
JOIN EndDates ON StartDates.ID = EndDates.ID;
The first section of the query is a recursive SELECT, which is anchored by all rows that are allocation = 0, and whose previous day either doesn't exist or has allocation != 0. This effectively returns IDs: 1 and 3 which are the starting dates of the periods of time you want to return.
The recursive part of this same query starts from the anchor rows, and finds all subsequent dates that also have allocation = 0. The SeedID keeps track of the anchored ID through all the iterations.
The result so far is this:
ID Date Allocation SeedID
----------- ----------------------- ----------- -----------
1 2012-01-01 00:00:00.000 0 1
3 2012-01-03 00:00:00.000 0 3
4 2012-01-04 00:00:00.000 0 3
5 2012-01-05 00:00:00.000 0 3
The next sub query uses a simple GROUP BY to filter out all the start dates for each SeedID, and also counts the days.
The last sub query does the same thing with the end dates, but this time the day count isn't needed as we already have this.
The final SELECT query joins these two together to combine the start and end dates, and returns them along with the day count.
Give it a try if it works for you
Here SDATE for your DATE remains same as your table.
SELECT SDATE,
CASE WHEN (SELECT COUNT(*)-1 FROM TABLE1 WHERE ID BETWEEN TBL1.ID AND (SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0)) >0 THEN(
CASE WHEN (SELECT SDATE FROM TABLE1 WHERE ID =(SELECT MAX(ID) FROM TABLE1 WHERE ID >TBL1.ID AND ID<(SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0))) IS NULL THEN SDATE
ELSE (SELECT SDATE FROM TABLE1 WHERE ID =(SELECT MAX(ID) FROM TABLE1 WHERE ID >TBL1.ID AND ID<(SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0))) END
)ELSE (SELECT SDATE FROM TABLE1 WHERE ID = (SELECT MAX(ID) FROM TABLE1 WHERE ID > TBL1.ID ))END AS EDATE
,CASE WHEN (SELECT COUNT(*)-1 FROM TABLE1 WHERE ID BETWEEN TBL1.ID AND (SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0)) <0 THEN
(SELECT COUNT(*) FROM TABLE1 WHERE ID BETWEEN TBL1.ID AND (SELECT MAX(ID) FROM TABLE1 WHERE ID > TBL1.ID )) ELSE
(SELECT COUNT(*)-1 FROM TABLE1 WHERE ID BETWEEN TBL1.ID AND (SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0)) END AS DAYCOUNT
FROM TABLE1 TBL1 WHERE ALLOCATION = 0
AND (((SELECT ALLOCATION FROM TABLE1 WHERE ID=(SELECT MAX(ID) FROM TABLE1 WHERE ID < TBL1.ID))<> 0 ) OR (SELECT MAX(ID) FROM TABLE1 WHERE ID < TBL1.ID)IS NULL);
A solution without CTE:
SELECT a.aDate AS StartDate
, MIN(c.aDate) AS EndDate
, (datediff(day, a.aDate, MIN(c.aDate)) + 1) AS DayCount
FROM (
SELECT x.aDate, x.allocation, COUNT(*) idn FROM table1 x
JOIN table1 y ON y.aDate <= x.aDate
GROUP BY x.id, x.aDate, x.allocation
) AS a
LEFT JOIN (
SELECT x.aDate, x.allocation, COUNT(*) idn FROM table1 x
JOIN table1 y ON y.aDate <= x.aDate
GROUP BY x.id, x.aDate, x.allocation
) AS b ON a.idn = b.idn + 1 AND b.allocation = a.allocation
LEFT JOIN (
SELECT x.aDate, x.allocation, COUNT(*) idn FROM table1 x
JOIN table1 y ON y.aDate <= x.aDate
GROUP BY x.id, x.aDate, x.allocation
) AS c ON a.idn <= c.idn AND c.allocation = a.allocation
LEFT JOIN (
SELECT x.aDate, x.allocation, COUNT(*) idn FROM table1 x
JOIN table1 y ON y.aDate <= x.aDate
GROUP BY x.id, x.aDate, x.allocation
) AS d ON c.idn = d.idn - 1 AND d.allocation = c.allocation
WHERE b.idn IS NULL AND c.idn IS NOT NULL AND d.idn IS NULL AND a.allocation = 0
GROUP BY a.aDate
Example