compare next record in the same table sql - sql

I have a table having 2 columns trans_date and amount.
I want to a query that give me the amount if the transdate diff of a record and the next record is 1 day or same day.
explanation:
AMOUNT TRANS_DATE
2645 2011-05-11 20:57:27.000
2640 2011-05-12 00:00:00.000
2645 2011-05-15 18:01:11.000
2645 2011-06-15 18:27:45.000
2645 2011-06-16 17:06:33.000
2645 2011-06-18 15:19:19.000
2645 2011-06-23 15:42:18.000
the query should show me only
AMOUNT TRANS_DATE
2645 2011-05-11 20:57:27.000
2640 2011-05-12 00:00:00.000
2645 2011-05-15 18:01:11.000
2645 2011-06-15 18:27:45.000
2645 2011-06-16 17:06:33.000
all i have tried is
select DATEDIFF(DAY,a.TRANS_DATE,b.TRANS_DATE) from FIN_AP_PAYMENTS a inner join ( select * from (select a.*,rank() over (order by id) as ra from FIN_AP_PAYMENTS a, FIN_AP_PAYMENTS b )tbl )
select a.TRANS_DATE,b.TRANS_DATE,rank() over (order by a.id) as ra1,rank() over (order by b.id) as ra2 from FIN_AP_PAYMENTS a,FIN_AP_PAYMENTS b
select DATEDIFF(day,tbl.TRANS_DATE,tbl2.TRANS_DATE) from (select a.*,rank() over (order by id) as ra from FIN_AP_PAYMENTS a) tbl inner join (select a.*,rank() over (order by a.id) as ra1 from FIN_AP_PAYMENTS a ) tbl2 on tbl.id=tbl2.id

Use lead() and lag() to get the next and previous values. Then check the timing between them for filtering:
select t.amount, t.trans_date
from (select t.*, lead(trans_date) over (order by trans_date) as next_td,
lag(trans_date) over (order by trans_date) as prev_td
from FIN_AP_PAYMENTS t
) t
where datediff(second, prev_td, trans_date) < 24*60*60 or
datediff(second, trans_date, next_trans_date) < 24*60*60;
EDIT:
In SQL Server 2008, you can do this using outer apply:
select t.amount, t.trans_date
from (select t.*, tlead.trans_date as next_td,
tlag.trans_date as prev_td
from FIN_AP_PAYMENTS t outer apply
(select top 1 t2.*
from FIN_AP_PAYMENTS t2
where t2.trans_date < t.trans_date
order by trans_date desc
) tlag outer apply
(select top 1 t2.*
from FIN_AP_PAYMENTS t2
where t2.trans_date > t.trans_date
order by trans_date asc
) tlead
) t
where datediff(second, prev_td, trans_date) < 24*60*60 or
datediff(second, trans_date, next_trans_date) < 24*60*60;

Pre SQL Server 2012 you can use a combination of ROW_NUMBER and self joins instead of LEAD and LAG.
Example
WITH Example AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY Trans_Date) AS rn,
r.*
FROM
(
VALUES
(2645, '2011-05-11 20:57:27.000'),
(2640, '2011-05-12 00:00:00.000'),
(2645, '2011-05-15 18:01:11.000'),
(2645, '2011-06-15 18:27:45.000'),
(2645, '2011-06-16 17:06:33.000'),
(2645, '2011-06-18 15:19:19.000'),
(2645, '2011-06-23 15:42:18.000')
) AS r(Amount, Trans_Date)
)
SELECT
curr.*,
FROM
Example AS curr
LEFT OUTER JOIN Example AS prv ON prv.rn = curr.rn - 1
INNER JOIN Example AS nxt ON nxt.rn = curr.rn + 1
WHERE
DATEDIFF(DAY, curr.Trans_Date, nxt.Trans_Date) IN (0, 1)
OR DATEDIFF(DAY, prv.Trans_Date, curr.Trans_Date) IN (0, 1)
;
The CTE allows you to reuse the row number multiple times. The row number provides a sequence for the self joins. The joins allow to you see the previous and next values on the same row, for comparison.
The output of this query doesn't match your example, see my question in the comments for more on this.
I'm not sure that telling people, who are giving up their time to help you, what you are / are not here to discuss is a good idea. It certainly made me think twice before posting.

Related

SQL - Combine two rows if difference is below threshhold

I have a table like this in SQL Server:
id start_time end_time
1 10:00:00 10:34:00
2 10:38:00 10:52:00
3 10:53:00 11:23:00
4 11:24:00 11:56:00
5 14:20:00 14:40:00
6 14:41:00 14:59:00
7 15:30:00 15:40:00
What I would like to have is a query that outputs consolidated records based on the time difference between two consecutive records (end_time of row n and start_time row n+1) . All records where the time difference is less than 2 minutes should be combined into one time entry and the ID of the first record should be kept. This should also combine more than two records if multiple consecutive records have a time difference less than 2 minutes.
This would be the expected output:
id start_time end_time
1 10:00:00 10:34:00
2 10:38:00 11:56:00
5 14:20:00 14:59:00
7 15:30:00 15:40:00
Thanks in advance for any tips how to build the query.
Edit:
I started with following code to calculate the lead_time and the time difference but do not know how to group and consolidate.
WITH rows AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY Id) AS rn
FROM #temp
)
SELECT mc.id, mc.start_time, mc.end_time, mp.start_time lead_time, DATEDIFF(MINUTE, mc.[end_time], mp.[start_time]) as DiffToNewSession
FROM rows mc
LEFT JOIN rows mp
ON mc.rn = mp.rn - 1
The window function in t-sql can realize a lot of data statistics, such as
create table #temp(id int identity(1,1), start_time time, end_time time)
insert into #temp(start_time, end_time)
values ('10:00:00', '10:34:00')
, ('10:38:00', '10:52:00')
, ('10:53:00', '11:23:00')
, ('11:24:00', '11:56:00')
, ('14:20:00', '14:40:00')
, ('14:41:00', '14:59:00')
, ('15:30:00', '15:40:00')
;with c0 as(
select *, LAG(end_time,1,'00:00:00') over (order by id) as lag_time
from #temp
), c1 as(
select *, case when DATEDIFF(MI, lag_time, start_time) <= 2 then 1 else -0 end as gflag
from c0
), c2 as(
select *, SUM(case when gflag=0 then 1 else 0 end) over(order by id) as gid
from c1
)
select MIN(id) as id, MIN(start_time) as start_time, MAX(end_time) as end_time
from c2
group by gid
In order to better describe the process of data construction, I simply use c0, c1, c2... to represent levels, you can merge some levels and optimize.
If you can’t use id as a sorting condition, then you need to change the sorting part in the above statement.
You can use a recursive cte to get the result that you want. This method just simple compare current end_time with next start_time. If it is less than the 2 mintues threshold use the same start_time as grp_start. And the end, simple do a GROUP BY on the grp_start
with rcte as
(
-- anchor member
select *, grp_start = start_time
from tbl
where id = 1
union all
-- recursive member
select t.id, t.start_time, t.end_time,
grp_start = case when datediff(second, r.end_time, t.start_time) <= 120
then r.grp_start
else t.start_time
end
from tbl t
inner join rcte r on t.id = r.id + 1
)
select id = min(id), grp_start as start_time, max(end_time) as end_time
from rcte
group by grp_start
demo
I guess this should do the trick without recursion. Again I used several ctes in order to make the solution a bit easier to read. guess it can be reduced a little...
INSERT INTO T1 VALUES
(1,'10:00:00','10:34:00')
,(2,'10:38:00','10:52:00')
,(3,'10:53:00','11:23:00')
,(4,'11:24:00','11:56:00')
,(5,'14:20:00','14:40:00')
,(6,'14:41:00','14:59:00')
,(7,'15:30:00','15:40:00')
GO
WITH cte AS(
SELECT *
,ROW_NUMBER() OVER (ORDER BY id) AS rn
,DATEDIFF(MINUTE, ISNULL(LAG(endtime) OVER (ORDER BY id), starttime), starttime) AS diffMin
,COUNT(*) OVER (PARTITION BY (SELECT 1)) as maxRn
FROM T1
),
cteFirst AS(
SELECT *
FROM cte
WHERE rn = 1 OR diffMin > 2
),
cteGrp AS(
SELECT *
,ISNULL(LEAD(rn) OVER (ORDER BY id), maxRn+1) AS nextRn
FROM cteFirst
)
SELECT f.id, f.starttime, MAX(ISNULL(n.endtime, f.endtime)) AS endtime
FROM cteGrp f
LEFT JOIN cte n ON n.rn >= f.rn AND n.rn < f.nextRn
GROUP BY f.id, f.starttime

oracle sql get transactions between the period

I have 3 tables in oracle sql namely investor, share and transaction.
I am trying to get new investors invested in any shares for a certain period. As they are the new investor, there should not be a transaction in the transaction table for that investor against that share prior to the search period.
For the transaction table with the following records:
Id TranDt InvCode ShareCode
1 2020-01-01 00:00:00.000 inv1 S1
2 2019-04-01 00:00:00.000 inv1 S1
3 2020-04-01 00:00:00.000 inv1 S1
4 2021-03-06 11:50:20.560 inv2 S2
5 2020-04-01 00:00:00.000 inv3 S1
For the search period between 2020-01-01 and 2020-05-01, I should get the output as
5 2020-04-01 00:00:00.000 inv3 S1
Though there are transactions for inv1 in the table for that period, there is also a transaction prior to the search period, so that shouldn't be included as it's not considered as new investor within the search period.
Below query is working but it's really taking ages to return the results calling from c# code leading to timeout issues. Is there anything we can do to refine to get the results quicker?
WITH
INVESTORS AS
(
SELECT I.INVCODE FROM INVESTOR I WHERE I.CLOSED IS NULL)
),
SHARES AS
(
SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL))
),
SHARES_IN_PERIOD AS
(
SELECT DISTINCT
T.INVCODE,
T.SHARECODE,
T.TYPE
FROM TRANSACTION T
JOIN INVESTORS I ON T.INVCODE = I.INVCODE
JOIN SHARES S ON T.SHARECODE = S.SHARECODE
WHERE T.TRANDT >= :startDate AND T.TRANDT <= :endDate
),
PREVIOUS_SHARES AS
(
SELECT DISTINCT
T.INVCODE,
T.SHARECODE,
T.TYPE
FROM TRANSACTION T
JOIN INVESTORS I ON T.INVCODE = I.INVCODE
JOIN SHARES S ON T.TRSTCODE = S.TRSTCODE
WHERE T.TRANDT < :startDate
)
SELECT
DISTINCT
SP.INVCODE AS InvestorCode,
SP.SHARECODE AS ShareCode,
SP.TYPE AS ShareType
FROM SHARES_IN_PERIOD SP
WHERE (SP.INVCODE, SP.SHARECODE, SP.TYPE) NOT IN
(
SELECT
PS.INVCODE,
PS.SHARECODE,
PS.TYPE
FROM PREVIOUS_SHARES PS
)
With the suggestion given by #Gordon Linoff, I tried following options (for all the shares I need) but they are taking long time too. Transaction table is over 32 million rows.
1.
WITH
SHARES AS
(
SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL))
)
select t.invcode, t.sharecode, t.type
from (select t.*,
row_number() over (partition by invcode, sharecode, type order by trandt)
as seqnum
from transactions t
) t
join shares s on s.sharecode = t.sharecode
where seqnum = 1 and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
WITH
INVESTORS AS
(
SELECT I.INVCODE FROM INVESTOR I WHERE I.CLOSED IS NULL)
),
SHARES AS
(
SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL))
)
select t.invcode, t.sharecode, t.type
from (select t.*,
row_number() over (partition by invcode, sharecode, type order by trandt)
as seqnum
from transactions t
) t
join investors i on i.invcode = t.invcode
join shares s on s.sharecode = t.sharecode
where seqnum = 1 and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
select t.invcode, t.sharecode, t.type
from (select t.*,
row_number() over (partition by invcode, sharecode, type order by trandt)
as seqnum
from transactions t
) t
where seqnum = 1 and
t.sharecode IN (SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL)))
and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
If you want to know if the first record in transactions for a share is during a period, you can use window functions:
select t.*
from (select t.*,
row_number() over (partition by invcode, sharecode order by trandt) as seqnum
from transactions t
) t
where seqnum = 1 and
t.sharecode = :sharecode and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
For performance for this code, you want an index on transactions(invcode, sharecode, trandate).

calculate time difference of consecutive row dates in SQL

Hello I am trying to calculate the time difference of 2 consecutive rows for Date (either in hours or Days), as attached in the image
Highlighted in Yellow is the result I want which is basically the difference of the date in that row and 1 above.
How can we achieve it in the SQL? Attached is my complex code which has the rest of the fields in it
with cte
as
(
select m.voucher_no, CONVERT(VARCHAR(30),CONVERT(datetime, f.action_Date, 109),100) as action_date,f.col1_Value,f.col3_value,f.col4_value,f.comments,f.distr_user,f.wf_status,f.action_code,f.wf_user_id
from attdetailmap m
LEFT JOIN awftaskfin f ON f.oid = m.oid and f.client ='PC'
where f.action_Date !='' and action_date between '$?datef' and '$?datet'
),
.*select *, ROW_NUMBER() OVER(PARTITION BY action_Date,distr_user,wf_Status,wf_user_id order by action_Date,distr_user,wf_Status,wf_user_id ) as row_no_1 from cte
cte2 as
(
select *, ROW_NUMBER() OVER(PARTITION BY voucher_no,action_Date,distr_user,wf_Status,wf_user_id order by voucher_no ) as row_no_1 from cte
)
select distinct(v.dim_value) as resid,c.voucher_no,CONVERT(datetime, c.action_Date, 109) as action_Date,c.col4_value,c.comments,c.distr_user,v.description,c.wf_status,c.action_code, c.wf_user_id,v1.description as name,r.rel_value as pay_office,r1.rel_value as site
from cte2 c
LEFT OUTER JOIN aagviuserdetail v ON v.user_id = c.distr_user
LEFT OUTER JOIN aagviuserdetail v1 ON v1.user_id = c.wf_user_id
LEFT OUTER JOIN ahsrelvalue r ON r.resource_id = v.dim_Value and r.rel_Attr_id = 'P1' and r.period_to = '209912'
LEFT OUTER JOIN ahsrelvalue r1 ON r1.resource_id = v.dim_Value and r1.rel_Attr_id = 'Z1' and r1.period_to = '209912'
where c.row_no_1 = '1' and r.rel_value like '$?site1' and voucher_no like '$?trans'
order by voucher_no,action_Date
The key idea is lag(). However, date/time functions vary among databases. So, the idea is:
select t.*,
(date - lag(date) over (partition by transaction_no order by date)) as diff
from t;
I should note that this exact syntax might not work in your database -- because - may not even be defined on date/time values. However, lag() is a standard function and should be available.
For instance, in SQL Server, this would look like:
select t.*,
datediff(second, lag(date) over (partition by transaction_no order by date), date) / (24.0 * 60 * 60) as diff_days
from t;

Oracle SQL Hierarchy Summation

I have a table TRANS that contains the following records:
TRANS_ID TRANS_DT QTY
1 01-Aug-2020 5
1 01-Aug-2020 1
1 03-Aug-2020 2
2 02-Aug-2020 1
The expected output:
TRANS_ID TRANS_DT BEGBAL TOTAL END_BAL
1 01-Aug-2020 0 6 6
1 02-Aug-2020 6 0 6
1 03-Aug-2020 6 2 8
2 01-Aug-2020 0 0 0
2 02-Aug-2020 0 1 1
2 03-Aug-2020 1 0 1
Each trans_id starts with a beginning balance of 0 (01-Aug-2020). For succeeding days, the beginning balance is the ending balance of the previous day and so on.
I can create PL/SQL block to create the output. Is it possible to get the output in 1 SQL statement?
Thanks.
Try this following script using CTE-
Demo Here
WITH CTE
AS
(
SELECT DISTINCT A.TRANS_ID,B.TRANS_DT
FROM your_table A
CROSS JOIN (SELECT DISTINCT TRANS_DT FROM your_table) B
),
CTE2
AS
(
SELECT C.TRANS_ID,C.TRANS_DT,SUM(D.QTY) QTY
FROM CTE C
LEFT JOIN your_table D
ON C.TRANS_ID = D.TRANS_ID
AND C.TRANS_DT = D.TRANS_DT
GROUP BY C.TRANS_ID,C.TRANS_DT
ORDER BY C.TRANS_ID,C.TRANS_DT
)
SELECT F.TRANS_ID,F.TRANS_DT,
(
SELECT COALESCE (SUM(QTY), 0) FROM CTE2 E
WHERE E.TRANS_ID = F.TRANS_ID AND E.TRANS_DT < F.TRANS_DT
) BEGBAL,
(
SELECT COALESCE (SUM(QTY), 0) FROM CTE2 E
WHERE E.TRANS_ID = F.TRANS_ID AND E.TRANS_DT = F.TRANS_DT
) TOTAL ,
(
SELECT COALESCE (SUM(QTY), 0) FROM CTE2 E
WHERE E.TRANS_ID = F.TRANS_ID AND E.TRANS_DT <= F.TRANS_DT
) END_BAL
FROM CTE2 F
You can as well do like this (I would assume it's a bit faster): Demo
with
dt_between as (
select mindt + level - 1 as trans_dt
from (select min(trans_dt) as mindt, max(trans_dt) as maxdt from t)
connect by level <= maxdt - mindt + 1
),
dt_for_trans_id as (
select *
from dt_between, (select distinct trans_id from t)
),
qty_change as (
select distinct trans_id, trans_dt,
sum(qty) over (partition by trans_id, trans_dt) as total,
sum(qty) over (partition by trans_id order by trans_dt) as end_bal
from t
right outer join dt_for_trans_id using (trans_id, trans_dt)
)
select
trans_id,
to_char(trans_dt, 'DD-Mon-YYYY') as trans_dt,
nvl(lag(end_bal) over (partition by trans_id order by trans_dt), 0) as beg_bal,
nvl(total, 0) as total,
nvl(end_bal, 0) as end_bal
from qty_change q
order by trans_id, trans_dt
dt_between returns all the days between min(trans_dt) and max(trans_dt) in your data.
dt_for_trans_id returns all these days for each trans_id in your data.
qty_change finds difference for each day (which is TOTAL in your example) and cumulative sum over all the days (which is END_BAL in your example).
The main select takes END_BAL from previous day and calls it BEG_BAL, it also does some formatting of final output.
First of all, you need to generate dates, then you need to aggregate your values by TRANS_DT, and then left join your aggregated data to dates. The easiest way to get required sums is to use analitic window functions:
with dates(dt) as ( -- generating dates between min(TRANS_DT) and max(TRANS_DT) from TRANS
select min(trans_dt) from trans
union all
select dt+1 from dates
where dt+1<=(select max(trans_dt) from trans)
)
,trans_agg as ( -- aggregating QTY in TRANS
select TRANS_ID,TRANS_DT,sum(QTY) as QTY
from trans
group by TRANS_ID,TRANS_DT
)
select -- using left join partition by to get data on daily basis for each trans_id:
dt,
trans_id,
nvl(sum(qty) over(partition by trans_id order by dates.dt range between unbounded preceding and 1 preceding),0) as BEGBAL,
nvl(qty,0) as TOTAL,
nvl(sum(qty) over(partition by trans_id order by dates.dt),0) as END_BAL
from dates
left join trans_agg tr
partition by (trans_id)
on tr.trans_dt=dates.dt;
Full example with sample data:
alter session set nls_date_format='dd-mon-yyyy';
with trans(TRANS_ID,TRANS_DT,QTY) as (
select 1,to_date('01-Aug-2020'), 5 from dual union all
select 1,to_date('01-Aug-2020'), 1 from dual union all
select 1,to_date('03-Aug-2020'), 2 from dual union all
select 2,to_date('02-Aug-2020'), 1 from dual
)
,dates(dt) as ( -- generating dates between min(TRANS_DT) and max(TRANS_DT) from TRANS
select min(trans_dt) from trans
union all
select dt+1 from dates
where dt+1<=(select max(trans_dt) from trans)
)
,trans_agg as ( -- aggregating QTY in TRANS
select TRANS_ID,TRANS_DT,sum(QTY) as QTY
from trans
group by TRANS_ID,TRANS_DT
)
select
dt,
trans_id,
nvl(sum(qty) over(partition by trans_id order by dates.dt range between unbounded preceding and 1 preceding),0) as BEGBAL,
nvl(qty,0) as TOTAL,
nvl(sum(qty) over(partition by trans_id order by dates.dt),0) as END_BAL
from dates
left join trans_agg tr
partition by (trans_id)
on tr.trans_dt=dates.dt;
You can use a recursive query to generate the overall date range, cross join it with the list of distinct tran_id, then bring the table with a left join. The last step is aggregation and window functions:
with all_dates (trans_dt, max_dt) as (
select min(trans_dt), max(trans_dt) from trans group by trans_id
union all
select trans_dt + interval '1' day, max_dt from all_dates where trans_dt < max_dt
)
select
i.trans_id,
d.trans_dt,
coalesce(sum(sum(t.qty)) over(partition by i.trans_id order by d.trans_dt), 0) - coalesce(sum(t.qty), 0) begbal,
coalesce(sum(t.qty), 0) total,
coalesce(sum(sum(t.qty)) over(partition by i.trans_id order by d.trans_dt), 0) endbal
from all_dates d
cross join (select distinct trans_id from trans) i
left join trans t on t.trans_id = i.trans_id and t.trans_dt = d.trans_dt
group by i.trans_id, d.trans_dt
order by i.trans_id, d.trans_dt

Using T-SQL, how do I generate a result that shows a range of dates

Using SQL Server, how do I generate a result set that shows a range of dates, like so:
StartDate EndDate
01/01/2014 01/04/2014
01/08/2014 01/11/2014
01/14/2014 01/15/2014
The original data had the dates in this format:
ColumnA DateColumn
blah 01/01/2014
blah 01/02/2014
blah 01/03/2014
blah 01/04/2014
blah 01/08/2014
blah 01/09/2014
blah 01/10/2014
blah 01/11/2014
blah 01/14/2014
blah 01/15/2014
Currently, I have a bunch of queries that does this, but I'm wondering if I can do something in less code:
SELECT ROW_NUMBER() OVER(ORDER BY DateColumn) AS rownum,
DateColumn
INTO #main
FROM MyTable
SELECT m1.DateColumn AS TBegin,
m2.DateColumn AS TEnd,
COALESCE(DATEDIFF(day, m2.TimePk, m1.TimePk), 0) AS Gap
INTO #Gap
FROM #main m1
LEFT OUTER JOIN #main m2
ON m1.rownum = m2.rownum + 1
ORDER BY m1.DateColumn
SELECT ROW_NUMBER() OVER(ORDER BY i_id, TBegin) AS rownum,
TBegin
INTO #Begin
FROM #Gap
WHERE Gap <> 1
ORDER BY TBegin
SELECT ROW_NUMBER() OVER(ORDER BY i_id, TEnd) AS rownum,
TEnd
INTO #End
FROM (
SELECT TEnd
FROM #Gap
WHERE Gap > 1
UNION
SELECT MAX(TBegin)
FROM #Gap
) as t
ORDER BY TEnd
SELECT b.TBegin,
e.TEnd
FROM #Begin b
INNER JOIN #End e
ON b.i_id = e.i_id
AND b.rownum = e.rownum
ORDER BY b.TBegin
Any ideas on how to simplify or approach this in an entirely different way?
My approach to these is to identify the first date that has no date preceding it. This is the beginning of a group. Then I take the cumulative sum of that as a group identifier, and do the aggregation.
SQL Server 2008 doesn't have lag or cumulative sums, so I use correlated subqueries for this:
with mt as (
select t.*,
(case when (select top 1 t2.dateColumn
from MyTable t2
where t2.ColumnA = t.ColumnA and
t2.dateColumn < t.dateColumn
order by t2.dateColumn desc
) = dateadd(day, -1, t.datecolumn)
then 0
else 1
end) as IsStart
from MyTable t
),
mtcum as (
select mt.*,
(select sum(mt2.IsStart)
from mt mt2
where mt2.ColumnA = mt.ColumnA and
mt2.dateColumn <= mt.DateColumn
) as grpId
from mt
)
select ColumnA, min(dateColumn) as StartDate, max(dateColumn) as EndDate
from mtcum
group by ColumnA, grpId;
EDIT:
An easier way of approach this is with the observation that the difference between a sequence of dates and a sequence of numbers is constant.
select columnA, min(dateColumn) as StartDate, max(dateColumn) as EndDate
from (select mt.*, row_number() over (partition by ColumnA order by datecolumn) as seqnum
from mytable mt
) t
group by columnA, dateadd(day, - seqnum, datecolumn);
This will work for you. It's still fairly complex though. It uses inner queries to find the first date that is after a gap for each date. This way all days belonging to the same group of dates can be grouped together.
select MIN(DateColumn) StartDate, MAX(DateColumn) EndDate from
(select X.DateColumn, MIN(Y.DateColumn) MinOverGap from
(select DateColumn, ROW_NUMBER() OVER (ORDER BY DateColumn) RowNumber
from MyTable) X
left join
(select DateColumn, ROW_NUMBER() OVER (ORDER BY DateColumn) RowNumber
from MyTable) Y
on DATEADD(d, Y.RowNumber - 1, X.DateColumn) <> DATEADD(d, X.RowNumber -1, Y.DateColumn) AND X.DateColumn < Y.DateColumn
group by x.DateColumn) grouped
group by MinOverGap
order by 1