datediff for row that meets my condition only once per row - sql

I want to do a datediff between 2 dates on different rows only if the rows have a condition.
my table looks like the following, with additional columns (like guid)
Id | CreateDateAndTime | condition
---------------------------------------------------------------
1 | 2018-12-11 12:07:55.273 | with this
2 | 2018-12-11 12:07:53.550 | I need to compare this state
3 | 2018-12-11 12:07:53.550 | with this
4 | 2018-12-11 12:06:40.780 | state 3
5 | 2018-12-11 12:06:39.317 | I need to compare this state
with this example I would like to have 2 rows in my selection which represent the difference between the dates from id 5-3 and from id 2-1.
As of now I come with a request that gives me the difference between dates from id 5-3 , id 5-1 and id 2-1 :
with t as (
SELECT TOP (100000)
*
FROM mydatatable
order by CreateDateAndTime desc)
select
DATEDIFF(SECOND, f.CreateDateAndTime, s.CreateDateAndTime) time
from t f
join t s on (f.[guid] = s.[guid] )
where f.condition like '%I need to compare this state%'
and s.condition like '%with this%'
and (f.id - s.id) < 0
My problem is I cannot set f.id - s.id to a value since other rows can be between the ones I want to make the diff on.
How can I make the datediff only on the first rows that meet my conditions?
EDIT : To make it more clear
My condition is an eventname and I want to calculate the time between the occurence of my event 1 and my event 2 and fill a column named time for example.
#Salman A answer is really close to what I want except it will not work when my event 2 is not happening (which was not in my initial example)
i.e. in table like the following , it will make the datediff between row id 5 and row id 2
Id | CreateDateAndTime | condition
---------------------------------------------------------------
1 | 2018-12-11 12:07:55.273 | with this
2 | 2018-12-11 12:07:53.550 | I need to compare this state
3 | 2018-12-11 12:07:53.550 | state 3
4 | 2018-12-11 12:06:40.780 | state 3
5 | 2018-12-11 12:06:39.317 | I need to compare this state
the code I modified :
WITH cte AS (
SELECT id
, CreateDateAndTime AS currdate
, LAG(CreateDateAndTime) OVER (PARTITION BY guid ORDER BY id desc ) AS prevdate
, condition
FROM t
WHERE condition IN ('I need to compare this state', 'with this ')
)
SELECT *
,DATEDIFF(second, currdate, prevdate) time
FROM cte
WHERE condition = 'I need to compare this state '
and DATEDIFF(second, currdate, prevdate) != 0
order by id desc

Perhaps you want to match ids with the nearest smaller id. You can use window functions for this:
WITH cte AS (
SELECT id
, CreateDateAndTime AS currdate
, CASE WHEN LAG(condition) OVER (PARTITION BY guid ORDER BY id) = 'with this'
THEN LAG(CreateDateAndTime) OVER (PARTITION BY guid ORDER BY id) AS prevdate
, condition
FROM t
WHERE condition IN ('I need to compare this state', 'with this')
)
SELECT *
, DATEDIFF(second, currdate, prevdate)
FROM cte
WHERE condition = 'I need to compare this state'
The CASE expression will match this state with with this. If you have mismatching pairs then it'll return NULL.

try by using analytic function lead()
with cte as
(
select 1 as id, '2018-12-11 12:07:55.273' as CreateDateAndTime,'with this' as condition union all
select 2,'2018-12-11 12:07:53.550','I need to compare this state' union all
select 3,'2018-12-11 12:07:53.550','with this' union all
select 4,'2018-12-11 12:06:40.780','state 3' union all
select 5,'2018-12-11 12:06:39.317','I need to compare this state'
) select *,
DATEDIFF(SECOND,CreateDateAndTime,lead(CreateDateAndTime) over(order by Id))
from cte
where condition in ('with this','I need to compare this state')

You Ideally want LEADIF/LAGIF functions, because you are looking for the previous row where condition = 'with this'. Since there are no LEADIF/LAGIFI think the best option is to use OUTER/CROSS APPLY with TOP 1, e.g
CREATE TABLE #T (Id INT, CreateDateAndTime DATETIME, condition VARCHAR(28));
INSERT INTO #T (Id, CreateDateAndTime, condition)
VALUES
(1, '2018-12-11 12:07:55', 'with this'),
(2, '2018-12-11 12:07:53', 'I need to compare this state'),
(3, '2018-12-11 12:07:53', 'with this'),
(4, '2018-12-11 12:06:40', 'state 3'),
(5, '2018-12-11 12:06:39', 'I need to compare this state');
SELECT ID1 = t1.ID,
Date1 = t1.CreateDateAndTime,
ID2 = t2.ID,
Date2 = t2.CreateDateAndTime,
Difference = DATEDIFF(SECOND, t1.CreateDateAndTime, t2.CreateDateAndTime)
FROM #T AS t1
CROSS APPLY
( SELECT TOP 1 t2.CreateDateAndTime, t2.ID
FROM #T AS t2
WHERE t2.Condition = 'with this'
AND t2.CreateDateAndTime > t1.CreateDateAndTime
--AND t2.GUID = t.GUID
ORDER BY CreateDateAndTime
) AS t2
WHERE t1.Condition = 'I need to compare this state';
Which Gives:
ID1 Date1 D2 Date2 Difference
-------------------------------------------------------------------------------
2 2018-12-11 12:07:53.000 1 2018-12-11 12:07:55.000 2
5 2018-12-11 12:06:39.000 3 2018-12-11 12:07:53.000 74

I would enumerate the values and then use window functions for the difference.
select min(id), max(id),
datediff(second, min(CreateDateAndTime), max(CreateDateAndTime)) as seconds
from (select t.*,
row_number() over (partition by condition order by CreateDateAndTime) as seqnum
from t
where condition in ('I need to compare this state', 'with this')
) t
group by seqnum;
I cannot tell what you want the results to look like. This version only output the differences, with the ids of the rows you care about. The difference can also be applied to the original rows, rather than put into summary rows.

Related

JOIN when no data exists

Disclaimer: I don't have a lot of tech background and just learning SQL so apologies.
I have 2 table for acct information - one ref data (ACCT_RD) and one txn data (ACCT_TD).
ACCT_RD is like this
ACCT_ID
ACCT_NAME
1
abc
2
xyz
ACCT_TD is like this
ACCT_ID
DATE
VALUE
1
01-31-2020
4000.33
1
01-31-2021
2000.11
2
01-31-2020
5666.23
I want a query where I will pass the account id and date and it will return me data in format
ACCT_ID
NAME
DATE
VALUE
1
abc
01-31-2020
4000.33
1
abc
null
null
it could be that the ACCT_TD may not contain data (no rows) for all dates but ACCT_RD will always have the info.
I am trying a LEFT Join like
SELECT R.ACCT_ID, R.NAME, T.VALUE, T.DATE
FROM ACCT_RD R
LEFT JOIN ACCT_TD T ON R.ACCT_ID = T.ACCT_ID
WHERE R.ACCT_ID = 1
AND T.DATE IN ('01-31-2000','01-31-2020')
I am getting a row where I have data in both and no row where I don't have data in ACCT_TD.
Is it because in ACCT_TD no row exists for date '01-31-2000' and it is not a column for ACCT_RD?
How can I achieve what I am looking for?
It looks like you have the following requirements for the result set.
For any given ACCT_ID It must contain all the corresponding rows
(if any) from ACCT_TD
VALUE and DATE result set columns are equal to ACCT_TD.VALUE and ACCT_TD.DATE, if ACCT_TD.DATE is in a list of date parameters passed, and both NULL otherwise
If it's the correct understanding of requirements, then try this:
WITH
ACCT_RD (ACCT_ID, ACCT_NAME) AS
(
VALUES
(1, 'abc')
, (2, 'xyz')
)
, ACCT_TD (ACCT_ID, DATE, VALUE) AS
(
VALUES
(1, '01-31-2020', 4000.33)
, (1, '01-31-2021', 2000.11)
, (2, '01-31-2020', 5666.23)
)
SELECT
R.ACCT_ID, R.ACCT_NAME
, CASE WHEN T.DATE IN ('01-31-2000','01-31-2020') THEN T.DATE END AS DATE
, CASE WHEN T.DATE IN ('01-31-2000','01-31-2020') THEN T.VALUE END AS VALUE
FROM ACCT_RD R
LEFT JOIN ACCT_TD T ON R.ACCT_ID = T.ACCT_ID
WHERE R.ACCT_ID = 1;
The result is:
|ACCT_ID|ACCT_NAME|DATE |VALUE |
|-------|---------|----------|-------|
|1 |abc |01-31-2020|4000.33|
|1 |abc | | |
If you want to guarantee that one row is returned even when there is no match in the second table, just move the date filters to the on clause:
SELECT R.ACCT_ID, R.NAME, T.VALUE, T.DATE
FROM ACCT_RD R LEFT JOIN
ACCT_TD T
ON R.ACCT_ID = T.ACCT_ID AND
T.DATE IN ('01-31-2000', '01-31-2020')
WHERE R.ACCT_ID = 1;
The values will be NULL if there is no match.
Also, if your columns are really dates, then you should use proper date formats:
T.DATE IN ('2000-01-31', '2000-01-31')

Joining next Sequential Row

I am planing an SQL Statement right now and would need someone to look over my thougts.
This is my Table:
id stat period
--- ------- --------
1 10 1/1/2008
2 25 2/1/2008
3 5 3/1/2008
4 15 4/1/2008
5 30 5/1/2008
6 9 6/1/2008
7 22 7/1/2008
8 29 8/1/2008
Create Table
CREATE TABLE tbstats
(
id INT IDENTITY(1, 1) PRIMARY KEY,
stat INT NOT NULL,
period DATETIME NOT NULL
)
go
INSERT INTO tbstats
(stat,period)
SELECT 10,CONVERT(DATETIME, '20080101')
UNION ALL
SELECT 25,CONVERT(DATETIME, '20080102')
UNION ALL
SELECT 5,CONVERT(DATETIME, '20080103')
UNION ALL
SELECT 15,CONVERT(DATETIME, '20080104')
UNION ALL
SELECT 30,CONVERT(DATETIME, '20080105')
UNION ALL
SELECT 9,CONVERT(DATETIME, '20080106')
UNION ALL
SELECT 22,CONVERT(DATETIME, '20080107')
UNION ALL
SELECT 29,CONVERT(DATETIME, '20080108')
go
I want to calculate the difference between each statistic and the next, and then calculate the mean value of the 'gaps.'
Thougts:
I need to join each record with it's subsequent row. I can do that using the ever flexible joining syntax, thanks to the fact that I know the id field is an integer sequence with no gaps.
By aliasing the table I could incorporate it into the SQL query twice, then join them together in a staggered fashion by adding 1 to the id of the first aliased table. The first record in the table has an id of 1. 1 + 1 = 2 so it should join on the row with id of 2 in the second aliased table. And so on.
Now I would simply subtract one from the other.
Then I would use the ABS function to ensure that I always get positive integers as a result of the subtraction regardless of which side of the expression is the higher figure.
Is there an easier way to achieve what I want?
The lead analytic function should do the trick:
SELECT period, stat, stat - LEAD(stat) OVER (ORDER BY period) AS gap
FROM tbstats
The average value of the gaps can be done by calculating the difference between the first value and the last value and dividing by one less than the number of elements:
select sum(case when seqnum = num then stat else - stat end) / (max(num) - 1);
from (select period, row_number() over (order by period) as seqnum,
count(*) over () as num
from tbstats
) t
where seqnum = num or seqnum = 1;
Of course, you can also do the calculation using lead(), but this will also work in SQL Server 2005 and 2008.
By using Join also you achieve this
SELECT t1.period,
t1.stat,
t1.stat - t2.stat gap
FROM #tbstats t1
LEFT JOIN #tbstats t2
ON t1.id + 1 = t2.id
To calculate the difference between each statistic and the next, LEAD() and LAG() may be the simplest option. You provide an ORDER BY, and LEAD(something) returns the next something and LAG(something) returns the previous something in the given order.
select
x.id thisStatId,
LAG(x.id) OVER (ORDER BY x.id) lastStatId,
x.stat thisStatValue,
LAG(x.stat) OVER (ORDER BY x.id) lastStatValue,
x.stat - LAG(x.stat) OVER (ORDER BY x.id) diff
from tbStats x

How to write Oracle query to find a total length of possible overlapping from-to dates

I'm struggling to find the query for the following task
I have the following data and want to find the total network day for each unique ID
ID From To NetworkDay
1 03-Sep-12 07-Sep-12 5
1 03-Sep-12 04-Sep-12 2
1 05-Sep-12 06-Sep-12 2
1 06-Sep-12 12-Sep-12 5
1 31-Aug-12 04-Sep-12 3
2 04-Sep-12 06-Sep-12 3
2 11-Sep-12 13-Sep-12 3
2 05-Sep-12 08-Sep-12 3
Problem is the date range can be overlapping and I can't come up with SQL that will give me the following results
ID From To NetworkDay
1 31-Aug-12 12-Sep-12 9
2 04-Sep-12 08-Sep-12 4
2 11-Sep-12 13-Sep-12 3
and then
ID Total Network Day
1 9
2 7
In case the network day calculation is not possible just get to the second table would be sufficient.
Hope my question is clear
We can use Oracle Analytics, namely the "OVER ... PARTITION BY" clause, in Oracle to do this. The PARTITION BY clause is kind of like a GROUP BY but without the aggregation part. That means we can group rows together (i.e. partition them) and them perform an operation on them as separate groups. As we operate on each row we can then access the columns of the previous row above. This is the feature PARTITION BY gives us. (PARTITION BY is not related to partitioning of a table for performance.)
So then how do we output the non-overlapping dates? We first order the query based on the (ID,DFROM) fields, then we use the ID field to make our partitions (row groups). We then test the previous row's TO value and the current rows FROM value for overlap using an expression like: (in pseudo code)
max(previous.DTO, current.DFROM) as DFROM
This basic expression will return the original DFROM value if it doesnt overlap, but will return the previous TO value if there is overlap. Since our rows are ordered we only need to be concerned with the last row. In cases where a previous row completely overlaps the current row we want the row then to have a 'zero' date range. So we do the same thing for the DTO field to get:
max(previous.DTO, current.DFROM) as DFROM, max(previous.DTO, current.DTO) as DTO
Once we have generated the new results set with the adjusted DFROM and DTO values, we can aggregate them up and count the range intervals of DFROM and DTO.
Be aware that most date calculations in database are not inclusive such as your data is. So something like DATEDIFF(dto,dfrom) will not include the day dto actually refers to, so we will want to adjust dto up a day first.
I dont have access to an Oracle server anymore but I know this is possible with the Oracle Analytics. The query should go something like this:
(Please update my post if you get this to work.)
SELECT id,
max(dfrom, LAST_VALUE(dto) OVER (PARTITION BY id ORDER BY dfrom) ) as dfrom,
max(dto, LAST_VALUE(dto) OVER (PARTITION BY id ORDER BY dfrom) ) as dto
from (
select id, dfrom, dto+1 as dto from my_sample -- adjust the table so that dto becomes non-inclusive
order by id, dfrom
) sample;
The secret here is the LAST_VALUE(dto) OVER (PARTITION BY id ORDER BY dfrom) expression which returns the value previous to the current row.
So this query should output new dfrom/dto values which dont overlap. It's then a simple matter of sub-querying this doing (dto-dfrom) and sum the totals.
Using MySQL
I did haves access to a mysql server so I did get it working there. MySQL doesnt have results partitioning (Analytics) like Oracle so we have to use result set variables. This means we use #var:=xxx type expressions to remember the last date value and adjust the dfrom/dto according. Same algorithm just a little longer and more complex syntax. We also have to forget the last date value any time the ID field changes!
So here is the sample table (same values you have):
create table sample(id int, dfrom date, dto date, networkDay int);
insert into sample values
(1,'2012-09-03','2012-09-07',5),
(1,'2012-09-03','2012-09-04',2),
(1,'2012-09-05','2012-09-06',2),
(1,'2012-09-06','2012-09-12',5),
(1,'2012-08-31','2012-09-04',3),
(2,'2012-09-04','2012-09-06',3),
(2,'2012-09-11','2012-09-13',3),
(2,'2012-09-05','2012-09-08',3);
On to the query, we output the un-grouped result set like above:
The variable #ld is "last date", and the variable #lid is "last id". Anytime #lid changes, we reset #ld to null. FYI In mysql the := operators is where the assignment happens, an = operator is just equals.
This is a 3 level query, but it could be reduced to 2. I went with an extra outer query to keep things more readable. The inner most query is simple and it adjusts the dto column to be non-inclusive and does the proper row ordering. The middle query does the adjustment of the dfrom/dto values to make them non-overlapped. The outer query simple drops the non-used fields, and calculate the interval range.
set #ldt=null, #lid=null;
select id, no_dfrom as dfrom, no_dto as dto, datediff(no_dto, no_dfrom) as days from (
select if(#lid=id,#ldt,#ldt:=null) as last, dfrom, dto, if(#ldt>=dfrom,#ldt,dfrom) as no_dfrom, if(#ldt>=dto,#ldt,dto) as no_dto, #ldt:=if(#ldt>=dto,#ldt,dto), #lid:=id as id,
datediff(dto, dfrom) as overlapped_days
from (select id, dfrom, dto + INTERVAL 1 DAY as dto from sample order by id, dfrom) as sample
) as nonoverlapped
order by id, dfrom;
The above query gives the results (notice dfrom/dto are non-overlapping here):
+------+------------+------------+------+
| id | dfrom | dto | days |
+------+------------+------------+------+
| 1 | 2012-08-31 | 2012-09-05 | 5 |
| 1 | 2012-09-05 | 2012-09-08 | 3 |
| 1 | 2012-09-08 | 2012-09-08 | 0 |
| 1 | 2012-09-08 | 2012-09-08 | 0 |
| 1 | 2012-09-08 | 2012-09-13 | 5 |
| 2 | 2012-09-04 | 2012-09-07 | 3 |
| 2 | 2012-09-07 | 2012-09-09 | 2 |
| 2 | 2012-09-11 | 2012-09-14 | 3 |
+------+------------+------------+------+
How about constructing an SQL which merges intervals by removing holes and considering only maximum intervals. It goes like this (not tested):
SELECT DISTINCT F.ID, F.From, L.To
FROM Temp AS F, Temp AS L
WHERE F.From < L.To AND F.ID = L.ID
AND NOT EXISTS (SELECT *
FROM Temp AS T
WHERE T.ID = F.ID
AND F.From < T.From AND T.From < L.To
AND NOT EXISTS ( SELECT *
FROM Temp AS T1
WHERE T1.ID = F.ID
AND T1.From < T.From
AND T.From <= T1.To)
)
AND NOT EXISTS (SELECT *
FROM Temp AS T2
WHERE T2.ID = F.ID
AND (
(T2.From < F.From AND F.From <= T2.To)
OR (T2.From < L.To AND L.To < T2.To)
)
)
with t_data as (
select 1 as id,
to_date('03-sep-12','dd-mon-yy') as start_date,
to_date('07-sep-12','dd-mon-yy') as end_date from dual
union all
select 1,
to_date('03-sep-12','dd-mon-yy'),
to_date('04-sep-12','dd-mon-yy') from dual
union all
select 1,
to_date('05-sep-12','dd-mon-yy'),
to_date('06-sep-12','dd-mon-yy') from dual
union all
select 1,
to_date('06-sep-12','dd-mon-yy'),
to_date('12-sep-12','dd-mon-yy') from dual
union all
select 1,
to_date('31-aug-12','dd-mon-yy'),
to_date('04-sep-12','dd-mon-yy') from dual
union all
select 2,
to_date('04-sep-12','dd-mon-yy'),
to_date('06-sep-12','dd-mon-yy') from dual
union all
select 2,
to_date('11-sep-12','dd-mon-yy'),
to_date('13-sep-12','dd-mon-yy') from dual
union all
select 2,
to_date('05-sep-12','dd-mon-yy'),
to_date('08-sep-12','dd-mon-yy') from dual
),
t_holidays as (
select to_date('01-jan-12','dd-mon-yy') as holiday
from dual
),
t_data_rn as (
select rownum as rn, t_data.* from t_data
),
t_model as (
select distinct id,
start_date
from t_data_rn
model
partition by (rn, id)
dimension by (0 as i)
measures(start_date, end_date)
rules
( start_date[for i
from 1
to end_date[0]-start_date[0]
increment 1] = start_date[0] + cv(i),
end_date[any] = start_date[cv()] + 1
)
order by 1,2
),
t_network_days as (
select t_model.*,
case when
mod(to_char(start_date, 'j'), 7) + 1 in (6, 7)
or t_holidays.holiday is not null
then 0 else 1
end as working_day
from t_model
left outer join t_holidays
on t_holidays.holiday = t_model.start_date
)
select id,
sum(working_day) as network_days
from t_network_days
group by id;
t_data - your initial data
t_holidays - contains list of holidays
t_data_rn - just adds unique key (rownum) to each row of t_data
t_model - expands t_data date ranges into a flat list of dates
t_network_days - marks each date from t_model as working day or weekend based on day of week (Sat and Sun) and holidays list
final query - calculates number of network day per each group.

SQL gaps in dates

I am trying to find gaps in the a table based on a state code the tables look like this.
StateTable:
StateID (PK) | Code
--------------------
1 | AK
2 | AL
3 | AR
StateModel Table:
StateModelID | StateID | EfftiveDate | ExpirationDate
-------------------------------------------------------------------------
1 | 1 | 2012-06-28 00:00:00.000| 2012-08-02 23:59:59.000
2 | 1 | 2012-08-03 00:00:00.000| 2050-12-31 23:59:59.000
3 | 1 | 2055-01-01 00:00:00.000| 2075-12-31 23:59:59.000
The query I am using is the following:
Declare #gapMessage varchar(250)
SET #gapMessage = ''
select
#gapMessage = #gapMessage +
(Select StateTable.Code FROM StateTable where t1.StateID = StateTable.StateID)
+ ' Row ' +CAST(t1.StateModelID as varchar(6))+' has a gap with '+
CAST(t2.StateModelID as varchar(6))+ CHAR(10)
from StateModel t1
inner join StateModel t2
on
t1.StateID = t2.StateID
and DATEADD(ss, 1,t1.ExpirationDate) < t2.EffectiveDate
and t1.EffectiveDate < t2.EffectiveDate
if(#gapMessage != '')
begin
Print 'States with a gap problem'
PRINT #gapMessage
end
else
begin
PRINT 'No States with a gap problem'
end
But with the above table example I get the following output:
States with a gap problem
AK Row 1 has a gap with 3
AK Row 2 has a gap with 3
Is there anyway to restructure my query so that the gap between 1 and 3 does not display because there is not a gap between 1 and 2?
I am using MS sql server 2008
Thanks
WITH
sequenced AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY StateID ORDER BY EfftiveDate) AS SequenceID,
*
FROM
StateModel
)
SELECT
*
FROM
sequenced AS a
INNER JOIN
sequenced AS b
ON a.StateID = b.StateID
AND a.SequenceID = b.SequenceID - 1
WHERE
a.ExpirationDate < DATEADD(second, -1, b.EfftiveDate)
To make this as effective as possible, also add an index on (StateID, EfftiveDate)
I wanted to just give credit to MatBailie, but don't have the points to do it yet, so I thought I would help out anyone else looking for a similar solution that may want to take it a step further like I needed to. I have changed my application of his code (which involves member enrollment) to the same language as the example here.
In my case, I needed these things:
I have two similar tables that I need to develop into one total table. In this example, let's make the tables like this: SomeStates + OtherStates = UpdatedTable. These are UNIONED in the AS clause.
I didn't want to remove any rows due to gaps, but I wanted to flag them on the StateID level. This is added as an additional column 'StateID_GapFlag'.
I also wanted to add a column to hold the oldest or MIN(EffectiveDate). This would be used in later calculations of SUM(period) to get a total duration, excluding gaps. This is the column 'MIN_EffectiveDate'.
;WITH sequenced
( SequenceID
,EffectiveDate
,ExpirationDate)
AS
(select
ROW_NUMBER() OVER (PARTITION BY StateID ORDER by EffectiveDate) as SequenceID,
* from (select EffectiveDate, ExpirationDate from SomeStates
UNION ALL
(select EffectiveDate, ExpirationDate from OtherStates)
) StateModel
where
EffectiveDate > 'filter'
)
Select DISTINCT
IJ1.[MIN_EffectiveDate]
,coalesce(IJ2.GapFlag,'') as [MemberEnrollmentGapFlag]
,EffectiveDate
,ExpirationDate
into UpdatedTable
from sequenced seq
inner join
(select StateID, min(EffectiveDate) as 'MIN_EffectiveDate'
from sequenced
group by StateID
) IJ1
on seq.member# = IJ1.member
left join
(select a.member#, 'GAP' as 'StateID_GapFlag'
from sequenced a
inner join
sequenced b
on a.StateID = b.StateID
and a.SequenceID = (b.sequenceID - 1)
where a.ExpirationDate < DATEADD(day, -1, b.EffectiveDate)
) LJ2
on seq.StateID = LJ2.StateID
You could use ROW_NUMBER to provide an ordering of stateModel's for each state, then check that the second difference for consecutive rows doesn't exceed 1. Something like:
;WITH Models (StateModelID, StateID, Effective, Expiration, RowOrder) AS (
SELECT StateModelID, StateID, EffectiveDate, ExpirationDate,
ROW_NUMBER() OVER (PARTITION BY StateID, ORDER BY EffectiveDate)
FROM StateModel
)
SELECT F.StateModelId, S.StateModelId
FROM Models F
CROSS APPLY (
SELECT M.StateModelId
FROM Models M
WHERE M.RowOrder = F.RowOrder + 1
AND M.StateId = F.StateId
AND DATEDIFF(SECOND, F.Expiration, M.Effective) > 1
) S
This will get you the state model IDs of the rows with gaps, which you can format how you wish.

How can I SELECT distinct data based on a date field?

I have table that stores a log of changes to objects in another table. Here are my table contents:
ObjID Color Date User
------- ------- ------------------------ --------
1 Red 2010-01-01 12:22:00.000 Joe
1 Blue 2010-01-02 15:22:00.000 Jill
1 Green 2010-01-03 16:22:00.000 Joe
1 White 2010-01-10 09:22:00.000 Mike
2 Red 2010-01-09 10:22:00.000 Mike
2 Blue 2010-01-12 09:22:00.000 Jill
2 Orange 2010-01-12 15:22:00.000 Joe
I want to select the most recent date for each Object, as well as the Color and User on the date of that record.
Bascically, I want this result set:
ObjID Color Date User
------- ------- ------------------------ --------
1 White 2010-01-10 09:22:00.000 Mike
2 Orange 2010-01-12 15:22:00.000 Joe
I'm having trouble wrapping my head around the SQL query I need to write to get this data...
I am retrieving data via ODBC from an iSeries DB2 database (AS/400).
Hey there, I think you want the following (where ColorTable is your table name):
SELECT Color.*
FROM ColorTable as Color
INNER JOIN
(
SELECT ObjID, MAX(Date) as Date
FROM ColorTable
GROUP BY ObjID
) as MaxDateByColor
ON Color.ObjID = MaxDateByColor.ObjID
AND Color.Date = MaxDateByColor.Date
Assuming at least SQL Server 2005
DECLARE #T TABLE (ObjID INT,Color VARCHAR(10),[Date] DATETIME,[User] VARCHAR(50))
INSERT INTO #T
SELECT 1,'Red',' 2010-01-01 12:22:00.000','Joe' UNION ALL
SELECT 1,'Blue','2010-01-02 15:22:00.000','Jill' UNION ALL
SELECT 1,'Green',' 2010-01-03 16:22:00.000','Joe' UNION ALL
SELECT 1,'White',' 2010-01-10 09:22:00.000','Mike' UNION ALL
SELECT 2,'Red',' 2010-01-09 10:22:00.000','Mike' UNION ALL
SELECT 2,'Blue','2010-01-12 09:22:00.000','Jill' UNION ALL
SELECT 2,'Orange','2010-01-12 15:22:00.000','Joe'
;WITH T AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ObjID ORDER BY Date DESC) AS RN
FROM #T
)
SELECT ObjID,
Color,
[Date],
[User]
FROM T
WHERE RN=1
Or a SQL Server 2000 method from the article linked to in the comments
SELECT ObjID,
CAST(SUBSTRING(string, 24, 33) AS VARCHAR(10)) AS Color,
CAST(SUBSTRING(string, 1, 23) AS DATETIME ) AS [Date],
CAST(SUBSTRING(string, 34, 83) AS VARCHAR(50)) AS [User]
FROM
(
SELECT ObjID,
MAX((CONVERT(CHAR(23), [Date], 126)
+ CAST(Color AS CHAR(10))
+ CAST([User] AS CHAR(50))) COLLATE Latin1_General_BIN) AS string
FROM #T
GROUP BY ObjID) T;
If you have an Objects table and your ObjectHistory table has an index on ObjID and date, then this could perform better than other queries given so far:
SELECT
X.*
FROM
Objects O
CROSS APPLY (
SELECT TOP 1 *
FROM ObjectHistory H
WHERE O.ObjID = O.ObjID
ORDER BY H.[Date] DESC
) X
The performance improvement may only come if you're pulling columns from the Objects table, too, but it's worth a shot.
If you want all Objects regardless of whether they have a history entry, switch to OUTER APPLY (and of course use O.ObjID instead of H.ObjID).
The neat thing about this query is that
It solves for situations where the Date value can have duplicates
It can support an arbitrary number of items per group (say, the top 5 instead of the top 1)
See these two related questions:
SQL/mysql - Select distinct/UNIQUE but return all columns?
And:
How to efficiently determine changes between rows using SQL
SELECT t1.* FROM Table_name as t1
INNER JOIN (
SELECT MAX(Date) as MaxDate, ObjID FROM Table_name
GROUP BY ObjID
) as t2
ON t1.ObjID = t2.ObjID AND t1.Date = t2.MaxDate
You can find out, per object, its most recent change like this:
select objectid, max(changedate) as LatestChange
from LOG
group by objectid
You can then get the color and user columns by linking the set returned above, instantiated as an inline view that has been given an alias, to the same table again:
select color, user, FOO.objectid, FOO.LatestChange
from LOG
inner join
(
select objectid, max(changedate) as LatestChange
from LOG
group by objectid
) as FOO
on LOG.objectid = FOO.objectid and LOG.changedate = FOO.LatestChange
like martin smiths above,
simply just do a row number over partition and pick one of the rows that is most recent
like
SELECT Color,Date,User
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY User ORDER BY [DATE]) AS ROW_NUMBER
FROM [tablename]
) AS ROWS
WHERE
ROW_NUMBER = 2