Mathematical comparison between rows - sql

I have a table
+----+-----------+---------+
| ID | StartTime | EndTime |
+----+-----------+---------+
| 1 | 2:00pm | 3:00pm |
| 2 | 4:00pm | 5:00pm |
| 3 | 7:00pm | 9:00pm |
+----+-----------+---------+
I need to get the difference between the end time of one row and the start time of the NEXT row. i.e. End time of row 1 compared to start time of row 2, or end time of row 2 compared to start time of row 3.
Ideally I'd like a result that looks similar to
+----+----------------+
| ID | TimeDifference |
+----+----------------+
| 2 | 1.0 hours |
| 3 | 2.0 hours |
+----+----------------+
I have no clue whatsoever on how to do something like this. I'm thinking that I may need 2 temp tables, one to hold start times another for end times so that I can more easily do the comparisons, but honestly that's just a shot in the dark at the moment.
FYI, on server 2008 in case that makes a difference for some of the commands.

NOTE: The question was not tagged SQL Server 2008 when this answer was written.
You can use lag():
select t.*,
datediff(minute, lag(endtime) over (order by id), starttime) / 60.0 as hours_diff
from t;
This does not filter out any rows. The description of the problem ("next row") and the sample data (which is based on "previous row") are inconsistent.

Well, since it's 2008 version you can't use the Lead() or Lag() window functions, but you can use subqueries to mimic them:
SELECT Id,
DATEDIFF(minute,
(
SELECT TOP 1 EndTime
FROM table t1
WHERE t1.Id < t0.Id
ORDER BY t1.Id DESC
), StartTime) / 60.0 As TimeDifference
FROM Table t0
WHERE EXISTS
(
SELECT 1
FROM Table t2
WHERE t2.Id < t0.Id
)

You can try it
declare #t as table (ID int, StartTime time , EndTime time)
INSERT #t SELECT 1 ,'2:00pm', '3:00pm'
INSERT #t SELECT 2 ,'4:00pm', '5:00pm'
INSERT #t SELECT 3 ,'7:00pm', '9:00pm'
---- For sequential IDs
select
a.ID
,a.StartTime
,a.EndTime
,datediff(minute, (SELECT EndTime FROM #t b where b.ID = a.ID - 1) , a.StartTime) / 60.0 as hours_diff
from #t a
---- For non-sequential IDs
;WIth cte_times as (
SELECT
ROW_NUMBER() OVER (ORDER BY Id) as new_ID
, ID
,StartTime
,EndTime
FROM
#t
)
select
a.ID
,a.StartTime
,a.EndTime
,datediff(minute, (SELECT EndTime FROM cte_times b where b.new_ID = a.new_ID - 1) , a.StartTime) / 60.0 as hours_diff
from cte_times a

Related

SQL select a row X times and insert into new

I am trying to migrate a bunch of data from an old database to a new one, the old one used to just have the number of alarms that occurred on a single row. The new database inserts a new record for each alarm that occurs. Here is a basic version of how it might look. I want to select each row from Table 1 and insert the number of alarm values as new rows into Table 2.
Table 1:
| Alarm ID | Alarm Value |
|--------------|----------------|
| 1 | 3 |
| 2 | 2 |
Should go into the alarm table as the below values.
Table 2:
| Alarm New ID | Value |
|--------------|----------|
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
I want to create a select insert script that will do this, so the select statement will bring back the number of rows that appear in the "Value" column.
A recursive CTE can be convenient for this:
with cte as (
select id, alarm, 1 as n
from t
union all
select id, alarm, n + 1
from cte
where n < alarm
)
select row_number() over (order by id) as alarm_id, id as value
from cte
order by 1
option (maxrecursion 0);
Note: If your values do not exceed 100, then you can remove OPTION (MAXRECURSION 0).
Replicate values out with a CTE.
DECLARE #T TABLE(AlarmID INT, Value INT)
INSERT #T VALUES
(1,3),
(2,2)
;WITH ReplicateAmount AS
(
SELECT AlarmID, Value FROM #T
UNION ALL
SELECT R.AlarmID, Value=(R.Value - 1)
FROM ReplicateAmount R
INNER JOIN #T T ON R.AlarmID = T.AlarmID
WHERE R.Value > 1
)
SELECT
AlarmID = ROW_NUMBER() OVER( ORDER BY AlarmID),
Value = AlarmID --??
FROM
ReplicateAmount
ORDER BY
AlarmID
This answers your question. I would think the query below would be more useful, however, you did not include usage context.
SELECT
AlarmID,
Value
FROM
ReplicateAmount
ORDER BY
AlarmID
Rather than using an rCTE, which is recursive (as the name suggests) and will fail at 100 rows, you can use a Tally table, which tend to be far faster as well:
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3)
SELECT ROW_NUMBER() OVER (ORDER BY V.AlarmID,T.I) AS AlarmNewID,
V.AlarmID
FROM (VALUES(1,3),(2,2))V(AlarmID,AlarmValue)
JOIN Tally T ON V.AlarmValue >= T.I;

Fill in missing timestamp values in SQL

SQL newby here looking for a bit of help in writing a query.
Some sample data
Time Value
9:00 1.2
9:01 2.3
9:05 2.4
9:06 2.5
I need to fill in those missing times with zero - so the query would return
Time Value
9:00 1.2
9:01 2.3
9:02 0
9:03 0
9:04 0
9:05 2.4
9:06 2.5
Is this possible in T-SQL?
Thanks for any help / advice ...
One method uses a recursive CTE to generate the list of times and then use left join to bring in the values:
with cte as (
select min(s.time) as time, max(s.time) as maxt
from sample s
union all
select dateadd(minute, 1, cte.time), cte.maxt
from cte
where cte.time < cte.maxt
)
select cte.time, coalesce(s.value, 0)
from cte left join
sample s
on cte.time = s.time
order by cte.time;
Note that if you have more than 100 minutes, you will need option (maxrecursion 0) at the end of the query.
You can try to use recursive CTE make calendar table and OUTER JOIN base on that.
CREATE TABLE T(
[Time] Time,
Value float
);
insert into T values ('9:00',1.2);
insert into T values ('9:01',2.3);
insert into T values ('9:05',2.4);
insert into T values ('9:06',2.5);
Query 1:
with cte as (
SELECT MIN([Time]) minDt,MAX([Time] ) maxDt
FROM T
UNION ALL
SELECT dateadd(minute, 1, minDt) ,maxDt
FROM CTE
WHERE dateadd(minute, 1, minDt) <= maxDt
)
SELECT t1.minDt 'Time',
ISNULL(t2.[Value],0) 'Value'
FROM CTE t1
LEFT JOIN T t2 on t2.[Time] = t1.minDt
Results:
| Time | Value |
|------------------|-------|
| 09:00:00.0000000 | 1.2 |
| 09:01:00.0000000 | 2.3 |
| 09:02:00.0000000 | 0 |
| 09:03:00.0000000 | 0 |
| 09:04:00.0000000 | 0 |
| 09:05:00.0000000 | 2.4 |
| 09:06:00.0000000 | 2.5 |

SQL Find Nearest Effective Date in Related Table

I'll preface this by stating that this problem is similar to SQL Join on Nearest Less Than Date but that the solution there doesn't work for my problem. Instead of selecting a single column, I need the results of a table that is 'filtered' based on the nearest date.
I have three tables. The main table contains time ticket data in the form:
ticketId
ticketNumber
ticketDate
projectId
A secondary table tracks rate schedules for resources on each daily ticket for the project. It looks something like this:
scheduleId
projectId
effectiveDate
There is also a third table that is related to the second that actually contains the applicable rates. Something like this:
scheduleId
straightTime
overTime
Joining the first two tables on projectId (obviously) replicates data for every record in the rate schedule for the project. If I have 3 rate schedules for project 1, then ticket records result in something like:
ticketNumber | ticketDate | projectId | effectiveDate | scheduleId
------------- | ------------ | ----------- | -------------- | ----------
1234 | 2016-06-18 | 25 | 2016-06-01 | 1
1234 | 2016-06-18 | 25 | 2016-06-15 | 2
1234 | 2016-06-18 | 25 | 2016-06-31 | 3
Selecting the effectiveDate into my results is straightforward with the example:
SELECT *
, (SELECT TOP 1 t1.effectiveFrom
FROM dbo.labourRateSchedule t1
WHERE t1.effectiveFrom <= t2.[date] and t1.projectId = t2.projectId
ORDER BY t1.effectiveFrom desc) as effectiveDate
FROM dbo.timeTicket t2
ORDER BY t.date
However, I need to be able to join the ID of dbo.labourRateSchedule onto the third table to get the actual rates that apply. Adding the t1.ID to the SELECT statement does not make it accessible to JOIN into another related table.
I've been trying to JOIN the SELECT statement in the FROM statement but the results are only resulting with the last effectiveDate value instead of one that is closest to the applicable ticketDate.
I would hugely appreciate any help on this!
You can move your subquery to the FROM clause by using CROSS APPLY:
SELECT *
FROM dbo.timeTicket tt
CROSS APPLY
(
SELECT TOP(1) *
FROM dbo.labourRateSchedule lrs
WHERE lrs.projectId = tt.projectId
AND lrs.effectiveFrom <= tt.date
ORDER BY lrs.effectiveFrom desc
) best_lrs
JOIN dbo.schedule s on s.schedule_id = best_lrs.schedule_id
ORDER BY tt.date
Can you try something like this (you should change something, as you didn't post all information).
SELECT A.*, C.*
FROM timeTicket A
INNER JOIN (SELECT * , ROW_NUMBER() OVER (PARTITION BY projectId ORDER BY effectiveFrom DESC) AS RN
FROM labourRateSchedule) B ON A.projectId=B.projectId AND B.RN=1
INNER JOIN YOUR3TABLE C ON B.SCHEDULEID=C.SCHEDULEID
You can do this by CTE and Rank function -
create table timeTicket (ticketId int,
ticketNumber int ,
ticketDate smalldatetime ,
projectId int )
go
create table labourRateSchedule
(scheduleId int,
projectId int,
effectiveDate smalldatetime )
go
create table ApplicableRates
(scheduleId int,
straightTime smalldatetime ,
overTime smalldatetime)
go
insert into timeTicket
select 1 , 1234 ,'2016-06-18' ,25
go
insert into labourRateSchedule
select 1 , 25 ,'2016-06-01'
union all select 2 , 25 ,'2016-06-15'
union all select 3 , 25 ,'2016-06-30'
go
insert into ApplicableRates
select 1 , '2016-06-07' ,'2016-06-07'
union all select 2 , '2016-06-17' ,'2016-06-17'
union all select 3 , '2016-06-22' ,'2016-06-25'
go
with cte
as (
select t1.ticketNumber ,t1.ticketDate ,t1.projectId ,t2.effectiveDate ,t3.scheduleId ,t3.straightTime
,t3.overTime , rank() over ( partition by t1.ticketNumber order by abs (DATEDIFF(day,t1.ticketDate, t2.effectiveDate) ) ) DyaDiff
from timeTicket t1 join labourRateSchedule t2
on t1.projectId = t2.projectId
join ApplicableRates t3
on t2.scheduleId = t3.scheduleId)
select * from cte where DyaDiff = 1

SQL Select First column and for each row select unique ID and the last date

I have a problems this mornig , I have tried many solutions and nothing gave me the expected result.
I have a table that looks like this :
+----+----------+-------+
| ID | COL2 | DATE |
+----+----------+-------+
| 1 | 1 | 2001 |
| 1 | 2 | 2002 |
| 1 | 3 | 2003 |
| 1 | 4 | 2004 |
| 2 | 1 | 2001 |
| 2 | 2 | 2002 |
| 2 | 3 | 2003 |
| 2 | 4 | 2004 |
+----+----------+-------+
And I have a query that returns a result like this :
I have the unique ID and for this ID I want to take the last date of the ID
+----+----------+-------+
| ID | COL2 | DATE |
+----+----------+-------+
| 1 | 4 | 2004 |
| 2 | 4 | 2004 |
+----+----------+-------+
But I don't have any idea how I can do that.
I tried Join , CROSS APPLY ..
If you have some idea ,
Thank you
Clement FAYARD
declare #t table (ID INT,Col2 INT,Date INT)
insert into #t(ID,Col2,Date)values (1,1,2001)
insert into #t(ID,Col2,Date)values (1,2,2001)
insert into #t(ID,Col2,Date)values (1,3,2001)
insert into #t(ID,Col2,Date)values (1,4,2001)
insert into #t(ID,Col2,Date)values (2,1,2002)
insert into #t(ID,Col2,Date)values (2,2,2002)
insert into #t(ID,Col2,Date)values (2,3,2002)
insert into #t(ID,Col2,Date)values (2,4,2002)
;with cte as(
select
*,
rn = row_number() over(partition by ID order by Col2 desc)
from #t
)
select
ID,
Col2,
Date
from cte
where
rn = 1
SELECT ID,MAX(Col2),MAX(Date) FROM tableName GROUP BY ID
If col2 and date allways the highest value in combination than you can try
SELECT ID, MAX(COL2), MAX(DATE)
FROM Table1
GROUP BY ID
But it is not realy good.
The alternative is a subquery with:
SELECT yourtable.ID, sub1.COL2, sub1.DATE
FROM yourtable
INNER JOIN -- try with CROSS APPLY for performance AND without ON 1=1
(SELECT TOP 1 COL2, DATE
FROM yourtable sub2
WHERE sub2.ID = topquery.ID
ORDER BY COL2, DATE) sub1 ON 1=1
You didn't tell what's the name of your table so I'll assume below it is tbl:
SELECT m.ID, m.COL2, m.DATE
FROM tbl m
LEFT JOIN tbl o ON m.ID = o.ID AND m.DATE < o.DATE
WHERE o.DATE is NULL
ORDER BY m.ID ASC
Explanation:
The query left joins the table tbl aliased as m (for "max") against itself (alias o, for "others") using the column ID; the condition m.DATE < o.DATE will combine all the rows from m with rows from o having a greater value in DATE. The row having the maximum value of DATE for a given value of ID from m has no pair in o (there is no value greater than the maximum value). Because of the LEFT JOIN this row will be combined with a row of NULLs. The WHERE clause selects only these rows that have NULL for o.DATE (i.e. they have the maximum value of m.DATE).
Check the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.
In order to do this you MUST exclude COL2 Your query should look like this
SELECT ID, MAX(DATE)
FROM table_name
GROUP BY ID
The above query produces the Maximum Date for each ID.
Having COL2 with that query does not makes sense, unless you want the maximum date for each ID and COL2
In that case you can run:
SELECT ID, COL2, MAX(DATE)
GROUP BY ID, COL2;
When you use aggregation functions(like max()), you must always group by all the other columns you have in the select statement.
I think you are facing this problem because you have some fundemental flaws with the design of the table. Usually ID should be a Primary Key (Which is Unique). In this table you have repeated IDs. I do not understand the business logic behind the table but it seems to have some flaws to me.

Group rows into sequences using a sliding window on a DateTime column

I have a table that stores timestamped events. I want to group the events into 'sequences' by using 5-min sliding window on the timestamp column, and write the 'sequence ID' (any ID that can distinguish sequences) and 'order in sequence' into another table.
Input - event table:
+----+-------+-----------+
| Id | Name | Timestamp |
+----+-------+-----------+
| 1 | test | 00:00:00 |
| 2 | test | 00:06:00 |
| 3 | test | 00:10:00 |
| 4 | test | 00:14:00 |
+----+-------+-----------+
Desired output - sequence table. Here SeqId is the ID of the starting event, but it doesn't have to be, just something to uniquely identify a sequence.
+---------+-------+----------+
| EventId | SeqId | SeqOrder |
+---------+-------+----------+
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 2 | 2 |
| 4 | 2 | 3 |
+---------+-------+----------+
What would be the best way to do it? This is MSSQL 2008, I can use SSAS and SSIS if they make things easier.
CREATE TABLE #Input (Id INT, Name VARCHAR(20), Time_stamp TIME)
INSERT INTO #Input
VALUES
( 1 ,'test','00:00:00' ),
( 2 ,'test','00:06:00' ),
( 3 ,'test','00:10:00' ),
( 4 ,'test','00:14:00' )
SELECT * FROM #Input;
WITH cte AS -- add a sequential number
(
SELECT *,
ROW_NUMBER() OVER(ORDER BY Id) AS sort
FROM #Input
), cte2 as -- find the Id's with a difference of more than 5min
(
SELECT cte.*,
CASE WHEN DATEDIFF(MI, cte_1.Time_stamp,cte.Time_stamp) < 5 THEN 0 ELSE 1 END as GrpType
FROM cte
LEFT OUTER JOIN
cte as cte_1 on cte.sort =cte_1.sort +1
), cte3 as -- assign a SeqId
(
SELECT GrpType, Time_Stamp,ROW_NUMBER() OVER(ORDER BY Time_stamp) SeqId
FROM cte2
WHERE GrpType = 1
), cte4 as -- find the Time_Stamp range per SeqId
(
SELECT cte3.*,cte_2.Time_stamp as TS_to
FROM cte3
LEFT OUTER JOIN
cte3 as cte_2 on cte3.SeqId =cte_2.SeqId -1
)
-- final query
SELECT
t.Id,
cte4.SeqId,
ROW_NUMBER() OVER(PARTITION BY cte4.SeqId ORDER BY t.Time_stamp) AS SeqOrder
FROM cte4 INNER JOIN #Input t ON t.Time_stamp>=cte4.Time_stamp AND (t.Time_stamp <cte4.TS_to OR cte4.TS_to IS NULL);
This code is slightly more complex but it returns the expected output (which Gordon Linoffs solution doesn't...) and it's even slightly faster.
You seem to want things grouped together when they are less than five minutes apart. You can assign the groups by getting the previous time stamp and marking the beginning of a group. You then need to do a cumulative sum to get the group id:
with e as (
select e.*,
(case when datediff(minute, prev_timestamp, timestamp) < 5 then 1 else 0 end) as flag
from (select e.*,
(select top 1 e2.timestamp
from events e2
where e2.timestamp < e.timestamp
order by e2.timestamp desc
) as prev_timestamp
from events e
) e
)
select e.eventId, e.seqId,
row_number() over (partition by seqId order b timestamp) as seqOrder
from (select e.*, (select sum(flag) from e e2 where e2.timestamp <= e.timestamp) as seqId
from e
) e;
By the way, this logic is easier to express in SQL Server 2012+ because the window functions are more powerful.