SQL Selecting rows at varying intervals - sql

I've got a situation where I have a huge table, containing a huge number of rows, which looks like (for example):
id Timestamp Value
14574499 2011-09-28 08:33:32.020 99713.3000
14574521 2011-09-28 08:33:42.203 99713.3000
14574540 2011-09-28 08:33:47.017 99713.3000
14574559 2011-09-28 08:38:53.177 99720.3100
14574578 2011-09-28 08:38:58.713 99720.3100
14574597 2011-09-28 08:39:03.590 99720.3100
14574616 2011-09-28 08:39:08.950 99720.3100
14574635 2011-09-28 08:39:13.793 99720.3100
14574654 2011-09-28 08:39:19.063 99720.3100
14574673 2011-09-28 08:39:23.780 99720.3100
14574692 2011-09-28 08:39:29.167 99758.6400
14574711 2011-09-28 08:39:33.967 99758.6400
14574730 2011-09-28 08:39:40.803 99758.6400
14574749 2011-09-28 08:39:49.297 99758.6400
Ok, so the rules are:
The timestamps can be any n number of seconds apart, 5s, 30s, 60s etc, it varies depending on how old the record is (archiving takes place).
I want to be able to query this table to select each nth row based on the timestamp.
So for example:
Select * from mytable where intervalBetweenTheRows = 30s
(for the purposes of this question, based on the presumption the interval requested is always to a higher precision than available in the database)
So, every nth row based on the time between each row
Any ideas?!
Karl
For those of you who are interested, recursive CTE was actually quite slow, I thought of a slightly different method:
SELECT TOP 500
MIN(pvh.[TimeStamp]) as [TimeStamp],
AVG(pvh.[Value]) as [Value]
FROM
PortfolioValueHistory pvh
WHERE
pvh.PortfolioID = #PortfolioID
AND pvh.[TimeStamp] >= #StartDate
AND pvh.[TimeStamp] <= #EndDate
GROUP BY
FLOOR(DateDiff(Second, '01/01/2011 00:00:00', pvh.[TimeStamp]) / #ResolutionInSeconds)
ORDER BY
[TimeStamp] ASC
I take the timestamp minus an arbitrary date to give a base int to work with, then floor and divide this by my desired resolution, I then group by this, taking the min timestamp (the first of that 'region' of stamps) and the average value for that 'period'.
This is used to plot a graph of historical data, so the average value does me fine.
This was the fastest execution based on the table size that I could come up with
Thanks for your help all.

Assuming that the requirement is that the determinant for whether a row is returned or not depends on the time elapsed from the previous returned row this needs a procedural approach. Recursive CTEs might be a bit more efficient than a cursor though.
WITH RecursiveCTE
AS (SELECT TOP 1 *
FROM #T
ORDER BY [Timestamp]
UNION ALL
SELECT id,
[Timestamp],
Value
FROM (
--Can't use TOP directly
SELECT T.*,
rn = ROW_NUMBER() OVER (ORDER BY T.[Timestamp])
FROM #T T
JOIN RecursiveCTE R
ON T.[Timestamp] >=
DATEADD(SECOND, 30, R.[Timestamp])) R
WHERE R.rn = 1)
SELECT *
FROM RecursiveCTE

This isn't as elegant as Martin S's CTE, but instead uses interpolation on predefined sample points to get the first sample in between each pair of sampling times.
If there is no sample in a period then no record is returned.
DECLARE #SampleTime DATETIME
DECLARE #NumberSamples INT
DECLARE #SampleInterval INT
SET #SampleTime = '2011-09-28 08:33:32.020' -- Start time
SET #NumberSamples = 20 -- Or however many sample intervals you need to evaluate
SET #SampleInterval = 30 -- Seconds
CREATE TABLE #tmpTimesToSample
(
SampleID INT,
SampleTime DATETIME NULL
)
-- Works out the time intervals, 0 to 19
INSERT INTO #tmpTimesToSample(SampleID, SampleTime)
SELECT TOP (#NumberSamples)
sv.number,
DATEADD(ss, sv.number * #SampleInterval, #SampleTime)
FROM
master..spt_values sv
WHERE
type = 'p'
ORDER BY
sv.number ASC
-- Now interpolate these sample intervals back into the data table
SELECT ID, [TimeStamp], Value
FROM
(
SELECT mt.Id, mt.[TimeStamp], mt.Value, row_number() over (partition by tmp.SampleID order by tmp.SampleID) as RowNum
FROM #tmpTimesToSample tmp RIGHT OUTER JOIN MyTable mt
on mt.[TimeStamp] BETWEEN tmp.SampleTime and DATEADD(ss, #SampleInterval, tmp.SampleTime)
) x
WHERE x.RowNum = 1 -- Only want the first sample in each bin
DROP TABLE #tmpTimesToSample
Test data:
CREATE TABLE MyTable
(
ID BIGINT,
[TimeStamp] DATETIME,
[Value] DECIMAL(18,4)
)
GO
insert into MyTable values(14574499, '2011-09-28 08:33:32.020', 99713.3000)
insert into MyTable values(14574521 ,'2011-09-28 08:33:42.203', 99713.3000)
insert into MyTable values(14574540 ,'2011-09-28 08:33:47.017', 99713.3000)
insert into MyTable values(14574559 ,'2011-09-28 08:38:53.177', 99720.3100)
insert into MyTable values(14574578 ,'2011-09-28 08:38:58.713', 99720.3100)
insert into MyTable values(14574597 ,'2011-09-28 08:39:03.590', 99720.3100)
insert into MyTable values(14574616 ,'2011-09-28 08:39:08.950', 99720.3100)
insert into MyTable values(14574635 ,'2011-09-28 08:39:13.793', 99720.3100)
insert into MyTable values(14574654 ,'2011-09-28 08:39:19.063', 99720.3100)
insert into MyTable values(14574673 ,'2011-09-28 08:39:23.780', 99720.3100)
insert into MyTable values(14574692 ,'2011-09-28 08:39:29.167', 99758.6400)
insert into MyTable values(14574711 ,'2011-09-28 08:39:33.967', 99758.6400)
insert into MyTable values(14574730 ,'2011-09-28 08:39:40.803', 99758.6400)
insert into MyTable values(14574749 ,'2011-09-28 08:39:49.297', 99758.6400)
go

This will give you all rows that have a 30 millisecond interval to the next row. Both rows will be side by side.
Select T1.*, T2.*
From MyTable T1
Inner Join MyTable T2
On DateDiff (millisecond, T1.Value, T2.Value) = 30

Related

Query to return the lowest SUM of values over X consecutive days

I'm not even sure how to word this one!...
I have a table with two columns, Price (double) and StartDate (Date). I need to be able to query the table and return X number of rows, lets say 3 for this example - I need to pull back the 3 rows that have consecutive dates e.g. 7th, 8th, 9th of May 2019 which have the lowest sum'd price values from a date range.
I'm thinking a function which takes startDateRange, endDateRange, duration.
It'll return a number of rows (duration) between startDateRange and endDateRange and those three rows when sum'd up would be the cheapest (lowest) sum of any number of rows within that date range for consecutive dates.
So as an example, if I wanted the cheapest 3 dates from between 1st May 2019 and 14th May 2019, the highlighted 3 rows would be returned;
I think possibly LEAD() and LAG() might be a starting point, but I'm not really a SQL person, so not sure if there's a better way around this.
I've developed some c# on my business layer to do this currently, but over large datasets its a bit sluggish - it would be nice to get a list of records straight from my data layer.
Any ideas would be greatly appreciated!
Thanks in advance.
You can calculate averages over 3 days with a window function. Then use top 1 to pick the set of 3 rows with the lowest average:
select top 1 StartDt
, AvgPrice
from (
select StartDt
, avg(Price) over (order by StartDt rows between 2 preceding
and current row) AvgPrice
, count(*) over (order by StartDt rows between 2 preceding
and current row) RowCnt
from prices
) sets_of_3_days
where RowCnt = 3 -- ignore first two rows
order by
AvgPrice desc
Here is your solution, the logic starts when you start to declare the dates. All the best.
--table example
declare #laVieja table (price float,fecha date )
insert into #laVieja values (632,'20150101')
insert into #laVieja values (649,'20150102')
insert into #laVieja values (632,'20150103')
insert into #laVieja values (607,'20150104')
insert into #laVieja values (598,'20150105')
insert into #laVieja values (624,'20150106')
insert into #laVieja values (641,'20150107')
insert into #laVieja values (598,'20150108')
insert into #laVieja values (556,'20150109')
insert into #laVieja values (480,'20150110')
insert into #laVieja values (510,'20150111')
insert into #laVieja values (541,'20150112')
insert into #laVieja values (634,'20150113')
insert into #laVieja values (634,'20150114')
-- end of setting up table example
--declaring dates
declare #fechaIni date, #fechaEnds date
set #fechaIni = '20150101'
set #fechaEnds = '20150114'
--assigning order based on price
select * , ROW_NUMBER() over (order by price) as unOrden
into #laVieja
from #laVieja
where fecha between #fechaIni and #fechaEnds
-- declaring variables for cycle
declare #iteracion float = 1 ,#iteracionMaxima float, #fechaPrimera date, #fechaSegunda date, #fechaTercera date
select #iteracionMaxima = max(unOrden) from #laVieja
--starting cycle
while(#iteracion <= #iteracionMaxima)
begin
--assigning dates to variables
select #fechaPrimera = fecha from #laVieja where unOrden = #iteracion
select #fechaSegunda = fecha from #laVieja where unOrden = #iteracion + 1
select #fechaTercera = fecha from #laVieja where unOrden = #iteracion + 2
--comparing variables
if(#fechaTercera = DATEADD(day,1,#fechaSegunda) and #fechaSegunda = DATEADD(day,1,#fechaPrimera))
begin
select * from #laVieja
where unOrden in (#iteracion,#iteracion+1,#iteracion+2)
set #iteracion = #iteracionMaxima
end
set #iteracion +=1
end
You can use window functions with OVER (... ROWS BETWEEN) clause to calculate the sum/average over specific number of rows. You can then use ROW_NUMBER to find the other two rows.
WITH cte1 AS (
SELECT *
, SUM(Price) OVER (ORDER BY Date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS wsum
, ROW_NUMBER() OVER (ORDER BY Date) AS rn
FROM #t
), cte2 AS (
SELECT TOP 1 rn
FROM cte1
WHERE rn > 2
ORDER BY wsum, Date
)
SELECT *
FROM cte1
WHERE rn BEtWEEN (SELECT rn FROM cte2) - 2 AND (SELECT rn FROM cte2)
In the above query, replace 2 with the size of window - 1.

Cannot get the values in time period using SQL Server Management Studio

Let me explain the process:
We've got a scanned questionnaries.
The OCR system processes these questionnaries to get data.
Then all recognized data(form_id, question_number, answer etc) goes into database.
For each form there are about 120-150 rows in database:
53453, 1, A, 2016-10-30 23:54:18.590
53453, 2, B, 2016-10-30 23:54:18.690
53453, 3, C, 2016-10-30 23:54:18.790 so on
As you can see, it is difficult enough to find duplicate of a questionnarie form in the database. SQL is not my strong point so I need your help) I need to select ID according to the condition: insertionTime difference of 1 min is not a duplicate. But if the ID exists somwhere else in another Time it would be a dublicate.
P.S. I did my best trying to explain my issue. Excuse me for my english)
Make sure your last column's data type is DATETIME the do:
SELECT tA.*
FROM MyTable tA INNER JOIN MyTable tB ON (tA.ID = tB.ID AND tA.question_number = tB.question_number AND tA.answer = tB.answer)
WHERE DATEDIFF(minute,tA.DateColumn,tB.DateColumn) < 2 -- DATEDIFF returns INT
You check only ID or also question and answer? I wrote my query for only ID and Date because You said If ID exists in other row with different time ( difference is more than minute - it is duplicate) you don't say anything about checking answer / question. In last row I modified time.
DECLARE #TMP TABLE (
ID INT,
VALUE INT,
VALUE2 VARCHAR(5),
DATES DATETIME
)
INSERT INTO #TMP
SELECT 53453, 1, 'A', '2016-10-30 23:54:18.590'
INSERT INTO #TMP
SELECT 53453, 2, 'B', '2016-10-30 23:54:18.690'
INSERT INTO #TMP
SELECT 53453, 3, 'C', '2016-10-30 23:56:20.590'
SELECT ID, MIN(DATES) DATES
INTO #TMP_ID
FROM #TMP
GROUP BY ID
-- MORE THAN MINUTE
SELECT *
FROM #TMP T
WHERE EXISTS (
SELECT NULL
FROM #TMP_ID X
WHERE DATEDIFF(second, x.dates, t.DATES) > 60
and x.id = t.id
)
-- LESS THAN MINUTE
SELECT *
FROM #TMP T
WHERE NOT EXISTS (
SELECT NULL
FROM #TMP_ID X
WHERE DATEDIFF(second, x.dates, t.DATES) > 60
and x.id = t.id
)
DROP TABLE #TMP_ID

TSQL - Run date comparison for "duplicates"/false positives on initial query?

I'm pretty new to SQL and am working on pulling some data from several very large tables for analysis. The data is basically triggered events for assets on a system. The events all have a created_date (datetime) field that I care about.
I was able to put together the query below to get the data I need (YAY):
SELECT
event.efkey
,event.e_id
,event.e_key
,l.l_name
,event.created_date
,asset.a_id
,asset.asset_name
FROM event
LEFT JOIN asset
ON event.a_key = asset.a_key
LEFT JOIN l
ON event.l_key = l.l_key
WHERE event.e_key IN (350, 352, 378)
ORDER BY asset.a_id, event.created_date
However, while this gives me the data for the specific events I want, I still have another problem. Assets can trigger these events repeatedly, which can result in large numbers of "false positives" for what I'm looking at.
What I need to do is go through the result set of the query above and remove any events for an asset that occur closer than N minutes together (say 30 minutes for this example). So IF the asset_ID is the same AND the event.created_date is within 30 minutes of another event for that asset in the set THEN I want that removed. For example:
For the following records
a_id 1124 created 2016-02-01 12:30:30
a_id 1124 created 2016-02-01 12:35:31
a_id 1124 created 2016-02-01 12:40:33
a_id 1124 created 2016-02-01 12:45:42
a_id 1124 created 2016-02-02 12:30:30
a_id 1124 created 2016-02-02 13:00:30
a_id 1115 created 2016-02-01-12:30:30
I'd want to return only:
a_id 1124 created 2016-02-01 12:30:30
a_id 1124 created 2016-02-02 12:30:30
a_id 1124 created 2016-02-02 13:00:30
a_id 1115 created 2016-02-01-12:30:30
I tried referencing this and this but I can't make the concepts there work for me. I know I probably need to do a SELECT * FROM (my existing query) but I can't seem to do that without ending up with tons of "multi-part identifier can't be bound" errors (and I have no experience creating temp tables, my attempts at that have failed thus far). I also am not exactly sure how to use DATEDIFF as the date filtering function.
Any help would be greatly appreciated! If you could dumb it down for a novice (or link to explanations) that would also be helpful!
This is a trickier problem than it initially appears. The hard part is capturing the previous good row and removing the next bad rows but not allowing those bad rows to influence whether or not the next row is good. Here is what I came up with. I've tried to explain what is going on with comments in the code.
--sample data since I don't have your table structure and your original query won't work for me
declare #events table
(
id int,
timestamp datetime
)
--note that I changed some of your sample data to test some different scenarios
insert into #events values( 1124, '2016-02-01 12:30:30')
insert into #events values( 1124, '2016-02-01 12:35:31')
insert into #events values( 1124, '2016-02-01 12:40:33')
insert into #events values( 1124, '2016-02-01 13:05:42')
insert into #events values( 1124, '2016-02-02 12:30:30')
insert into #events values( 1124, '2016-02-02 13:00:30')
insert into #events values( 1115, '2016-02-01 12:30:30')
--using a cte here to split the result set of your query into groups
--by id (you would want to partition by whatever criteria you use
--to determine that rows are talking about the same event)
--the row_number function gets the row number for each row within that
--id partition
--the over clause specifies how to break up the result set into groups
--(partitions) and what order to put the rows in within that group so
--that the numbering stays consistant
;with orderedEvents as
(
select id, timestamp, row_number() over (partition by id order by timestamp) as rn
from #events
--you would replace #events here with your query
)
--using a second recursive cte here to determine which rows are "good"
--and which ones are not.
, previousGoodTimestamps as
(
--this is the "seeding" part of the recursive cte where I pick the
--first rows of each group as being a desired result. Since they
--are the first in each group, I know they are good. I also assign
--their timestamp as the previous good timestamp since I know that
--this row is good.
select id, timestamp, rn, timestamp as prev_good_timestamp, 1 as is_good
from orderedEvents
where rn = 1
union all
--this is the recursive part of the cte. It takes the rows we have
--already added to this result set and joins those to the "next" rows
--(as defined by our ordering in the first cte). Then we output
--those rows and do some calculations to determine if this row is
--"good" or not. If it is "good" we set it's timestamp as the
--previous good row timestamp so that rows that come after this one
--can use it to determine if they are good or not. If a row is "bad"
--we just forward along the last known good timestamp to the next row.
--
--We also determine if a row is good by checking if the last good row
--timestamp plus 30 minutes is less than or equal to the current row's
--timestamp. If it is then the row is good.
select e2.id
, e2.timestamp
, e2.rn
, last_good_timestamp.timestamp
, case
when dateadd(mi, 30, last_good_timestamp.timestamp) <= e2.timestamp then 1
else 0
end
from previousGoodTimestamps e1
inner join orderedEvents e2 on e2.id = e1.id and e2.rn = e1.rn + 1
--I used a cross apply here to calculate the last good row timestamp
--once. I could have used two identical subqueries above in the select
--and case statements, but I would rather not duplicate the code.
cross apply
(
select case
when e1.is_good = 1 then e1.timestamp --if the last row is good, just use it's timestamp
else e1.prev_good_timestamp --the last row was bad, forward on what it had for the last good timestamp
end as timestamp
) last_good_timestamp
)
select *
from previousGoodTimestamps
where is_good = 1 --only take the "good" rows
Links to MSDN for some of the more complicated things here:
CTEs and Recursive CTEs
CROSS APPLY
-- Sample data.
declare #Samples as Table ( Id Int Identity, A_Id Int, CreatedDate DateTime );
insert into #Samples ( A_Id, CreatedDate ) values
( 1124, '2016-02-01 12:30:30' ),
( 1124, '2016-02-01 12:35:31' ),
( 1124, '2016-02-01 12:40:33' ),
( 1124, '2016-02-01 12:45:42' ),
( 1124, '2016-02-02 12:30:30' ),
( 1124, '2016-02-02 13:00:30' ),
( 1125, '2016-02-01 12:30:30' );
select * from #Samples;
-- Calculate the windows of 30 minutes before and after each CreatedDate and check for conflicts with other rows.
with Ranges as (
select Id, A_Id, CreatedDate,
DateAdd( minute, -30, S.CreatedDate ) as RangeStart, DateAdd( minute, 30, S.CreatedDate ) as RangeEnd
from #Samples as S )
select Id, A_Id, CreatedDate, RangeStart, RangeEnd,
-- Check for a conflict with another row with:
-- the same A_Id value and an earlier CreatedDate that falls inside the +/-30 minute range.
case when exists ( select 42 from #Samples where A_Id = R.A_Id and CreatedDate < R.CreatedDate and R.RangeStart < CreatedDate and CreatedDate < R.RangeEnd ) then 1
else 0 end as Conflict
from Ranges as R;

Drop rows identified within moving time window

I have a dataset of hospitalisations ('spells') - 1 row per spell. I want to drop any spells recorded within a week after another (there could be multiple) - the rationale being is that they're likely symptomatic of the same underlying cause. Here is some play data:
create table hif_user.rzb_recurse_src (
patid integer not null,
eventdate integer not null,
type smallint not null
);
insert into hif_user.rzb_recurse_src values (1,1,1);
insert into hif_user.rzb_recurse_src values (1,3,2);
insert into hif_user.rzb_recurse_src values (1,5,2);
insert into hif_user.rzb_recurse_src values (1,9,2);
insert into hif_user.rzb_recurse_src values (1,14,2);
insert into hif_user.rzb_recurse_src values (2,1,1);
insert into hif_user.rzb_recurse_src values (2,5,1);
insert into hif_user.rzb_recurse_src values (2,19,2);
Only spells of type 2 - within a week after any other - are to be dropped. Type 1 spells are to remain.
For patient 1, dates 1 & 9 should be kept. For patient 2, all rows should remain.
The issue is with patient 1. Spell date 9 is identified for dropping as it is close to spell date 5; however, as spell date 5 is close to spell date 1 is should be dropped therefore allowing spell date 9 to live...
So, it seems a recursive problem. However, I've not used recursive programming in SQL before and I'm struggling to really picture how to do it. Can anyone help? I should add that I'm using Teradata which has more restrictions than most with recursive SQL (only UNION ALL sets allowed I believe).
It's a cursor logic, check one row after the other if it fits your rules, so recursion is the easiest (maybe the only) way to solve your problem.
To get a decent performance you need a Volatile Table to facilitate this row-by-row processing:
CREATE VOLATILE TABLE vt (patid, eventdate, exac_type, rn, startdate) AS
(
SELECT r.*
,ROW_NUMBER() -- needed to facilitate the join
OVER (PARTITION BY patid ORDER BY eventdate) AS rn
FROM hif_user.rzb_recurse_src AS r
) WITH DATA ON COMMIT PRESERVE ROWS;
WITH RECURSIVE cte (patid, eventdate, exac_type, rn, startdate) AS
(
SELECT vt.*
,eventdate AS startdate
FROM vt
WHERE rn = 1 -- start with the first row
UNION ALL
SELECT vt.*
-- check if type = 1 or more than 7 days from the last eventdate
,CASE WHEN vt.eventdate > cte.startdate + 7
OR vt.exac_type = 1
THEN vt.eventdate -- new start date
ELSE cte.startdate -- keep old date
END
FROM vt JOIN cte
ON vt.patid = cte.patid
AND vt.rn = cte.rn + 1 -- proceed to next row
)
SELECT *
FROM cte
WHERE eventdate - startdate = 0 -- only new start days
order by patid, eventdate
I think the key to solving this is getting the first date more than 7 days from the current date and then doing a recursive subquery:
with rrs as (
select rrs.*,
(select min(rrs2.eventdate)
from hif_user.rzb_recurse_src rrs2
where rrs2.patid = rrs.patid and
rrs2.eventdate > rrs.eventdate + 7
) as eventdate7
from hif_user.rzb_recurse_src rrs
),
recursive cte as (
select patid, min(eventdate) as eventdate, min(eventdate7) as eventdate7
from hif_user.rzb_recurse_src rrs
group by patid
union all
select cte.patid, cte.eventdate7, rrs.eventdate7
from cte join
hif_user.rzb_recurse_src rrs
on rrs.patid = cte.patid and
rrs.eventdate = cte.eventdate7
)
select cte.patid, cte.eventdate
from cte;
If you want additional columns, then join in the original table at the last step.

SQL Looping for repeated value for next column?

Iam trying to loop my values so that my result must look like
ETA ETD
01/01/2013 03/01/2013 //Adding Days according to condition, Here 1 day
03/01/2013 06/01/2013 //Add 3 days
06/01/2013 18/01/2013
18/01/2013 21/01/2013
Here i need to loop values so that my value is repeated in next line
For this i have done my work as
CREATE TABLE #TEMPETAETD(ROWNUM INT,ETA DATETIME,ETD DATETIME)
CREATE TABLE #TEMPETD(ID INT IDENTITY(1,1),ETD DATETIME,ROWNUM INT)
CREATE TABLE #TEMPETA(ID INT IDENTITY(1,1),ETA DATETIME,ROWNUM INT)
;WITH cte AS(
SELECT Row_Number() OVER(ORDER BY Sequence)AS RowID,#ETA AS ETA,DATEADD(DD,vd.NumHaltDays,#ETD) as ETD FROM VoyageDetails vd WHERE ID=1 and vd.Sequence BETWEEN 0 AND 1)
INSERT INTO #TEMPETAETD select * from cte
DECLARE #C INT,#C1 INT
SET #C=1
WHILE #C<(SELECT COUNT(*) FROM #TEMPETAETD)
BEGIN
INSERT INTO #TEMPETA SELECT * FROM #TEMPETAETD WHERE ROWNUM=#C
SET #C=#C+1
END
SET #C1=2
WHILE #C1<=(SELECT COUNT(*) FROM #TEMPETAETD)
BEGIN
INSERT INTO #TEMPETD SELECT * FROM #TEMPETAETD WHERE ROWNUM=#C1
SET #C1=#C1+1
END
This is my condition for Looping..., Here i could not get my repeated values coming in next row.., Can any one please help
It looks like you want values from both the current row and the row before it. In other words, you want to be able to pair up a row with the preceding row, and then select stuff from this pair.
I don't think you need loops for this. Looping is generally pretty slow.
The general idea is, like you did, number the rows. Then you can join the table to itself with the number. Below is an example of how you can do this pairing without using a loop. Schema:
create table T (a int);
insert into T values
(1), (7), (20), (30), (500), (800), (1300), (2112);
query:
with tNumbered as (
select row_number() over (order by a) as rowID, a
from T
)
select tLeft.a as l, tRight.a as r from tNumbered tLeft
left join tNumbered tRight on tLeft.rowID = tRight.rowID -1
Here's a fiddle showing it in action: http://sqlfiddle.com/#!3/a257a/2