SQL get n results from each set with defined deviation

SQL get n results from each set with defined deviation - sql

Using Oracle 11g and having a table like:
USER | TIME
----- | --------
User1 | 08:15:50
User1 | 10:42:22
User1 | 10:42:24
User1 | 10:42:35
User1 | 10:50:01
User2 | 13:23:05
User2 | 13:23:34
User2 | 13:24:01
User2 | 13:24:02
For each user I need to get (if available) exactly 3 records with deviation between first and last less than a minute. If rows are more than 3 they won't match the criteria. Could you give me some clue?
The result should look like:
User1 | 10:42:22
User1 | 10:42:24
User1 | 10:42:35

Here's my stab at this. I don't have a live Oracle and SQLFiddle isn't working, so please advise how it turns out:
CREATE TABLE t (
u VARCHAR(5),
t DATETIME
);
INSERT INTO t
(u, t)
VALUES
('User1', '2001-01-01 08:15:50'),
('User1', '2001-01-01 10:42:22'),
('User1', '2001-01-01 10:42:24'),
('User1', '2001-01-01 10:42:35'),
('User1', '2001-01-01 10:50:01'),
('User2', '2001-01-01 13:23:05'),
('User2', '2001-01-01 13:23:34'),
('User2', '2001-01-01 13:24:01'),
('User2', '2001-01-01 13:24:02');
SELECT
z.u,
min(z.t) evt_start,
max(z.t) evt_end
FROM
(
SELECT y.*, SUM(prev_or_2prev_not_within) OVER(PARTITION BY u ORDER BY t ROWS UNBOUNDED PRECEDING) as ctr
FROM
(
SELECT
t.*,
CASE WHEN
t - LAG(t) OVER(PARTITION BY u ORDER BY t) < 1.0/1440.0 OR
t - LAG(t, 2) OVER(PARTITION BY u ORDER BY t) < 1.0/1440.0
THEN 0 ELSE 1
END as prev_or_2prev_not_within
FROM
t
) y
) z
GROUP BY
z.u,
z.ctr
HAVING COUNT(*) = 3
I believe it will establish an incrementing counter that doesn't increment when the previous or previousprevious row is within a minute of the current row. It does this by classing rows as 0 or 1, and when 0 occurs the sum-all-preceding-rows operation generates a counter that doesn't change. It then groups on this counter having exactly 3 occurrences. The partition makes the counter work per user
You can see it in action here: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=018125210ecd071f3d11e3d4b3d3e670
It's SQL Server (as noted, I don't have an oracle) but the terms used for sqlserver and the logic should be broadly similar for oracle - oracle supports lag, unbounded sums, having etc, and it does date math in terms of dateA - dateB -> a floating point number representative of whole or parts of a day (and 1440 minutes per day, 1/1440 should represent a float of one minute). The data types sqlserver uses might differ slightly to oracle, and this query does depend on TIME (I called it t - dislike column names that are reserved words/keywords) column being a datetime, not a string that looks like a time. If your data is a string, sort it out so it isn't (use an inner subquery to generate a datetime, or change your data storage so it's stored as a datetime type)
You said you wanted a result that tells the user and the event time - the simplest way to do that was to use min and max to give you the date range. If you're desperate to have all 3 rows on show, you can join the output of this query back to the table with date between evt_start and evt_end, or you can use some sort of string_aggregate type function to give you a list of times straight out of the outermost group operation

I would use analytical count() with range clause:
SQL Fiddle demo
select user_, to_char(time_, 'hh24:mi:ss') time_
from (
select user_, time_,
count(1) over (partition by user_ order by time_
range between interval '1' minute preceding
and interval '1' minute following) cnt
from (select user_, to_date(time_, 'hh24:mi:ss') time_ from tbl))
where cnt = 3
Result:
USER_ TIME_
----- --------
User1 10:42:22
User1 10:42:24
User1 10:42:35
Edit:
As #CaiusJard noticed first answer may show incorrect values when there are intervals like 10:52:01, 10:53:00, 10:53:59. There are some ways to correct this. First is to find min and max time in group and check condition numtodsinterval( max - min, 'day') <= interval '1' minute. Second is to number all rows, then assign flag to these row where prior, current and leading count = 3.
Finally show flagged rows joined with original table:
with t as (
select row_number() over (order by user_, time_) rn, tbl.*,
count(1) over (partition by user_ order by time_
range between interval '1' minute preceding
and interval '1' minute following) cnt
from (select user_, to_date(time_, 'hh24:mi:ss') time_ from tbl) tbl),
r as (select rn,
case when 3 = lag(cnt) over (partition by user_ order by time_)
and 3 = cnt
and 3 = lead(cnt) over (partition by user_ order by time_)
then 1
end flag
from t )
select * from t
join (select rn-1 r1, rn r2, rn+1 r3 from r where flag = 1) r
on rn in (r1, r2, r3)

Related

How can I reference column values from previous rows in BigQuery SQL, in order to perform operations or calculations?

I have sorted my data by start time, and I want to create a new field that rolls up data that overlap start times from the previous rows start and end time.
More specifically, I want to write logic that, for a given record X, if the start time is somewhere between the start and end time of the previous row, I want to give record X the same value for the new field as that previous row. If the start time happens after the end time of the previous row, it would get a new value for the new field.
Is something like this possible in BigQuery SQL? Was thinking maybe lag or window function, but not quite sure. Below are examples of what the base table looks like and what I want for the final table.
Any insight appreciated!

Below is for BigQuery Standard SQL
#standardSQL
SELECT recordID, startTime, endTime,
COUNTIF(newRange) OVER(ORDER BY startTime) AS newRecordID
FROM (
SELECT *,
startTime >= MAX(endTime) OVER(ORDER BY startTime ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS newRange
FROM `project.dataset.table`
)
You can test, play with above using sample data from your question as in example below
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 recordID, TIME '12:35:00' startTime, TIME '12:50:00' endTime UNION ALL
SELECT 2, '12:46:00', '12:59:00' UNION ALL
SELECT 3, '14:27:00', '16:05:00' UNION ALL
SELECT 4, '15:48:00', '16:35:00' UNION ALL
SELECT 5, '16:18:00', '17:04:00'
)
SELECT recordID, startTime, endTime,
COUNTIF(newRange) OVER(ORDER BY startTime) AS newRecordID
FROM (
SELECT *,
startTime >= MAX(endTime) OVER(ORDER BY startTime ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS newRange
FROM `project.dataset.table`
)
-- ORDER BY startTime
with result
Row recordID startTime endTime newRecordID
1 1 12:35:00 12:50:00 0
2 2 12:46:00 12:59:00 0
3 3 14:27:00 16:05:00 1
4 4 15:48:00 16:35:00 1
5 5 16:18:00 17:04:00 1

This is a gaps and islands problem. What you want to do is assign a group id to non-intersecting groups. You can calculating the non-intersections using window functions.
A record starts a new group if the cumulative maximum value of the end time, ordered by start time and ending at the previous record, is less than the current end time. The rest is just a cumulative sum to assign a group id.
For your data:
select t.*,
sum(case when prev_endtime >= endtime then 0 else 1 end) over (order by starttime) as group_id
from (select t.*,
max(endtime) over (order by starttime rows between unbounded preceding and 1 preceding) as prev_endtime
from t
) t;
The only potential issue is if two records start at exactly the same time. If this can happen, the logic might need to be slightly more complex.

Exclude overlapping periods in time aggregate function

I have a table containing each a start and and end date:
DROP TABLE temp_period;
CREATE TABLE public.temp_period
(
id integer NOT NULL,
"startDate" date,
"endDate" date
);
INSERT INTO temp_period(id,"startDate","endDate") VALUES(1,'2010-01-01','2010-03-31');
INSERT INTO temp_period(id,"startDate","endDate") VALUES(2,'2013-05-17','2013-07-18');
INSERT INTO temp_period(id,"startDate","endDate") VALUES(3,'2010-02-15','2010-05-31');
INSERT INTO temp_period(id,"startDate","endDate") VALUES(7,'2014-01-01','2014-12-31');
INSERT INTO temp_period(id,"startDate","endDate") VALUES(56,'2014-03-31','2014-06-30');
Now I want to know the total duration of all periods stored there. I need just the time as an interval. That's pretty easy:
SELECT sum(age("endDate","startDate")) FROM temp_period;
However, the problem is: Those periods do overlap. And I want to eliminate all overlapping periods, so that I get the total amount of time which is covered by at least one record in the table.
You see, there are quite some gaps in between the times, so passing the smallest start date and the most recent end date to the age function won't do the trick. However, I thought about doing that and subtracting the total amount of gaps, but no elegant way to do that came into my mind.
I use PostgreSQL 9.6.

What about this:
WITH
/* get all time points where something changes */
points AS (
SELECT "startDate" AS p
FROM temp_period
UNION SELECT "endDate"
FROM temp_period
),
/*
* Get all date ranges between these time points.
* The first time range will start with NULL,
* but that will be excluded in the next CTE anyway.
*/
inter AS (
SELECT daterange(
lag(p) OVER (ORDER BY p),
p
) i
FROM points
),
/*
* Get all date ranges that are contained
* in at least one of the intervals.
*/
overlap AS (
SELECT DISTINCT i
FROM inter
CROSS JOIN temp_period
WHERE i <# daterange("startDate", "endDate")
)
/* sum the lengths of the date ranges */
SELECT sum(age(upper(i), lower(i)))
FROM overlap;
For your data it will return:
┌──────────┐
│ interval │
├──────────┤
│ 576 days │
└──────────┘
(1 row)

You could try to use recursive cte to calculate the period. For each record, we will check if it's overlapped with previous records. If it is, we only calculate the period that is not overlapping.
WITH RECURSIVE days_count AS
(
SELECT startDate,
endDate,
AGE(endDate, startDate) AS total_days,
rowSeq
FROM ordered_data
WHERE rowSeq = 1
UNION ALL
SELECT GREATEST(curr.startDate, prev.endDate) AS startDate,
GREATEST(curr.endDate, prev.endDate) AS endDate,
AGE(GREATEST(curr.endDate, prev.endDate), GREATEST(curr.startDate, prev.endDate)) AS total_days,
curr.rowSeq
FROM ordered_data curr
INNER JOIN days_count prev
ON curr.rowSeq > 1
AND curr.rowSeq = prev.rowSeq + 1),
ordered_data AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY startDate) AS rowSeq
FROM temp_period)
SELECT SUM(total_days) AS total_days
FROM days_count;
I've created a demo here

Actually there is a case that is not covered by the previous examples.
What if we have such a period ?
INSERT INTO temp_period(id,"startDate","endDate") VALUES(100,'2010-01-03','2010-02-10');
We have the following intervals:
Interval No. | | start_date | | end_date
--------------+------------------+------------+----------------+------------
1 | Interval start | 2010-01-01 | Interval end | 2010-03-31
2 | Interval start | 2010-01-03 | Interval end | 2010-02-10
3 | Interval start | 2010-02-15 | Interval end | 2010-05-31
4 | Interval start | 2013-05-17 | Interval end | 2013-07-18
5 | Interval start | 2014-01-01 | Interval end | 2014-12-31
6 | Interval start | 2014-03-31 | Interval end | 2014-06-30
Even though segment 3 overlaps segment 1, it's seen as a new segment, hence the (wrong) result:
sum
-----
620
(1 row)
The solution is to tweak the core of the query
CASE WHEN start_date < lag(end_date) OVER (ORDER BY start_date, end_date) then NULL ELSE start_date END
needs to be replaced by
CASE WHEN start_date < max(end_date) OVER (ORDER BY start_date, end_date rows between unbounded preceding and 1 preceding) then NULL ELSE start_date END
then it works as expected
sum
-----
576
(1 row)
Summary:
SELECT sum(e - s)
FROM (
SELECT left_edge as s, max(end_date) as e
FROM (
SELECT start_date, end_date, max(new_start) over (ORDER BY start_date, end_date) as left_edge
FROM (
SELECT start_date, end_date, CASE WHEN start_date < max(end_date) OVER (ORDER BY start_date, end_date rows between unbounded preceding and 1 preceding) then NULL ELSE start_date END AS new_start
FROM temp_period
) s1
) s2
GROUP BY left_edge
) s3;

This one required two outer joins on a complex query. One join to identify all overlaps with a startdate larger than THIS and to expand the timespan to match the larger of the two. The second join is needed to match records with no overlaps. Take the Min of the min and the max of the max, including non matched. I was using MSSQL so the syntax may be a bit different.
DECLARE #temp_period TABLE
(
id int NOT NULL,
startDate datetime,
endDate datetime
)
INSERT INTO #temp_period(id,startDate,endDate) VALUES(1,'2010-01-01','2010-03-31')
INSERT INTO #temp_period(id,startDate,endDate) VALUES(2,'2013-05-17','2013-07-18')
INSERT INTO #temp_period(id,startDate,endDate) VALUES(3,'2010-02-15','2010-05-31')
INSERT INTO #temp_period(id,startDate,endDate) VALUES(3,'2010-02-15','2010-07-31')
INSERT INTO #temp_period(id,startDate,endDate) VALUES(7,'2014-01-01','2014-12-31')
INSERT INTO #temp_period(id,startDate,endDate) VALUES(56,'2014-03-31','2014-06-30')
;WITH OverLaps AS
(
SELECT
Main.id,
OverlappedID=Overlaps.id,
OverlapMinDate,
OverlapMaxDate
FROM
#temp_period Main
LEFT OUTER JOIN
(
SELECT
This.id,
OverlapMinDate=CASE WHEN This.StartDate<Prior.StartDate THEN This.StartDate ELSE Prior.StartDate END,
OverlapMaxDate=CASE WHEN This.EndDate>Prior.EndDate THEN This.EndDate ELSE Prior.EndDate END,
PriorID=Prior.id
FROM
#temp_period This
LEFT OUTER JOIN #temp_period Prior ON Prior.endDate > This.startDate AND Prior.startdate < this.endDate AND This.Id<>Prior.ID
) Overlaps ON Main.Id=Overlaps.PriorId
)
SELECT
T.Id,
--If has overlapped then sum all overlapped records prior to this one, else not and overlap get the start and end
MinDate=MIN(COALESCE(HasOverlapped.OverlapMinDate,startDate)),
MaxDate=MAX(COALESCE(HasOverlapped.OverlapMaxDate,endDate))
FROM
#temp_period T
LEFT OUTER JOIN OverLaps IsAOverlap ON IsAOverlap.OverlappedID=T.id
LEFT OUTER JOIN OverLaps HasOverlapped ON HasOverlapped.Id=T.id
WHERE
IsAOverlap.OverlappedID IS NULL -- Exclude older records that have overlaps
GROUP BY
T.Id

Beware: the answer by Laurenz Albe has a huge scalability issue.
I was more than happy when I found it. I customized it for our needs. We deployed to staging and very soon, the server took several minutes to return the results.
Then I found this answer on postgresql.org. Much more efficient.
https://wiki.postgresql.org/wiki/Range_aggregation
SELECT sum(e - s)
FROM (
SELECT left_edge as s, max(end_date) as e
FROM (
SELECT start_date, end_date, max(new_start) over (ORDER BY start_date, end_date) as left_edge
FROM (
SELECT start_date, end_date, CASE WHEN start_date < lag(end_date) OVER (ORDER BY start_date, end_date) then NULL ELSE start_date END AS new_start
FROM temp_period
) s1
) s2
GROUP BY left_edge
) s3;
Result:
sum
-----
576
(1 row)

How to select most dense 1 min in Oracle

I have table with time stamp column tmstmp, this table contains log of certain events. I need to find out the max number events which occurred within 1 min interval.
Please read carefully! I do NOT want to extract the time stamps minute fraction and sum like this:
select count(*), TO_CHAR(tmstmp,'MI')
from log_table
group by TO_CHAR(tmstmp,'MI')
order by TO_CHAR(tmstmp,'MI');
It needs to take 1st record and then look ahead until it selects all records within 1 min from the 1st and sum number of records, then take 2nd and do the same etc..
And as the result there must be a recordset of (sum, starting timestamp).
Anyone has a snippet of code somewhere and care to share please?

Analytic function with a logical window can provide this information directly:
select l.tmstmp,
count(*) over (order by tmstmp range between current row and interval '59.999999' second following) cnt
from log_table l
order by 1
;
TMSTMP CNT
--------------------------- ----------
01.01.16 00:00:00,000000000 4
01.01.16 00:00:10,000000000 4
01.01.16 00:00:15,000000000 3
01.01.16 00:00:20,000000000 2
01.01.16 00:01:00,000000000 3
01.01.16 00:01:40,000000000 2
01.01.16 00:01:50,000000000 1
Please adjust the interval length for your precision. It must be the highest possible value below 1 minute.
To get the maximal minute use the subquery (and don't forget you may receive more that one record - with the MAX count):
with tst as (
select l.tmstmp,
count(*) over (order by tmstmp range between current row and interval '59.999999' second following) cnt
from log_table l)
select * from tst where cnt = (select max(cnt) from tst);
TMSTMP CNT
--------------------------- ----------
01.01.16 00:00:00,000000000 4
01.01.16 00:00:10,000000000 4

I think you can achieve your goal using a subquery in SELECT statement, as follow:
SELECT tmstmp, (
SELECT COUNT(*)
FROM log_table t2
WHERE t2.tmstmp >= t.tmstmp AND t2.tmstmp < t.tmstmp + 1 / (24*60)
) AS events
FROM log_table t;

One method uses a join and aggregation:
select t.*
from (select l.tmstmp, count(*)
from log_table l join
log_table l2
on l2.tmstmp >= l.tmstmp and
l2.tmstmp < l.tmstmp + interval '1' minute
group by l.tmpstmp
order by count(*) desc
) t
where rownum = 1;
Note: This assumes that tmstmp is unique on each row. If this is not true, then the subquery should be aggregating by some column that is unique.
EDIT:
For large data, there is a more efficient way that makes use of cumulative sums:
select tmstamp - interval 1 minute as starttm, tmstamp as endtm, cumulative
from (select tmstamp, sum(inc) over (order by tmstamp) as cumulative
from (select tmstamp, 1 as inc from log_table union all
select tmstamp + interval '1' day, -1 as inc from log_table
) t
order by sum(inc) over (order by tmstamp) desc
) t
where rownum = 1;

Counting an already counted column in SQL (db2)

I'm pretty new to SQL and have this problem:
I have a filled table with a date column and other not interesting columns.
date | name | name2
2015-03-20 | peter | pan
2015-03-20 | john | wick
2015-03-18 | harry | potter
What im doing right now is counting everything for a date
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
what i want to do now is counting the resulting lines and only returning them if there are less then 10 resulting lines.
What i tried so far is surrounding the whole query with a temp table and the counting everything which gives me the number of resulting lines (yeah)
with temp_count (date, counter) as
(
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
)
select count(*)
from temp_count
What is still missing the check if the number is smaller then 10.
I was searching in this Forum and came across some "having" structs to use, but that forced me to use a "group by", which i can't.
I was thinking about something like this :
with temp_count (date, counter) as
(
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
)
select *
from temp_count
having count(*) < 10
maybe im too tired to think of an easy solution, but i can't solve this so far
Edit: A picture for clarification since my english is horrible
http://imgur.com/1O6zwoh
I want to see the 2 columned results ONLY IF there are less then 10 rows overall

I think you just need to move your having clause to the inner query so that it is paired with the GROUP BY:
with temp_count (date, counter) as
(
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
having count(*) < 10
)
select *
from temp_count
If what you want is to know whether the total # of records (after grouping), are returned, then you could do this:
with temp_count (date, counter) as
(
select date, counter=count(*)
from testtable
where date >= current date - 10 days
group by date
)
select date, counter
from (
select date, counter, rseq=row_number() over (order by date)
from temp_count
) x
group by date, counter
having max(rseq) >= 10
This will return 0 rows if there are less than 10 total, and will deliver ALL the results if there are 10 or more (you can just get the first 10 rows if needed with this also).

In your temp_count table, you can filter results with the WHERE clause:
with temp_count (date, counter) as
(
select date, count(distinct date)
from testtable
where date >= current date - 10 days
group by date
)
select *
from temp_count
where counter < 10

Something like:
with t(dt, rn, cnt) as (
select dt, row_number() over (order by dt) as rn
, count(1) as cnt
from testtable
where dt >= current date - 10 days
group by dt
)
select dt, cnt
from t where 10 >= (select max(rn) from t);
will do what you want (I think)

PostgreSQL: How to return rows with respect to a found row (relative results)?

Forgive my example if it does not make sense. I'm going to try with a simplified one to encourage more participation.
Consider a table like the following:
dt | mnth | foo
--------------+------------+--------
2012-12-01 | December |
...
2012-08-01 | August |
2012-07-01 | July |
2012-06-01 | June |
2012-05-01 | May |
2012-04-01 | April |
2012-03-01 | March |
...
1997-01-01 | January |
If you look for the record with dt closest to today w/o going over, what would be the best way to also return the 3 records beforehand and 7 records after?
I decided to try windowing functions:
WITH dates AS (
select row_number() over (order by dt desc)
, dt
, dt - now()::date as dt_diff
from foo
)
, closest_date AS (
select * from dates
where dt_diff = ( select max(dt_diff) from dates where dt_diff <= 0 )
)
SELECT *
FROM dates
WHERE row_number - (select row_number from closest_date) >= -3
AND row_number - (select row_number from closest_date) <= 7 ;
I feel like there must be a better way to return relative records with a window function, but it's been some time since I've looked at them.

create table foo (dt date);
insert into foo values
('2012-12-01'),
('2012-08-01'),
('2012-07-01'),
('2012-06-01'),
('2012-05-01'),
('2012-04-01'),
('2012-03-01'),
('2012-02-01'),
('2012-01-01'),
('1997-01-01'),
('2012-09-01'),
('2012-10-01'),
('2012-11-01'),
('2013-01-01')
;
select dt
from (
(
select dt
from foo
where dt <= current_date
order by dt desc
limit 4
)
union all
(
select dt
from foo
where dt > current_date
order by dt
limit 7
)) s
order by dt
;
dt
------------
2012-03-01
2012-04-01
2012-05-01
2012-06-01
2012-07-01
2012-08-01
2012-09-01
2012-10-01
2012-11-01
2012-12-01
2013-01-01
(11 rows)

You could use the window function lead():
SELECT dt_lead7 AS dt
FROM (
SELECT *, lead(dt, 7) OVER (ORDER BY dt) AS dt_lead7
FROM foo
) d
WHERE dt <= now()::date
ORDER BY dt DESC
LIMIT 11;
Somewhat shorter, but the UNION ALL version will be faster with a suitable index.
That leaves a corner case where "date closest to today" is within the first 7 rows. You can pad the original data with 7 rows of -infinity to take care of this:
SELECT d.dt_lead7 AS dt
FROM (
SELECT *, lead(dt, 7) OVER (ORDER BY dt) AS dt_lead7
FROM (
SELECT '-infinity'::date AS dt FROM generate_series(1,7)
UNION ALL
SELECT dt FROM foo
) x
) d
WHERE d.dt &lt= now()::date -- same as: WHERE dt &lt= now()::date1
ORDER BY d.dt_lead7 DESC -- same as: ORDER BY dt DESC 1
LIMIT 11;
I table-qualified the columns in the second query to clarify what happens. See below.
The result will include NULL values if the "date closest to today" is within the last 7 rows of the base table. You can filter those with an additional sub-select if you need to.
1To address your doubts about output names versus column names in the comments - consider the following quotes from the manual.
Where to use an output column's name:
An output column's name can be used to refer to the column's value in
ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses;
there you must write out the expression instead.
Bold emphasis mine. WHERE dt <= now()::date references the column d.dt, not the the output column of the same name - thereby working as intended.
Resolving conflicts:
If an ORDER BY expression is a simple name that matches both an output
column name and an input column name, ORDER BY will interpret it as
the output column name. This is the opposite of the choice that GROUP BY
will make in the same situation. This inconsistency is made to be
compatible with the SQL standard.
Bold emphasis mine again. ORDER BY dt DESC in the example references the output column's name - as intended. Anyway, either columns would sort the same. The only difference could be with the NULL values of the corner case. But that falls flat, too, because:
the default behavior is NULLS LAST when ASC is specified or implied,
and NULLS FIRST when DESC is specified
As the NULL values come after the biggest values, the order is identical either way.
Or, without LIMIT (as per request in comment):
WITH x AS (
SELECT *
, row_number() OVER (ORDER BY dt) AS rn
, first_value(dt) OVER (ORDER BY (dt > '2011-11-02')
, dt DESC) AS dt_nearest
FROM foo
)
, y AS (
SELECT rn AS rn_nearest
FROM x
WHERE dt = dt_nearest
)
SELECT dt
FROM x, y
WHERE rn BETWEEN rn_nearest - 3 AND rn_nearest + 7
ORDER BY dt;
If performance is important, I would still go with #Clodoaldo's UNION ALL variant. It will be fastest. Database agnostic SQL will only get you so far. Other RDBMS do not have window functions at all, yet (MySQL), or different function names (like first_val instead of first_value). You might just as well replace LIMIT with TOP n (MS SQL) or whatever the local dialect.

You could use something like that:
select * from foo
where dt between now()- interval '7 months' and now()+ interval '3 months'
This and this may help you.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL get n results from each set with defined deviation - sql

Related

How can I reference column values from previous rows in BigQuery SQL, in order to perform operations or calculations?

Exclude overlapping periods in time aggregate function

How to select most dense 1 min in Oracle

Counting an already counted column in SQL (db2)

PostgreSQL: How to return rows with respect to a found row (relative results)?

Categories

Resources