Removing clusters of duplicates in a query resultset

Removing clusters of duplicates in a query resultset - sql

I have the following query returning the following results:
db=# SELECT t1.id as id1, t2.id as id2
db-# FROM table_1 As t1, table_2 As t2
db-# WHERE ST_DWithin(t2.lonlat, t1.lonlat, t2.range)
db-# ORDER BY t1.id, t2.id, ST_Distance(t2.lonlat, t1.lonlat);
id1 | id2
-------+------
4499 | 1118
4500 | 1118
4501 | 1119
4502 | 1119
4503 | 1118
4504 | 1118
4505 | 1119
4506 | 1119
4507 | 1118
4508 | 1118
4510 | 1118
4511 | 1118
4514 | 1117
4515 | 1117
4518 | 1117
4519 | 1117
4522 | 1117
4523 | 1117
4603 | 1116
4604 | 1116
4607 | 1116
And I want the resultset to look like this:
id1 | id2
-------+------
4499 | 1118
4501 | 1119
4503 | 1118
4505 | 1119
4507 | 1118
4514 | 1117
4603 | 1116
Essentially, in the results, the query is returning duplicates of id2, but it's ok that id2 occurs many times in the results, but it's not ok if id2 is duplicated in clusters.
The use case here is that id1 represents the ID of a table of GPS positions, while id2 represents a table of waypoints, and I want to have a query that returns the closest passing point to any waypoint (so if waypoint #1118 is passed, then it cannot be passed again until another waypoint is passed).
Is there a way to make this happen using Postgres?

This is a gaps-and-islands problem, but rather subtle. In this case, you only want the rows where the previous row has a different id2. That suggests using LAG():
SELECT id1, id2
FROM (SELECT tt.*, LAG(id2) OVER (ORDER BY id1, id2, dist) as prev_id2
FROM (SELECT t1.id as id1, t2.id as id2,
ST_Distance(t2.lonlat, t1.lonlat) as dist
FROM table_1 t1 JOIN
table_2 t2
ON ST_DWithin(t2.lonlat, t1.lonlat, t2.range)
) tt
) tt
WHERE prev_id2 is distinct from id2
ORDER BY id1, id2, dist;
Note: I think the logic as presented could be simplified because id1 seems unique. Hence the distance calculation seems entirely superfluous. I left that logic in because it might be relevant in your actual query.

Related

SQL: usage time of item between dates combining two tables

Trying to create query that will give me usage time of each car part between dates when that part is used. Etc. let say part id 1 is installed on 2018-03-01 and on 2018-04-01 runs for 50min and then on 2018-05-10 runs 30min total usage of this part shoud be 1:20min as result.
These are examples of my tables.
Table1
| id | part_id | car_id | part_date |
|----|-------- |--------|------------|
| 1 | 1 | 3 | 2018-03-01 |
| 2 | 1 | 1 | 2018-03-28 |
| 3 | 1 | 3 | 2018-05-10 |
Table2
| id | car_id | run_date | puton_time | putoff_time |
|----|--------|------------|---------------------|---------------------|
| 1 | 3 | 2018-04-01 | 2018-04-01 12:00:00 | 2018-04-01 12:50:00 |
| 2 | 2 | 2018-04-10 | 2018-04-10 15:10:00 | 2018-04-10 15:20:00 |
| 3 | 3 | 2018-05-10 | 2018-05-10 10:00:00 | 2018-05-10 10:30:00 |
| 4 | 1 | 2018-05-11 | 2018-05-11 12:00:00 | 2018-04-01 12:50:00 |
Table1 contains dates when each part is installed, table2 contains usage time of each part and they are joined on car_id, I have try to write query but it does not work well if somebody can figure out my mistake in this query that would be healpful.
My SQL query
SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(TIMEDIFF(t1.puton_time, t1.putoff_time)))) AS total_time
FROM table2 t1
LEFT JOIN table1 t2 ON t1.car_id=t2.car_id
WHERE t2.id=1 AND t1.run_date BETWEEN t2.datum AND
(SELECT COALESCE(MIN(datum), '2100-01-01') AS NextDate FROM table1 WHERE
id=1 AND t2.part_date > part_date);
Expected result
| part_id | total_time |
|---------|------------|
| 1 | 1:20:00 |
Hope that this problem make sence because in my search I found nothing like this, so I need help.
Solution, thanks to Kota Mori
SELECT t1.id, SEC_TO_TIME(SUM(TIME_TO_SEC(TIMEDIFF(t2.puton_time, t2.putoff_time)))) AS total_time
FROM table1 t1
LEFT JOIN table2 t2 ON t1.car_id = t2.car_id
AND t1.part_date >= t2.run_date
GROUP BY t1.id

You first need to join the two tables by the car_id and also a condition that part_date should be no greater than run_date.
Then compute the total minutes for each part_id separately.
The following is a query example for SQLite (The only SQL engine that I have access to right now).
Since SQLite does not have datetime type, I convert strings into unix timestamp by strftime function. This part should be changed in accordance with the SQL engine you are using. Apart from that, this is fairly a standard sql and mostly valid for other SQL dialect.
SELECT
t1.id,
sum(
cast(strftime('%s', t2.putoff_time) as integer) -
cast(strftime('%s', t2.puton_time) as integer)
) / 60 AS total_minutes
FROM
table1 t1
LEFT JOIN
table2 t2
ON
t1.car_id = t2.car_id
AND t1.part_date <= t2.run_date
GROUP BY
t1.id
The result is something like the below. Note that ID 1 gets 80 minutes (1:20) as expected.
id total_minutes
0 1 80
1 2 80
2 3 30

Get value of each latest record grouped by ID

I have a table of data records which are stored over time, looking roughly like this :
|| ID || timestamp || position || value || field1 || field2 ||
And another table representing geographic points looking roughly like this :
|| ID || position || field1 || field2 ||
Where field1 and field2 of each table are in the same category (which enables me to compare them)
I have a query that gives me the closest point (from the points table) to each record, looking like this :
SELECT B.ID, A.timestamp as date, A.value, A.field1, A.field2
FROM (SELECT DISTINCT ON (ID) * FROM records) AS A
CROSS JOIN LATERAL (SELECT *
FROM points
ORDER BY A.position <-> geom
LIMIT 1) AS B
WHERE A.field1 = B.field1
AND A.field2 = B.field2
Which allows me to know exactly from what point the value of a record comes from.
I need to get the latest value for each point, and I started like this :
SELECT B.ID, MAX(A.timestamp) as date, A.field1, A.field2
FROM (SELECT DISTINCT ON (ID) * FROM records) AS A
CROSS JOIN LATERAL (SELECT *
FROM points
ORDER BY A.position <-> geom
LIMIT 1) AS B
WHERE A.field1 = B.field1
AND A.field2 = B.field2
GROUP BY B.ID, A.field1, A.field2
But I don't know how to get the value from the data records in my result set, right now if I simply add it at the top, it asks me to add it in the GROUP BY clause.
I've read on other answers that I need to use a INNER JOIN or LATERAL JOIN, but in this case it searches the closest point to each record a second and it considerably slows the request. Is there any way to avoid doing the request two times and then match them using field1 and field2 ?
EDIT:
Here's how the data records look like (the position are really long and not relevant so I decided to not show them)
ID | timestamp | position | value | field1 | field2
----|---------------------|--------------|-------|------------|-----------
001 | 2019-05-03 17:50:00 | {....} | 5 | South | Forward
----|---------------------|--------------|-------|------------|-----------
002 | 2019-05-03 17:55:00 | {....} | 17 | South | Forward
----|---------------------|--------------|-------|------------|-----------
003 | 2019-05-03 18:30:00 | {....} | 0 | South | Backward
----|---------------------|--------------|-------|------------|-----------
004 | 2019-05-03 13:20:00 | {....} | 25 | West | Forward
----|---------------------|--------------|-------|------------|-----------
005 | 2019-05-03 14:30:00 | {....} | 36 | West | Backward
----|---------------------|--------------|-------|------------|-----------
006 | 2019-05-03 16:00:00 | {....} | 12 | West | Backward
After running my first query (to get the closest point), I get this :
B.ID | timestamp | value | field1 | field2
------|---------------------|-------|------------|-----------
475 | 2019-05-03 17:50:00 | 5 | South | Forward
------|---------------------|-------|------------|-----------
263 | 2019-05-03 17:55:00 | 17 | South | Forward
------|---------------------|-------|------------|-----------
157 | 2019-05-03 18:30:00 | 0 | South | Backward
------|---------------------|-------|------------|-----------
957 | 2019-05-03 13:20:00 | 25 | West | Forward
------|---------------------|-------|------------|-----------
547 | 2019-05-03 14:30:00 | 36 | West | Backward
------|---------------------|-------|------------|-----------
547 | 2019-05-03 16:00:00 | 12 | West | Backward
Where B.ID correspond to the closest point to the record position.
What I get when running the query to get the latest record for each [ID / field1 / field2] combination is this :
B.ID | timestamp | field1 | field2
------|---------------------|------------|-----------
475 | 2019-05-03 17:50:00 | South | Forward
------|---------------------|------------|-----------
263 | 2019-05-03 17:55:00 | South | Forward
------|---------------------|------------|-----------
157 | 2019-05-03 18:30:00 | South | Backward
------|---------------------|------------|-----------
957 | 2019-05-03 13:20:00 | West | Forward
------|---------------------|------------|-----------
547 | 2019-05-03 16:00:00 | West | Backward
Where as you can see only the before-the-last row disappeared, because it had the same combination as the last one on (ID / field1 / field2) and it was older.
And what I'd like is this :
B.ID | timestamp | value | field1 | field2
------|---------------------|-------|------------|-----------
475 | 2019-05-03 17:50:00 | 5 | South | Forward
------|---------------------|-------|------------|-----------
263 | 2019-05-03 17:55:00 | 17 | South | Forward
------|---------------------|-------|------------|-----------
157 | 2019-05-03 18:30:00 | 0 | South | Backward
------|---------------------|-------|------------|-----------
957 | 2019-05-03 13:20:00 | 25 | West | Forward
------|---------------------|-------|------------|-----------
547 | 2019-05-03 16:00:00 | 12 | West | Backward

Do you just want distinct on again?
SELECT DISTINCT ON (p.ID) p.ID, r.*
FROM (SELECT DISTINCT ON (r.ID) r.* FROM records r
) r CROSS JOIN LATERAL
(SELECT p.*
FROM points p
ORDER BY r.position <-> p.geom
LIMIT 1
) p
WHERE r.field1 = p.field1 AND r.field2 = p.field2
ORDER BY p.ID, r.timestamp DESC;
I cannot figure out what you intend by:
(SELECT DISTINCT ON (ID) *
FROM records
)
At a minimum, you should have an ORDER BY:
(SELECT DISTINCT ON (ID) *
FROM records
ORDER BY ID
)
However, your sample data and the name ID suggest that there a no duplicates, so the DISTINCT ON may not be necessary.

Filling NULL values with preceding Non-NULL values using FIRST_VALUE

I am joining two tables.
In the first table, I have some items starting at a specific time. In the second table, I have values and timestamps for each minute in between the start and end time of each item.
First table
UniqueID Items start_time
123 one 10:00 AM
456 two 11:00 AM
789 three 11:30 AM
Second table
UniqueID Items time_hit value
123 one 10:00 AM x
123 one 10:05 AM x
123 one 10:10 AM x
123 one 10:30 AM x
456 two 11:00 AM x
456 two 11:15 AM x
789 three 11:30 AM x
So When joining the two tables I have this:
UniqueID Items start_time time_hit value
123 one 10:00 AM 10:00 AM x
123 null null 10:05 AM x
123 null null 10:10 AM x
123 null null 10:30 AM x
456 two 11:00 AM 11:00 AM x
456 null null 11:15 AM x
789 three 11:30 AM 11:30 AM x
I'd like to replace these null values with the values from the non-null precedent row...
So the expected result is
UniqueID Items start_time time_hit value
123 one 10:00 AM 10:00 AM x
123 one 10:00 AM 10:05 AM x
123 one 10:00 AM 10:10 AM x
123 one 10:00 AM 10:30 AM x
456 two 11:00 AM 11:00 AM x
456 two 11:00 AM 11:15 AM x
789 three 11:30 AM 11:30 AM x
I tried to build my join using the following function without success:
FIRST_VALUE(Items IGNORE NULLS) OVER (
PARTITION BY time_hit ORDER BY time_hit
ROWS BETWEEN CURRENT ROW AND
UNBOUNDED FOLLOWING) AS test
My question was a bit off. I found out that UniqueID were inconsistent that is why I had these null values in my output. So the validated answer is a good option to fill null-values when joining two tables and one of your tables has more unique rows than the other.

You could use first_value (but last_value would also work too in this scenario). The import part is to specify rows between unbounded preceding and current row to set the boundaries of the window.
Answer updated to reflect updated question, and preference for first_value
select
first_value(t1.UniqueId ignore nulls) over (partition by t2.UniqueId
order by t2.time_hit
rows between unbounded preceding and current row) as UniqueId,
first_value(t1.items ignore nulls) over (partition by t2.UniqueId
order by t2.time_hit
rows between unbounded preceding and current row) as Items,
first_value(t1.start_time ignore nulls) over (partition by t2.UniqueId
order by t2.time_hit
rows between unbounded preceding and current row) as start_time,
t2.time_hit,
t2.item_value
from table2 t2
left join table1 t1 on t1.start_time = t2.time_hit
order by t2.time_hit;
Result
| UNIQUEID | ITEMS | START_TIME | TIME_HIT | ITEM_VALUE |
|----------|-------|------------|----------|------------|
| 123 | one | 10:00:00 | 10:00:00 | x |
| 123 | one | 10:00:00 | 10:05:00 | x |
| 123 | one | 10:00:00 | 10:10:00 | x |
| 123 | one | 10:00:00 | 10:30:00 | x |
| 456 | two | 11:00:00 | 11:00:00 | x |
| 456 | two | 11:00:00 | 11:15:00 | x |
| 789 | three | 11:30:00 | 11:30:00 | x |
SQL Fiddle Example
Note: I had to use Oracle in SQL Fiddle (so I had to change the data types and a column name). But it should work for your database.

One alternative solution would be to use a NOT EXISTS clause as JOIN condition, with a correlated subquery that ensures that we are relating to the relevant record.
SELECT t1.items, t1.start_time, t2.time_hit, t2.value
FROM table1 t1
INNER JOIN table2 t2
ON t1.items = t2.items
AND t1.start_time <= t2.time_hit
AND NOT EXISTS (
SELECT 1 FROM table1 t10
WHERE
t10.items = t2.items
AND t10.start_time <= t2.time_hit
AND t10.start_time > t1.start_time
)
Demo on DB Fiddle:
| items | start_time | time_hit | value |
| ----- | ---------- | -------- | ----- |
| one | 10:00:00 | 10:00:00 | x |
| one | 10:00:00 | 10:05:00 | x |
| one | 10:00:00 | 10:10:00 | x |
| one | 10:00:00 | 10:30:00 | x |
| two | 11:00:00 | 11:00:00 | x |
| two | 11:00:00 | 11:15:00 | x |
| three | 11:30:00 | 11:30:00 | x |
Alternative solution to avoid using EXISTS on a JOIN condition (not allowed in Big Query): just move that condition to the WHERE clause.
SELECT t1.items, t1.start_time, t2.time_hit, t2.value
FROM table1 t1
INNER JOIN table2 t2
ON t1.items = t2.items
AND t1.start_time <= t2.time_hit
WHERE NOT EXISTS (
SELECT 1 FROM table1 t10
WHERE
t10.items = t2.items
AND t10.start_time <= t2.time_hit
AND t10.start_time > t1.start_time
)
DB Fiddle

I guess you are expecting an output by using INNER JOIN. But not sure why you used FIRST_VALUE.
SELECT I.Item, I.Start_Time, ID.Time_hit, ID.Value
FROM Items I
INNER JOIN ItemDetails ID
ON I.Items = ID.Items
Please explain if you are looking for any specific reasons to look over this approach.

Getting first and last values from contiguous ranges

I want to get the first enter_date and the last leave_date for contiguous enter_day and leave_day values for each id. Given this example data:
+-----+------------+------------+-----------+-----------+
| id | enter_date | leave_date | enter_day | leave_day |
+-----+------------+------------+-----------+-----------+
| 111 | 2016-07-29 | 2016-12-01 | 1 | 75 |
| 111 | 2016-12-02 | 2017-01-13 | 76 | 95 |
| 111 | 2017-01-17 | 2017-06-02 | 96 | 181 |
| 222 | 2016-07-29 | 2016-12-02 | 1 | 76 |
| 222 | 2017-01-30 | 2017-06-02 | 105 | 181 |
| 333 | 2016-08-01 | 2017-06-02 | 1 | 180 |
+-----+------------+------------+-----------+-----------+
I want the following result:
+-----+------------+------------+
| id | enter_date | leave_date |
+-----+------------+------------+
| 111 | 2016-07-29 | 2017-06-02 |
| 222 | 2016-07-29 | 2016-12-02 |
| 222 | 2017-01-30 | 2017-06-02 |
| 333 | 2016-08-01 | 2017-06-02 |
+-----+------------+------------+
I want one record for ID 111 because there are no gaps between any enter_day and the previous leave_day.
I want both records for ID 222 because there is a gap (days 75 through 104).
EDIT: What I have so far, which isn't giving me the correct leave_date for ID 111:
with cte as (
select a.id, a.enter_date, a.leave_date, b.enter_date next_ed, b.leave_date next_ld
from #tbl a
join #tbl b on b.id = a.id and b.enter_day = a.leave_day + 1
)
select id, min(enter_date) enter_date, max(leave_date) leave_date
from cte
group by id
union
select a.id, a.enter_date, a.leave_date
from #tbl a
left join #tbl b on b.id = a.id and b.enter_day = a.leave_day + 1
left join cte c on c.id = a.id and c.next_ed = a.enter_date and c.next_ld = a.leave_date
where b.id is null and c.id is null
order by 1,3

Below is an example of Gaps-and-Islands for ranges.
I used an ad-hoc tally table, but an actual number/tally would do the trick as well.
If you just run the inner query, you will see very quickly how your sample data of 6 rows explodes into 514 rows. Then is is a small matter of applying the grouped aggregation to get the final results.
Example
Declare #YourTable Table ([id] int,[enter_date] date,[leave_date] date,[enter_day] int,[leave_day] int)
Insert Into #YourTable Values
(111,'2016-07-29','2016-12-01',1,75)
,(111,'2016-12-02','2017-01-13',76,95)
,(111,'2017-01-17','2017-06-02',96,181)
,(222,'2016-07-29','2016-12-02',1,76)
,(222,'2017-01-30','2017-06-02',105,181)
,(333,'2016-08-01','2017-06-02',1,180)
Select ID
,[enter_date] = min([enter_date])
,[leave_date] = max([leave_date])
From (
Select *
,Grp = N - Row_Number() over (Partition By ID Order by N)
From #YourTable A
Join (
Select Top (Select max([leave_day]-[enter_day])+1 From #YourTable)
N=-1+Row_Number() Over (Order By (Select Null))
From master..spt_values n1,master..spt_values n2
) B on B.N between [enter_day] and [leave_day]
) A
Group By [ID],Grp
Order By [ID],min([enter_date])
Returns
ID enter_date leave_date
111 2016-07-29 2017-06-02
222 2016-07-29 2016-12-02
222 2017-01-30 2017-06-02
333 2016-08-01 2017-06-02

How do i calculate minimum and maximum for groups in a sequence in SQL Server?

I am having the following data in my database table in SQL Server:
Id Date Val_A Val_B Val_C Avg Vector MINMAXPOINTS
329 2016-01-15 78.09 68.40 70.29 76.50 BELOW 68.40
328 2016-01-14 79.79 75.40 76.65 76.67 BELOW 75.40
327 2016-01-13 81.15 74.59 79.00 76.44 ABOVE 81.15
326 2016-01-12 81.95 77.04 78.95 76.04 ABOVE 81.95
325 2016-01-11 82.40 73.65 81.34 75.47 ABOVE 82.40
324 2016-01-08 78.75 73.40 77.20 74.47 ABOVE 78.75
323 2016-01-07 76.40 72.29 72.95 73.74 BELOW 72.29
322 2016-01-06 81.25 77.70 78.34 73.12 ABOVE 81.25
321 2016-01-05 81.75 76.34 80.54 72.08 ABOVE 81.75
320 2016-01-04 80.95 75.15 76.29 70.86 ABOVE 80.95
The column MIMMAXPOINTS should actually contain lowest of Val_B until Vector is 'BELOW' and highest of Val_A until Vector is 'ABOVE'. So, we would have the following values in MINMAXPOINTS:
MINMAXPOINTS
68.40
68.40
82.40
82.40
82.40
82.40
72.29
81.75
81.75
81.75
Is it possible without cursor?
Any help will be greatly appreciated!.

At first apply classic gaps-and-islands to determine groups (gaps/islands/above/below) and then calculate MIN and MAX for each group.
I assume that ID column defines the order of rows.
Tested on SQL Server 2008. Here is SQL Fiddle.
Sample data
DECLARE #T TABLE
([Id] int, [dt] date, [Val_A] float, [Val_B] float, [Val_C] float, [Avg] float,
[Vector] varchar(5));
INSERT INTO #T ([Id], [dt], [Val_A], [Val_B], [Val_C], [Avg], [Vector]) VALUES
(329, '2016-01-15', 78.09, 68.40, 70.29, 76.50, 'BELOW'),
(328, '2016-01-14', 79.79, 75.40, 76.65, 76.67, 'BELOW'),
(327, '2016-01-13', 81.15, 74.59, 79.00, 76.44, 'ABOVE'),
(326, '2016-01-12', 81.95, 77.04, 78.95, 76.04, 'ABOVE'),
(325, '2016-01-11', 82.40, 73.65, 81.34, 75.47, 'ABOVE'),
(324, '2016-01-08', 78.75, 73.40, 77.20, 74.47, 'ABOVE'),
(323, '2016-01-07', 76.40, 72.29, 72.95, 73.74, 'BELOW'),
(322, '2016-01-06', 81.25, 77.70, 78.34, 73.12, 'ABOVE'),
(321, '2016-01-05', 81.75, 76.34, 80.54, 72.08, 'ABOVE'),
(320, '2016-01-04', 80.95, 75.15, 76.29, 70.86, 'ABOVE');
Query
To understand better how it works examine results of each CTE.
CTE_RowNumbers calculates two sequences of row numbers.
CTE_Groups assigns a number for each group (above/below).
CTE_MinMax calculates MIN/MAX for each group.
Final SELECT picks MIN or MAX to return.
WITH
CTE_RowNumbers
AS
(
SELECT [Id], [dt], [Val_A], [Val_B], [Val_C], [Avg], [Vector]
,ROW_NUMBER() OVER (ORDER BY ID DESC) AS rn1
,ROW_NUMBER() OVER (PARTITION BY Vector ORDER BY ID DESC) AS rn2
FROM #T
)
,CTE_Groups
AS
(
SELECT [Id], [dt], [Val_A], [Val_B], [Val_C], [Avg], [Vector]
,rn1-rn2 AS Groups
FROM CTE_RowNumbers
)
,CTE_MinMax
AS
(
SELECT [Id], [dt], [Val_A], [Val_B], [Val_C], [Avg], [Vector]
,MAX(Val_A) OVER(PARTITION BY Groups) AS MaxA
,MIN(Val_B) OVER(PARTITION BY Groups) AS MinB
FROM CTE_Groups
)
SELECT [Id], [dt], [Val_A], [Val_B], [Val_C], [Avg], [Vector]
,CASE
WHEN [Vector] = 'BELOW' THEN MinB
WHEN [Vector] = 'ABOVE' THEN MaxA
END AS MINMAXPOINTS
FROM CTE_MinMax
ORDER BY ID DESC;
Result
+-----+------------+-------+-------+-------+-------+--------+--------------+
| Id | dt | Val_A | Val_B | Val_C | Avg | Vector | MINMAXPOINTS |
+-----+------------+-------+-------+-------+-------+--------+--------------+
| 329 | 2016-01-15 | 78.09 | 68.4 | 70.29 | 76.5 | BELOW | 68.4 |
| 328 | 2016-01-14 | 79.79 | 75.4 | 76.65 | 76.67 | BELOW | 68.4 |
| 327 | 2016-01-13 | 81.15 | 74.59 | 79 | 76.44 | ABOVE | 82.4 |
| 326 | 2016-01-12 | 81.95 | 77.04 | 78.95 | 76.04 | ABOVE | 82.4 |
| 325 | 2016-01-11 | 82.4 | 73.65 | 81.34 | 75.47 | ABOVE | 82.4 |
| 324 | 2016-01-08 | 78.75 | 73.4 | 77.2 | 74.47 | ABOVE | 82.4 |
| 323 | 2016-01-07 | 76.4 | 72.29 | 72.95 | 73.74 | BELOW | 72.29 |
| 322 | 2016-01-06 | 81.25 | 77.7 | 78.34 | 73.12 | ABOVE | 81.75 |
| 321 | 2016-01-05 | 81.75 | 76.34 | 80.54 | 72.08 | ABOVE | 81.75 |
| 320 | 2016-01-04 | 80.95 | 75.15 | 76.29 | 70.86 | ABOVE | 81.75 |
+-----+------------+-------+-------+-------+-------+--------+--------------+

Modify the query to check for group of data greater than current records as
You can use below query using case statment which will let you select a conditional value based on vector value for each row.
The query is
SELECT ID, DATE, VAL_A, VAL_B, VAL_C, AVG, VECTOR,
CASE
WHEN VECTOR = 'BELOW' THEN (SELECT MIN(VAL_B) FROM TABLE A WHERE ROWID >= B.ROWID)
WHEN VECTOR = 'ABOVE' THEN (SELECT MAX(VAL_A) FROM TABLE A WHERE ROWID >= B.ROWID)
END AS MINMAXVALUE
FROM TABLE B
GO
Check this should yield the result you are expecting from the data.

You can use below query using case statment which will let you select a conditional value based on vector value for each row.
The query is
SELECT ID, DATE, VAL_A, VAL_B, VAL_C, AVG, VECTOR,
CASE
WHEN VECTOR = 'BELOW' THEN (SELECT MIN(VAL_B) FROM TABLE A)
WHEN VECTOR = 'ABOVE' THEN (SELECT MAX(VAL_A) FROM TABLE A)
END AS MINMAXVALUE
FROM TABLE B
GO
Check if this help you.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Removing clusters of duplicates in a query resultset - sql

Related

SQL: usage time of item between dates combining two tables

Get value of each latest record grouped by ID

Filling NULL values with preceding Non-NULL values using FIRST_VALUE

Getting first and last values from contiguous ranges

How do i calculate minimum and maximum for groups in a sequence in SQL Server?

Categories

Resources