SQL exclude rows based on value in another row - sql

I am trying to exclude rows where a value exists in another row.
select * from TABLE1
ROW SEQ VALUE
1 1 HIGH
1 2 HIGH
1 3 LOW
1 4 HIGH
2 1 MED
2 2 HIGH
2 3 HIGH
2 4 LOW
2 5 HIGH
2 6 HIGH
All the data is coming from the same table what I am trying to do is exclude the rows where VALUE = 'LOW' and all previous rows where SEQ <= the row with the value = 'LOW'. This is my desired result:
ROW SEQ VALUE
1 4 HIGH
2 5 HIGH
2 6 HIGH
Here's work in progress but it's only excluding the one row
select * from TABLE1
where not exists(select VALUE from TABLE1
where ROW = ROW and VALUE = 'LOW' and SEQ <= SEQ)
I need to write it into the where cause as the select is hard coded. I am lost any help would be greatly appreciated. Thanks in advance!

select *
from table1
left outer join (
select row, max(seq) as seq
from table1
where value = 'low'
group by row
) lows on lows.row = table1.row
where lows.row is null
or table1.seq > lows.seq

You should be aliasing the tables. I'm surprised you are getting any results from this query as you don't have aliases at all.
select *
from TABLE1 As t0
where not exists(
select VALUE
from TABLE1 As t1
where t0.ROW = t1.ROW
and t1.VALUE = 'LOW'
and t0.SEQ <= t1.SEQ
)

You can use a window function with a cumulative approach :
select t.*
from (select t.*, sum(case when value = 'LOW' then 1 else 0 end) over (partition by row order by seq) as cnt
from table t
) t
where cnt = 1 and value <> 'LOW';

For the results you mention, you seem to want the rows after the last "low". One method is:
select t1.*
from table1 t1
where t1.seq > (select max(t2.seq) from table1 tt1 where tt1.row = t1.row and tt1.value = 'LOW');
(Note: This requires a "low" row. If there could be no "low" rows and you want all rows returned, that is easily added to the query.)
Or, similarly, using not exists:
select t1.*
from table1 t1
where not exists (select 1
from table1 tt1
where tt1.row = t1.row and
tt1.seq > t.seq and
tt1.value = 'LOW'
);
This might be the most direct translation of your question.
However, I would more likely use window functions:
select t1.*
from (select t1.*,
max(case when t1.value = 'low' then seqnum end) over (partition by row) as max_low_seqnum
from table1 t1
) t1
where seqnum > max_low_seqnum;
You might want to add or max_low_seqnum is null to return all rows if there are no "low" rows.

Related

SQL Partition by with conditions

I want to partition the data on the basis of two columns Type and Env and fetch the top 5 records for each partition order by count desc. The problem that I'm facing is that I need to partition the Env on the basis of LIKE condition.
Data -
Type
Environment
Count
T1
E1
1
T1
M1
2
T1
AB1
3
T2
E1
1
T2
M1
2
T2
CB1
3
T2
M1
5
The result that I want - Let's say I'm fetching top (1) record for now
Type
Environment
Count
T1
M1
2
T1
AB1
3
T2
CB1
3
T2
M1
5
Here I'm dividing the env on condition (env LIKE "%M%" and env NOT LIKE "%M")
One approach that I can think of is using partition and union but this is a very expensive call due to the large amount of data that I'm filtering from. Is there a better way to achieve this?
SELECT
*
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Type ORDER BY Count DESC) AS maxCount
FROM
table
WHERE
Env LIKE '%M%'
) AS t1
WHERE
t1.maxCount <= 5
UNION
SELECT
*
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Type ORDER BY Count DESC) AS maxCount
FROM
table
WHERE
Env NOT LIKE '%M%'
) AS t1
WHERE
t1.maxCount <= 5
You would seem to want an additional partition by in your row_number():
select t.*
from (select t.*,
row_number() over (partition by type, case when environment like '%M%' then 1 else 2 end)
order by count desc
) as seqnum
from t
) t
where seqnum <= 5;

Remove matching pairs of rows from query

I am trying to produce a report from an SQL database.
The data is transactions, sometimes because of operator error incorrect records are entered, latter to correct for this the same record is entered but with a negative quantity.
i.e.
ID, DESC , QTY
0 , ITEM1 , 2
1 , ITEM2 , 1
2 , ITEM3 , 2 // This record and
3 , ITEM2 , 1
4 , ITEM3 , -2 // this record cancel out
I would like to have a query that looks at pairs of rows that are identical besides the ID and have an opposite sign on the QTY and does not include them in the result.
Similar to the below.
ID, DESC , QTY
0 , ITEM1 , 2
1 , ITEM2 , 1
3 , ITEM4 , 1
What is the easiest way I can achieve this in a query. I was thinking along the lines of an aggregate SUM function, but I only wanted to remove rows with a QTY of opposite sign but equal magnitude.
This is rather painful. The immediate answer to your question is not exists. However, you need to be careful about duplicates, so I would recommend enumerating the values first:
with t as (
select t.*,
row_number() over (partition by desc, qty order by id) as seqnum
from transactions t
)
select t.*
from t
where not exists (select 1
from t t2
where t2.desc = t.desc and
t2.seqnum = t.seqnum and
t2.qty = - t.qty
);
You could use the left join antipattern to evict records for which another record exists with the same desc and an opposite qty.
select t.*
from mytable t
left join mytable t1 on t1.desc = t.desc and t1.qty = - t.qty
where t1.id is null
Or a not exists condition with a correlated subquery:
select t.*
from mytable t
where not exists (
select 1
from mytable t1
where t1.desc = t.desc and t1.qty = - t.qty
)

Postgres - SQL to match the first rownum

I have the following SQL to generate a row num for each record
MY_VIEW AS
( SELECT
my_id,
(case when col1 = 'A' then
1
when col1 = 'C' then
2
else
3
end) as rownum
from table_1
So I have data look like this:
my_id rownum
0001-A 1
0001-A 2
0001-B 2
Later, I want to use the smallest rownum for each unique "my_id" to do a inner join what another table_2. How should I proceed? This is what I have so far.
select * from table_2
inner join tabl_1
on table_2.my_id = table1.my_id
and row_num = (...the smallest from M_VIVE...)
In Postgres, I would recommend distinct on:
selecd distinct on (my_id) my_id
(case when col1 = 'A' then 1
when col1 = 'C' then 2
else 3
end) as rownum
from table_1
order by my_id, rownum;
However, you can just as easily do this using group by:
select my_id,
min(case when col1 = 'A' then 1
when col1 = 'C' then 2
else 3
end) as rownum
from table_1
group by my_id;
The distinct on approach allows you to include other columns. It might be a bit faster. On the downside, it is Postgres-specific.
You can use MIN() function for rownum against every my_id in table_1 and use that in the join.
You would need to make sure table_2 also has my_id field to make the join work.
select *
from
table_2
inner join
(select my_id, MIN(rownum) as minimum_rownum from tabl_1 group by my_id) t1
on table_2.my_id = t1.my_id;

SQL Get rows based on conditions

I'm currently having trouble writing the business logic to get rows from a table with id's and a flag which I have appended to it.
For example,
id: id seq num: flag: Date:
A 1 N ..
A 2 N ..
A 3 N
A 4 Y
B 1 N
B 2 Y
B 3 N
C 1 N
C 2 N
The end result I'm trying to achieve is that:
For each unique ID I just want to retrieve one row with the condition for that row being that
If the flag was a "Y" then return that row.
Else return the last "N" row.
Another thing to note is that the 'Y' flag is not always necessarily the last
I've been trying to get a case condition using a partition like
OVER (PARTITION BY A."ID" ORDER BY A."Seq num") but so far no luck.
-- EDIT:
From the table, the sample result would be:
id: id seq num: flag: date:
A 4 Y ..
B 2 Y ..
C 2 N ..
Using a window clause is the right idea. You should partition the results by the ID (as you've done), and order them so the Y flag rows come first, then all the N flag rows in descending date order, and pick the first for each id:
SELECT id, id_seq_num, flag, date
FROM (SELECT id, id_seq_num, flag, date,
ROW_NUMBER() OVER (PARTITION BY id
ORDER BY CASE flag WHEN 'Y' THEN 0
ELSE 1
END ASC,
date ASC) AS rk
FROM mytable) t
WHERE rk = 1
My approach is to take a UNION of two queries. The first query simply selects all Yes records, assuming that Yes only appears once per ID group. The second query targets only those ID having no Yes anywhere. For those records, we use the row number to select the most recent No record.
WITH cte1 AS (
SELECT id
FROM yourTable
GROUP BY id
HAVING SUM(CASE WHEN flag = 'Y' THEN 1 ELSE 0 END) = 0
),
cte2 AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY t1.id ORDER BY t1."id seq" DESC) rn
FROM yourTable t1
INNER JOIN cte1 t2
ON t1.id = t2.id
)
SELECT *
FROM yourTable
WHERE flag = 'Y'
UNION ALL
SELECT *
FROM cte2 t2
WHERE t2.rn = 1
Here's one way (with quite generic SQL):
select t1.*
from Table1 as t1
where t1.id_seq_num = COALESCE(
(select max(id_seq_num) from Table1 as T2 where t1.id = t2.id and t2.flag = 'Y') ,
(select max(id_seq_num) from Table1 as T3 where t1.id = t3.id and t3.flag = 'N') )
Available in a fiddle here: http://sqlfiddle.com/#!9/5f7f9/6
SELECT DISTINCT id, flag
FROM yourTable

SQL: Get running row delta for records

Let's say we have this table with columns RowID and Call:
RowID Call DesiredOut
1 A 0
2 A 0
3 B
4 A 1
5 A 0
6 A 0
7 B
8 B
9 A 2
10 A 0
I want to SQL query the last column DesiredOut as follows:
Each time Call is 'A' go back until 'A' is found again and count the number of records which are in between two 'A' entries.
Example: RowID 4 has 'A' and the nearest predecessor is in RowID 2. Between RowID 2 and RowID 4 we have one Call 'B', so we count 1.
Is there an elegant and performant way to do this with ANSI SQL?
I would approach this by first finding the rowid of the previous "A" value. Then count the number of values in-between.
The following query implements this logic using correlated subqueries:
select t.*,
(case when t.call = 'A'
then (select count(*)
from table t3
where t3.id < t.id and t3.id > prevA
)
end) as InBetweenCount
from (select t.*,
(select max(rowid)
from table t2
where t2.call = 'A' and t2.rowid < t.rowid
) as prevA
from table t
) t;
If you know that rowid is sequential with no gaps, you can just use subtraction instead of a subquery for the calculation in the outer query.
You could use a query to find the previous Call = A row. Then, you could count the number of rows between that row and the current row:
select RowID
, `Call`
, (
select count(*)
from YourTable t2
where RowID < t1.RowID
and RowID > coalesce(
(
select RowID
from YourTable t3
where `Call` = 'A'
and RowID < t1.RowID
order by
RowID DESC
limit 1
),0)
)
from YourTable t1
Example at SQL Fiddle.
Here is another solution using window functions:
with flagged as (
select *,
case
when call = 'A' and lead(call) over (order by rowid) <> 'A' then 'end'
when call = 'A' and lag(call) over (order by rowid) <> 'A' then 'start'
end as change_flag
from calls
)
select t1.rowid,
t1.call,
case
when change_flag = 'start' then rowid - (select max(t2.rowid) from flagged t2 where t2.change_flag = 'end' and t2.rowid < t1.rowid) - 1
when call = 'A' then 0
end as desiredout
from flagged t1
order by rowid;
The CTE first marks the start and end of each "A"-Block and the final select then uses these markers to get the difference between the start of one block and the end of the previous one.
If the rowid is not gapless, you can easily add a gapless rownumber inside the CTE to calculate the difference.
I'm not sure about the performance though. I wouldn't be surprised if Gordon's answer is faster.
SQLFiddle example: http://sqlfiddle.com/#!15/e1840/1
Believe it or not, this will be pretty fast if the two columns are indexed.
select r1.RowID, r1.CallID, isnull( R1.RowID - R2.RowID - 1, 0 ) as DesiredOut
from RollCall R1
left join RollCall R2
on R2.RowID =(
select max( RowID )
from RollCall
where RowID < R1.RowID
and CallID = 'A')
and R1.CallID = 'A';
Here is the Fiddle.
You could do something like that:
SELECT a.rowid - b.rowid
FROM table as a,
(SELECT rowid FROM table where rowid < a.rowid order by rowid) as b
WHERE <something>
ORDER BY a.rowid
As I cannot say which DBMS you are using this is more kind of pseudo code which could work based on your system.