Find max uninterrupted interval - sql

How I can find max uninterrupted interval in column?
Example
ID Result
1 1
2 2
3 3
4 4
5 5
6 6
10
11
12

You can use row_number(). Here is a simple way to get the first and lsat values:
select top (1) with ties min(id), max(id)
from (select t.*, row_number() over (order by id) as seqnum
from t
) t
group by (id - seqnum)
order by count(*) desc;
To get the actual original rows requires another level of window functions:
select top (1) with ties
from (select t.*, count(*) over (partition by id - seqnum) as cnt
from (select t.*, row_number() over (order by id) as seqnum
from t
) t
) t
order by cnt desc, id;

Related

How to get longest consecutive same value?

How to get the rows of the longest consecutive same value?
Table Learning:
rowID
values
1
1
2
1
3
0
4
0
5
0
6
1
7
0
8
1
9
1
10
1
Longest consecutive value is 1 (rowID 8-10 as rowID 1-2 is 2 and rowID 6-6 is 1). How to query to get the actual rows of consecutive values (not just rowStart and rowEnd values) like :
rowID
values
8
1
9
1
10
1
And for longest consecutive values of both 1 and 0?
DB Fiddle
I think that the simplest approach is to use a window count to define the islands. Then to get the "longest" island, we just need to aggregate, sort and limit:
select min(valueid) grp_start, max(valueid) grp_end
from (select t.*, sum(value = 0) over(order by valueid) grp from testing t) t
where value = 1
group by grp
order by count(*) desc limit 1
In the DB Fiddle that you provided, the query returns:
grp_start
grp_end
8
10
This is a gaps and islands problem, and one approach is to use the difference in row numbers method:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY rowID) rn1,
ROW_NUMBER() OVER (PARTITION BY values ORDER BY rowID) rn2
FROM yourTable
),
cte2 AS (
SELECT *,
MIN(rowID) OVER (PARTITION BY values, rn1 - rn2) AS minRowID,
MAX(rowID) OVER (PARTITION BY values, rn1 - rn2) AS maxRowID
FROM cte1
),
cte3 AS (
SELECT *, RANK() OVER (PARTITION BY values ORDER BY maxRowID - minRowID DESC) rnk
FROM cte2
)
SELECT rowID, values
FROM cte3
WHERE rnk = 1
ORDER BY values, rowID;

Selecting rows that have row_number more than 1

I have a table as following (using bigquery):
id
year
month
sales
row_number
111
2020
11
1000
1
111
2020
12
2000
2
112
2020
11
3000
1
113
2020
11
1000
1
Is there a way in which I can select rows that have row numbers more than one?
For example, my desired output is:
id
year
month
sales
row_number
111
2020
11
1000
1
111
2020
12
2000
2
I don't want to just exclusively select rows with row_number = 2 but also row_number = 1 as well.
The original code block I used for the first table result is:
SELECT
id,
year,
month,
SUM(sales) AS sales,
ROW_NUMBER() OVER (PARTITIONY BY id ORDER BY id ASC) AS row_number
FROM
table
GROUP BY
id, year, month
You can use window functions:
select t.* except (cnt)
from (select t.*,
count(*) over (partition by id) as cnt
from t
) t
where cnt > 1;
As applied to your aggregation query:
SELECT iym.* EXCEPT (cnt)
FROM (SELECT id, year, month,
SUM(sales) as sales,
ROW_NUMBER() OVER (Partition by id ORDER BY id ASC) AS row_number
COUNT(*) OVER(Partition by id ORDER BY id ASC) AS cnt
FROM table
GROUP BY id, year, month
) iym
WHERE cnt > 1;
You can wrap your query as in below example
select * except(flag) from (
select *, countif(row_number > 1) over(partition by id) > 0 flag
from (YOUR_ORIGINAL_QUERY)
)
where flag
so it can look as
select * except(flag) from (
select *, countif(row_number > 1) over(partition by id) > 0 flag
from (
SELECT id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month
)
)
where flag
so when applied to sample data in your question - it will produce below output
Try this:
with tmp as (SELECT id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month)
select * from tmp a where exists ( select 1 from tmp b where a.id = b.id and b.row_number =2)
It's a so clearly exists statement SQL
This is what I use, it's similar to #ElapsedSoul answer but from my understanding for static list "IN" is better than using "EXISTS" but I'm not sure if the performance difference, if any, is significant:
Difference between EXISTS and IN in SQL?
WITH T1 AS
(
SELECT
id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY id ASC) AS ROW_NUM
FROM table
GROUP BY id, year, month
)
SELECT *
FROM T1
WHERE id IN (SELECT id FROM T1 WHERE ROW_NUM > 1);

SQL The largest number of consecutive values for each value

I have Tabel MatchResults
id | player_win_id
------------------
1 | 1
2 | 1
3 | 3
4 | 1
5 | 2
6 | 3
7 | 3
8 | 1
9 | 1
10 | 1
I need to find out for each player ID the highest number of consecutive victories. I use MS SQL Server.
Expected Result
PLAYER_ID | WIN_COUNT
------------------
1 | 3
2 | 1
3 | 2
This is a type of gaps-and-islands problem. One solution uses the difference of row numbers. So, to get all streaks:
select player_win_id, count(*)
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by player_win_id order by id) as seqnum_p
from MatchResults t
) t
group by player_win_id, (seqnum - seqnum_p);
Why this works is a little tricky to explain. But if you look at the results of the subquery, you'll probably see how the difference between the row number values captures adjacent rows with the same player win id.
For the maximum, the simplest is probably just an aggregation query:
select player_win_id, max(cnt)
from (select player_win_id, count(*) as cnt
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by player_win_id order by id) as seqnum_p
from MatchResults t
) t
group by player_win_id, (seqnum - seqnum_p)
) p
group by player_win_id;
Now I understand the previous comment. The code for my table is:
select player_win_id, max(cnt)
from (select player_win_id, count(*) as cnt
from (select *,
row_number() over (order by id) as seqnum,
row_number() over (partition by player_win_id order by id) as seqnum_p
from MatchResults ) t
group by player_win_id, (seqnum - seqnum_p)
) p
group by player_win_id;

SQL Server : create group of N rows each and give group number for each group

I want to create a SQL query that SELECT a ID column and adds an extra column to the query which is a group number as shown in the output below.
Each group consists of 3 rows and should have the MIN(ID) as a GroupID for each group. The order by should be ASC on the ID column.
ID GroupNr
------------
100 100
101 100
102 100
103 103
104 103
105 103
106 106
107 106
108 106
I've tried solutions with ROW_NUMBER() and DENSE_RANK(). And also this query:
SELECT
*, MIN(ID) OVER (ORDER BY ID ASC ROWS 2 PRECEDING) AS Groupnr
FROM
Table
ORDER BY
ID ASC
Use row_number() to enumerate the rows, arithmetic to assign the group and then take the minimum of the id:
SELECT t.*, MIN(ID) OVER (PARTITION BY grp) as groupnumber
FROM (SELECT t.*,
( (ROW_NUMBER() OVER (ORDER BY ID) - 1) / 3) as grp
FROM Table
) t
ORDER BY ID ASC;
It is possible to do this without a subquery, but the logic is rather messy:
select t.*,
(case when row_number() over (order by id) % 3 = 0
then lag(id, 2) over (order by id)
when row_number() over (order by id) % 3 = 2
then lag(id, 1) over (order by id)
else id
end) as groupnumber
from table t
order by id;
Assuming you want the lowest value in the group, and they are always groups of 3, rather than the NTILE (as Saravantn suggests, which splits the data into that many even(ish) groups), you could use a couple of window functions:
WITH Grps AS(
SELECT V.ID,
(ROW_NUMBER() OVER (ORDER BY V.ID) -1) / 3 AS Grp
FROM (VALUES(100),
(101),
(102),
(103),
(104),
(105),
(106),
(107),
(108))V(ID))
SELECT G.ID,
MIN(G.ID) OVER (PARTITION BY G.Grp) AS GroupNr
FROM Grps G;
SELECT T2.ID, T1.ID
FROM (
SELECT MIN(ID) AS ID, GroupNr
FROM
(
SELECT ID, ( Row_number()OVER(ORDER BY ID) - 1 ) / 3 + 1 AS GroupNr
FROM Table
) AS T1
GROUP BY GroupNr
) AS T1
INNER JOIN (
SELECT ID, ( Row_number()OVER(ORDER BY ID) - 1 ) / 3 + 1 AS GroupNr
FROM Table
) T2 ON T1.GroupNr = T2.GroupNr

Select TOP 2 values for each group

I'm having problem with getting only TOP 2 values for each group (groups are in column).
Example :
ID Group Value
1 A 30
2 A 150
3 A 40
4 A 70
5 B 0
6 B 100
7 B 90
I expect my output to be
ID Group Value
1 A 150
2 A 70
3 B 100
4 B 90
Simply, for each group I want just 2 rows with the highest Value
Most databases support the ANSI standard row_number() function. You would use it as:
select group, value
from (select t.*,
row_number() over (partition by group order by value desc) as seqnum
from t
) t
where seqnum <= 2;
To set the id you can use row_number() in the outer query:
select row_number() over (order by group, value) as id,
group, value
from (select t.*,
row_number() over (partition by group order by value desc) as seqnum
from t
) t
where seqnum <= 2;
However, changing the id seems suspicious.
You can use CTE with rank function ROW_NUMBER() .
Here is query to get your result.
;WITH cte AS
( SELECT Group, value,
ROW_NUMBER() OVER (PARTITION BY Group ORDER BY value DESC) AS rn
FROM test
)
SELECT Group, value FROM cte
WHERE rn <= 2
ORDER BY value