I have Tabel MatchResults
id | player_win_id
------------------
1 | 1
2 | 1
3 | 3
4 | 1
5 | 2
6 | 3
7 | 3
8 | 1
9 | 1
10 | 1
I need to find out for each player ID the highest number of consecutive victories. I use MS SQL Server.
Expected Result
PLAYER_ID | WIN_COUNT
------------------
1 | 3
2 | 1
3 | 2
This is a type of gaps-and-islands problem. One solution uses the difference of row numbers. So, to get all streaks:
select player_win_id, count(*)
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by player_win_id order by id) as seqnum_p
from MatchResults t
) t
group by player_win_id, (seqnum - seqnum_p);
Why this works is a little tricky to explain. But if you look at the results of the subquery, you'll probably see how the difference between the row number values captures adjacent rows with the same player win id.
For the maximum, the simplest is probably just an aggregation query:
select player_win_id, max(cnt)
from (select player_win_id, count(*) as cnt
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by player_win_id order by id) as seqnum_p
from MatchResults t
) t
group by player_win_id, (seqnum - seqnum_p)
) p
group by player_win_id;
Now I understand the previous comment. The code for my table is:
select player_win_id, max(cnt)
from (select player_win_id, count(*) as cnt
from (select *,
row_number() over (order by id) as seqnum,
row_number() over (partition by player_win_id order by id) as seqnum_p
from MatchResults ) t
group by player_win_id, (seqnum - seqnum_p)
) p
group by player_win_id;
Related
I am trying to use the Row Number in SQL. However, it's not giving desired output.
Data :
ID Name Output should be
111 A 1
111 B 2
111 C 3
111 C 3
111 A 4
222 A 1
222 A 1
222 B 2
222 C 3
222 B 4
222 B 4
This is a gaps-and-islands problem. As a starter: for the question to just make sense, you need a column that defines the ordering of the rows - I assumed ordering_id. Then, I would recommend lag() to get the "previous" name, and a cumulative sum() that increases everytime the name changes in adjacent rows:
select id, name,
sum(case when name = lag_name then 0 else 1 end) over(partition by id order by ordering_id) as rn
from (
select t.*, lag(name) over(partition by id order by ordering_id) lag_name
from mytable t
) t
SQL Server 2008 makes this much trickier. You can identify the adjacent rows using a difference of rows numbers. Then you can assign the minimum id in each island and use dense_rank():
select t.*,
dense_rank() over (partition by name order by min_ordcol) as output
from (select t.*,
min(<ordcol>) over (partition by name, seqnum - seqnum_2) as min_ordcol
from (select t.*,
row_number() over (partition by name order by <ordcol>) as seqnum,
row_number() over (partition by name, id order by <ordcol>) as seqnum_2
from t
) t
) t;
ValId | PolicyId | Date | Value
------+----------+------------+-------
1 | 11 | 2020-06-01 | 2000
2 | 11 | 2020-06-03 | 3000
3 | 11 | 2020-06-03 | 4000
4 | 12 | 2020-06-02 | 8000
5 | 12 | 2020-06-03 | 8500
I wanted to get top 2 latest Val rows for each PolicyId but they cannot be from the same date.
Rows for PolicyId = 12 are returned correctly - ValId 4 and 5.
For PolicyId = 11, rows with ValId 2 and 3 are returned but as they are on the same date I wanted row of ValId 1 to be returned instead of ValId 2.
SELECT
V.ValId, V.PolicyId, V.Value, V.Date
FROM
(SELECT
ValId, PolicyId, Value, Date,
ROW_NUMBER() OVER (PARTITION BY PolicyId ORDER BY Date Desc, ValId DESC) AS RowNum
FROM
TVal) V
WHERE
RowNum <= 2
You can enumerate the rows by dates and within dates:
select t.*
from (select t.*,
dense_rank() over (partition by policyid order by date desc valId desc) as seqnum,
rank() over (partition by policyid, date order by valId desc) as seqnum_within_date
from tval
) t
where seqnum <= 2 and seqnum_within_date = 1;
Using the suggestion from Gordon Linoff I was able to complete the sql as below
Select v.* from
(
select t.*,
row_number() over (partition by policyid order by date desc valId desc) as seqnum,
from (select t.*
dense_rank() over (partition by policyid, date order by valId desc) as seqnum_within_date
from tval
) t where seqnum_within_date = 1
)v where seqnum <= 2
Hi I have a table like below, and I want to count the repeating values in the status column. I don't want to calculate the overall duplicate values. For example, I just want to count how many "Offline" appears until the value changes to "Idle".
This is the result I wanted. Thank you.
This is often called gaps-and-islands.
One way to do it is with two sequences of row numbers.
Examine each intermediate result of the query to understand how it works.
WITH
CTE_rn
AS
(
SELECT
status
,dt
,ROW_NUMBER() OVER (ORDER BY dt) as rn1
,ROW_NUMBER() OVER (PARTITION BY status ORDER BY dt) as rn2
FROM
T
)
SELECT
status
,COUNT(*) AS cnt
FROM
CTE_rn
GROUP BY
status
,rn1-rn2
ORDER BY
min(dt)
;
Result
| status | cnt |
|---------|-----|
| offline | 2 |
| idle | 1 |
| offline | 2 |
| idle | 1 |
WITH
cte1 AS ( SELECT status,
"date",
workstation,
CASE WHEN status = LAG(status) OVER (PARTITION BY workstation ORDER BY "date")
THEN 0
ELSE 1 END changed
FROM test ),
cte2 AS ( SELECT status,
"date",
workstation,
SUM(changed) OVER (PARTITION BY workstation ORDER BY "date") group_num
FROM cte1 )
SELECT status, COUNT(*) "count", workstation, MIN("date") "from", MAX("date") "till"
FROM cte2
GROUP BY group_num, status, workstation;
fiddle
I am trying to create a SQL query that will pull the number of rows since the last maximum value within a windows function over the last 5 rows. In the example below it would return 2 for row 8. The max value is 12 which is 2 rows from row 8.
For row 6 it would return 5 because the max value of 7 is 5 rows away.
|ID | Date | Amount
| 1 | 1/1/2019 | 7
| 2 | 1/2/2019 | 3
| 3 | 1/3/2019 | 4
| 4 | 1/4/2019 | 1
| 5 | 1/5/2019 | 1
| 6 | 1/6/2019 | 12
| 7 | 1/7/2019 | 2
| 8 | 1/8/2019 | 4
I tried the following:
SELECT ID, date, MAX(amount)
OVER (ORDER BY date ASC ROWS 5 PRECEDING) mymax
FROM tbl
This gets me to the max values but I am unable to efficiently determine how many rows away it is. I was able to get close using multiple variables within the SELECT but this did not seem efficient or scalable.
You can calculate the cumulative maximum and then use row_number() on that.
So:
select t.*,
row_number() over (partition by running_max order by date) as rows_since_last_max
from (select t.*,
max(amount) over (order by date rows between 5 preceding and current row) as running_max
from tbl t
) t;
I think this works for your sample data. It might not work if you have duplicates.
In that case, you can use date arithmetic:
select t.*,
datediff(day,
max(date) over (partition by running_max order by date),
date
) as days_since_most_recent_max5
from (select t.*,
max(amount) over (order by date rows between 5 preceding and current row) as running_max
from tbl t
) t;
EDIT:
Here is an example using row number:
select t.*,
(seqnum - max(case when amount = running_amount then seqnum end) over (partition by running_max order by date)) as rows_since_most_recent_max5
from (select t.*,
max(amount) over (order by date rows between 5 preceding and current row) as running_max,
row_number() over (order by date) as seqnum
from tbl t
) t;
It would be :
select *,ID-
(
SELECT ID
FROM
(
SELECT
ID,amount,
Maxamount =q.mymax
FROM
Table_4
) AS derived
WHERE
amount = Maxamount
) as result
from (
SELECT ID, date,
MAX(amount)
OVER (ORDER BY date ASC ROWS 5 PRECEDING) mymax
FROM Table_4
)as q
How I can find max uninterrupted interval in column?
Example
ID Result
1 1
2 2
3 3
4 4
5 5
6 6
10
11
12
You can use row_number(). Here is a simple way to get the first and lsat values:
select top (1) with ties min(id), max(id)
from (select t.*, row_number() over (order by id) as seqnum
from t
) t
group by (id - seqnum)
order by count(*) desc;
To get the actual original rows requires another level of window functions:
select top (1) with ties
from (select t.*, count(*) over (partition by id - seqnum) as cnt
from (select t.*, row_number() over (order by id) as seqnum
from t
) t
) t
order by cnt desc, id;