Redshift Row_Number() Query with partitions that restart

Redshift Row_Number() Query with partitions that restart - sql

I have data with id, timestamp(ts) event, capital_event_bool, and prev_event_capital_bool.
id ts event capital_event_bool prev_event_capital_bool
001 00:01 a 0 0
002 00:02 b 0 0
002 00:03 b 0 0
002 00:04 b 1 0
002 00:05 c 0 1
003 00:03 c 0 0
003 00:04 b 0 0
003 00:05 b 1 0
003 00:06 b 0 1
003 00:07 b 0 0
003 00:08 b 1 0
Only "b" events can have a capital_event_bool = True.
What I would like to accomplish is have a way to count all capital_event_bool = False b events prior to every capital_event_bool = True event for every id. I originally thought I could accomplish this via the row_number() window function in Redshift with
ROW_NUMBER() OVER (PARTITION BY id, event, capital_event_bool ORDER BY ts) AS row_num
but the part that is tripping me up is how to get the count to restart after every capital_event_bool = True event. It is fine if the row numbering will stop at every capital_event_bool = True event and then restart because I can just use a case statement with the capital_event_bool to reach my final result.
row_num DESIRED only row_num Final Desired Result
1 1 0
1 1 0
2 2 0
1 3 2
1 1 0
2 2 0
1 1 0
1 2 1
2 1 0
3 2 0
2 3 2

This is a type of gap-and-islands problem. Basically, you need to define subsets of the data by the number of "1" in the "b" columns. For this purpose, an inverse sum of capital_event_bool does exactly what you want. Then, you can use window functions on this group:
select t.*,
(case when capital_event_bool = 1
then sum( (event = 'b')::int ) over (partition by id, grp) - 1
else 0
end) as final_result
from (select t.*,
sum(capital_event_bool) over (partition by id order by ts desc) as grp
from t
) t

Related

How to reset running counts and start from 1 based on the condition and make previous row values as 0 in SQL?

For example there is some table with below data:
No Id Value
1 100 1
2 100 0
3 100 1
4 100 2
1 101 1
2 101 2
1 102 0
2 102 1
I have to write SQL query, which will return row count based on specific condition. If the value matches 0 then need to reset running counts and start from 1 and make previous row values as 0
So the result will be like:
No Id Value Running Count
1 100 1 0
2 100 0 0
3 100 1 1
4 100 1 2
1 101 1 1
2 101 2 2
1 102 1 0
2 102 0 0

Your sample dataset is quite limited so I'm not sure of all edge cases but see if the following works for you. If not it might help get you there.
This gets a running count using a window & case expression and uses lead to check the next value.
If the current value or next value is 0 the count is 0, otherwise it's the running count subtracting 1 if there is a 0 in the Id block indicating the count was reset.
select No, Id, Value,
case when value = 0 or nv = 0
then 0
else
rc - case when Min(value) over(partition by id) = 0 then 1 else 0 end
end Running_Count
from (
select *,
Sum(case when value = 0 then 0 else 1 end) over(partition by id order by no) rc,
Lead(Value) over(partition by Id order by No)nv
from t
)t;

Number of Rows Between Polarity Changes SQL

I want to count the number of rows between polarity changes grouped by id in SQL. I'm thinking that there may be a clever way to use window functions to get the job done but I don't know what it is.
Consider data like this:
id
polarity
date
1
0
12/1
1
1
12/2
1
0
12/3
1
0
12/4
1
1
12/5
2
0
12/1
2
0
12/2
2
0
12/3
2
1
12/4
2
0
12/5
2
0
12/6
2
0
12/7
2
1
12/8
Is there a way to count the number of rows between each change in polarity to get something like this :
id
n
1
1
1
2
2
3
2
3

You can do:
select id, count(*) as n
from (
select *,
sum(i) over(partition by id order by date) as g
from (
select *, case when polarity <> lag(polarity)
over(partition by id order by date)
then 1 else 0 end as i
from t
) x
) y
group by id, g
having max(polarity) = 0

How to select a Algokey when [ UserSelect] column has any of row value is 1 otherwise switch to[ SytemSelect]column haivng row value is 1

I have a table with Scenario,Product,AlgoKey,User Select,System Select columns, I have to select the algo key for each scenario, The first priority goes to user selected otherwise system selected.
I have Shared my Inout & output result below, could you please help me how to write query for this.
Scenario
Product
AlgoKey
User Select
SystemSelect
1
P101
1
0
1
1
P102
2
1
0
2
P101
1
0
1
2
P102
2
0
0
3
P101
1
1
1
3
P102
2
0
0
4
P101
1
0
0
4
P102
2
0
1
OutPut :
Scenario
AlgoKey
Columnselected
1
2
User
2
1
System
3
1
User
4
2
System

here is how you can do it
select scenario , AlgoKey, case when Userselect = 1 then 'User' else 'System' end Columnselected
from (
select *, ROW_NUMBER() over (partition by scenario,productkey order by userselect desc, systemselect desc) rn
from tableName
) t
where rn = 1

Number the rows and reset the counter back to 1 on certain condition

How can I reset a counter in SQL Server on a keyword? In the following data, everytime the string 'A' is found, the counter needs to be reset to 1:
Item Date
A 01.01.2019
B 02.01.2019
C 03.01.2019
D 04.01.2019
A 05.01.2019
B 06.01.2019
A 07.01.2019
B 08.01.2019
C 09.01.2019
D 10.01.2019
E 11.01.2019
A 12.01.2019
A 13.01.2019
A 14.01.2019
B 15.01.2019
And I need to reset the counter everytime A is found:
Count Item Date
1 A 01.01.2019
2 B 02.01.2019
3 C 03.01.2019
4 D 04.01.2019
1 A 05.01.2019
2 B 06.01.2019
1 A 07.01.2019
2 B 08.01.2019
3 C 09.01.2019
4 D 10.01.2019
5 E 11.01.2019
1 A 12.01.2019
1 A 13.01.2019
1 A 14.01.2019
2 B 15.01.2019

Something like:
WITH cte AS (
SELECT *, COUNT(CASE WHEN Item = 'A' THEN 1 END) OVER (ORDER BY Date) AS GroupNum
FROM t
)
SELECT *, ROW_NUMBER() OVER (PARTITION BY GroupNum ORDER BY Date) AS [Count]
FROM cte
ORDER BY Date
The cte assigns a running count to each row that increments whenever A is encountered. Rows are the assigned a ROW_NUMBER() based on this counter.
Demo on db<>fiddle

Oracle SQL - Returning row of first occurrence by value

I have data in the following format:
ID YRMTH EVENT
1 201201 0
1 201202 0
1 201203 1
1 201204 0
1 201205 0
2 201304 0
2 201305 0
2 201306 0
3 201301 0
3 201302 0
3 201303 0
3 201304 1
3 201305 0
I want to return one row per ID with YRMTH and EVENT. If an EVENT occurred (= 1), then I want the YRMTH when the event occurred and EVENT = 1. If an EVENT did not occur (= 0), then I want to return the last month listed in YRMTH and EVENT = 0.
In this example, I want the following output:
ID YRMTH EVENT
1 201203 1
2 201306 0
3 201304 1
Essentially, the intent of this query is to identify the first month when an event occurred if one occurred at all. If no event occurred, then I just want the last month where we have data.
I am thinking that I would need to use a window function (PARTITION BY), but I am open to any possible solutions.

You can do this with aggregations. Analytic functions are not needed:
select id,
(case when max(event) = 1 then min(case when event = 1 then yrmth end)
else max(yrmth)
end) as yrmth,
max(event) as event
from table t
group by id;

Try this:
with cte as
(select id,
case event
when 1 then yrmth
else max(yrmth) over (partition by id order by yrmth)
end as yrmth,
event,
row_number() over
(partition by id
order by case event when 1 then 1 else 0 end desc, yrmth desc) as rn
from your_table)
select id, yrmth, event
from cte
where rn = 1
SQLFiddle

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Redshift Row_Number() Query with partitions that restart - sql

Related

How to reset running counts and start from 1 based on the condition and make previous row values as 0 in SQL?

Number of Rows Between Polarity Changes SQL

How to select a Algokey when [ UserSelect] column has any of row value is 1 otherwise switch to[ SytemSelect]column haivng row value is 1

Number the rows and reset the counter back to 1 on certain condition

Oracle SQL - Returning row of first occurrence by value

Categories

Resources