Oracle- logically give each set of rows a group based on a value from ordered list - sql

I have this
SQL Fiddle
When ordered by the sequence_number field, these records need to be grouped and given a row_number based on the following logic:
All records for which the following line type is not a 0 is part of the same group.
Example, from the provided SQL fiddle,
sequence numbers 0,1 and 2 are part of the same group, and sequence numbers 3 and 4 are part of another group. Basically, any rows up to a 0 line type are part of a single group. The data I am trying to return will look like:
GROUP LINE_TYPE SEQUENCE_NUMBER PRODUCT
------------------------------------------------
1 0 0 REM322
1 6 1 Discount
1 7 2 Loyalty Discount
2 0 3 RGM32
2 6 4 Discount
Another way to re-word what I am after is that when ordered by the sequence number, the group number will change when it hit's a 0.
I've been racking my brain trying to think how to do this using partitions/lags and even self joins but am having trouble.
Any help appreciated.

Set the column value to 1 if line_type is 0 and then calculate the running sum(using SUM as analytical function) over this.
select sum(case when line_type = 0 then 1
else 0 end
) over (order by sequence_number) as grp,
line_type,
sequence_number,
product
from ret_trand
order by sequence_number;
Demo.

Another way of doing the grouping is using a hierarchical query and CONNECT_BY_ROOT:
SELECT CONNECT_BY_ROOT sequence_number AS first_in_sequence,
line_type,
sequence_number,
product
FROM ret_trand
START WITH
line_type = 0
CONNECT BY
( sequence_number - 1 = PRIOR sequence_number
AND line_type <> 0)
ORDER SIBLINGS BY
sequence_number;
SQLFIDDLE
This will identify the groups by the initial sequence number of the group.
If you want to change this to a sequential ranking for the groups then you can use DENSE_RANK to do this:
WITH first_in_sequences AS
(
SELECT CONNECT_BY_ROOT sequence_number AS first_in_sequence,
line_type,
sequence_number,
product
FROM ret_trand
START WITH
line_type = 0
CONNECT BY
( sequence_number - 1 = PRIOR sequence_number
AND line_type <> 0)
ORDER SIBLINGS BY
sequence_number
)
SELECT DENSE_RANK() OVER ( ORDER BY first_in_sequence ) AS "group",
line_type,
sequence_number,
product
FROM first_in_sequences;
SQLFIDDLE

Related

sql how to assign the same ID for the same group

I have a dataset as this:
ID SESSION DATE
1 A 2021/1/1
1 A 2021/1/2
1 B 2021/1/3
1. B 2021/1/4
1 A 2021/1/5
1 A 2021/1/6
So what I want to create is the GROUP column which assigns the same row number for where ID column AND SESSION column is the same as below:
ID SESSION DATE GROUP
1 A 2021/1/1 1
1 A 2021/1/2 1
1 B 2021/1/3 2
1 B 2021/1/4 2
1 A 2021/1/5 3
1 A 2021/1/6 3
Does anyone know how to do this in SQL in an efficient way because I have about 5 billion rows? Thank you in advance!
You have a kind of gaps and islands problem, you can create your groupings by counting when the session changes using lag, like so:
select Id, Session, Date,
Sum(case when session = prevSession then 0 else 1 end) over(partition by Id order by date) "Group"
from (
select *,
Lag(Session) over(partition by Id order by date) prevSession
from t
)t;
Example Fiddle using MySql but this is ansi SQL that should work in most DBMS.

SQL Query getting the latest record of the Group and calculate the value of those particular records

I do have the following table (just a sample) and would like to get the Points subtract from Record2 to Record1. (Record2-Record1) from the latest record of both record1 and 2. The records are entered in category of Match. 1 Match will consists of 2 records which are Record 1 and Record 2.
The output will be 3 as the newest record is ID 3 and 4 from the Match2.)
ID
Name
Points
TimeRecorded
Match
1
Record 1
3
2-Mar 2pm
1
2
Record 2
5
2-Mar 2pm
1
3
Record 1
5
4-Mar 5pm
2
4
Record 2
8
4-Mar 5pm
2
I tried to get the value of subtracting both query as below. But I feel that this is not the good way as it is hard coded for the match and the Name of the record. May I know how to construct a better query in order to get the latest record of the grouped match and calculate the points whereby subtracting Record1 from Record2.
SELECT
(select Points from RunRecord where Name= 'Record2' AND Match = 2)
- (select Points from RunRecord where Name= 'Record1' AND Match = 2)
You could use:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY TimeRecorded DESC) rn
FROM yourTable
)
SELECT
MAX(CASE WHEN Name = 'Record 2' THEN Points END) -
MAX(CASE WHEN Name = 'Record 1' THEN Points END) AS diff
FROM cte
WHERE rn = 1;
The CTE assigns a row number for each group of records of the same name, with 1 being assigned to the most recent record. Then, we aggregate over the entire table and pivot out the points to find the difference.
You can use the rank() window function to rank the records by match descending. Then take the top of the ranked records and use conditional aggregation to control the sign of the points added.
SELECT sum(CASE x.name
WHEN 'Record2' THEN
x.points
WHEN 'Record1' THEN
-x.points
END)
FROM (SELECT rr.name,
rr.points,
rank() OVER (ORDER BY rr.match DESC) r
FROM runrecord rr
WHERE name IN ('Record1',
'Record2')) x
WHERE x.r = 1;

Using IF or Case with multiple in SQL Statement

I want to do something like this
this works
Select ID, number, cost from table order by number
number can be 2-xtimes but the cost and the same
1 A33 66.50
2 A34 73.50
3 A34 73.50
But I want to have
1 A33 66.50
2 A34 73.50
3 A34 0
I want to change it in the Sql to 0
I tried distinct or if then else.
I want to do something like this
declare #oldcost int;
Select ID, number,
if(cost=#oldcost) then
cost=0;
else
cost=cost;
end if
#oldcost=cost;
from table order by number
How can I do it in SQL?
You can use window functions and a case expression:
select ID, number,
(case when row_number() over (partition by number order by id) = 1
then cost else 0
end) as cost
from table
order by number, id;
Note that SQL generally does not take ordering into account, so results can be returned in any order -- and even with an order by, rows with the same keys can be in any order (and in different orders on different executions).
Hence, the order by includes id as well as number so you get the cost on the "first" row for each number.

Oracle SQL - select last 3 rows after a specific row

Below is my data:
My requirement is to get the first 3 consecutive approvals. So from above data, ID 4, 5 and 6 are the rows that I need to select. ID 1 and 2 are not eligible, because ID 3 is a rejection and hence breaks the consecutive condition of actions. Basically, I am looking for the last rejection in the list and then finding the 3 consecutive approvals after that.
Also, if there are no rejections in the chain of actions then the first 3 actions should be the result. For below data:
So my output should be ID 11, 12 and 13.
And if there are less than 3 approvals, then the output should be the list of approvals. For below data:
output should be ID 21 and 22.
Is there any way to achieve this with SQL query only - i.e. no PL-SQL code?
Here is one method that uses window functions:
Find the first row where there are three approvals.
Find the minimum action_at among the rows with three approvals
Filter
Keep the three rows you want
This version uses fetch which is in Oracle 12+:
select t.*
from (select t.*,
min(case when has_approval_3 = 3 then action_at end) over () as first_action_at
from (select t.*,
sum(case when action = 'APPROVAL' then 1 else 0 end) over (order by action_at rows between current row and 2 following) as has_approval_3
from t
) t
) t
where action = 'APPROVAL' and
(action_at >= first_action_at or first_action_at is null)
order by action_at
fetch first 3 rows only;
You can use IN and ROW_NUMBER analytical function as following:
SELECT * FROM
( SELECT
T.*,
ROW_NUMBER() OVER(ORDER BY Y.ACTION_AT) AS RN
FROM YOUR_TABLE Y
WHERE Y.ACTION = 'APPROVE'
AND Y.ACTION_AT >= COALESCE(
(SELECT MAX(YIN.ACTION_AT)
FROM YOUR_TABLE YIN
WHERE YIN.ACTION = 'REJECT'
), Y.ACTION_AT) )
WHERE RN <= 3;
Cheers!!

Find Range in Sequence

I have table #NumberRange. It has a start and end number. I have to find out ranges are in sequence
Declare #NumberRange table
(
Id int primary key,
ItemId int,
[start] int,
[end] int
)
INSERT INTO #NumberRange
VALUES
(1,1,1,10),
(2,1,11,20),
(3,1,21,30),
(4,1,40,50),
(5,1,51,60),
(6,1,61,70),
(7,1,80,90),
(8,1,100,200)
Expected Result:
Note: Result Column calculated from if any continuous numbers i.e 1 to 10 ,11-20,21-30 are continuous numbers. So result column updated as 1 and then 41-50 not continuous numbers (because previous row end with 30 next row start with 40) that is why result column will be 2 and it continuous..
In 4th end with 50 and 5 th start with 51 continuous then result would be 3 because I have differentiate with Result 1...
I have used lead functions and expected result not came,..please can someone help me get the result?
Workaround:
select
*,
[Diff] = [Lead] - [end],
[Result] = Rank() OVER (PARTITION BY ([Lead] - [end]) ORDER BY Id)
from
(select
id, [start], [end], LEAD([start]) over (order by id) as [Lead]
from
#NumberRange) Z
order by
id
Use lag() to determine where the groups start. Then a cumulative sum to enumerate them:
select nr.*,
sum(case when startr = prev_endr + 1 then 0 else 1 end) over (partition by itemid order by startr) as grp
from (select nr.*, lag(endr) over (partition by itemid order by startr) as prev_endr
from numberrange nr
) nr;
Here is a db<>fiddle.
This answer assumes that ids 4 and 5 are continuous, which makes sense based on the rest of the question.
Your expected result is not clear and the questions which are asked in the comments I have too, but I think what you want to do is something similar to
select N1.*,case when N1.[end]+1=N2.[start] then 1 else 2 end Result from #NumberRange N1 inner join #NumberRange N2 on N1.Id=N2.Id-1