create sequence of numbers on grouped column in Oracle - sql

Consider below table with column a,b,c.
a b c
3 4 5
3 4 5
6 4 1
1 1 8
1 1 8
1 1 0
1 1 0
I need a select statement to get below output. i.e. increment column 'rn' based on group of column a,b,c.
a b c rn
3 4 5 1
3 4 5 1
6 4 1 2
1 1 8 3
1 1 8 3
1 1 0 4
1 1 0 4

You can use the DENSE_RANK analytic function to get a unique ID for each combination of A, B, and C. Just note that if a new value is inserted into the table, the IDs of each combination of A, B, and C will shift and may not be the same.
Query
WITH
my_table (a, b, c)
AS
(SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 6, 4, 1 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL)
SELECT t.*, DENSE_RANK () OVER (ORDER BY b desc, c desc, a) as rn
FROM my_table t;
Result
A B C RN
____ ____ ____ _____
3 4 5 1
3 4 5 1
6 4 1 2
1 1 8 3
1 1 8 3
1 1 0 4
1 1 0 4

As a starter: for your answer to make sense at all, you need a column that defines the ordering of the rows. Let me assume that you have such column, called id.
Then, you can use window functions:
select a, b, c,
sum(case when a = lag_a and b = lag_b and c = lag_c then 0 else 1 end) over(order by id) rn
from (
select t.*,
lag(a) over(order by id) lag_a,
lag(b) over(order by id) lag_b,
lag(c) over(order by id) lag_c
from mytable t
) t

Assuming you have some way of ordering your rows, then you can use MATCH_RECOGNIZE:
SELECT a, b, c, rn
FROM table_name
MATCH_RECOGNIZE (
ORDER BY id
MEASURES MATCH_NUMBER() AS rn
ALL ROWS PER MATCH
PATTERN ( FIRST_ROW EQUAL_ROWS* )
DEFINE EQUAL_ROWS AS (
EQUAL_ROWS.a = PREV( EQUAL_ROWS.a )
AND EQUAL_ROWS.b = PREV( EQUAL_ROWS.b )
AND EQUAL_ROWS.c = PREV( EQUAL_ROWS.c )
)
)
So, for your test data:
CREATE TABLE table_name ( id, a, b, c ) AS
SELECT 1, 3, 4, 5 FROM DUAL UNION ALL
SELECT 2, 3, 4, 5 FROM DUAL UNION ALL
SELECT 3, 6, 4, 1 FROM DUAL UNION ALL
SELECT 4, 1, 1, 8 FROM DUAL UNION ALL
SELECT 5, 1, 1, 8 FROM DUAL UNION ALL
SELECT 6, 1, 1, 0 FROM DUAL UNION ALL
SELECT 7, 1, 1, 0 FROM DUAL;
Outputs:
A | B | C | RN
-: | -: | -: | -:
3 | 4 | 5 | 1
3 | 4 | 5 | 1
6 | 4 | 1 | 2
1 | 1 | 8 | 3
1 | 1 | 8 | 3
1 | 1 | 0 | 4
1 | 1 | 0 | 4
db<>fiddle here

It can also be done without any ordering, by getting the distinct groups and numbering each group. Borrowing the first part from EJ Egjed:
WITH my_table (a, b, c) AS
(SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 6, 4, 1 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL)
, groups as (select distinct a, b, c
from my_table)
, groupnums as (select rownum as num, a, b, c
from groups)
select a, b, c, num
from my_table join groupnums using(a,b,c);

Related

Oracle add sequential query rows if missing

In Oracle 19c I have a data like:
with t as (
select 1 Cat, 1 id, 11 val from dual
union all
select 1, 3, 33 from dual
union all
select 2, 2, 22 from dual
union all
select 2, 4, 44 from dual)
select *
from t
In query result I want to get 4 rows per every cat with ids 1-4 and if there was no such id in that cat a val must be null:
cat
id
val
1
1
11
1
2
1
3
33
1
4
2
1
2
2
22
2
3
2
4
44
Use a PARTITIONed join with a row-generator:
SELECT t.cat,
i.id,
t.val
FROM (SELECT LEVEL AS id FROM DUAL CONNECT BY LEVEL <= 4) i
LEFT OUTER JOIN t
PARTITION BY (t.cat)
ON (i.id = t.id)
Which outputs:
CAT
ID
VAL
1
1
11
1
2
null
1
3
33
1
4
null
2
1
null
2
2
22
2
3
null
2
4
44
db<>fiddle here

Oracle - generate a running number by group

I need to generate a running number / group sequence inside a select statement for a group of data.
For example
Group Name Sequence
1 a 1
1 b 2
1 c 3
2 d 1
2 e 2
2 f 3
So for each group the sequence should be a running number starting with 1 depending on the order of column"Name".
I already pleayed around with Row_Number() and Level but I couldn't get a solution.
Any idea how to do it?
Analytic functions help.
SQL> with test (cgroup, name) as
2 (select 1, 'a' from dual union all
3 select 1, 'b' from dual union all
4 select 1, 'c' from dual union all
5 select 2, 'd' from dual union all
6 select 2, 'e' from dual union all
7 select 2, 'f' from dual
8 )
9 select cgroup,
10 name,
11 row_number() over (partition by cgroup order by name) sequence
12 from test
13 order by cgroup, name;
CGROUP N SEQUENCE
---------- - ----------
1 a 1
1 b 2
1 c 3
2 d 1
2 e 2
2 f 3
6 rows selected.
SQL>
Try this
SELECT
"Group",
Name,
DENSE_RANK() OVER (PARTITION BY "Group" ORDER BY Name) AS Sequence
FROM table;

Gradually aggregating a string column in Oracle SQL

I would like to gradually aggregate a string column in Oracle sql.
From this table:
col_1 | col_2
-------------
1 | A
1 | B
1 | C
2 | C
2 | D
to:
col_1 | col_2
-------------
1 | A
1 | A,B
1 | A,B,C
2 | C
2 | C,D
I tried LISTAGG but it won't return all rows due to group by. I have about 2 million rows in the table.
Oracle doesn't support accumulating string concatenation with a single listagg() expression. However, you can use a subquery.
Just one note: SQL tables represent unordered sets. You seem to have an ordering in mind. The following code adds an ordering column:
with t as (
select 1 as id, 1 as x, 'A' as y from dual union all
select 2, 1 as x, 'B' as y from dual union all
select 3, 1 as x, 'C' as y from dual union all
select 4, 2 as x, 'C' as y from dual union all
select 5, 2 as x, 'D' as y from dual
)
select t.*,
(select listagg(t2.y, ',') within group (order by t2.id)
from t t2
where t2.x = t.x and t2.id <= t.id
)
from t;
A hierarchical query option looks like this:
SQL> with t as (
2 select 1 as id, 1 as x, 'A' as y from dual union all
3 select 2, 1 as x, 'B' as y from dual union all
4 select 3, 1 as x, 'C' as y from dual union all
5 select 4, 2 as x, 'C' as y from dual union all
6 select 5, 2 as x, 'D' as y from dual
7 )
8 select x,
9 ltrim(sys_connect_by_path(y, ','), ',') result
10 from (select x,
11 y,
12 row_number() over (partition by x order by y) rn
13 from t
14 )
15 start with rn = 1
16 connect by prior rn = rn - 1 and prior x = x;
X RESULT
---------- --------------------
1 A
1 A,B
1 A,B,C
2 C
2 C,D
SQL>

BigQuery: select the nth smallest value in window, ordered by another value

My table has two integer columns: a and b. For each row, I want to select the nth smallest value of b among the rows with smaller a values. Here's a sample input/output, with n=2.
Input:
a | b
-------
1 | 4
2 | 2
3 | 5
4 | 3
5 | 9
6 | 1
7 | 7
8 | 6
9 | 0
Output:
a | 2th min b
-------------
1 | null ← only 1 element in [4], no 2nd min
2 | 4 ← 2nd min between [4,2]
3 | 4 ← 2nd min between [4,2,5]
4 | 3 ← 2nd min between [4,2,5,3]
5 | 3 ← etc.
6 | 2
7 | 2
8 | 2
9 | 1
I used n=2 here to keep it simple, but in practice, I want the 2000th smallest value (or some other large-ish constant). The column a can be assumed to contain distinct integers (and even 1, 2, 3, … if that's easier).
The problem is that if I use ORDER BY b in my window clause and NTH_VALUE, it just computes the answer on the wrong set of values:
WITH data AS (
SELECT 1 AS a, 4 AS b
UNION ALL SELECT 2 AS a, 2 AS b
UNION ALL SELECT 3 AS a, 5 AS b
UNION ALL SELECT 4 AS a, 3 AS b
UNION ALL SELECT 5 AS a, 9 AS b
UNION ALL SELECT 6 AS a, 1 AS b
)
SELECT nth_value(b, 2) over (order by a)
from data
returns [null, 2, 2, 2, 2, 2]: the values are ordered by a (so in the same order than they appear), so the value b=2 is always the one in second place. I want to order by a and then take the nth smallest value of b. Any idea how to write this in BigQuery (preferably Standard SQL)?
Below is for BigQuery Standard SQL and produces correct result for given example.
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 a, 4 b UNION ALL
SELECT 2, 2 UNION ALL
SELECT 3, 5 UNION ALL
SELECT 4, 3 UNION ALL
SELECT 5, 9 UNION ALL
SELECT 6, 1 UNION ALL
SELECT 7, 7 UNION ALL
SELECT 8, 6 UNION ALL
SELECT 9, 0
)
SELECT
a,
(SELECT b FROM
(SELECT b FROM UNNEST(c) b ORDER BY b LIMIT 2)
ORDER BY b DESC LIMIT 1
) b2
FROM (
SELECT a, IF(ARRAY_LENGTH(c) > 1, c, [NULL]) c
FROM (
SELECT a, ARRAY_AGG(b) OVER (ORDER BY a) c
FROM `project.dataset.table`
)
)
-- ORDER BY a
with expected result as below
Row a b2
1 1 null
2 2 4
3 3 4
4 4 3
5 5 3
6 6 2
7 7 2
8 8 2
9 9 1
Note: to make it work for 2000th element you might change 2 to 2000 in LIMIT 2
meantime, i can admit it looks a little ugly/messy to me and not sure about scalability but you can give it a shot
Quick Update
Below is a little less ugly looking version (same output of course)
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 a, 4 b UNION ALL
SELECT 2, 2 UNION ALL
SELECT 3, 5 UNION ALL
SELECT 4, 3 UNION ALL
SELECT 5, 9 UNION ALL
SELECT 6, 1 UNION ALL
SELECT 7, 7 UNION ALL
SELECT 8, 6 UNION ALL
SELECT 9, 0
)
SELECT a, c[SAFE_ORDINAL(2)] b2 FROM (
SELECT x.a, ARRAY_AGG(y.b ORDER BY y.b LIMIT 2) c
FROM `project.dataset.table` x
CROSS JOIN `project.dataset.table` y
WHERE y.a <= x.a
GROUP BY x.a
)
-- ORDER BY a
For 2000th element replace 2 to 2000 in LIMIT 2 and SAFE_ORDINAL(2)
Still potentially same issue with scalability because of (now) explicit CROSS JOIN

SQL pad query result for missing groups

Assume the following table:
TableA:
ID GroupName SomeValue
1 C 1
2 C 1
2 B 1
2 A 1
I need to construct a query that selects the following result:
ID GroupName SomeValue
1 C 1
1 B 0
1 A 0
2 C 1
2 B 1
2 A 1
The GroupName is actually derived from TableA column's CASE expression and can take only 3 values: A, B, C.
Are the analytic functions the way to go?
EDIT
Sorry, for not mentioning it, but the ID could consist of multiple columns. Consider this example:
ID1 ID2 GroupName SomeValue
1 1 C 1
1 2 C 1
2 2 C 1
2 2 B 1
2 2 A 1
I need to pad SomeValue with 0 for each unique combination ID1+ID2. So the result should be like this:
ID1 ID2 GroupName SomeValue
1 1 C 1
1 1 B 0
1 1 A 0
1 2 C 1
1 2 B 0
1 2 A 0
2 2 C 1
2 2 B 1
2 2 A 1
EDIT2
Seems like solution, proposed by #Laurence should work even for multiple-column 'ID'. I couldn't rewrite the query proposed by #Nicholas Krasnov to conform to this requirement. But could somebody compare these solutions performance-wise? Will the analytic function work faster than 'cross join + left outer join'?
To fill in gaps, you could write a similar query using partition by clause of outer join:
SQL> with t1(ID,GroupName,SomeValue) as
2 (
3 select 1, 'C', 1 from dual union all
4 select 2, 'C', 1 from dual union all
5 select 2, 'B', 1 from dual union all
6 select 2, 'A', 1 from dual
7 ),
8 groups(group_name) as(
9 select 'A' from dual union all
10 select 'B' from dual union all
11 select 'C' from dual
12 )
13 select t1.ID
14 , g.group_name
15 , nvl(SomeValue, 0) SomeValue
16 from t1
17 partition by (t1.Id)
18 right outer join groups g
19 on (t1.GroupName = g.group_name)
20 order by t1.ID asc, g.group_name desc
21 ;
ID GROUP_NAME SOMEVALUE
---------- ---------- ----------
1 C 1
1 B 0
1 A 0
2 C 1
2 B 1
2 A 1
6 rows selected
UPDATE: Response to the comment.
Specify ID2 column in the partition by clause as well:
SQL> with t1(ID1, ID2, GroupName,SomeValue) as
2 (
3 select 1, 1, 'C', 1 from dual union all
4 select 1, 2, 'C', 1 from dual union all
5 select 2, 2, 'C', 1 from dual union all
6 select 2, 2, 'B', 1 from dual union all
7 select 2, 2, 'A', 1 from dual
8 ),
9 groups(group_name) as(
10 select 'A' from dual union all
11 select 'B' from dual union all
12 select 'C' from dual
13 )
14 select t1.ID1
15 , t1.ID2
16 , g.group_name
17 , nvl(SomeValue, 0) SomeValue
18 from t1
19 partition by (t1.Id1, t1.Id2)
20 right outer join groups g
21 on (t1.GroupName = g.group_name)
22 order by t1.ID1, t1.ID2 asc , g.group_name desc
23 ;
ID1 ID2 GROUP_NAME SOMEVALUE
---------- ---------- ---------- ----------
1 1 C 1
1 1 B 0
1 1 A 0
1 2 C 1
1 2 B 0
1 2 A 0
2 2 C 1
2 2 B 1
2 2 A 1
9 rows selected
Select
i.Id1,
i.Id2,
g.GroupName,
Coalesce(a.SomeValue, 0) As SomeValue
From
(select distinct ID1, ID2 from TableA) as i
cross join
(select distinct GroupName from TableA) as g
left outer join
tableA a
on i.ID = a.ID and g.GroupName = a.GroupName
Order By
1,
2,
3 Desc