Gradually aggregating a string column in Oracle SQL - sql

I would like to gradually aggregate a string column in Oracle sql.
From this table:
col_1 | col_2
-------------
1 | A
1 | B
1 | C
2 | C
2 | D
to:
col_1 | col_2
-------------
1 | A
1 | A,B
1 | A,B,C
2 | C
2 | C,D
I tried LISTAGG but it won't return all rows due to group by. I have about 2 million rows in the table.

Oracle doesn't support accumulating string concatenation with a single listagg() expression. However, you can use a subquery.
Just one note: SQL tables represent unordered sets. You seem to have an ordering in mind. The following code adds an ordering column:
with t as (
select 1 as id, 1 as x, 'A' as y from dual union all
select 2, 1 as x, 'B' as y from dual union all
select 3, 1 as x, 'C' as y from dual union all
select 4, 2 as x, 'C' as y from dual union all
select 5, 2 as x, 'D' as y from dual
)
select t.*,
(select listagg(t2.y, ',') within group (order by t2.id)
from t t2
where t2.x = t.x and t2.id <= t.id
)
from t;

A hierarchical query option looks like this:
SQL> with t as (
2 select 1 as id, 1 as x, 'A' as y from dual union all
3 select 2, 1 as x, 'B' as y from dual union all
4 select 3, 1 as x, 'C' as y from dual union all
5 select 4, 2 as x, 'C' as y from dual union all
6 select 5, 2 as x, 'D' as y from dual
7 )
8 select x,
9 ltrim(sys_connect_by_path(y, ','), ',') result
10 from (select x,
11 y,
12 row_number() over (partition by x order by y) rn
13 from t
14 )
15 start with rn = 1
16 connect by prior rn = rn - 1 and prior x = x;
X RESULT
---------- --------------------
1 A
1 A,B
1 A,B,C
2 C
2 C,D
SQL>

Related

create sequence of numbers on grouped column in Oracle

Consider below table with column a,b,c.
a b c
3 4 5
3 4 5
6 4 1
1 1 8
1 1 8
1 1 0
1 1 0
I need a select statement to get below output. i.e. increment column 'rn' based on group of column a,b,c.
a b c rn
3 4 5 1
3 4 5 1
6 4 1 2
1 1 8 3
1 1 8 3
1 1 0 4
1 1 0 4
You can use the DENSE_RANK analytic function to get a unique ID for each combination of A, B, and C. Just note that if a new value is inserted into the table, the IDs of each combination of A, B, and C will shift and may not be the same.
Query
WITH
my_table (a, b, c)
AS
(SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 6, 4, 1 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL)
SELECT t.*, DENSE_RANK () OVER (ORDER BY b desc, c desc, a) as rn
FROM my_table t;
Result
A B C RN
____ ____ ____ _____
3 4 5 1
3 4 5 1
6 4 1 2
1 1 8 3
1 1 8 3
1 1 0 4
1 1 0 4
As a starter: for your answer to make sense at all, you need a column that defines the ordering of the rows. Let me assume that you have such column, called id.
Then, you can use window functions:
select a, b, c,
sum(case when a = lag_a and b = lag_b and c = lag_c then 0 else 1 end) over(order by id) rn
from (
select t.*,
lag(a) over(order by id) lag_a,
lag(b) over(order by id) lag_b,
lag(c) over(order by id) lag_c
from mytable t
) t
Assuming you have some way of ordering your rows, then you can use MATCH_RECOGNIZE:
SELECT a, b, c, rn
FROM table_name
MATCH_RECOGNIZE (
ORDER BY id
MEASURES MATCH_NUMBER() AS rn
ALL ROWS PER MATCH
PATTERN ( FIRST_ROW EQUAL_ROWS* )
DEFINE EQUAL_ROWS AS (
EQUAL_ROWS.a = PREV( EQUAL_ROWS.a )
AND EQUAL_ROWS.b = PREV( EQUAL_ROWS.b )
AND EQUAL_ROWS.c = PREV( EQUAL_ROWS.c )
)
)
So, for your test data:
CREATE TABLE table_name ( id, a, b, c ) AS
SELECT 1, 3, 4, 5 FROM DUAL UNION ALL
SELECT 2, 3, 4, 5 FROM DUAL UNION ALL
SELECT 3, 6, 4, 1 FROM DUAL UNION ALL
SELECT 4, 1, 1, 8 FROM DUAL UNION ALL
SELECT 5, 1, 1, 8 FROM DUAL UNION ALL
SELECT 6, 1, 1, 0 FROM DUAL UNION ALL
SELECT 7, 1, 1, 0 FROM DUAL;
Outputs:
A | B | C | RN
-: | -: | -: | -:
3 | 4 | 5 | 1
3 | 4 | 5 | 1
6 | 4 | 1 | 2
1 | 1 | 8 | 3
1 | 1 | 8 | 3
1 | 1 | 0 | 4
1 | 1 | 0 | 4
db<>fiddle here
It can also be done without any ordering, by getting the distinct groups and numbering each group. Borrowing the first part from EJ Egjed:
WITH my_table (a, b, c) AS
(SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 6, 4, 1 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL)
, groups as (select distinct a, b, c
from my_table)
, groupnums as (select rownum as num, a, b, c
from groups)
select a, b, c, num
from my_table join groupnums using(a,b,c);

Oracle - generate a running number by group

I need to generate a running number / group sequence inside a select statement for a group of data.
For example
Group Name Sequence
1 a 1
1 b 2
1 c 3
2 d 1
2 e 2
2 f 3
So for each group the sequence should be a running number starting with 1 depending on the order of column"Name".
I already pleayed around with Row_Number() and Level but I couldn't get a solution.
Any idea how to do it?
Analytic functions help.
SQL> with test (cgroup, name) as
2 (select 1, 'a' from dual union all
3 select 1, 'b' from dual union all
4 select 1, 'c' from dual union all
5 select 2, 'd' from dual union all
6 select 2, 'e' from dual union all
7 select 2, 'f' from dual
8 )
9 select cgroup,
10 name,
11 row_number() over (partition by cgroup order by name) sequence
12 from test
13 order by cgroup, name;
CGROUP N SEQUENCE
---------- - ----------
1 a 1
1 b 2
1 c 3
2 d 1
2 e 2
2 f 3
6 rows selected.
SQL>
Try this
SELECT
"Group",
Name,
DENSE_RANK() OVER (PARTITION BY "Group" ORDER BY Name) AS Sequence
FROM table;

Oracle SQL intersection of 2 comma separated string

What can I apply as function?
Query:
Select x, f(y) from table where y like '%ab%cd%ef';
sample table(y is sorted alphabatically)
x. y
1 ab
2 ab,cd
3 cd,ef
4 ab,ef,gh,yu
5 de,ef,rt
Expected Output:
Output:
x y
1 ab
2 ab,cd
3 cd,ef
4 ab,ef
5 ef
Use regexp_substr function with connect by level expressions as
with tab(x,y) as
(
select 1,'ab' from dual union all
select 2,'ab,cd' from dual union all
select 3,'cd,ef' from dual union all
select 4,'ab,ef,gh,yu' from dual union all
select 5,'de,ef,rt' from dual
), tab2 as
(
Select x, regexp_substr(y,'[^,]+',1,level) as y
from tab
connect by level <= regexp_count(y,',') + 1
and prior x = x
and prior sys_guid() is not null
), tab3 as
(
select x, y
from tab2
where y like '%ab%'
or y like '%cd%'
or y like '%ef%'
)
select x, listagg(y,',') within group (order by y) as y
from tab3
group by x;
X Y
1 ab
2 ab,cd
3 cd,ef
4 ab,ef
5 ef
Demo
Follow comments written within the code.
SQL> with test (x, y) as
2 -- your sample table
3 (select 1, 'ab' from dual union all
4 select 2, 'ab,cd' from dual union all
5 select 3, 'cd,ef' from dual union all
6 select 4, 'ab,ef,gh,yu' from dual union all
7 select 5, 'de,ef,rt' from dual
8 ),
9 srch (val) as
10 -- a search string, which is to be compared to the sample table's Y column values
11 (select 'ab,cd,ef' from dual),
12 --
13 srch_rows as
14 -- split search string into rows
15 (select regexp_substr(val, '[^,]+', 1, level) val
16 from srch
17 connect by level <= regexp_count(val, ',') + 1
18 ),
19 test_rows as
20 -- split sample values into rows
21 (select x,
22 regexp_substr(y, '[^,]+', 1, column_value) y
23 from test,
24 table(cast(multiset(select level from dual
25 connect by level <= regexp_count(y, ',') + 1
26 ) as sys.odcinumberlist))
27 )
28 -- the final result
29 select t.x, listagg(t.y, ',') within group (order by t.y) result
30 from test_rows t join srch_rows s on s.val = t.y
31 group by t.x
32 order by t.x;
X RESULT
---------- --------------------
1 ab
2 ab,cd
3 cd,ef
4 ab,ef
5 ef
SQL>

How to combine and count two columns in oracle?

My table seems like this;
A B
1 100
1 102
1 105
2 100
2 105
3 100
3 102
I want output like this:
A Count(B)
1 3
1,2 2
1,2,3 3
2 2
3 2
2,3 2
How can i do this?
I try to use listagg but it didn't work.
I suspect that you want to count the number of sets of A that are in the data -- and that your sample results are messed up.
If so:
select grp, count(*)
from (select listagg(a, ',') within group (order by a) as grp
from t
group by b
) b;
This gives you the counts for the full combinations present in the data. The results would be:
1,2,3 1
1,3 1
1,2 1
You can get the original number of rows by doing:
select grp, sum(cnt)
from (select listagg(a, ',') within group (order by a) as grp, count(*) as cnt
from t
group by b
) b;
Oracle Setup:
CREATE TABLE table_name ( A, B ) AS
SELECT 1, 100 FROM DUAL UNION ALL
SELECT 1, 102 FROM DUAL UNION ALL
SELECT 1, 105 FROM DUAL UNION ALL
SELECT 2, 100 FROM DUAL UNION ALL
SELECT 2, 105 FROM DUAL UNION ALL
SELECT 3, 100 FROM DUAL UNION ALL
SELECT 3, 102 FROM DUAL;
Query:
SELECT A,
COUNT(B)
FROM (
SELECT SUBSTR( SYS_CONNECT_BY_PATH( A, ',' ), 2 ) AS A,
B
FROM table_name
CONNECT BY PRIOR B = B
AND PRIOR A + 1 = A
)
GROUP BY A
ORDER BY A;
Output:
A COUNT(B)
----- ----------
1 3
1,2 2
1,2,3 1
2 2
2,3 1
3 2

Oracle advanced union

Is there any advanced Oracle SQL methods to solve this kind of situation?
Simplified:
Two queries returns primary_key_value and other_value.
Both queries always return primary_key_value but other_value might be null.
So how I can union those two queries so that it returns always those rows which has other_value, but if both queries are having other_value = null with same primary key, then only one row should be returned.
I know this is so stupid case. But specifications were like this :)
Example:
First query:
A | B
=======
1 | X
2 |
3 |
4 | Z
Second query:
A | B
=======
1 | Y
2 |
3 | Z
4 |
So result need to be like this:
A | B
=======
1 | X
1 | Y
2 |
3 | Z
4 | Z
You could use analytics:
SQL> WITH q1 AS (
2 SELECT 1 a, 'X' b FROM DUAL UNION ALL
3 SELECT 2 a, '' b FROM DUAL UNION ALL
4 SELECT 3 a, '' b FROM DUAL UNION ALL
5 SELECT 4 a, 'Z' b FROM DUAL
6 ), q2 AS (
7 SELECT 1 a, 'Y' b FROM DUAL UNION ALL
8 SELECT 2 a, '' b FROM DUAL UNION ALL
9 SELECT 3 a, 'Z' b FROM DUAL UNION ALL
10 SELECT 4 a, '' b FROM DUAL
11 )
12 SELECT a, b
13 FROM (SELECT a, b,
14 rank() over(PARTITION BY a
15 ORDER BY decode(b, NULL, 2, 1)) rnk
16 FROM (SELECT * FROM q1
17 UNION
18 SELECT * FROM q2))
19 WHERE rnk = 1;
A B
---------- -
1 X
1 Y
2
3 Z
4 Z
If you want use something really advanced, use model clause http://rwijk.blogspot.com/2007/10/sql-model-clause-tutorial-part-one.html
But, in real life, using such things usually means bad-designed data model
Another way to look at is that you want all possible values from the union of column A then left outer outer join these with the non-null values from column B, thus only showing null in B when there is no non-null value to display.
roughly:
WITH q1 as (whatever),
q2 as (whatever)
SELECT All_A.A, BVals.B
FROM (SELECT DISTINCT A FROM (SELECT A FROM q1 UNION SELECT A FROM q2)) All_A,
(SELECT A,B FROM q1 WHERE B IS NOT NULL
UNION
SELECT A,B FROM q2 WHERE B IS NOT NULL) BVals
WHERE All_A.A = BVals.A (+)
Also pruning the unwanted nulls explicitly could do the same job:
WITH q3 AS (q1_SELECT UNION q2_SELECT)
SELECT A,B
FROM q3 main
WHERE NOT ( B IS NULL AND
EXISTS (SELECT 1 FROM q3 x WHERE main.A = x.A and x.B IS NOT NULL) )