splitting two columns containing comma separated values in oracle [duplicate] - sql

This question already has answers here:
Splitting string into multiple rows in Oracle
(14 answers)
Closed 2 years ago.
I have two columns in a table with comma separated values, how do it split it into rows?

Would this help?
SQL> with test (col1, col2) as
2 (select 'Little,Foot,is,stupid', 'poor,bastard' from dual union all
3 select 'Green,mile,is,a' , 'good,film,is,it,not?' from dual
4 )
5 select regexp_substr(col1 ||','|| col2, '[^,]+', 1, column_value) str
6 from test cross join
7 table(cast(multiset(select level from dual
8 connect by level <= regexp_count(col1 ||','|| col2, ',') + 1
9 ) as sys.odcinumberlist));
STR
--------------------------------------------------------------------------------
Little
Foot
is
stupid
poor
bastard
Green
mile
is
a
good
film
is
it
not?
15 rows selected.
SQL>

Use a recursive sub-query factoring clause and simple string functions:
WITH splits ( id, c1, c2, idx, start_c1, end_c1, start_c2, end_c2 ) AS (
SELECT id,
c1,
c2,
1,
1,
INSTR( c1, ',', 1 ),
1,
INSTR( c2, ',', 1 )
FROM test_data
UNION ALL
SELECT id,
c1,
c2,
idx + 1,
CASE end_c1 WHEN 0 THEN NULL ELSE end_c1 + 1 END,
CASE end_c1 WHEN 0 THEN NULL ELSE INSTR( c1, ',', end_c1 + 1 ) END,
CASE end_c2 WHEN 0 THEN NULL ELSE end_c2 + 1 END,
CASE end_c2 WHEN 0 THEN NULL ELSE INSTR( c2, ',', end_c2 + 1 ) END
FROM splits
WHERE end_c1 > 0
OR end_c2 > 0
)
SELECT id,
idx,
CASE end_c1
WHEN 0
THEN SUBSTR( c1, start_c1 )
ELSE SUBSTR( c1, start_c1, end_c1 - start_c1 )
END AS c1,
CASE end_c2
WHEN 0
THEN SUBSTR( c2, start_c2 )
ELSE SUBSTR( c2, start_c2, end_c2 - start_c2 )
END AS c2
FROM splits s
ORDER BY id, idx;
So for the test data:
CREATE TABLE test_data ( id, c1, c2 ) AS
SELECT 1, 'a,b,c,d', 'e,f,g' FROM DUAL UNION ALL
SELECT 2, 'h', 'i' FROM DUAL UNION ALL
SELECT 3, NULL, 'j,k,l,m,n' FROM DUAL;
This outputs:
ID | IDX | C1 | C2
-: | --: | :--- | :---
1 | 1 | a | e
1 | 2 | b | f
1 | 3 | c | g
1 | 4 | d | null
2 | 1 | h | i
3 | 1 | null | j
3 | 2 | null | k
3 | 3 | null | l
3 | 4 | null | m
3 | 5 | null | n
db<>fiddle here

Related

SQL check if fields contains same letters

I want to check if there is a row in my table that contains the same letters but in different order, but it must have the exact same letters, no more and no less.
For example, I have the letters "abc":
bca -> true
acb -> true
abcd -> **false**
ab -> **false**
Thanks!
You can use recursive CTEs to split the parameter 'abc' and each column value to letters and compare them:
with
recursive paramletters as (
select 'abc' col, 1 pos, substr('abc', 1, 1) letter
union all
select col, pos + 1, substr(col, pos + 1, 1)
from paramletters
where pos < length(col)
),
param as (
select group_concat(letter, '') over (order by letter) paramvalue
from paramletters
order by paramvalue desc limit 1
),
cteletters as (
select col, 1 pos, substr(col, 1, 1) letter
from tablename
union all
select col, pos + 1, substr(col, pos + 1, 1)
from cteletters
where pos < length(col)
),
cte as (
select * from (
select col, group_concat(letter, '') over (partition by col order by letter) colvalue
from cteletters
)
where length(colvalue) = length(col)
)
select c.col, c.colvalue = p.paramvalue result
from cte c cross join param p
See the demo.
Results:
| col | result |
| ---- | ------ |
| ab | 0 |
| abcd | 0 |
| acb | 1 |
| bca | 1 |
If the letters of the parameter are already sorted (like 'abc') then this code can be simplified to use only the last 2 CTEs.

Using multilist column as foreign key reference

I have a table TABLEA that store data in a Columns which are basically multilist columns like this ColumnA ',2562,2563,2564,' and ColumnB with values ',121,122,123,'.
These column are actually foreign key values coming from another table.
Data is something like this in Table A.
ID NAME ColumnA ColumnB
1 ITEM1 ,2562,2563,2564, ,121,122,123
2 ITEM2 NULL ,6455,545,
3 ITEM3 ,1221,1546, NULL
4 ITEM4 NULL NULL
I want to join these columns with there parent tables and extract data.
I am hoping the result set would have 8 rows.
For example
ITEM ColumnA ColumB
ITEM1 2562 121
ITEM1 2563 122
ITEM1 2564 123
ITEM2 NULL 6455
ITEM2 NULL 545
....
I have tried this query with some help but this is not working when I try to use ColumnB as well and also it ignores the Items with NULL values.
The Column A is saving Ids of USER_GROUP table but ColumnB is fetching the Ids from some other table lets say GROUP1 and there could be another Column ColumnC that might be storing values from another table so that's kind of situation I am stuck in and hope I have explained so someone can understand but I am open if you want me to improve more
SELECT ug.*
FROM USER_GROUP ug
WHERE EXISTS (SELECT 1
FROM TableA t1
WHERE t1.COLUMNA LIKE '%,' || ug.ID || ',%'
)
AND EXISTS (SELECT 1
FROM TableA t1
WHERE t1.COLUMNB LIKE '%,' || ug.ID || ',%'
);
Here's one option:
SQL> with test (id, name, cola, colb) as
2 (select 1, 'item1', ',2562,2563,2564,', ',121,122,123,' from dual union all
3 select 2, 'item2', null , ',6455,545,' from dual union all
4 select 3, 'item3', ',1221,1546,' , null from dual union all
5 select 4, 'item4', null , null from dual
6 ),
7 remcom
8 -- remove leading and trailing commas
9 as (select id,
10 name,
11 rtrim(ltrim(cola, ','), ',') cola,
12 rtrim(ltrim(colb, ','), ',') colb
13 from test
14 )
15 select id,
16 name,
17 regexp_substr(cola, '[^,]+', 1, column_value) cola,
18 regexp_substr(colb, '[^,]+', 1, column_value) colb
19 from remcom r cross join
20 table(cast(multiset(select level from dual
21 connect by level <= regexp_count(nvl(r.cola, r.colb), ',') + 1
22 ) as sys.odcinumberlist))
23 order by id, name, cola, colb;
ID NAME COLA COLB
---------- ----- ---------- ----------
1 item1 2562 121
1 item1 2563 122
1 item1 2564 123
2 item2 545
2 item2 6455
3 item3 1221
3 item3 1546
4 item4
8 rows selected.
SQL>
Now that you have it, join this result with another table you have.
By the way, this example nicely shows what it is a bad idea to store multiple values into the same column. Don't do that.
You don't need to use (slow) regular expressions and can do it with simple string functions in a recursive sub-query factoring clause:
WITH split_data ( id, name, columna, columnb, starta, enda, startb, endb ) AS (
SELECT id,
name,
columna,
columnb,
INSTR(columna,',',1,1),
INSTR(columna,',',1,2),
INSTR(columnb,',',1,1),
INSTR(columnb,',',1,2)
FROM test_data
UNION ALL
SELECT id,
name,
columna,
columnb,
enda,
CASE WHEN enda = 0 THEN 0 ELSE INSTR(columna,',',enda+1,1) END,
endb,
CASE WHEN endb = 0 THEN 0 ELSE INSTR(columnb,',',endb+1,1) END
FROM split_data
WHERE enda > 0
OR endb > 0
)
SELECT id,
name,
CASE
WHEN starta = 0 THEN NULL
WHEN enda = 0 THEN SUBSTR( columna, starta + 1 )
ELSE SUBSTR( columna, starta + 1, enda - starta - 1 )
END AS valuea,
CASE
WHEN startb = 0 THEN NULL
WHEN endb = 0 THEN SUBSTR( columnb, startb + 1 )
ELSE SUBSTR( columnb, startb + 1, endb - startb - 1 )
END as valueb
FROM split_data
ORDER BY id, starta, startb;
Which for your test data:
CREATE TABLE test_data ( ID, NAME, ColumnA, ColumnB ) AS
SELECT 1, 'ITEM1', ',2562,2563,2564', ',121,122,123' FROM DUAL UNION ALL
SELECT 2, 'ITEM2', NULL, ',6455,545' FROM DUAL UNION ALL
SELECT 3, 'ITEM3', ',1221,1546', NULL FROM DUAL UNION ALL
SELECT 4, 'ITEM4', NULL, NULL FROM DUAL;
Outputs:
ID | NAME | VALUEA | VALUEB
-: | :---- | :----- | :-----
1 | ITEM1 | 2562 | 121
1 | ITEM1 | 2563 | 122
1 | ITEM1 | 2564 | 123
2 | ITEM2 | null | 6455
2 | ITEM2 | null | 545
3 | ITEM3 | 1221 | null
3 | ITEM3 | 1546 | null
4 | ITEM4 | null | null
db<>fiddle here

Unnest(String_to_array) conversion in oracle

I am migrating procedural structure of PostgreSQL code to Oracle. Is there any alternative function present in Oracle for PostgreSQL's unnest(string_to_array)?
select a.finalval
from (select unnest(string_to_array(vturs_id, ',')) as finalval)
Use a table collection expression and, rather than using a delimited string, use a collection or VARRAY (like SYS.ODCINUMBERLIST):
SELECT COLUMN_VALUE as finalval
FROM TABLE( SYS.ODCINUMBERLIST( 1, 2, 4 ) )
outputs:
| FINALVAL |
| -------: |
| 1 |
| 2 |
| 4 |
db<>fiddle here
If you have to use delimited string (don't) then you can use a recursive sub-query factoring clause to parse the string:
WITH test_data ( delimited_string ) AS (
SELECT '1,2,40,-5,72' FROM DUAL
),
bounds ( delimited_string, start_idx, end_idx ) AS (
SELECT delimited_string,
1,
INSTR( delimited_string, ',', 1 )
FROM test_data
UNION ALL
SELECT delimited_string,
end_idx + 1,
INSTR( delimited_string, ',', end_idx + 1 )
FROM bounds
WHERE end_idx > 0
)
SELECT CASE end_idx
WHEN 0
THEN SUBSTR( delimited_string, start_idx )
ELSE SUBSTR( delimited_string, start_idx, end_idx - start_idx )
END AS finalval
FROM bounds;
outputs:
| FINALVAL |
| :------- |
| 1 |
| 2 |
| 40 |
| -5 |
| 72 |
db<>fiddle here

Remove duplicate (combination of 2 columns) values in a row

I have a requirement to remove duplicate values present in a row.
like :
C1 | C2 | C3 | C4 | C5 | C6
----------------------------
1 | 2 | 1 | 2 | 1 | 3
1 | 2 | 1 | 3 | 1 | 4
1 |NULL| 1 |NULL| 1 |NULL
OUTPUT of the query should be:
C1 | C2 | C3 | C4 | C5 | C6
----------------------------
1 | 2 | 1 | 3 |NULL|NULL
1 | 2 | 1 | 3 | 1 | 4
1 |NULL|NULL|NULL|NULL|NULL
As you can see combination of 2 columns should be unique in a row.
in Row 1:
combination of 1/2 is duplicate so its removed and 1/3 is in c5/c6 is moved to c3/c4
in Row 2:
there is no duplicate in the combination of 1/2 , 1/3, 1/4 so no change in the result
in Row 3:
All the 3 combinations are same like 1/NULL is present in all the combinations so c3 to c6 is set to null.
Thanks in advance
Maybe there is a more clever way... but you could convert them to pairs, distinct (union in this case does that), then pivot back.
with pairs as (
select id, c1 as x, c2 as y from mytable
union
select id, c3, c4 from mytable
union
select id, c5, c6 from mytable
)
select id,
max(decode(rn,1,x)) c1,
max(decode(rn,1,y)) c2,
max(decode(rn,2,x)) c3,
max(decode(rn,2,y)) c4,
max(decode(rn,3,x)) c5,
max(decode(rn,3,y)) c6
from (
select id, x, y, row_number() over (partition by id) rn
from pairs
) as foo
group by id
This one works - data included for testing, but might take some time to understand
A tip: un-comment the code snippets under the -- debug lines, copy the script until just these code snippets and paste this part into an SQL prompt to test the intermediate results.
The principle is get a row identifier to "remember" the rows; then to vertically pivot - not 3 columns to one, but 6 columns to 3 pairs of columns; then, use DISTINCT to de-dupe; then get an index within the row identifier of the de-duped intermediate rows; then use that index to pivot horizontally again.
Like so:
WITH
input(c1,c2,c3,c4,c5,c6) AS (
SELECT 1, 2,1, 2,1, 3
UNION ALL SELECT 1, 2,1, 3,1, 4
UNION ALL SELECT 1,NULL::INT,1,NULL::INT,1,NULL::INT
)
,
-- need rowid
input_with_rowid AS (
SELECT ROW_NUMBER() OVER() AS rowid, * FROM input
)
,
-- three groupy of 2 columns, so pivot using 3 indexes
idx3(idx) AS (SELECT 1 UNION SELECT 2 UNION SELECT 3)
,
-- pivot vertically, two columns at a time and de-dupe
pivot_pair AS (
SELECT DISTINCT
rowid
, CASE idx
WHEN 1 THEN c1
WHEN 2 THEN c3
WHEN 3 THEN c5
END AS c1
,
CASE idx
WHEN 1 THEN c2
WHEN 2 THEN c4
WHEN 3 THEN c6
END AS c2
FROM input_with_rowid CROSS JOIN idx3
)
-- debug
-- SELECT * FROM pivot_pair ORDER BY rowid;
,
-- add sequence per rowid
pivot_pair_with_seq AS (
SELECT
rowid
, ROW_NUMBER() OVER(PARTITION BY rowid) AS seq
, c1
, c2
FROM pivot_pair
)
-- debug
-- SELECT * FROM pivot_pair_with_seq;
SELECT
rowid
, MAX(CASE seq WHEN 1 THEN c1 END) AS c1
, MAX(CASE seq WHEN 1 THEN c2 END) AS c2
, MAX(CASE seq WHEN 2 THEN c1 END) AS c3
, MAX(CASE seq WHEN 2 THEN c2 END) AS c4
, MAX(CASE seq WHEN 3 THEN c1 END) AS c5
, MAX(CASE seq WHEN 3 THEN c2 END) AS c6
FROM pivot_pair_with_seq
GROUP BY rowid
ORDER BY rowid
;
rowid|c1|c2|c3|c4|c5|c6
1| 1| 2| 1| 3|- |-
2| 1| 2| 1| 3| 1| 4
3| 1|- |- |- |- |-
Using marcothesane's idea with pivot/unpivot operators. Easier to maintain if more input columns should be deduplicated. This maintains the order of source data (column pairs) - whereas marcothesane's solution might reorder column pairs depening on input data. Also it is a little slower than marcothesane's. It works only in 11R1 and up.
WITH
input(c1,c2,c3,c4,c5,c6) AS (
SELECT 1, 2,1, 2,1, 3 from dual
UNION ALL SELECT 1, 2,1, 3,1, 4 from dual
UNION ALL SELECT 1,NULL ,1,NULL ,1,NULL from dual
)
,
-- need rowid
input_with_rowid AS (
SELECT ROW_NUMBER() OVER (order by 1) AS row_id, input.* FROM input
),
unpivoted_pairs as
(
select row_id, tuple_idx, val1, val2, row_number() over (partition by row_id, val1, val2 order by tuple_idx) as keep_first
from input_with_rowid
UnPivot include nulls(
(val1, val2) --measure
for tuple_idx in ((c1,c2) as 1,
(c3,c4) as 2,
(c5,c6) as 3)
)
)
select row_id,
t1_val1 as c1,
t1_val2 as c2,
t2_val1 as c3,
t2_val2 as c4,
t3_val1 as c5,
t3_val2 as c6
from (
select row_id,
val1, val2, row_number() over (partition by row_id order by tuple_idx) as tuple_order
from unpivoted_pairs
where keep_first = 1
)
pivot (sum(val1) as val1, sum(val2) as val2
for tuple_order in ('1' as t1, '2' as t2, '3' as t3)
)

oracle query to obtain all major version segmented message values

I need to crete one sql Oracle query to obtain the major versions of each segmented message values.
I have the next tables with their relationships already filled with example registers:
*MESSAGE_TABLE*
ID NAME
1 hello
2 bye
*SEGMENT_TABLE*
ID VALUE
1 development
2 production
*MESSAGE_VALUE_TABLE*
ID ID_MESSAGE ID_SEGMENT VERSION VALUE
1 1 1 2 hello
2 1 1 1 hi
3 1 2 1 hi
4 1 null 3 hi
5 1 null 4 hello
6 2 1 1 bye
7 2 1 2 good bye
MESSAGE_VALUE_TABLE UNIQUE_CONSTRAINT is (ID_MESSAGE, ID_SEGMENT, VERSION)
ID_SEGMENT is nullable because null segment indicates default values.
VERSION is a simple number field.
The query has to obtain the major versions of each segmented message values (query results must include the segment value):
Selected result rows from MESSAGE_VALUE_TABLE are:
ID ID_MESSAGE ID_SEGMENT VERSION VALUE
1 1 1 2 hello
3 1 2 1 hi
5 1 null 4 hello
7 2 1 2 good bye
Query return values should be (same order as the previous selected rows list):
NAME(MESSAGE_TABLE) VALUE (SEGMENT_TABLE) VALUE (MESSAGE_VALUE_TABLE)
hello development hello
hello production hi
hello null / empty hello
bye development good bye
The solution is here, thanks to San that did the hard work:
WITH tab AS (SELECT ID,
id_message,
id_segment,
CASE WHEN lead(nvl(id_segment, -1)) over (partition by id_message ORDER BY id_segmento, id_version) IS NULL
THEN 1
WHEN (nvl(id_segment, -1) != lead(nvl(id_segment, -1)) over (partition by id_message ORDER BY id_segmento, id_version))
THEN 1
ELSE 0
END change_ind,
version,
VALUE
FROM MESSAGE_VALUE_TABLE)
SELECT b.NAME, nvl(c.VALUE, 'null/empty'), a.VALUE
FROM tab a
JOIN MESSAGE_TABLE b ON (b.ID=a.id_message)
LEFT OUTER JOIN SEGMENT_TABLE c ON (c.ID=a.id_segment)
WHERE change_ind = 1
As per my understanding, you want something like
you can get first part as
WITH MESSAGE_VALUE_TABLE(ID,ID_MESSAGE,ID_SEGMENT,VERSION,VALUE) as
(select 1,1,1,1,'hi' from dual union all
select 2,1,1,2,'hello' from dual union all
select 3,1,2,1,'hi' from dual union all
select 4,1,null,3,'hi' from dual union all
select 5,1,null,4,'hello' from dual union all
select 6,1,1,1,'hi' from dual union all
select 7,1,1,2,'hello' from dual union all
select 8,1,2,3,'hi' from dual union all
select 9,1,null,1,'hi' from dual union all
select 10,1,null,2,'hello' from dual union all
select 11,2,1,1,'bye' from dual union all
select 12,2,1,2,'good bye' from dual)
------
---End of data
------
SELECT id, id_message, id_segment, version, VALUE
from (
SELECT ID,
id_message,
id_segment,
CASE WHEN lead(nvl(id_segment, -1)) over (partition by id_message ORDER BY ID) IS NULL
THEN 1
WHEN (nvl(id_segment, -1) != lead(nvl(id_segment, -1)) over (partition by id_message ORDER BY ID))
THEN 1
ELSE 0
END change_ind,
version,
VALUE
FROM MESSAGE_VALUE_TABLE)
where change_ind = 1;
Output:
| ID | ID_MESSAGE | ID_SEGMENT | VERSION | VALUE |
|----|------------|------------|---------|----------|
| 2 | 1 | 1 | 2 | hello |
| 3 | 1 | 2 | 1 | hi |
| 5 | 1 | (null) | 4 | hello |
| 7 | 1 | 1 | 2 | hello |
| 8 | 1 | 2 | 3 | hi |
| 10 | 1 | (null) | 2 | hello |
| 12 | 2 | 1 | 2 | good bye |
And second Part as
WITH MESSAGE_VALUE_TABLE(ID,ID_MESSAGE,ID_SEGMENT,VERSION,VALUE) as
(select 1,1,1,1,'hi' from dual union all
select 2,1,1,2,'hello' from dual union all
select 3,1,2,1,'hi' from dual union all
select 4,1,null,3,'hi' from dual union all
select 5,1,null,4,'hello' from dual union all
select 6,1,1,1,'hi' from dual union all
select 7,1,1,2,'hello' from dual union all
select 8,1,2,3,'hi' from dual union all
select 9,1,null,1,'hi' from dual union all
select 10,1,null,2,'hello' from dual union all
select 11,2,1,1,'bye' from dual union all
select 12,2,1,2,'good bye' from dual),
MESSAGE_TABLE(ID,NAME) AS
(SELECT 1 , 'hello' FROM dual UNION ALL
SELECT 2, 'bye' FROM dual),
SEGMENT_TABLE(ID,VALUE) AS
(SELECT 1,'development' FROM dual UNION ALL
SELECT 2,'production' FROM dual),
------
---End of data
------
tab AS (SELECT ID,
id_message,
id_segment,
CASE WHEN lead(nvl(id_segment, -1)) over (partition by id_message ORDER BY ID) IS NULL
THEN 1
WHEN (nvl(id_segment, -1) != lead(nvl(id_segment, -1)) over (partition by id_message ORDER BY ID))
THEN 1
ELSE 0
END change_ind,
version,
VALUE
FROM MESSAGE_VALUE_TABLE)
SELECT b.NAME, nvl(c.VALUE, 'null/empty') C_VALUE, a.VALUE
FROM tab a
JOIN MESSAGE_TABLE b ON (b.ID=a.id_message)
LEFT OUTER JOIN SEGMENT_TABLE c ON (c.ID=a.id_segment)
WHERE change_ind = 1
ORDER BY a.ID
Output:
| NAME | C_VALUE | VALUE |
|-------|-------------|----------|
| hello | development | hello |
| hello | production | hi |
| hello | null/empty | hello |
| hello | development | hello |
| hello | production | hi |
| hello | null/empty | hello |
| bye | development | good bye |
So your final query will be :
WITH tab AS (SELECT ID,
id_message,
id_segment,
CASE WHEN lead(nvl(id_segment, -1)) over (partition by id_message ORDER BY ID) IS NULL
THEN 1
WHEN (nvl(id_segment, -1) != lead(nvl(id_segment, -1)) over (partition by id_message ORDER BY ID))
THEN 1
ELSE 0
END change_ind,
version,
VALUE
FROM MESSAGE_VALUE_TABLE)
SELECT b.NAME, nvl(c.VALUE, 'null/empty'), a.VALUE
FROM tab a
JOIN MESSAGE_TABLE b ON (b.ID=a.id_message)
LEFT OUTER JOIN SEGMENT_TABLE c ON (c.ID=a.id_segment)
WHERE change_ind = 1
ORDER BY a.ID
Not fully sure what you are trying to achieve. Try analytical query.
select message_name, segment_value, message_value from (
select m.name message_name, s.value segment_value, v.value message_value,
dense_rank() over (partition by v.id_message, v.id_segment order by version desc nulls last) version_rank
from message_value_table v
inner join message_table m on v.id_message = m.id
left outer join segment_table s on v.id_segment = s.id
) where version_rank = 1
;
http://sqlfiddle.com/#!4/35aa7/10