Splitting data field String value in Oracle SQl - sql

I have a String Field value in Oracle SQL table. Is there any query so that I can split the string into new lines with certain number of equal characters in each line and the excess characters at the bottom?
eg- ABCDEFGHIJ
I want to have lines with equal number of characters 4 in each line as follows
ABCD
EFGH
IJ
The remainder of 2 letters should be at the bottom. Is it possible to achieve this using an Oracle sql query?

You can use a query like the one below using CONNECT BY and LEVEL based on the length of the string.
WITH d AS (SELECT 'ABCDEFGHIJ' AS str FROM DUAL)
SELECT SUBSTR (str, ((LEVEL - 1) * 4) + 1, 4) AS four_letters
FROM d
CONNECT BY LEVEL < (LENGTH (str) / 4) + 1;

If you have multiple rows, you can use OUTER APPLY with a hierarchical query:
SELECT s.split_value,
s.position
FROM table_name t
OUTER APPLY (
SELECT LEVEL AS position,
SUBSTR( t.value, 4 * LEVEL - 3, 4 ) AS split_value
FROM DUAL
CONNECT BY LEVEL <= CEIL( LENGTH( t.value ) / 4 )
) s
Which, for the sample data:
CREATE TABLE table_name ( value ) AS
SELECT 'ABCDEFGHIJ' FROM DUAL UNION ALL
SELECT '123456789012' FROM DUAL;
Outputs:
SPLIT_VALUE | POSITION
:---------- | -------:
ABCD | 1
EFGH | 2
IJ | 3
1234 | 1
5678 | 2
9012 | 3
db<>fiddle here
Update
if I want to have like 36 words characters in a line how can I modify your 1st answer?
SELECT s.split_value,
s.position
FROM table_name t
OUTER APPLY (
SELECT LEVEL AS position,
SUBSTR( t.value, 36 * ( LEVEL - 1 ) + 1, 36 ) AS split_value
FROM DUAL
CONNECT BY LEVEL <= CEIL( LENGTH( t.value ) / 36 )
) s
Which, for the sample data:
CREATE TABLE table_name ( value ) AS
SELECT 'ABCDEFGHIJ' FROM DUAL UNION ALL
SELECT '________10________20________30________40________50________60________70________80' FROM DUAL;
Outputs:
SPLIT_VALUE | POSITION
:----------------------------------- | -------:
ABCDEFGHIJ | 1
________10________20________30______ | 1
__40________50________60________70__ | 2
______80 | 3
db<>fiddle here

Related

SUBSTR to ADD value in oracle

I have table with column having data in below format in Oracle DB.
COL 1
abc,mno:EMP
xyz:EMP;tyu,opr:PROF
abc,mno:EMP;tyu,opr:PROF
I am trying to convert the data in below format
COL 1
abc:EMP;mno:EMP
xyz:EMP;tyu:PROF;opr:PROF
abc:EMP;mno:EMP;tyu:PROF;opr:PROF
Basically trying to get everything after : and before ; to move it substitute comma with it.
I tried some SUBSTR and LISTAGG but couldn't get anything worth sharing.
Regards.
Here's one option; read comments within code.
SQL> with test (id, col) as
2 -- sample data
3 (select 1, 'abc,mno:EMP' from dual union all
4 select 2, 'xyz:EMP;tyu,opr:PROF' from dual union all
5 select 3, 'abc,mno:EMP;tyu,opr:PROF' from dual
6 ),
7 temp as
8 -- split sample data to rows
9 (select id,
10 column_value cv,
11 regexp_substr(col, '[^;]+', 1, column_value) val
12 from test cross join
13 table(cast(multiset(select level from dual
14 connect by level <= regexp_count(col, ';') + 1
15 ) as sys.odcinumberlist))
16 )
17 -- finally, replace comma with a string that follows a colon sign
18 select id,
19 listagg(replace(val, ',', substr(val, instr(val, ':')) ||';'), ';') within group (order by cv) new_val
20 from temp
21 group by id
22 order by id;
ID NEW_VAL
---------- ----------------------------------------
1 abc:EMP;mno:EMP
2 xyz:EMP;tyu:PROF;opr:PROF
3 abc:EMP;mno:EMP;tyu:PROF;opr:PROF
SQL>
Using the answer of littlefoot, if i were to use cross apply i wouldnt need to cast as multiset...
with test (id, col) as
-- sample data
(select 1, 'abc,mno:EMP' from dual union all
select 2, 'xyz:EMP;tyu,opr:PROF' from dual union all
select 3, 'abc,mno:EMP;tyu,opr:PROF' from dual
),
temp as
-- split sample data to rows
(select id,
column_value cv,
regexp_substr(col, '[^;]+', 1, column_value) val
from test
cross apply (select level as column_value
from dual
connect by level<= regexp_count(col, ';') + 1)
)
-- finally, replace comma with a string that follows a colon sign
select id,
listagg(replace(val, ',', substr(val, instr(val, ':')) ||';'), ';') within group (order by cv) new_val
from temp
group by id
order by id;
You do not need recursive anything, just basic regex: if the pattern is always something,something2:someCode (e.g. you have no colon before the comma), then it would be sufficient.
with test (id, col) as (
select 1, 'abc,mno:EMP' from dual union all
select 2, 'xyz:EMP;tyu,opr:PROF' from dual union all
select 3, 'abc,mno:EMP;tyu,opr:PROF' from dual union all
select 3, 'abc,mno:EMP;tyu,opr:PROF;something:QWE;something2:QWE' from dual
)
select
/*
Grab this groups:
1) Everything before the comma
2) Then everything before the colon
3) And then everything between the colon and a semicolon
Then place group 3 between 1 and 2
*/
trim(trailing ';' from regexp_replace(col || ';', '([^,]+),([^:]+):([^;]+)', '\1:\3;\2:\3')) as res
from test
| RES |
| :------------------------------------------------------------- |
| abc:EMP;mno:EMP |
| xyz:EMP;tyu:PROF;opr:PROF |
| abc:EMP;mno:EMP;tyu:PROF;opr:PROF |
| abc:EMP;mno:EMP;tyu:PROF;opr:PROF;something:QWE;something2:QWE |
db<>fiddle here

Oracle SQL: Extracting multiple text between two characters

i have table like below :
|-------------|---------------------------------------------------|
|ID. | CONTENT |
|-------------|---------------------------------------------------|
|1 |<TITLE> <SUB-TITLE-1> Content <SUB-TITLE-2>Content.
|2 |<TITLE> <SUB-TITLE-1> Content <SUB-TITLE-2>Content.
|3 |<TITLE> <SUB-TITLE-1> Content <SUB-TITLE-2>Content. <SUB-TITLE-3> Content
|-------------|---------------------------------------------------|
I want to extract all text in between <>, so it will become like below :
|-------------|-------------------------------------------------|
|ID. | CONTENT |
|-------------|-------------------------------------------------|
|1 |TITLE |
|1 |SUB-TITLE-1 |
|1 |SUB-TITLE-2 |
|2 |TITLE |
|2 |SUB-TITLE-1 |
|2 |SUB-TITLE-2 |
|3 |TITLE |
|3 |SUB-TITLE-1 |
|3 |SUB-TITLE-2 |
|3 |SUB-TITLE-3 |
|-------------|-------------------------------------------------|
How to achieve this ? I'm trying to do by regex, but I think I'm lost..
My Oracle version is 18c, if that's help...
You can use the 4th argument of REGEXP_SUBSTR to specify an occurrence for matching.
To get a row for the 1st, 2nd, and 3rd occurrence, you can cross-join with a sub-query from dual.
WITH test_data AS (
SELECT 1 AS content_id, '<TITLE> <SUB-TITLE-1> Content<SUB-TITLE-2>Content.<A third sub-title>' AS content_data FROM dual UNION
SELECT 2 AS content_id, '<TITLE> <SUB-TITLE-1> Content<SUB-TITLE-2>Content.' AS content_data FROM dual
)
SELECT t.content_id,
REGEXP_SUBSTR(t.content_data, '<(.*?)>', 1, s.match_occurrence, 'i', 1) AS content_match
FROM test_data t
CROSS JOIN (
SELECT 1 AS match_occurrence FROM dual UNION
SELECT 2 AS match_occurrence FROM dual UNION
SELECT 3 AS match_occurrence FROM dual UNION
SELECT 4 AS match_occurrence FROM dual
/* ... etc, with the number of rows equal to the maximum number of matches that can appear */
) s
WHERE REGEXP_SUBSTR(t.content_data, '<.*?>', 1, s.match_occurrence) IS NOT NULL /* Only return records that have a match for the given occurrence */
ORDER BY t.content_id, s.match_occurrence
Borrowing the CONNECT_BY_LEVEL from Barbaros' excellent answer, you could do it more concisely as:
WITH test_data AS (
SELECT 1 AS content_id, '<TITLE> <SUB-TITLE-1> Content<SUB-TITLE-2>Content.<A third sub-title>' AS content_data FROM dual UNION
SELECT 2 AS content_id, '<TITLE> <SUB-TITLE-1> Content<SUB-TITLE-2>Content.' AS content_data FROM dual
)
SELECT t.content_id,
REGEXP_SUBSTR(t.content_data, '<(.*?)>', 1, LEVEL, 'i', 1) AS content_match
FROM test_data t
CONNECT BY
LEVEL <= REGEXP_COUNT(t.content_data, '<.*?>')
AND PRIOR sys_guid() IS NOT NULL
AND PRIOR content_id = content_id
ORDER BY t.content_id, LEVEL
Note that the CONNECT_BY_LEVEL method might be slower on large datasets, so I would avoid that if performance is a concern.
One option would be using instr() and substr() functions together within a
SELECT .. FROM ..CONNECT BY level style query in order to repeat through counting the numbers of > (or <) signs within each strings :
SELECT id, substr(content,
instr(content,'<',1,level)+1,
instr(content,'>',1,level)-instr(content,'<',1,level)-1) as content
FROM tab
CONNECT BY level <= regexp_count(content,'>')
AND PRIOR sys_guid() IS NOT NULL
AND PRIOR id = id
Demo
I had tried the more conventional way using SUBSTR and INSTR
With data as
Select column1,
Trim(CONTENT, '<', '>') as col2 FROM
TABLE
WITH subdata as
( Select column1,
SUBSTR(col2,0, INSTR(col2, ' '))
as s1
from
data) t1
Union
( Select t1.column1 as col1,
SUBSTR(col2, Length(t1.s1)+1
INSTR(
SUBSTR(
t1.col2, Length(t1.s1)+1,
LENGTH(col2)), ' '))) as col2
From
data) t3
Union
........ t3.... t4
From table
Perfect approach, #Josh Eller !
Only that #Gerry Gry needs it without the greater/smaller signs.
Try with grouping using parentheses?
WITH test_data(content_id,content_data) AS (
SELECT 1 , '<TITLE> <SUB-TITLE-1> Content<SUB-TITLE-2>Content.<SUB-TITLE-3>Content.' FROM dual
UNION SELECT 2 , '<TITLE> <SUB-TITLE-1> Content<SUB-TITLE-2>Content.' FROM dual
)
SELECT t.content_id
, match_occurrence
, REGEXP_SUBSTR(
t.content_data -- input string
, '[<]([^>]*)[>]' -- regex
, 1 -- starting position
, s.match_occurrence -- n-th occurrence
, '' -- regexp modifier
, 1 -- captured-subexp
) AS content_match
FROM test_data t
CROSS JOIN (
SELECT 1 FROM dual
UNION SELECT 2 FROM dual
UNION SELECT 3 FROM dual
UNION SELECT 4 FROM dual
) s(match_occurrence)
WHERE
REGEXP_SUBSTR(
t.content_data -- input string
, '[<]([^>]*)[>]' -- regex
, 1 -- starting position
, s.match_occurrence -- n-th occurrence
, '' -- regexp modifier
, 1 -- captured-subexp
)
IS NOT NULL
ORDER BY t.content_id, s.match_occurrence
;
-- out Time: First fetch (0 rows): 0.656 ms. All rows formatted: 0.667 ms
-- out content_id | match_occurrence | content_match
-- out ------------+------------------+---------------
-- out 1 | 1 | TITLE
-- out 1 | 2 | SUB-TITLE-1
-- out 1 | 3 | SUB-TITLE-2
-- out 1 | 4 | SUB-TITLE-3
-- out 2 | 1 | TITLE
-- out 2 | 2 | SUB-TITLE-1
-- out 2 | 3 | SUB-TITLE-2
-- out (7 rows)
-- out
-- out Time: First fetch (7 rows): 29.904 ms. All rows formatted: 29.947 ms

Get list of what special characters and how many times in oracle column

I'm searching for a way to get a list of special characters and how many times they appear in my column. I've tried using using regexp_count which works, but I'm not sure how to extend it to make it work for all special characters in one query.
For example for syntax = 'x=y*100' with the following query I get
select *
from (
select regexp_count(syntax, '\*') as charCnt, syntax
from tblTemp
)
where charCnt > 0
charCnt=1 and syntax='x=y*100'.
Which is correct but I want to be able to get back
specChar Cnt
\* 1
= 1
etc..
Oracle Setup:
CREATE TABLE table_name(
id INT,
value NVARCHAR2(200)
);
INSERT INTO table_name
SELECT 1, N'y=20x+3' FROM DUAL UNION ALL
SELECT 2, N'***^%$%$%*&*.&\?' FROM DUAL UNION ALL
SELECT 3, UNISTR('\00B5\00B6\00B5') FROM DUAL UNION ALL
SELECT 4, N'!"£$%^&*()!"£$%^&*()!"£$%^&*()!"£$%^&*()!"£$%^&*()'
|| N'!"£$%^&*()!"£$%^&*()!"£$%^&*()!"£$%^&*()!"£$%^&*()'
|| N'!"£$%^&*()!"£$%^&*()!"£$%^&*()!"£$%^&*()!"£$%^&*()'
|| N'!"£$%^&*()!"£$%^&*()!"£$%^&*()!"£$%^&*()!"£$%^&*()' FROM DUAL;
CREATE OR REPLACE TYPE CHAR_LIST IS TABLE OF CHAR(1 CHAR);
/
Query:
SELECT t.id,
--MAX( t.value ) AS value,
CAST( c.COLUMN_VALUE AS CHAR(1 CHAR) ) AS character,
COUNT(1) AS frequency
FROM table_name t,
TABLE(
CAST(
MULTISET(
SELECT SUBSTR( t.value, LEVEL, 1 )
FROM DUAL
WHERE REGEXP_LIKE( SUBSTR( t.value, LEVEL, 1 ), '[^a-zA-Z0-9]' )
CONNECT BY LEVEL <= LENGTH( t.value )
) AS CHAR_LIST
)
) c
GROUP BY t.id, c.COLUMN_VALUE
ORDER BY id, character;
Output:
ID CHARACTER FREQUENCY
---------- --------- ----------
1 + 1
1 = 1
2 $ 2
2 % 3
2 & 2
2 * 5
2 . 1
2 ? 1
2 \ 1
2 ^ 1
3 µ 2
3 ¶ 1
4 ! 20
4 " 20
4 $ 20
4 % 20
4 & 20
4 ( 20
4 ) 20
4 * 20
4 ^ 20
4 £ 20

Oracle SQL, Need to repeat a number sequence based on the count of regexp from another column

I am having csv data in column named component_id in form of 800230,6015,6312,6315,700255,800170,
using the count of the comma I need to generated a unique sequence for the component_id in another column named component_instance_id.
e.g. 1, 2, 3, 4, 5, 6
+-------------------------------------+----------------+
| Component_id | Component |
+-------------------------------------+----------------+
| 800230,6015,6312,6315,700255,800170 | 1,2,3,4,5,6 |
| 800230,6015,6312,6315,700255,800170 | 7,8,9,10,11,12 |
| 800230,6015,6312,6315 | 13,14,15,16 |
+-------------------------------------+----------------+
You could use :
REGEXP_COUNT : To count the number of occurrences of comma
REGEXP_SUBSTR : To split the delimited string into rows
CONNECT BY : For row generation
LISTAGG : For string aggregation
See Split comma delimited strings in a table in Oracle
For example,
SQL> WITH t_1 AS
2 ( SELECT '800230,6015,6312,6315,700255,800170' component_id FROM dual
3 UNION ALL
4 SELECT '800230,6015,6312,6315,700255,800170' FROM dual
5 UNION ALL
6 SELECT '800230,6015,6312,6315' FROM dual
7 ),
8 t AS
9 ( SELECT ROWNUM ID, component_id FROM t_1
10 )
11 SELECT listagg(text, ',') WITHIN GROUP(
12 ORDER BY cv) component_id,
13 listagg(rn, ',') WITHIN GROUP(
14 ORDER BY cv) component_instance_id
15 FROM
16 (SELECT id, lines.column_value cv,
17 rownum rn,
18 trim(regexp_substr(component_id, '[^,]+', 1, lines.column_value)) text
19 FROM t,
20 TABLE (CAST (MULTISET
21 (SELECT LEVEL FROM dual CONNECT BY LEVEL <= regexp_count(component_id, ',')+1
22 ) AS sys.odciNumberList ) ) lines
23 ORDER BY id
24 )
25 GROUP BY id
26 /
COMPONENT_ID COMPONENT_INSTANCE_ID
----------------------------------- ------------------------------------------------
800230,6015,6312,6315,700255,800170 1,2,3,4,5,6
800230,6015,6312,6315,700255,800170 7,8,9,10,11,12
800230,6015,6312,6315 13,14,15,16
SQL>
NOTE : The WITH clause is only to build the sample data for demonstration. You need to only use the query and rename the table names per your requirement.

Preserve order when converting a delimited string to a column

I want to preserve the record order, which is provided as comma delimited string. The 5th item in by delimited string is a null. I need the 5th row to be null as well.
with test as
(select 'ABC,DEF,GHI,JKL,,MNO' str from dual
)
select rownum, regexp_substr (str, '[^,]+', 1, rownum) split
from test
connect by level <= length (regexp_replace (str, '[^,]+' )) + 1
The current result I'm getting puts this in the 6th position:
1 ABC
2 DEF
3 GHI
4 JKL
5 MNO
6
Order is preserved by your expression, but your regular expression doesn't match nulls correctly, so the 5th item disappears. The 6th row is a NULL because there are no more match after the 5th match.
You could do this instead:
SQL> with test as
2 (select 'ABC,DEF,GHI,JKL,,MNO' str from dual
3 )
4 SELECT rownum,
5 rtrim(regexp_substr(str || ',', '[^,]*,', 1, rownum), ',') split
6 FROM test
7 CONNECT BY LEVEL <= length(regexp_replace(str, '[^,]+')) + 1;
ROWNUM SPLIT
---------- ---------------------------------------------------------------
1 ABC
2 DEF
3 GHI
4 JKL
5
6 MNO
6 rows selected
Or this:
SQL> with test as
2 (select 'ABC,DEF,GHI,JKL,,MNO' str from dual
3 )
4 SELECT rownum,
5 regexp_substr(str, '([^,]*)(,|$)', 1, rownum, 'i', 1) split
6 FROM test
7 CONNECT BY LEVEL <= length(regexp_replace(str, '[^,]+')) + 1;
ROWNUM SPLIT
---------- ------------------------------------------------------------
1 ABC
2 DEF
3 GHI
4 JKL
5
6 MNO
6 rows selected
Try Something like this:
SELECT
STR,
REPLACE ( SUBSTR ( STR,
CASE LEVEL
WHEN 1
THEN
0
ELSE
INSTR ( STR,
'~',
1,
LEVEL
- 1 )
END
+ 1,
1 ),
'~' )
FROM
(SELECT 'A~~C~~E' AS STR FROM DUAL)
CONNECT BY
LEVEL <= LENGTH ( REGEXP_REPLACE ( STR,
'[^~]+' ) )
+ 1;
This one works..
SELECT
ROWNUM,
CAST ( REGEXP_SUBSTR ( STR,
'(.*?)(,|$)',
1,
LEVEL,
NULL,
1 ) AS CHAR ( 12 ) )
OUTPUT
FROM
(SELECT 'ABC,DEF,GHI,JKL,,MNO' AS STR FROM DUAL)
CONNECT BY
LEVEL <= REGEXP_COUNT ( STR,
',' )
+ 1;