I have the following SQL question:
How to divide a column (text inside) using the SELECT command into two separate columns with split text?
I need to separate the text-data, using the space character.
I know it is better to put an example to make it easy. So:
SELECT COLUMN_A FROM TABLE1
output:
COLUMN_A
-----------
LORE IPSUM
desired output:
COLUMN_A COLUMN_B
--------- ----------
LORE IPSUM
Thank you all for the help.
Depends on the consistency of the data - assuming a single space is the separator between what you want to appear in column one vs two:
WITH TEST_DATA AS
(SELECT 'LOREM IPSUM' COLUMN_A FROM DUAL)
SELECT SUBSTR(t.COLUMN_A, 1, INSTR(t.COLUMN_A, ' ')-1) AS COLUMN_A,
SUBSTR(t.COLUMN_A, INSTR(t.COLUMN_A, ' ')+1) AS COLUMN_B
FROM test_data T;
You can also use below query with REGEX:
WITH TEST_DATA AS
(SELECT 'LOREM IPSUM' COLUMN_A FROM DUAL)
SELECT REGEXP_SUBSTR(t.COLUMN_A, '[^ ]+', 1, 1) COLUMN_A,
REGEXP_SUBSTR(t.COLUMN_A, '[^ ]+', 1, 2) COLUMN_B
FROM test_data T;
Oracle 10g+ has regex support, allowing more flexibility depending on the situation you need to solve. It also has a regex substring method...
EDIT:
3 WORDS SPLIT:
WITH TEST_DATA AS
(SELECT 'LOREM IPSUM DIMSUM' COLUMN_A FROM DUAL)
SELECT REGEXP_SUBSTR(t.COLUMN_A, '[^ ]+', 1, 1) COLUMN_A,
REGEXP_SUBSTR(t.COLUMN_A, '[^ ]+', 1, 2) COLUMN_B,
REGEXP_SUBSTR(t.COLUMN_A, '[^ ]+', 2, 3) COLUMN_C
FROM test_data T;
Reference:
SUBSTR
INSTR
The solution can be generalized using a counter and the PIVOT operator, the counter to get the word number and the PIVOT to change rows to columns
WITH Counter (N) AS (
SELECT LEVEL FROM DUAL
CONNECT BY LEVEL <= (SELECT MAX(regexp_count( COLUMN_A, ' ')) + 1
FROM Table1)
)
SELECT Word_1, Word_2, Word_3, Word_4
FROM (SELECT t.COLUMN_A
, c.N N
, REGEXP_SUBSTR(t.COLUMN_A, '[^ ]+', 1, c.N) Word
FROM Table1 t
LEFT JOIN Counter c ON c.N <= regexp_count( COLUMN_A, ' ') + 1) b
PIVOT
(MAX(Word) FOR N IN (1 Word_1, 2 Word_2, 3 Word_3, 4 Word_4)) pvt
SQLFiddle demo
But that have a fixed columns list in the PIVOT definition, to really have a general query a dynamic pivot or a PIVOT XML is needed
INSERT INTO Rough (Tag_Id,Status_ , new_)
WITH TEST_DATA AS
(SELECT regexp_replace('&data' ,'\s+',' ') COLUMN_A FROM DUAL)
SELECT REGEXP_SUBSTR (REGEXP_SUBSTR (t.COLUMN_A, '[^-]+', 1, LEVEL), '[^ ]+', 1, 1) AS Col1,
REGEXP_SUBSTR (REGEXP_SUBSTR (t.COLUMN_A, '[^-]+', 1, LEVEL), '[^ ]+', 1, 2) AS Col2,
REGEXP_SUBSTR (REGEXP_SUBSTR (t.COLUMN_A, '[^-]+', 1, LEVEL), '[^ ]+', 1, 3) AS Col3
FROM test_data T
CONNECT BY LEVEL <= LENGTH (REGEXP_REPLACE (t.COLUMN_A, '[^-]+')) + 1;
Related
I am struggle with regex to split spring into columns in Oracle database.
select (REGEXP_SUBSTR(replace('1:::9999', ' ',''), '[^: ]+', 1, 4)) from dual;
I need to obtain 4th value from that string as a column value, sometimes values at position 2,3 are empty and my query doesn't work. I am trying to figure out what regex will work
You can use
select (REGEXP_SUBSTR(replace('1:::9999', ' ',''), '([^: ]*)(:|$)', 1, 4, 'i', 1)) from dual;
Here, the ([^: ]*)(:|$) matches
([^: ]*) - Group 1: any zero or more chars other than : and space
(:|$) - Group 2, either : or end of string.
You do not need a (slower) regex for this task, use simple substr/instr functions:
with input_(val) as (
select '1:::9999' from dual
union all
select '1:2::' from dual
union all
select '1:2::3:5' from dual
)
, replaced as (
select input_.*, replace(val, ' ', '') as val_replaced
from input_
)
select
val,
substr(
val_replaced,
/*Locate the first occurrence of a colon and get a substring ...*/
instr(val_replaced, ':', 1, 3) + 1,
/*.. until the end, if the next colon is absent, or until the next colon*/
nvl(nullif(instr(val_replaced, ':', 1, 4), 0), length(val_replaced) + 1) - instr(val_replaced, ':', 1, 3) - 1
) as col
from replaced
VAL
COL
1:::9999
9999
1:2::
null
1:2::3:5
3
fiddle with performance difference.
Example
123\.456.578.910.ABC
123\.456.578.910
Expected result
123\.456.578
123\.456.578
For the both the inputs I should get only the first 3
I tried the regexp and substring and instr but I’m not getting the results
We can use REGEXP_SUBSTR here with a capture group:
SELECT REGEXP_SUBSTR(col, '^(\d+(\.\d+)*)', 1, 1, NULL, 1)
FROM yourTable;
Demo
Traditional, substr + instr combination is another option:
Sample data:
SQL> with test (col) as
2 (select '123\.456.578.910.ABC' from dual union all
3 select '123\.456.578.910' from dual
4 )
Query begins here:
5 select col,
6 substr(col, 1, instr(col, '.', 1, 3) - 1) result
7 from test;
COL RESULT
-------------------- --------------------
123\.456.578.910.ABC 123\.456.578
123\.456.578.910 123\.456.578
SQL>
If you value will always have at least 3 . characters then you can use:
SELECT value,
SUBSTR(value, 1, INSTR(value, '.', 1, 3) - 1) AS expected
FROM table_name;
If it may have fewer and you want the entire string in those cases then:
SELECT value,
CASE INSTR(value, '.', 1, 3)
WHEN 0
THEN value
ELSE SUBSTR(value, 1, INSTR(value, '.', 1, 3) - 1)
END AS expected
FROM table_name;
Which, for your sample data:
CREATE TABLE table_name (value) AS
SELECT '123\.456.578.910.ABC' FROM DUAL UNION ALL
SELECT '123\.456.578.910' FROM DUAL;
Both outputs:
VALUE
EXPECTED
123.456.578.910.ABC
123.456.578
123.456.578.910
123.456.578
db<>fiddle here
i want to split this into 2019/GA/0000104
select REGEXP_SUBSTR('2019/0000015,2019/GA/0000104,2cdb376e-2966-4f24-9063-f4c6f31a6f35', '[^,]+')
from dual;
Output = 2019/GA/0000104
can u guys help?
It seems that you want to extract the second substring. If that's so, then you could use
regexp_substr (result), or
substr + inenter code herestr combination (result2)
SQL> with test (col) as
2 (select '2019/0000015,2019/GA/0000104,2cdb376e-2966-4f24-9063-f4c6f31a6f35' from dual)
3 select regexp_substr(col, '[^,]+', 1, 2) result,
4 --
5 substr(col, instr(col, ',', 1, 1) + 1,
6 instr(col, ',', 1, 2) - instr(col, ',', 1, 1) - 1
7 ) result2
8 from test;
RESULT RESULT2
--------------- ---------------
2019/GA/0000104 2019/GA/0000104
SQL>
Try using REGEXP_SUBSTR with a capture group:
SELECT
REGEXP_SUBSTR(input, ',(.*),', 1, 1, NULL, 1)
FROM yourTable;
Demo
This form of the regex returns the second occurrence of a string of characters that are followed by a comma or the end of the line. It returns the correct element if the first one should ever be NULL.
with tbl(str) as (
select '2019/0000015,2019/GA/0000104,2cdb376e-2966-4f24-9063-f4c6f31a6f35' from dual
)
select regexp_substr(str, '(.*?)(,|$)', 1, 2, NULL, 1)
from tbl;
I need to split a text like "Aa:One|Bb:Two,Three,Four|Cc:Five,Six" into rows and columns for the result to look like -
Col1 Col2
AA One
Bb Two
Bb Three
Bb Four
Cc Five
Cc Six
I have tried using
SELECT REGEXP_SUBSTR (str, '[^:]+', 1, 1) AS COL1
,REGEXP_SUBSTR (str, '[^:]+', 1, 2) AS COL2
FROM (SELECT REGEXP_SUBSTR('Aa:One|Bb:Two,Three,Four|Cc:Five,Six', '[^|]+', 1, LEVEL) AS str
FROM DUAL
CONNECT BY INSTR('Aa:One|Bb:Two,Three,Four|Cc:Five,Six', '|', 1, LEVEL - 1) > 0
)
But I could only create
Col1 Col2
Aa One
Bb Two,Three,Four
Cc Five,Six
I am not sure how to split it further with comma(,) in Col2 into rows against the Col1 value's first occurrence
Any help in this regard would be greatly appreciated.
Thanks in advance! :-)
I just did this for SQL Server. You can get the same result from ORACLE using the alternates for the Functions in ORACLE.
DECLARE #Id VARCHAR(100)='Aa:One|Bb:Two,Three,Four|Cc:Five,Six'
SELECT LEFT(items, CHARINDEX(':',items)-1)Col1
,RIGHT(items, CHARINDEX(':',REVERSE(items))-1)Col2
INTO #Temp
FROM dbo.Split(#Id,'|')
SELECT p.Col1, colortable.items as Col2
FROM #Temp p
cross apply split(p.Col2, ',') as colortable
DROP TABLE #Temp
And my Result is:
Col1 Col2
Aa One
Bb Two
Bb Three
Bb Four
Cc Five
Cc Six
I used the temp table logic 'WITH t AS ()' to create the result set with the below query -
WITH t AS (
SELECT TRIM(REGEXP_SUBSTR (str, '[^:]+', 1, 1)) AS tempCol1
,TRIM(REGEXP_SUBSTR (str, '[^:]+', 1, 2)) AS tempCol2
FROM (
SELECT TRIM(REGEXP_SUBSTR('Aa:One|Bb:Two,Three,Four|Cc:Five,Six|Dd:Seven, Eight, Nine, Ten', '[^|]+', 1, LEVEL)) str
FROM DUAL
CONNECT BY INSTR('Aa:One|Bb:Two,Three,Four|Cc:Five,Six|Dd:Seven, Eight, Nine, Ten', '|', 1, LEVEL - 1) > 0)
)
SELECT DISTINCT(Col1)
, Col2
FROM (
SELECT t.tempCol1 AS Col1
,TRIM(REGEXP_SUBSTR(t.tempCol2, '[^,]+', 1, LEVEL)) Col2
FROM t
CONNECT BY INSTR(t.tempCol2, ',', 1, LEVEL - 1) > 0)
ORDER BY Col1, Col2
Apparently, I had to use 'Distinct' for Col1 as the query would fetch duplicate rows.
Thanks all for the help ! :)
I have a table from where I need to get only some part of record with comma after one part of record.
for example I have
ABCD [1000-1987] BCD[101928-876] adgs[10987-786]
I want to get the record like :
1000-1987,101928-876,10987-786
Can you please help me out to get the record as mentioned.
If you don't use 11g and do not want to use wm_concat:
WITH
my_data AS (
SELECT 'ABCD [1000-1987] BCD[101928-876] adgs[10987-786]' AS val FROM dual
)
SELECT
ltrim(
MAX(
sys_connect_by_path(
rtrim(ltrim(regexp_substr(val, '\[[0-9-]*\]', 1, level, NULL), '['), ']'),
',')
),
',') AS val_part
FROM my_data
CONNECT BY regexp_substr(val, '\[[0-9-]*\]', 1, level, NULL) IS NOT NULL
;
If using wm_concat is ok for you:
WITH
my_data AS (
SELECT 'ABCD [1000-1987] BCD[101928-876] adgs[10987-786]' AS val FROM dual
)
SELECT
wm_concat(rtrim(ltrim(regexp_substr(val, '\[[0-9-]*\]', 1, level, NULL), '['), ']')) AS val_part
FROM my_data
CONNECT BY regexp_substr(val, '\[[0-9-]*\]', 1, level, NULL) IS NOT NULL
;
If you use 11g:
WITH
my_data AS (
SELECT 'ABCD [1000-1987] BCD[101928-876] adgs[10987-786]' AS val FROM dual
)
SELECT
listagg(regexp_substr(val, '[a-b ]*\[([0-9-]*)\] ?', 1, level, 'i', 1), ',') WITHIN GROUP (ORDER BY 1) AS val_part
FROM my_data
CONNECT BY regexp_substr(val, '[a-b ]*\[([0-9-]*)\] ?', 1, level, 'i', 1) IS NOT NULL
;
Read more about string aggregation techniques: Tim Hall about aggregation techniques
Read more about regexp_substr: regexp_substr - Oracle Documentation - 10g
Read more about regexp_substr: regexp_substr - Oracle Documentation - 11g
You don't have to split and then aggregate it. You can use regexp_replace to keep only those characters within square brackets, then replace the square brackets by comma.
WITH my_data
AS (SELECT 'ABCD [1000-1987] BCD[101928-876] adgs[10987-786]' AS val
FROM DUAL)
SELECT RTRIM (
REPLACE (
REGEXP_REPLACE (val, '(\[)(.*?\])|(.)', '\2'),
']', ','),
',')
FROM my_data;