Split a String which do not have a delimiter in Oracle - sql

I have been beating my head around a problem. Following is the input string
1034536455702130340053769240340002208520191202134036
What I need to do is split this string into the following
03453645570
03400537692
03400022085
Here, every string that needs to get picked starts with a '03'.
I can do it with a PL/SQL code, by picking each substring starting from a '03' in a loop, then concatenating each value after removing extra characters from left and right and getting only 11 characters in each iteration. And then use REGEXP_SUBSTR to get desired result. However, this approach involves too much code. Is there a way by which this can be achieved using an SQL query?
SELECT UPPER (
REGEXP_SUBSTR ('03453645570,03400537692,03400022085',
'[^,]+',
1,
LEVEL))
AS VAL
FROM DUAL
CONNECT BY REGEXP_SUBSTR ('03453645570,03400537692,03400022085',
'[^,]+',
1,
LEVEL)
IS NOT NULL

You can use your existing code with the original input string, and just change the regex to match 03 followed by 9 digits:
SELECT REGEXP_SUBSTR ('1034536455702130340053769240340002208520191202134036',
'03[0-9]{9}',
1,
LEVEL)
AS VAL
FROM DUAL
CONNECT BY REGEXP_SUBSTR ('1034536455702130340053769240340002208520191202134036',
'03[0-9]{9}',
1,
LEVEL)
IS NOT NULL
Output
VAL
03453645570
03400537692
03400022085
Demo on dbfiddle

Using #Nicks solution. Following is the code that I have used to optimize it. I have evaluated only 1k records and it takes less than 1 seconds. I hope it helps.
--Table to store all sorts of strings
INSERT INTO TBL_SCRM_MSISDN
SELECT '0342244357903452274515236320191201091147' NUMBERS FROM DUAL
UNION
SELECT '03457064700420191201124242' FROM DUAL
UNION
SELECT '03414221723620191201130431' FROM DUAL
UNION
SELECT '1034536455702130340053769240340002208520191202134036' FROM DUAL;
-- Table used to store unique values
create table TBL_MSISDN
(
msisdn VARCHAR2(500)
);
--Using a loop to evaluate a single value one at a time
BEGIN
EXECUTE IMMEDIATE 'TRUNCATE TABLE TBL_MSISDN';
FOR C IN
(
SELECT * FROM TBL_SCRM_MSISDN
)
LOOP
INSERT INTO TBL_MSISDN
SELECT REGEXP_SUBSTR (C.NUMBERS,
'03[0-9]{9}',
1,
LEVEL)
AS VAL
FROM DUAL
CONNECT BY REGEXP_SUBSTR (C.NUMBERS,
'03[0-9]{9}',
1,
LEVEL) is not null;
END LOOP;
commit;
END;
/
SELECT * FROM TBL_MSISDN WHERE MSISDN IS NOT NULL;

Try Below query
select trim(regexp_substr('1$03453645570213$034005376924$0340002208520191202134$036','[^$]+', 1, level) ) value, level
from dual
connect by regexp_substr('103453645570213$034005376924$0340002208520191202134$036', '[^$]+', 1, level) is not null
order by level;

Related

How do I only have to input data once?

I am currently using the regexp_subtr function to take a string of data (user input) and convert it into a list. Is there a way I can make it so I don't have to input the data twice. This is what I currently have:
select regexp_substr((WITH X AS (SELECT ('&EmployeeID') a FROM DUAL) SELECT REPLACE(X.a,' ',',') FROM X),'[^,]+', 1, level) "Employee"
from dual
connect by regexp_substr ((WITH X AS (SELECT ('&EmployeeID') a FROM DUAL) SELECT REPLACE(X.a,' ',',') FROM X),'[^,]+', 1, level)
is not null;
Yes, you can clean it up a bit like this:
WITH X AS
(SELECT REPLACE('&EmployeeID',' ',',') a FROM DUAL)
select regexp_substr(X.a,'[^,]+', 1, level) "Employee"
from X
connect by regexp_substr(X.a,'[^,]+', 1, level) is not null;
Using xmltable with tokenize:
select *
from xmltable('tokenize(replace(.," ",","),",")'
passing '&EmployeeID'
columns s varchar2(100) path '.');
DBFiddle: https://dbfiddle.uk/?rdbms=oracle_21&fiddle=19ab9c444502f6e1fd3bdaa44e04ab27
It appears your client is sqlplus, where the ampersand ('&') is an indication to the client to prompt for user input. You can avoid being re-prompted for the same variable by adding a second ampersand in front of the variable name.
select regexp_substr((WITH X AS (SELECT ('&&EmployeeID') a FROM DUAL) SELECT REPLACE(X.a,' ',',') FROM X),'[^,]+', 1, level) "Employee"
from dual
connect by regexp_substr ((WITH X AS (SELECT ('&&EmployeeID') a FROM DUAL) SELECT REPLACE(X.a,' ',',') FROM X),'[^,]+', 1, level)
is not null;

Get substring with REGEXP_SUBSTR

I need to use regexp_substr, but I can't use it properly
I have column (l.id) with numbers, for example:
1234567891123!123 EXPECTED OUTPUT: 1234567891123
123456789112!123 EXPECTED OUTPUT: 123456789112
12345678911!123 EXPECTED OUTPUT: 12345678911
1234567891123!123 EXPECTED OUTPUT: 1234567891123
I want use regexp_substr before the exclamation mark (!)
SELECT REGEXP_SUBSTR(l.id,'[%!]',1,13) from l.table
is it ok ?
You can try using INSTR() and substr()
DEMO
select substr(l.id,1,INSTR(l.id,'!', 1, 1)-1) from dual
You want to remove the exclamation mark and all following characters it seems. That is simply:
select regexp_replace(id, '!.*', '') from mytable;
Look at it like a delimited string where the bang is the delimiter and you want the first element, even if it is NULL. Make sure to test all possibilities, even the unexpected ones (ALWAYS expect the unexpected)! Here the assumption is if there is no delimiter you'll want what's there.
The regex returns the first element followed by a bang or the end of the line. Note this form of the regex handles a NULL first element.
SQL> with tbl(id, str) as (
select 1, '1234567891123!123' from dual union all
select 2, '123456789112!123' from dual union all
select 3, '12345678911!123' from dual union all
select 4, '1234567891123!123' from dual union all
select 5, '!123' from dual union all
select 6, '123!' from dual union all
select 7, '' from dual union all
select 8, '12345' from dual
)
select id, regexp_substr(str, '(.*?)(!|$)', 1, 1, NULL, 1)
from tbl
order by id;
ID REGEXP_SUBSTR(STR
---------- -----------------
1 1234567891123
2 123456789112
3 12345678911
4 1234567891123
5
6 123
7
8 12345
8 rows selected.
SQL>
If you like to use REGEXP_SUBSTR rather than regexp_replace then you can use
SELECT REGEXP_SUBSTR(l.id,'^\d+')
assuming you have only numbers before !
If I understand correctly, this is the pattern that you want:
SELECT REGEXP_SUBSTR(l.id,'^[^!]+', 1)
FROM (SELECT '1234567891123!123' as id from dual) l

Retrieve certain number from data set in Oracle 10g

1. <0,0><120.96,2000><241.92,4000><362.88,INF>
2. <0,0><143.64,2000><241.92,4000><362.88,INF>
3. <0,0><125.5,2000><241.92,4000><362.88,INF>
4. <0,0><127.5,2000><241.92,4000><362.88,INF>
Above is the data set I have in Oracle 10g. I need output as below
1. 120.96
2. 143.64
3. 125.5
4. 125.5
the output I want is only before "comma" (120.96). I tried using REGEXP_SUBSTR but I could not get any output. It will be really helpful if someone could provide effective way to solve this
Here is one method that first parses out the second element and then gets the first number in it:
select regexp_substr(regexp_substr(x, '<[^>]*>', 1, 2), '[0-9.]+', 1, 1)
Another method just gets the third number in the string:
select regexp_substr(x, '[0-9.]+', 1, 3)
Here is an approach without using Regexp.
Find the index of second occurrence of '<'. Then find the second occurrence of ',' use those values in substring.
with
data as
(
select '<0,0><120.96,2000><241.92,4000><362.88,INF>' x from dual
UNION ALL
select '<0,0><143.64,2000><241.92,4000><362.88,INF>' x from dual
UNION ALL
select '<0,0><125.5,2000><241.92,4000><362.88,INF>' from dual
)
select substr(x, instr(x,'<',1,2)+1, instr(x,',',1,2)- instr(x,'<',1,2)-1)
from data
Approach Using Regexp:
Identify the 2nd occurence of numerical value followed by a comma
Then remove the trailing comma.
with
data as
(
select '<0,0><120.96,2000><241.92,4000><362.88,INF>' x from dual
UNION ALL
select '<0,0><143.64,2000><241.92,4000><362.88,INF>' x from dual
UNION ALL
select '<0,0><125.5,2000><241.92,4000><362.88,INF>' from dual
)
select
trim(TRAILING ',' FROM regexp_substr(x,'[0-9.]+,',1,2))
from data
This example uses regexp_substr to get the string contained within the 2nd occurance of a less than sign and a comma:
SQL> with tbl(id, str) as (
select 1, '<0,0><120.96,2000><241.92,4000><362.88,INF>' from dual union
select 2, '<0,0><143.64,2000><241.92,4000><362.88,INF>' from dual union
select 3, '<0,0><125.5,2000><241.92,4000><362.88,INF>' from dual union
select 4, '<0,0><127.5,2000><241.92,4000><362.88,INF>' from dual
)
select id,
regexp_substr(str, '<(.*?),', 1, 2, null, 1) value
from tbl;
ID VALUE
---------- -------------------------------------------
1 120.96
2 143.64
3 125.5
4 127.5
EDIT: I realized the OP specified 10g and the regexp_substr example I gave used the 6th argument (subgroup) which was added in 11g. Here is an example using regexp_replace instead which should work with 10g:
SQL> with tbl(id, str) as (
select 1, '<0,0><120.96,2000><241.92,4000><362.88,INF>' from dual union
select 2, '<0,0><143.64,2000><241.92,4000><362.88,INF>' from dual union
select 3, '<0,0><125.5,2000><241.92,4000><362.88,INF>' from dual union
select 4, '<0,0><127.5,2000><241.92,4000><362.88,INF>' from dual
)
select id,
regexp_replace(str, '^(.*?)><(.*?),.*$', '\2') value
from tbl;
ID VALUE
---------- ----------
1 120.96
2 143.64
3 125.5
4 127.5
SQL>

Regexp_replace processing result

I have a string with groups of nubmers. And Id like to make constant length string. Now I use two regexp_replace. First to add 10 numbers to string and next to cut string and take last 10 values:
with s(txt) as ( select '1030123:12031:1341' from dual)
select regexp_replace(
regexp_replace(txt, '(\d+)','0000000000\1')
,'\d+(\d{10})','\1') from s ;
But Id like to use only one regex something like
regexp_replace(txt, '(\d+)',lpad('\1',10,'0'))
But it don't work. lpad executed before regexp. Could you have any ideas?
With a slightly different approach, you can try the following:
with s(id, txt) as
(
select rownum, txt
from (
select '1030123:12031:1341' as txt from dual union all
select '1234:0123456789:1341' from dual
)
)
SELECT listagg(lpad(regexp_substr(s.txt, '[^:]+', 1, lines.column_value), 10, '0'), ':') within group (order by column_value) txt
FROM s,
TABLE (CAST (MULTISET
(SELECT LEVEL FROM dual CONNECT BY instr(s.txt, ':', 1, LEVEL - 1) > 0
) AS sys.odciNumberList )) lines
group by id
TXT
-----------------------------------
0001030123:0000012031:0000001341
0000001234:0123456789:0000001341
This uses the CONNECT BY to split every string based on the separator ':', then uses LPAD to pad to 10 and then aggregates the strings to build rows containing the concatenation of padded values
This works for non-empty sequences (e.g. 123::456)
with s(txt) as ( select '1030123:12031:1341' from dual)
select regexp_replace (regexp_replace (txt,'(\d+)',lpad('0',10,'0') || '\1'),'0*(\d{10})','\1')
from s
;

oracle SQL: get every 2nd value of a string

I am searching for a way to get every 2nd value of a string via SQL.
My string looks like this:
12:115:22:98
and I would like to get every 2nd value out of it, in this case 115 and 98.
Tried around with regexp_substr, but best I could do was to get every value with this:
select regexp_substr('3:113:1:14','[^ :]+', 2, level)
from dual
connect by regexp_substr('3:113:1:14','[^ :]+', 2, level)
is not null;
or just the 2nd value with this:
select regexp_substr('3: 113:1:14','[^ :]+', 2, 1)
from dual;
Is there a way or maybe another function to get this to work?
Thanks in advance,
Wod
edit:
it would also be possible to make the string look like this:
4-116:3-113:22-12
You were already so very very close...
select * from(
select regexp_substr('3:113:1:14:5:6:7:8','[^ :]+', 2, level), level lvl
from dual
connect by regexp_substr('3:113:1:14:5:6:7:8','[^ :]+', 2, level)
is not null
) where mod(lvl,2)=1;
Using the normal SUBSTR and INSTR :
SQL> WITH DATA AS
2 ( SELECT '12:115:22:98' STR FROM DUAL
3 UNION ALL
4 SELECT '3:113:1:14' FROM DUAL
5 )
6 SELECT SUBSTR(STR, INSTR(STR,':',1,1)+1, INSTR(STR,':',1,2)-INSTR(STR,':',1,1)-1)
7 ||'-'
8 || SUBSTR(STR, INSTR(STR,':',1,3)+1)
9 ||':'
10 ||str str
11 FROM data
12 /
STR
--------------------------------------
115-98:12:115:22:98
113-14:3:113:1:14