Substring Regular Expression for occurence - sql

I want to select part of the string which occurs after the first underline _ and before the second, third or whatever amount of underlines _ occur in a string.
For example I have strings such as:
75618_LORIK1_2_BABA_ODD_GENERIC
19_GENTRIT3_CC_DD_FF_BROWSERTC
75618_BETIM2
Output should be:
LORIK1
GENTRIT3
BETIM2
I cant seem to find some kind of expression of substring to get that part, I tried using:
SELECT SUBSTR(COLNAME, 0, INSTR(COLNAME, '_')-1) FROM DUAL;
But it seems to get only the part before the first occurrence of '_'.

Here's one way to do this with regular expressions.
with
test_data (str) as (
select '75618_LORIK1_2_BABA_ODD_GENERIC' from dual union all
select '19_GENTRIT3_CC_DD_FF_BROWSERTC' from dual union all
select '75618_BETIM2' from dual union all
select 'NO UNDERLINES HERE' from dual
)
select str, regexp_substr(str, '[^_]*', 1, 3) as second_token
from test_data
;
STR SECOND_TOKEN
------------------------------- -------------------------------
75618_LORIK1_2_BABA_ODD_GENERIC LORIK1
19_GENTRIT3_CC_DD_FF_BROWSERTC GENTRIT3
75618_BETIM2 BETIM2
NO UNDERLINES HERE

Related

Ltrim trimming extra character

I have the below code:
SELECT
ltrim('REASON_ACTIVE_DCA', 'REASON_') reason
FROM
dual
However, I'm obtaining '_CTIVE_DCA'. What's happening and how can I get 'ACTIVE_DCA' with ltrim?
Because LTRIM() removes all the characters as a set. So all leading "R"s and "E"s and so on removed. In fact, the ordering of the characters in the second string is irrelevant, so you would get the same result with '_NOSAER'.
If you want to remove the leading string of REASON_ -- if present -- then you don't use trim(). Instead, one method is:
select (case when 'REASON_ACTIVE_DCA' LIKE 'REASON$_%' ESCAPE '$'
then substr('REASON_ACTIVE_DCA', 8)
else 'REASON_ACTIVE_DCA'
end)
There are other ways, such as:
select regexp_replace('REASON_ACTIVE_DCA', '^REASON_', '')
I would do it with regular string functions (not regular expressions), and using INSTR instead of LIKE so I don't have to worry about escaping underscore.
Something like this - including a few sample strings in the WITH clause for testing:
with
inputs (i_str) as (
select 'REASON_ACTIVE_DCA' from dual union all
select 'REASON_NOT_GIVEN' from dual union all
select null from dual union all
select 'REASON-SPECIAL' from dual union all
select 'REASON_' from dual union all
select 'REASON' from dual
)
select i_str, substr(i_str, case instr(i_str, 'REASON_')
when 1 then 1 + length('REASON_')
else 1 end) as new_str
from inputs;
I_STR NEW_STR
----------------- -----------------
REASON_ACTIVE_DCA ACTIVE_DCA
REASON_NOT_GIVEN NOT_GIVEN
REASON-SPECIAL REASON-SPECIAL
REASON_
REASON REASON

How to remove all characters except 'E' in oracle

I have strig like 'AA_0331L_02317_R5_P' and i want remove all characters except 'E' from second part after splitting with _ character, so here it is 0331N should become 0331 and if it comes like 0331E , then it should become 0331E .ie simply if i have i string like AA_0331N_02317_R5_P , then i want to be AA_0331_02317_R5_P and if i have a AA_0331E_02317_R5_P ,then it should be AA_0331E_02317_R5_P. I did like as shown below without any luck
SELECT REGEXP_REPLACE(REGEXP_SUBSTR( 'AA_0331L_02317_R5_P' , '[^_]+', 1, 2 ), '[^0-9]', '')
FROM dual
You might try something like the following -- keeping in mind that REGEXP_REPLACE() will return the original string if nothing is actually replaced. Here I'm using backreferences (if Oracle regexes had lookahead I could have omitted the 2nd capturing group and backreference):
WITH mytable AS (
SELECT 'AA_0331L_02317_R5_P' AS myvalue
FROM dual
UNION ALL
SELECT 'AA_0331N_02317_R5_P'
FROM dual
UNION ALL
SELECT 'AA_0331E_02317_R5_P'
FROM dual
)
SELECT myvalue
, REGEXP_REPLACE(myvalue, '^([^_]+_[^_]+)[^E](_)', '\1\2') mynewvalue
FROM mytable;
MYVALUE MYNEWVALUE
------------------------- -------------------------
AA_0331L_02317_R5_P AA_0331_02317_R5_P
AA_0331N_02317_R5_P AA_0331_02317_R5_P
AA_0331E_02317_R5_P AA_0331E_02317_R5_P
with s as (
select 'AA_0331L_02317_R5_P' str from dual union all
select 'AA_0331E_02317_R5_P' str from dual)
select str,
regexp_replace(regexp_substr(str, '[^_]+_[^_]+'), '[^E]$') || regexp_replace(str, '[^_]+_[^_]+', '', 1, 1) new_str
from s;
STR NEW_STR
------------------------------ ------------------------------
AA_0331L_02317_R5_P AA_0331_02317_R5_P
AA_0331E_02317_R5_P AA_0331E_02317_R5_P

Get substring with REGEXP_SUBSTR

I need to use regexp_substr, but I can't use it properly
I have column (l.id) with numbers, for example:
1234567891123!123 EXPECTED OUTPUT: 1234567891123
123456789112!123 EXPECTED OUTPUT: 123456789112
12345678911!123 EXPECTED OUTPUT: 12345678911
1234567891123!123 EXPECTED OUTPUT: 1234567891123
I want use regexp_substr before the exclamation mark (!)
SELECT REGEXP_SUBSTR(l.id,'[%!]',1,13) from l.table
is it ok ?
You can try using INSTR() and substr()
DEMO
select substr(l.id,1,INSTR(l.id,'!', 1, 1)-1) from dual
You want to remove the exclamation mark and all following characters it seems. That is simply:
select regexp_replace(id, '!.*', '') from mytable;
Look at it like a delimited string where the bang is the delimiter and you want the first element, even if it is NULL. Make sure to test all possibilities, even the unexpected ones (ALWAYS expect the unexpected)! Here the assumption is if there is no delimiter you'll want what's there.
The regex returns the first element followed by a bang or the end of the line. Note this form of the regex handles a NULL first element.
SQL> with tbl(id, str) as (
select 1, '1234567891123!123' from dual union all
select 2, '123456789112!123' from dual union all
select 3, '12345678911!123' from dual union all
select 4, '1234567891123!123' from dual union all
select 5, '!123' from dual union all
select 6, '123!' from dual union all
select 7, '' from dual union all
select 8, '12345' from dual
)
select id, regexp_substr(str, '(.*?)(!|$)', 1, 1, NULL, 1)
from tbl
order by id;
ID REGEXP_SUBSTR(STR
---------- -----------------
1 1234567891123
2 123456789112
3 12345678911
4 1234567891123
5
6 123
7
8 12345
8 rows selected.
SQL>
If you like to use REGEXP_SUBSTR rather than regexp_replace then you can use
SELECT REGEXP_SUBSTR(l.id,'^\d+')
assuming you have only numbers before !
If I understand correctly, this is the pattern that you want:
SELECT REGEXP_SUBSTR(l.id,'^[^!]+', 1)
FROM (SELECT '1234567891123!123' as id from dual) l

how to select exact 7 or 10 world in oracle using regular expression

I am working on below query, I am expected to select exact 7 or 10 digit values columns using regular expression, I have used express in regexp_like() function of oracle, but its not working, please help
Query :
select * from
(select '1234567CELL' "a" from dual
union
select '123CaLLAsasd12' "a" from dual
union
select 'as9960488188CELLas12' "a" from dual
union
select '1234567' "a" from dual
union
select '9960488188' "a" from dual
union
select 'asdCELLqw' "a" from dual) b
where b."a" like '%CELL%' and regexp_like(b."a",'^(\d{7}|\d{10})$');
Expected output
"1234567"
"9960488188"
as above two rows, please check
^ and $ match the start and end of a string and the value cannot contain the string CELL and be solely a 7- or 10-digit number. Instead you could use the regular expression (^|\D)(\d{7}|\d{10})($|\D) which will match either the start of the string or a not digit character (^|\D) then either 7- or 10- digits and then either the end of the string or a non digit character ($|\D).
Like this:
WITH data ( a ) AS (
select '1234567CELL' from dual union
select '123CaLLAsasd12' from dual union
select 'as9960488188CELLas12' from dual union
select '1234567' from dual union
select '9960488188' from dual union
select 'asdCELLqw' from dual
)
SELECT a,
REGEXP_SUBSTR( a, '(^|\D)(\d{7}|\d{10})($|\D)', 1, 1, NULL, 2 ) AS val
FROM data
WHERE a LIKE '%CELL%'
AND REGEXP_LIKE( a, '(^|\D)(\d{7}|\d{10})($|\D)');
Output:
A VAL
-------------------- ----------
1234567CELL 1234567
as9960488188CELLas12 9960488188
You may just use
where regexp_like(b."a",'^([[:digit:]]{7}|[[:digit:]]{10})$')
Since the pattern is anchored (^ matches the start of the string and $ matches the end of the string) there can't be CELL inside the entries you fetch, and you can remove where b."a" like '%CELL%' from the query.

Fetching value from Pipe-delimited String using Regex (Oracle)

I have a sample source string like below, which was in pipe delimited format in that the value obr can be at anywhere. I need to get the second value of the pipe from the first occurrence of obr. So for the below source strings the expected would be,
Source string:
select 'asd|dfg|obr|1|value1|end' text from dual
union all
select 'a|brx|123|obr|2|value2|end' from dual
union all
select 'hfv|obr|3|value3|345|pre|end' from dual
Expected output:
value1
value2
value3
I have tried the below regexp in oracle sql, but it is not working fine properly.
with t as (
select 'asd|dfg|obr|1|value1|end' text from dual
union all
select 'a|brx|123|obr|2|value2|end' from dual
union all
select 'hfv|obr|3|value3|345|pre|end' from dual
)
select text,to_char(regexp_replace(text,'*obr\|([^|]*\|)([^|]*).*$', '\2')) output from t;
It is working fine when the string starts with OBR, but when OBR is in the middle like the above samples it is not working fine.
Any help would be appreciated.
Not sure of how Oracle handles regular expressions, but starting with an asterisk usually implies that you're looking for zero or more null characters.
Have you tried '^.*obr\|([^|]*\|)([^|]*).*$' ?
This handles null elements and is wrapped in a NVL() call which supplies a value if 'obr' is not found or occurs too far toward the end of a record so a value 2 away is not possible:
SQL> with t(id, text) as (
select 1, 'asd|dfg|obr|1|value1|end' from dual
union
select 2, 'a|brx|123|obr|2|value2|end' from dual
union
select 3, 'hfv|obr|3|value3|345|pre|end' from dual
union
select 4, 'hfv|obr||value4|345|pre|end' from dual
union
select 5, 'a|brx|123|obriem|2|value5|end' from dual
union
select 6, 'a|brx|123|obriem|2|value6|obr' from dual
)
select
id,
nvl(regexp_substr(text, '\|obr\|[^|]*\|([^|]*)(\||$)', 1, 1, null, 1), 'value not found') value
from t;
ID VALUE
---------- -----------------------------
1 value1
2 value2
3 value3
4 value4
5 value not found
6 value not found
6 rows selected.
SQL>
The regex basically can be read as "look for a pattern of a pipe, followed by 'obr', followed by a pipe, followed by zero or more characters that are not a pipe, followed by a pipe, followed by zero or more characters that are not a pipe (remembered in a captured group), followed by a pipe or the end of the line". The regexp_substr() call then returns the 1st captured group which is the set of characters between the pipes 2 fields from the 'obr'.