Regexp substr until underscore + digit - sql

I have the this string: 'STRING_EXA2MP_3LE'. I want to obtain all the characters until there is an underscore followed by a digit, which in this case would output 'STRING_EXA2MP'. How could I obtain this?
This is what I have tried so far.
SELECT
regexp_substr('STRING_EXA2MP_3LE', '[^(_0-9]+', 1, 1)
FROM
dual

Select regexp_replace('STRING_EXA2MP_3LE', '_[0-9].*') from dual

I suggest
SELECT regexp_substr('STRING_EXA2MP_3LE', '(.+)_[0-9]+', 1, 1, NULL, 1)
FROM dual
The subexpression in parentheses ((.+)), which is subexpression #1 (important later) says "Match 1 or more of any character". The rest of the expression (_[0-9]+) says "Match an underscore followed by one or more numeric digits". The last argument to REGEXP_SUBSTR says "Return the value of subexpression #1". So using the subexpression here is important as it allows you to extract a portion of the matched string.
db<>fiddle here

Try this:
SELECT regexp_substr('STRING_EXA2MP_3LE', '(.+_?)_[0-9]?', 1, 1,'i',1)
FROM dual;

Related

Get the data from a string between double quotes in Oracle

I have a string with double quotes inside.
EG:
<cosmtio :ff "intermit"ksks>
I need the data between the ""
I have tried the regexp_substr but still couldn't get the value between double-quotes.
We could try using REGEXP_REPLACE here:
SELECT
string,
REGEXP_REPLACE(string, '.*"([^"]+)".*', '\1') AS quoted_term
FROM yourTable;
Data:
WITH yourTable AS (
SELECT '<cosmtio :ff "intermit"ksks>' AS string FROM dual
)
Demo
Another option, using REGEXP_SUBSTR:
SELECT
string,
TRIM(BOTH '"' FROM REGEXP_SUBSTR(string, '".*"'))
FROM yourTable;
But this approach requires nesting two function calls, which means it might not outperform the REGEXP_REPLACE version.
You need to use REGEXP_SUBSTR:
SELECT REGEXP_SUBSTR('<cosmtio :ff "intermit"ksks>', '"([^"]+)"', 1, 1, NULL, 1) AS Result FROM DUAL
See the online demo.
The regex is simple: "([^"]+)" matches ", then captures any 1+ chars other than " into Group 1 and then matches ". The last argument is 1 telling Oracle REGEXP_SUBSTR to return the Group 1 values. The first (position) and the second (occurrence) arguments are default, 1. NULL means no specific options need to be passed to the regex engine.
You can try the following:
SELECT REGEXP_REPLACE('<cosmtio :ff "intermit"ksks>', '^[^"]*("([^"]*)")?.*', '\2') FROM dual
It is possible with regexp_substr as following:
Select
regexp_substr('<cosmtio :ff "intermit"ksks>', '[^"]+', 1, 2)
from dual;
Cheers!!

How to get file name without extension with using Regular Expressions

I have a field with following values, now i want to extract only those rows with "xyz" in the field value mentioned below, can you please help?
Mydata_xyz_aug21
Mydata2_zzz_aug22
Mydata3_xyz_aug33
One more requirement
I want to extract only "aIBM_MyProjectFile" from following string below, can you please help me with this?
finaldata/mydata/aIBM_MyProjectFile.exe.ld
I've tried this but it didn't work.
select
regexp_substr('FinalProject/MyProject/aIBM_MyProjectFile.exe.ld','([^/]*)[\.]') exp
from dual;
To extract substrings between the first pair of underscores, you need to use
regexp_substr('Mydata_xyz_aug21','_([^_]+)_', 1, 1, NULL, 1)
To get the file name without the extension, you need
regexp_substr('FinalProject/MyProject/aIBM_MyProjectFile.exe.ld','.*/([^.]+)', 1, 1, NULL, 1)
Note that each regex contains a capturing group (a pattern inside (...)) and this value is accessed with the last 1 argument to the regexp_substr function.
The _([^_]+)_ pattern finds the first _, then places 1 or more chars other than _ into Group 1 and then matches another _.
The .*/([^.]+) pattern matches the whole text up to the last /, then captures 1 or more chars other than . into Group 1 using ([^.]+).
For the first requirement, it would suffice to use LIKE, as posted in answer above:
SELECT column
FROM table
WHERE column LIKE '%xyz%';
For your second requirement (extraction) you will have to use REGEXP_SUBSTR function:
SELECT REGEXP_SUBSTR ('FinalProject/MyProject/aIBM_MyProjectFile.exe.ld', '.*/([^.]+)', 1, 1, NULL, 1)
FROM DUAL
I hope it helped!
Another way to do this is to skip regexp completely:
WITH
aset AS
(SELECT 'with_extension.txt' txt FROM DUAL
UNION ALL
SELECT 'without_extension' FROM DUAL)
SELECT CASE
WHEN INSTR (txt, '.', -1) > 0
THEN
SUBSTR (txt, 1, INSTR (txt, '.', -1) - 1)
ELSE
txt
END
txt
FROM aset
The result of this is
with_extension
without_extension
A BIG Caveat where the regexp is better:
My method doesn't handle this case correctly:
\this\is.a\test
So after I have gone to all this effort, stay with the regexp solutions. I'll leave this here so that others may learn from it.

regexp_substr strip text between first forward slash and second one

/abc/required_string/2/ should return abc with regexp_substr
SELECT REGEXP_SUBSTR ('/abc/blah/blah/', '/([a-zA-Z0-9]+)/', 1, 1, NULL, 1) first_val
from dual;
You might try the following:
SELECT TRIM('/' FROM REGEXP_SUBSTR(mycolumn, '^\/([^\/]+)'))
FROM mytable;
This regular expression will match the first occurrence of a pattern starting with / (I habitually escape /s in regular expressions, hence \/ which won't hurt anything) and including any non-/ characters that follow. If there are no such characters then it will return NULL.
Hope this helps.
You can search for /([^/]+)/, which says:
/ forward slash
( start of subexpression (usually called "group" in other languages)
[^/] any character other than forward slash
+ match the preceding expression one or more times
) end of subexpression
/ forward slash
You can use the 6th argument to regexp_substr to select a subexpression.
Here we pass 1 to match only the characters between the /s:
select regexp_substr(txt, '/([^/]+)/', 1, 1, null, 1)
from t1
See it working at SQL Fiddle.
Classic SUBSTR + INSTR offer a simple solution; I know you specified regular expressions, but - consider this too, might work better for a large data volume.
SQL> with test (col) as
2 (select '/abc/required_string/2/' from dual)
3 select substr(col, 2, instr(col, '/', 1, 2) - 2) result
4 from test;
RES
---
abc
SQL>
Here's another way to get the 2nd occurrence of a string of characters followed by a forward slash. It handles the problem if that element happens to be NULL as well. Always expect the unexpected!
Note: If you use the regex form of [^/]+, and that element is NULL it will return "required string" which is NOT what you expect! That form does NOT handle NULL elements. See here for more info: [https://stackoverflow.com/a/31464699/2543416]
with tbl(str) as (
select '/abc/required_string/2/' from dual union all
select '//required_string1/3/' from dual
)
select regexp_substr(str, '(.*?)(/)', 1, 2, null, 1)
from tbl;

Removal of first characters in a string oracle sql

It may be very simple question, but I have run out of ideas.
I would like to remove first 5 characters from string.
Example string will look like:
1Y40K100R
I would like to display only digits that are after '%K' which in this case should give me result of 100R.
Please note that number after 'K' can have different amount of digits. It can be 4 digit number or 2 digit number.
Just use substr():
select substr(col, 6)
This returns all characters starting at the sixth.
There are multiple ways to return all characters after the k. If you know the string has a k, then use instr():
select substr(col, instr(col, 'K') + 1)
You can use regexp_substr
select regexp_substr('1Y40K100R', '(K)(.*)', 1, 1, 'i', 2) from dual
A way without regexp:
select substr('1Y40K100R', instr('1Y40K100R', 'K') +1) from dual
This may appear not so elegant, but it usually performs better than the regexp way.

How to extract value between 2 slashes

I have a string like "1490/2334/5166400411000434" from which I need to derive value after second slash. I tried below logic
select REGEXP_SUBSTR('1490/2334/5166400411000434','[^/]+',1,3) from dual;
it is working fine. But when i dont have value between first and second slash it is returining blank.
For example my string is "1490//5166400411000434" and am trying
select REGEXP_SUBSTR('1490//5166400411000434','[^/]+',1,3) from dual;
it is returning blank. Please suggest me what i am missing.
If I understand well, you may need
regexp_substr(t, '(([^/]*/){2})([^/]*)', 1, 1, 'i', 3)
This handles the first 2 parts like 'xxx/' and then checks for a sequence of non / characters; the parameter 3 is used to get the 3rd matching subexpression, which is what you want.
For example:
with test(t) as (
select '1490/2334/5166400411000434' from dual union all
select '1490//5166400411000434' from dual union all
select '1490//5166400411000434/ramesh/3344' from dual
)
select t, regexp_substr(t, '(([^/]*/){2})([^/]*)', 1, 1, 'i', 3) as substr
from test
gives:
T SUBSTR
---------------------------------- ----------------------------------
1490/2334/5166400411000434 5166400411000434
1490//5166400411000434 5166400411000434
1490//5166400411000434/ramesh/3344 5166400411000434
You can REVERSE() your string and take the value before the first slash. And then reverse again to obtain the desired output.
select reverse(regexp_substr(reverse('1490//5166400411000434'), '[^/]+', 1, 1)) from dual;
It can also be done with basic substring and instr function:
select reverse(SUBSTR(reverse('1490//5166400411000434'), 0, INSTR(reverse('1490//5166400411000434'), '/')-1)) from dual;
Use other options in REGEXP_SUBSTR to match a pattren
select REGEXP_SUBSTR('1490//5166400411000434','(/\d*)/(\d+)',1,1,'x',2) from dual
Basically it is finding the pattren of two / including digits starting from 1 with 1 appearance and ignoring whitespaces ('x') then outputting 2nd subexpression that is in second expression within ()
... pattern,1,1,'x',subexp2)