Oracle: How to use regexp_substr in this case - sql

I have a table in Oracle where one of the column contains UserIds which are in the form of \. For eg "fin\george", "sales\andy" etc. How can I use REGEXP_SUBSTR function to get only the from the UserIds. ie I want to fetch only "george", "andy" etc. I have achieved the desired reult using SUBSTR function but I want to use REGEXP_SUBSTR in this case.
I tried doing this:
SELECT REGEXP_SUBSTR('fin\george','\[^\]+,') "UserName" FROM DUAL;
but it did'nt help. Can anyone please point out my mistake ?

I believe you want to use a regexp_replace with a backreference. I'm assuming that all the characters before and after the \ are alphabetic. If you allow numbers, you'd want to use the [[:alnum:]] rather than [[:alpha::].
1* SELECT REGEXP_replace('fin\george',
'([[:alpha:]]+\\)([[:alpha:]]+)$',
'\2') "UserName"
FROM DUAL
SQL> /
UserNa
------
george

SQL> SELECT REGEXP_SUBSTR('fin\george', '[^\]+', 1, 2) AS userId from dual;
USERID
------
george
See this Oracle Base article

select regexp_replace( 'fin\george', '.*\\', null ) from dual;
returns george.
The regex will match any character followed by the \ (which is escaped), as many times as possible (greedy).
So it will match everything up to the final \.
Then the matching string is replaced with null.
null is the default so
select regexp_replace('fin\george', '.*\\' ) from dual;
does the same thing
Same expression can extract filename from the end of pathname e.g.
select regexp_replace ('fin\fin2\fin3\fin4\george', '.*\\' ) from dual;
will also return george.

You have to use escaping: \\ instead of \

The easiest way (IMHO) to do this is the following:
SELECT REGEXP_SUBSTR('fin\george', '[^\\]+$') AS "UserName" FROM DUAL;
The issues with your original query were (a) that the \ character was not escaped and (b) there was an extraneous comma in the regular expression. I've used the end-of-string anchor $ here, assuming that there are not more than two elements delimited by \. If there are more than two, and you need only the second one, you can use the following:
SELECT REGEXP_SUBSTR('fin\george\ringo', '[^\\]+', 1, 2) AS "UserName"
FROM DUAL;
This tells Oracle to start looking at the first character of the string and return the second match.

Related

SQL Query to select a string after last delimiter

I want to retrieve a String after last appearance of ~ delimiter. I have whole string like "Attachments:Attachments~Attachment" and I want to take substring after ~ characters that is output will be Attachment. How can be this done in SQL/Oracle select statement?
Use REGEXP_SUBSTR
select regexp_substr('Attachments:Attachments~Attachment','[^~]+$') from dual;
[^ ] - Used to specify a nonmatching list where you are trying to match any character except for the ones in the list.
+ - Matches one or more occurrences
$ - Matches the end of a string
Demo on db<>fiddle
You can use the substr and instr for such simple pattern matching requirements as regexp will be costly compared to substr and instr combination.
You can try the following:
substr(str,instr(str,'~',-1) + 1)
Example:
SQL> select substr('Attachments:Attachments~Attachment1~Attachment2',
2 instr('Attachments:Attachments~Attachment1~Attachment2','~',-1) + 1)
3 from dual;
SUBSTR('ATT
-----------
Attachment2
SQL>

Oracle SQL - select parts of a string

How can I select abcdef.txt from the following string?
abcdef.123.txt
I only know how to select abcdef by doing select substr('abcdef.123.txt',1,6) from dual;
You can using || for concat and substr -3 for right part
select substr('abcdef.123.txt',1,6) || '.' ||substr('abcdef.123.txt',-3) from dual;
or avoiding a concat (like suggested by Luc M)
select substr('abcdef.123.txt',1,7) || substr('abcdef.123.txt',-3) from dual;
A general solution, assuming the input string has exactly two periods . and you want to extract the first and third tokens, separated by one . The length of the "tokens" in the input string can be arbitrary (including zero!) and they can contain any characters other than .
select regexp_replace('abcde.123.xyz', '([^.]*).([^.]*).([^.]*)', '\1.\3') as result
from dual;
RESULT
---------
abcde.xyz
Explanation:
[ ] means match any of the characters between brackets.
^
means do NOT match the characters in the brackets - so...
[^.]
means match any character OTHER THAN .
* means match zero or
more occurrences, as many as possible ("greedy" match)
( ... ) is called a subexpression... see below
'\1.\3 means replace the original string
with the first subexpression, followed by ., followed by the THIRD
subexpression.
Replace the substring of anything surrounded by dots (inclusive) with a single dot. No dependence on lengths of components of the string:
SQL> select regexp_replace('abcdef.123.txt', '\..*\.', '.') fixed
from dual;
FIXED
----------
abcdef.txt
SQL>

Remove last two characters from each database value

I run the following query:
select * from my_temp_table
And get this output:
PNRP1-109/RT
PNRP1-200-16
PNRP1-209/PG
013555366-IT
How can I alter my query to strip the last two characters from each value?
Use the SUBSTR() function.
SELECT SUBSTR(my_column, 1, LENGTH(my_column) - 2) FROM my_table;
Another way using a regular expression:
select regexp_replace('PNRP1-109/RT', '^(.*).{2}$', '\1') from dual;
This replaces your string with group 1 from the regular expression, where group 1 (inside of the parens) includes the set of characters after the beginning of the line, not including the 2 characters just before the end of the line.
While not as simple for your example, arguably more powerful.

How to get part of the string that matched with regular expression in Oracle SQL

Lets say I have following string: 'product=1627;color=45;size=7' in some field of the table.
I want to query for the color and get 45.
With this query:
SELECT REGEXP_SUBSTR('product=1627;color=45;size=7', 'color\=([^;]+);?') "colorID"
FROM DUAL;
I get :
colorID
---------
color=45;
1 row selected
.
Is it possible to get part of the matched string - 45 for this example?
One way to do it is with REGEXP_REPLACE. You need to define the whole string as a regex pattern and then use just the element you want as the replace string. In this example the ColorID is the third pattern in the entire string
SELECT REGEXP_REPLACE('product=1627;color=45;size=7'
, '(.*)(color\=)([^;]+);?(.*)'
, '\3') "colorID"
FROM DUAL;
It is possible there may be less clunky regex solutions, but this one definitely works. Here's a SQL Fiddle.
Try something like this:
SELECT REGEXP_SUBSTR(REGEXP_SUBSTR('product=1627;color=45;size=7', 'color\=([^;]+);?'), '[[:digit:]]+') "colorID"
FROM DUAL;
From Oracle 11g onwards we can specify capture groups in REGEXP_SUBSTR.
SELECT REGEXP_SUBSTR('product=1627;color=45;size=7', 'color=(\d+);', 1, 1, 'i', 1) "colorID"
FROM DUAL;

How to Select a substring in Oracle SQL up to a specific character?

Say I have a table column that has results like:
ABC_blahblahblah
DEFGH_moreblahblahblah
IJKLMNOP_moremoremoremore
I would like to be able to write a query that selects this column from said table, but only returns the substring up to the Underscore (_) character. For example:
ABC
DEFGH
IJKLMNOP
The SUBSTRING function doesn't seem to be up to the task because it is position-based and the position of the underscore varies.
I thought about the TRIM function (the RTRIM function specifically):
SELECT RTRIM('listofchars' FROM somecolumn)
FROM sometable
But I'm not sure how I'd get this to work since it only seems to remove a certain list/set of characters and I'm really only after the characters leading up to the Underscore character.
Using a combination of SUBSTR, INSTR, and NVL (for strings without an underscore) will return what you want:
SELECT NVL(SUBSTR('ABC_blah', 0, INSTR('ABC_blah', '_')-1), 'ABC_blah') AS output
FROM DUAL
Result:
output
------
ABC
Use:
SELECT NVL(SUBSTR(t.column, 0, INSTR(t.column, '_')-1), t.column) AS output
FROM YOUR_TABLE t
Reference:
SUBSTR
INSTR
Addendum
If using Oracle10g+, you can use regex via REGEXP_SUBSTR.
This can be done using REGEXP_SUBSTR easily.
Please use
REGEXP_SUBSTR('STRING_EXAMPLE','[^_]+',1,1)
where STRING_EXAMPLE is your string.
Try:
SELECT
REGEXP_SUBSTR('STRING_EXAMPLE','[^_]+',1,1)
from dual
It will solve your problem.
You need to get the position of the first underscore (using INSTR) and then get the part of the string from 1st charecter to (pos-1) using substr.
1 select 'ABC_blahblahblah' test_string,
2 instr('ABC_blahblahblah','_',1,1) position_underscore,
3 substr('ABC_blahblahblah',1,instr('ABC_blahblahblah','_',1,1)-1) result
4* from dual
SQL> /
TEST_STRING POSITION_UNDERSCORE RES
---------------- ------------------ ---
ABC_blahblahblah 4 ABC
Instr documentation
Susbtr Documentation
SELECT REGEXP_SUBSTR('STRING_EXAMPLE','[^_]+',1,1) from dual
is the right answer, as posted by user1717270
If you use INSTR, it will give you the position for a string that assumes it contains "_" in it. What if it doesn't? Well the answer will be 0. Therefore, when you want to print the string, it will print a NULL.
Example: If you want to remove the domain from a "host.domain". In some cases you will only have the short name, i.e. "host". Most likely you would like to print "host". Well, with INSTR it will give you a NULL because it did not find any ".", i.e. it will print from 0 to 0. With REGEXP_SUBSTR you will get the right answer in all cases:
SELECT REGEXP_SUBSTR('HOST.DOMAIN','[^.]+',1,1) from dual;
HOST
and
SELECT REGEXP_SUBSTR('HOST','[^.]+',1,1) from dual;
HOST
Another possibility would be the use of REGEXP_SUBSTR.
In case if String position is not fixed then by below Select statement we can get the expected output.
Table Structure
ID VARCHAR2(100 BYTE)
CLIENT VARCHAR2(4000 BYTE)
Data-
ID CLIENT
1001 {"clientId":"con-bjp","clientName":"ABC","providerId":"SBS"}
1002
--
{"IdType":"AccountNo","Id":"XXXXXXXX3521","ToPricingId":"XXXXXXXX3521","clientId":"Test-Cust","clientName":"MFX"}
Requirement - Search ClientId string in CLIENT column and return the corresponding value. Like From "clientId":"con-bjp" --> con-bjp(Expected output)
select CLIENT,substr(substr(CLIENT,instr(CLIENT,'"clientId":"')+length('"clientId":"')),1,instr(substr(CLIENT,instr(CLIENT,'"clientId":"')+length('"clientId":"')),'"',1 )-1) cut_str from TEST_SC;
--
CLIENT cut_str
----------------------------------------------------------- ----------
{"clientId":"con-bjp","clientName":"ABC","providerId":"SBS"} con-bjp
{"IdType":"AccountNo","Id":"XXXXXXXX3521","ToPricingId":"XXXXXXXX3521","clientId":"Test-Cust","clientName":"MFX"} Test-Cust
Remember this if all your Strings in the column do not have an underscore
(...or else if null value will be the output):
SELECT COALESCE
(SUBSTR("STRING_COLUMN" , 0, INSTR("STRING_COLUMN", '_')-1),
"STRING_COLUMN")
AS OUTPUT FROM DUAL
To find any sub-string from large string:
string_value:=('This is String,Please search string 'Ple');
Then to find the string 'Ple' from String_value we can do as:
select substr(string_value,instr(string_value,'Ple'),length('Ple')) from dual;
You will find result: Ple