i have the following text string stored in an oracle 11g table
"MGK8M76HRT Confirmed. You have received Kshs 6,678.00 from Peter 0700123456 on 1/1/2018"
I would like to extract the following from the text using regexp
6,678.00 - amount paid
MGK8M76HRT - unique payment transaction code (changes pattern everytime)
0700123456 - phone number
1/1/2018 - payment date
I have tried multiple oracle regexp patterns to extract the texts without any success. Any assistance/ideas will be appreciated.
I tried:
CONFIRMATION_CODE_PATTERN = "[A-Z0-9]+ Confirmed.";
PHONE_PATTERN = "07[\\d]{8}";
AMOUNT_PATTERN = "Ksh[,|.|\\d]+";
DATETIME_PATTERN = "d/M/yy hh:mm a";
Note that inside bracket expressions, in Oracle regex, you cannot use regex escapes. [\d] does not match a digit, it matches a \ or d chars. You should use [0-9] / [[:digit:]] instead. Next, you should use capturing groups, (...), to wrap those parts of the pattern that you want to exract.
You may use the following regular expressions:
select regexp_substr('MGK8M76HRT Confirmed. You have received Kshs 6,678.00 from Peter 0700123456 on 1/1/2018',
'Kshs\s*(\d([,.0-9]*\d)?)', 1, 1, NULL, 1) as Paid from dual
\\
select regexp_substr('MGK8M76HRT Confirmed. You have received Kshs 6,678.00 from Peter 0700123456 on 1/1/2018',
'(\D|^)(07\d{8})(\D|$)', 1, 1, NULL, 2) as Phone from dual
\\
select regexp_substr('MGK8M76HRT Confirmed. You have received Kshs 6,678.00 from Peter 0700123456 on 1/1/2018',
'(\S+)\s+Confirmed\.', 1, 1, NULL, 1) as Code from dual
\\
select regexp_substr('MGK8M76HRT Confirmed. You have received Kshs 6,678.00 from Peter 0700123456 on 1/1/2018',
'\d{1,2}/\d{1,2}/\d{4}') as TrDate from dual
Please organize this as per your requirements, it does not seem to be in the scope of the question.
Output:
Related
This question already has answers here:
Regular expression to stop at first match
(9 answers)
Closed 5 days ago.
I have values in a table which have a format similar to below. I only want to retrieve the string of data between A-E>>.....>>. (eg. the first occurrence of the >> so in the case below it would be Chubb Fire & Security Pty Ltd - ABN 47000067541
A-E>>Chubb Fire & Security Pty Ltd - ABN 47000067541>>C2004/10539>>My Docs
I have tried using REGEXP_SUBSTR(path,'A-E>>([^.]+)>>',1,1,NULL,1) and other variances but it is also returning values past the >>. For example it would return
Chubb Fire & Security Pty Ltd - ABN 47000067541>>C2004/10539>>My Docs
Any ideas what I have missed in my Regex?
An option would be using REGEXP_REPLACE() with capture group 2 in order to extract the second piece sliced by the below pattern group such as
SELECT REGEXP_REPLACE(path,'^(.*A-E>>)([^>>]*).*','\2') AS new_path
FROM t -- your table
Demo
I'd rather suggest simple & fast substr + instr combination; extract substring between the 1st and the 2nd occurrence of the >> sign.
Sample data:
SQL> with test(col) as
2 (select 'A-E>>Chubb Fire & Security Pty Ltd - ABN 47000067541>>C2004/10539>>My Docs' from dual)
Query:
3 select substr(col, instr(col, '>>', 1, 1) + 2,
4 instr(col, '>>', 1, 2) - instr(col, '>>', 1, 1) - 2
5 ) result
6 from test;
RESULT
-----------------------------------------------
Chubb Fire & Security Pty Ltd - ABN 47000067541
SQL>
You could try this regex pattern in your query: A-E>>([^.>]+)>>.*$
SELECT
REGEXP_SUBSTR(
'A-E>>Chubb Fire & Security Pty Ltd - ABN 47000067541>>C2004/10539>>My Docs',
'A-E>>([^.>]+)>>.*$',1,1,NULL,1)
FROM DUAL
Please check a demo here
SELECT REGEXP_REPLACE('A-E>>Chubb Fire Security Pty Ltd - ABN
47000067541>>C2004/10539>>My DoC>>some other text', '^[^>]*>>([^>]+).*$', '\1')
AS extracted_text FROM DUAL;
'^[^>]*': Matches any characters at the start of the string that are not ">" (i.e. the characters before the first ">>").
'>>': Matches the first occurrence of ">>"
'([^>]+)': Matches one or more characters that are not ">" and captures them in a group (i.e. the characters between the first and second ">>")
'.*$': Matches any characters to the end of the string.
Regex101
or
use this function with substitution "&3" or "\3" (.+[a-zA-Z])(>>)([a-zA-Z].+?)>>.*
Regex101
I am trying to extract a value between the brackets from a string.
Here(How to extract a string between brackets in oracle sql query), it is explains how to do.
But in my situation, the string has 2 lines. With this way, I get only NULL.
SELECT REGEXP_SUBSTR('Gupta, Abha (01792)', '\((.+)\)', 1, 1, NULL, 1) FROM dual --01792
SELECT REGEXP_SUBSTR('Gupta, Abha (01
792)', '\((.+)\)', 1, 1, NULL, 1) FROM dual -- NULL
I known that i can remove the break line symbol and then use regex_substr but i need to keep the break line symbol
I would adress this with the following regex:
\(([^)]*)\
This makes use of a custom character class, [^)], which means: everything but a closing parenthese. This way, you do not have to worry about line breaks (since, obviously, a line break is not a closing parenthese), or any other special character:
Demo on DB Fiddle:
SELECT REGEXP_SUBSTR('Gupta, Abha (01
792)', '\(([^)]*)\)') res FROM dual
| RES |
| :------------------- |
| (01 |
| 792) |
I've got pretty dirty data of client addresses. For each client, there are 2 or more addresses in one string. Using regular expressions in Oracle I want to subtract the first one.
It would be very easy if there was the same separator as ';'. But sometimes there is a comma. And comma is also used within an address to separate city, street, and building.
I've got Russian addresses so I translated them for you.
For example, I have a string with multiple addresses:
A comma is a separator, but it also separates blocks inside addresses.
So I could match the first address by matching everything until the second '\sul\.'.
But I don't how to do it. Regexp_substr(address, '.*,\sul') will return
This is far from what I need.
So how can I subtract everything until second ,\sul\. ?
Russia, Moscow, ul. Tverskaya, d.32 should be returned.
You could address this requirement using SUBSTR and INSTR instead of regexes. The following expression should give you what you need:
SUBSTR(v, 1, INSTR(v, ', ul.', 1, 2) - 1)
INSTR() finds the position of the second occurence of string ', ul.' in the source string, and SUBSTR() selects everything from the beginning of the string until that position (minus 1).
Example:
WITH t AS (
SELECT 'Russia, Moscow, ul. Tverskaya, d.32, ul. Yakimanka, d21, ul. Kalinina, d.43' address FROM DUAL
)
SELECT SUBSTR(address, 1, INSTR(address, ', ul.', 1, 2) - 1) adress1 FROM t
| ADRESS1 |
| :---------------------------------- |
| Russia, Moscow, ul. Tverskaya, d.32 |
Demo on DB Fiddle
NB: this works as long as there are indeed at least two occurences of the given pattern in the string. If you happen to have values that do not match this spec and that you want to preserve, you would need an additional level of testing, like:
CASE INSTR(address, ', ul.', 1, 2)
WHEN 0 THEN address
ELSE SUBSTR(address, 1, INSTR(address, ', ul.', 1, 2) - 1)
END adress1
Demo on DB Fiddle
In a table(football_team) the values in a column(names) looks like this Andrew Luck , QB . I want to split this column into 3 columns first_name,Last_name,position using PL/SQL functions.
i tried this
select regexp_substr(names,'[^ ,"]+',1,1) as first_name,
regexp_substr(names,'[^ ,"]+',1,2) as last_name,
regexp_substr(names,'[^ ,"]+',1,3) as position from football_team;
doesn't work
Can I make it by using only SUBSTR and INSTR functions.
Please help me. Thanks in advance.
Yes you could use string functions too but IMO regexp is much simpler here. Well as long as you can read regexps, YMMV.
The problem is in regular expressions. Try this instead:
with names(name) as (
select 'Andrew Luck , QB' from dual
union all
select 'John Doe , WB' from dual
)
select
regexp_substr(name, '^([[:alpha:]]+)', 1, 1, '', 1) as firstname
,regexp_substr(name, '^[[:alpha:]]+[[:space:]]+([[:alpha:]]+)', 1, 1, '', 1) as lasttname
,regexp_substr(name, '([[:alpha:]]+)$', 1, 1, '', 1) as position
from names
;
Returns:
FIRSTNAME LASTNAME POSITION
--------- -------- --------
Andrew Luck QB
John Doe WB
The firstname matching regular expressions explained:
^ -- start of the string
( -- start of subexpression group that is referenced by regexp_substr parameter #6 (subexpr)
[ -- start of matching character list
[:alpha:] -- character class expression: alphabet.
] -- end of matching character list
+ -- matches one or more occurrences of the preceding subexpression
) -- end of subexpression group
The explanantion of other two regexps are left as an excercise for the OP.
I need to retrieve a specific part of a string which has values separated by asterisk's
In the example below I need to retrieve the string Client Contact Centre Seniors2 which sits between the 6 and 7 asterisk.
I am fairly new to regular expressions and have only managed to find select a value between 2 asterisks using *[\w]+*
Is there a way to specify which number of asterisk to look at using regular expression, or is there a better way for me to retrieve the string I am after?
String:
2*J25*Owner11*Owner Group2*L231*CLIENTCONTACTCENTRESENIORSQUEUE29*Client Contact Centre Seniors2*K20*0*2*C110*SR_STAT_ID2*N18*Referred2*O10*
Note: I will be using this regular expression in Oracle SQL using REGEXP_LIKE(string, regex).
* is a regex operator and needs to be escaped, unless used inside brackets that holds character list. You can use this simplified pattern to extract the
seventh word.
regexp_substr(Audits.audit_log,'[^*]+',1,7)
SQL Fiddle
Query 1:
with x(y) as (
select '2*J25*Owner11*Owner Group2*L231*CLIENTCONTACTCENTRESENIORSQUEUE29*Client Contact Centre Seniors2*K20*0*2*C110*SR_STAT_ID2*N18*Referred2*O10*'
from dual
)
select regexp_substr(y,'([^*]+)\*',1,7,null,1)
from x
Results:
| REGEXP_SUBSTR(Y,'([^*]+)\*',1,7,NULL,1) |
|-----------------------------------------|
| Client Contact Centre Seniors2 |
Query 2:
with x(y) as (
select '2*J25*Owner11*Owner Group2*L231*CLIENTCONTACTCENTRESENIORSQUEUE29*Client Contact Centre Seniors2*K20*0*2*C110*SR_STAT_ID2*N18*Referred2*O10*'
from dual
)
select regexp_substr(y,'[^*]+',1,7)
from x
Results:
| REGEXP_SUBSTR(Y,'[^*]+',1,7) |
|--------------------------------|
| Client Contact Centre Seniors2 |
You could also use INSTR and SUBSTR for that. Simple and fast, but not as concise as the REGEXP_SUBSTR.
with t as (
select '2*J25*Owner11*Owner Group2*L231*CLIENTCONTACTCENTRESENIORSQUEUE29*Client Contact Centre Seniors2*K20*0*2*C110*SR_STAT_ID2*N18*Referred2*O10*' testvalue
from dual
)
select substr(testvalue, instr(testvalue, '*', 1, 6)+1, instr(testvalue, '*', 1, 7) - instr(testvalue, '*', 1, 6) - 1)
from t;