extracting text from a column using regexp_substr - sql

I have a table with a varchar column with data like this:
"<tasa>
<parametros>
<parametro>
<nombre>ea</nombre>
<valor>35</valor>
</parametro>
</parametros>
<valorTasa>3.15</valorTasa>
</tasa>"
I need to be able to extract the value between the valorTasa tags, but don't know how to use the function and can't access oracle documentation.
I'm trying something like
select regexp_substr(field, '<valorTasa>[0-9]{0-3}</valorTasa') from dual;
With no results.
Any help would be greatly appreciated

More simple way would be using extractvalue function to extract the value of the node.
-- sample of data
SQL> with t1(col) as(
2 select '<tasa>
3 <parametros>
4 <parametro>
5 <nombre>ea</nombre>
6 <valor>35</valor>
7 </parametro>
8 </parametros>
9 <valorTasa>3.15</valorTasa>
10 </tasa>'
11 from dual
12 )
13 select extractvalue(xmltype(col), '/tasa/valorTasa') as res
14 from t1
15 /
RES
-------
3.15

Actually REGEXP_REPLACE will work best for this. If you put a part of the search expression in parentheses you can refer to it in the third "replace-with" parameter - the first such expression is \1, the second is \2, and so on up to \9 (you can't do more than 9).
For your requirement, try this:
SELECT REGEXP_REPLACE(myXMLCol, '^.*<valorTasa>(.*)</valorTasa>.*$', '\1') FROM myTable
^^^^ ^^
The part in the parentheses above - (.*) maps to \1. The Oracle REGEXP_REPLACE docs explain this better than I can :)

SELECT regexp_replace(
regexp_substr(field, '<valorTasa>[0-9\.]+</valorTasa>'),
'<valorTasa>([0-9\.]+)</valorTasa>',
'\1')
from dual;

For multiline XML documents, as we have here, regexp_replace routine could be used but only with correct match_parameter = mn :
with t1(col) as(
select '<tasa>
<parametros>
<parametro>
<nombre>ea</nombre>
<valor>35</valor>
</parametro>
</parametros>
<valorTasa>3.15</valorTasa>
</tasa>'
from dual
)
select
REGEXP_REPLACE(col, '^.*<valorTasa>(.*)</valorTasa>.*$', '\1', 1, 0, 'mn') as res
from t1
/

Related

Need Regex pattern for right side of 10 digits from mobile Number

I need help with how to write a regex for getting the last 10 digits from the right side of the mobile Number
For examples:
Input is: 919345678901
output is: 9345678901
input2 is: 09934567892
output is: 9934567892
PL/SQL means Oracle; in that case, you don't need slow regular expressions as fast substr function does the job nicely:
Sample data:
SQL> with test (col) as
2 (select '919345678901' from dual union all
3 select '09934567892' from dual
4 )
Query begins here:
5 select col,
6 substr(col, -10) result
7 from test;
COL RESULT
------------ ----------------------------------------
919345678901 9345678901
09934567892 9934567892
SQL>
regexp_replace(target,'^\d*(\d{10})$', '\1')

How to get first string after character Oracle SQL

I'm trying to get first string after a character.
Example is like
ABCDEF||GHJ||WERT
I need only
GHJ
I tried to use REGEXP but i couldnt do it.
Can anyone help me with please?
Thank you
Somewhat simpler:
SQL> select regexp_substr('ABCDEF||GHJ||WERT', '\w+', 1, 2) result from dual;
^
RES |
--- give me the 2nd "word"
GHJ
SQL>
which reads as: give me the 2nd word out of that string. Won't work properly if GHJ consists of several words (but that's not what your example suggests).
Something like I interpret with a separator in place, In this case it is || or | example is with oracle database
-- pattern -- > [^] represents non-matching character and + for says one or more character followed by ||
-- 3rd parameter --> starting position
-- 4th parameter --> nth occurrence
WITH tbl(str) AS
(SELECT 'ABCDEF||GHJ||WERT' str FROM dual)
SELECT regexp_substr(str
,'[^||]+'
,1
,2) output
FROM tbl;
I think the most general solution is:
WITH tbl(str) AS (
SELECT 'ABCDEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABC|DEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABClDEF||GHJ||WERT' str FROM dual
)
SELECT regexp_replace(str, '^.*\|\|(.*)\|\|.*', '\1')
FROM tbl;
Note that this works even if the individual elements contain punctuation or a single vertical bar -- which the other solutions do not. Here is a comparison.
Presumably, the double vertical bar is being used for maximum flexibility.
You should use regexp_substr function
select regexp_substr('ABCDEF||GHJ||WERT ', '\|{2}([^|]+)', 1, 1, 'i', 1) str
from dual;
STR
---
GHJ

ORACLE regexp_substr extract everything after specific char

How to get rest of string after specific char?
I have a string 'a|b|c|2|:x80|3|rr|' and I would like to get result after 3rd occurance of |. So the result should be like 2|:x80|3|rr|
The query
select REGEXP_SUBSTR('a|b|c|2|:x80|3|rr|','[^|]+$',1,4)
from dual
Returned me NULL
Use SUBSTR / INSTR combination
WITH t ( s ) AS (
SELECT 'a|b|c|2|:x80|3|rr|'
FROM dual
) SELECT substr(s,instr(s,'|',1,3) + 1)
FROM t;
Demo
REGEXP_REPLACE() will do the trick. Skip 3 groups of anything followed by a pipe, then replace with the 2nd group, which is the rest of the line (anchored to the end).
SQL> select regexp_replace('a|b|c|2|:x80|3|rr|', '(.*?\|){3}(.*)$', '\2') trimmed
2 from dual;
TRIMMED
------------
2|:x80|3|rr|
SQL>
I suggest a nice by long way by using regexp_substr, regexp_count and listagg together as :
select listagg(str) within group (order by lvl)
as "Result String"
from
(
with t(str) as
(
select 'a|b|c|2|:x80|3|rr|' from dual
)
select level-1 as lvl,
regexp_substr(str,'(.*?)(\||$)',1,level) as str
from dual
cross join t
connect by level <= regexp_count('a|b|c|2|:x80|3|rr|','\|')
)
where lvl >= 3;
Rextester Demo
If you use oracle 11g and above you can specify a subexpression to return like this:
select REGEXP_SUBSTR('a|b|c|2|:x80|3|rr|','([^|]+\|){3}(.+)$',1,1,null,2) from dual
Erkko,
You need to use the combination of SUBSTR and REGEXP_INSTR OR INSTR.
Your query will look like this. (Without Regex)
SELECT SUBSTR('a|b|c|2|:x80|3|rr|',INSTR('a|b|c|2|:x80|3|rr|','|',1,3)+1) from dual;
Your query will look like this. (With Regex as you want to use)
SELECT SUBSTR('a|b|c|2|:x80|3|rr|',REGEXP_INSTR('a|b|c|2|:x80|3|rr|','\|',1,3)+1) from dual;
Explanation:
First, you will need to find the place of the string you want as you mentioned. So in your case | comes at place 6. So that +1 would be your position to start to substring.
Second, from the original string, substring from that position+1 to unlimited.(Where your string ends)
Example:
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=6fd782db95f575201eded084493232ee

Insert character between string Oracle SQL

I need to insert character string after each character in Oracle SQL.
Example:
ABC will A,B,C
DEFG will be D,E,F,G
This question gives only one character in string
Oracle insert character into a string
Edit: As some fellows have mentioned, Oracle does not admit this regex. So my approach would be to do a regex to match all characters, add them a comma after the character and then removing the last comma.
WITH regex AS (SELECT REGEXP_REPLACE('ABC', '(.)', '\1,') as reg FROM dual) SELECT SUBSTR(reg, 1, length(reg)-1) FROM regex;
Note that with the solution of rtrim there could be errors if the string you want to parse has a final ending comma and you don't want to remove it.
Previous solution: (Not working on Oracle)
Check if this does the trick:
SELECT REGEXP_REPLACE('ABC', '(.)(?!$)', '\1,') FROM dual;
It does a regexp_replace of every character, but the last one for the same character followed by a ,
To see how regexp_replace works I recommend you: https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions130.htm
SELECT rtrim(REGEXP_REPLACE('ABC', '(.)', '\1,'),',') "REGEXP_REPLACE" FROM dual;
You could do it using:
REGEXP_REPLACE
RTRIM
For example,
SQL> WITH sample_data AS(
2 SELECT 'ABC' str FROM dual UNION ALL
3 SELECT 'DEFG' str FROM dual UNION ALL
4 SELECT 'XYZ' str FROM dual
5 )
6 -- end of sample_data mimicking a real table
7 SELECT str,
8 rtrim(regexp_replace(str, '(\w?)', '\1,'),',') new_str
9 FROM sample_data;
STR NEW_STR
---- ----------
ABC A,B,C
DEFG D,E,F,G
XYZ X,Y,Z
Since there is no way to negate the end of string in an Oracle regex (that does not support lookarounds), you may use
SELECT REGEXP_REPLACE(
REGEXP_REPLACE('ABC', '([^,])([^,])','\1,\2'),
'([^,])([^,])',
'\1,\2')
AS Result from dual
See the DB Fiddle. The point here is to use REGEXP_REPLACE with ([^,])([^,]) pattern twice to cater for consecutive matches.
The ([^,])([^,]) pattern matches any non-comma char into Group 1 (\1) and then any non-comma char into Group 2 (\2), and inserts a comma in between them.

Extract string within delimiters

I have a string {1:F01BPHKPLPKXXX0000000000} from which I need to extract 1:F01BPHKPLPKXXX0000000000 using regex_substr. Can you please help me with this ?
Why use REGEXP_SUBSTR? Using pistol to kill a mouse?
You just need to TRIM those braces.
SQL> WITH DATA AS(
2 SELECT q'[{1:F01BPHKPLPKXXX0000000000}]' STR FROM DUAL)
3 select rtrim(ltrim(str,'{'),'}') str from data
4 /
STR
--------------------------
1:F01BPHKPLPKXXX0000000000
SQL>
What about this:
select regexp_replace('{1:F01BPHKPLPKXXX0000000000}', '{(.*)}', '\1')
from dual
It takes everything between the brackets and outputs that.
This can be much easier if you ask me, using substr:
select substr(var, 2, length(var)-2)
from (select '{1:F01BPHKPLPKXXX0000000000}' var from dual)
Try the following:
SELECT REGEXP_SUBSTR('{1:F01BPHKPLPKXXX0000000000}', '[^{].*[^}]') FROM DUAL
Share and enjoy.
I would just use regexp_replace. I just use the alternation operator, |, so that I ask to replace { at the beginning (using the anchor ^) and } at the end of the string (using the anchor $).
SCOTT#dev> WITH a_tab AS
2 (SELECT '{1:F01BPHKPLPKXXX0000000000}' a_col FROM dual
3 )
4 SELECT regexp_replace (a_col, '^{|}$') FROM a_tab
5 /
REGEXP_REPLACE(A_COL,'^{|}$')
==========================
1:F01BPHKPLPKXXX0000000000