REGEXP_REPLACE back-reference with function call - sql

Can I use some function call on REGEXP_REPLACE back-reference value?
For example I want to call chr() or any other function on back-reference value, but this
SELECT REGEXP_REPLACE('a 98 c 100', '(\d+)', ASCII('\1')) FROM dual;
just returns ASCII value of '\':
'a 92 c 92'
I want that the last parameter (replacement string) to be evaluated first and then to replace string. So result would be:
'a b c d'

Just for fun really, you could do the tokenization, conversion of numbers to characters, and aggregation using XPath:
select *
from xmltable(
'string-join(
for $t in tokenize($s, " ")
return if ($t castable as xs:integer) then codepoints-to-string(xs:integer($t)) else $t,
" ")'
passing 'a 98 c 100' as "s"
);
Result Sequence
--------------------------------------------------------------------------------
a b c d
The initial string value is passed in as $s; tokenize() splits that up using a space as the delimiter; each $t that generates is evaluated to see if it's an integer, and if it is then it's converted to the equivalent character via codepoints-to-string, otherwise it's left alone; then all the tokens are recombined with string-join().
If the original has runs of multiple spaces those will collapse to a single space (as they will with Littlefoot's regex).

I'm not that smart to do it using one regular expression, but - step-by-step, something like this might help. It splits the source string into rows, checks whether part of it is a number and - if so - selects CHR of it. Finally, everything is aggregated back to a single string.
SQL> with test (col) as
2 (select 'a 98 c 100' from dual),
3 inter as
4 (select level lvl,
5 regexp_substr(col, '[^ ]+', 1, level) c_val
6 from test
7 connect by level <= regexp_count(col, ' ') + 1
8 ),
9 inter_2 as
10 (select lvl,
11 case when regexp_like(c_val, '^\d+$') then chr(c_val)
12 else c_val
13 end c_val_2
14 from inter
15 )
16 select listagg(c_val_2, ' ') within group (order by lvl) result
17 from inter_2;
RESULT
--------------------
a b c d
SQL>
It can be shortened for one step (I intentionally left it as is so that you could execute one query at a time and check the result, to make things clearer):
SQL> with test (col) as
2 (select 'a 98 c 100' from dual),
3 inter as
4 (select level lvl,
5 case when regexp_like(regexp_substr(col, '[^ ]+', 1, level), '^\d+$')
6 then chr(regexp_substr(col, '[^ ]+', 1, level))
7 else regexp_substr(col, '[^ ]+', 1, level)
8 end c_val
9 from test
10 connect by level <= regexp_count(col, ' ') + 1
11 )
12 select listagg(c_val, ' ') within group (order by lvl) result
13 from inter;
RESULT
--------------------
a b c d
SQL>
[EDIT: what if input looks differently?]
That is somewhat simpler. Using REGEXP_SUBSTR, extract digits: ..., 1, 1 returns the first one, ... 1, 2 the second one. Pure REPLACE then replaces numbers with their CHR values.
SQL> with test (col) as
2 (select 'a98c100e' from dual)
3 select
4 replace(replace(col, regexp_substr(col, '\d+', 1, 1), chr(regexp_substr(col, '\d+', 1, 1))),
5 regexp_substr(col, '\d+', 1, 2), chr(regexp_substr(col, '\d+', 1, 2))) result
6 from test;
RESULT
--------------------
abcde
SQL>

Related

How to replace numbers in an alphanumeric string with words

I have a table named department_details with a column dept_id which contains values like
10_prod
20_r&d
80_sales
etc. I want a query which will give me output like
ten_prod
twenty_r&d
eighty_sales
etc.
Here's one option:
SQL> with test (col) as
2 (select '10_prod' from dual union all
3 select '20_r&d' from dual union all
4 select '80_sales' from dual
5 )
6 select col,
7 regexp_substr(col, '^\d+') num,
8 to_char(to_date(substr(col, 1, instr(col, '_') - 1), 'j'), 'jsp') wrd,
9 --
10 to_char(to_date(substr(col, 1, instr(col, '_') - 1), 'j'), 'jsp') ||
11 substr(col, instr(col, '_')) result
12 from test;
COL NUM WRD RESULT
-------- -------------------------------- ---------- --------------------
10_prod 10 ten ten_prod
20_r&d 20 twenty twenty_r&d
80_sales 80 eighty eighty_sales
SQL>
What does it do (step-by-step, so that you could follow it):
lines #1 - 5: sample data
line #7: one way to extract the number from the beginning of the string (using regular expressions)
line #8: another way (using substr + instr; probably better). It - additionally - converts it to date using the 'J' format and to character using the JSP format. This is the usual way of spelling numbers
lines #10 - 11: combine spelled number (line #10) with the rest of the string (line #11)
You could do something like this:
with list1 as
(select '10_prod' as val from dual union
select '20_randd' as val from dual union
select '80_sales' as val from dual)
select a.*,
to_char(to_date(substr(val,1,2),'j'), 'jsp')||substr(val,3,20) as text_val
from list1 a

Comma-separated string match

I have this query:
SELECT regexp_replace (var_called_num, '^' ||ROUTING_PREFIX) INTO Num
FROM INCOMING_ROUTING_PREFIX
WHERE var_called_num LIKE ROUTING_PREFIX ||'%';`
INCOMING_ROUTING_PREFIX table has two rows
1) 007743
2) 007742
var_called_num is 0077438843212123. So above query gives the result 8843212123.
So basically, the query is removing prefix (longest match from table) from var_called_num.
Now my table has changed. Now it has only 1 row which is comma-separated.
Modified Table:
INCOMING_ROUTING_PREFIX table has one row which is comma-separated:
1) 007743,007742
How to modify the query to achieve the same behavior. Need to remove longest match prefix from var_called_num.
Here's one option: you'd have to split the prefix into rows, and the use it in REGEXP_REPLACE.
SQL> with
2 calnum (var_called_num) as
3 (select '0077438843212123' from dual),
4 incoming_routing_prefix (routing_prefix) as
5 (select '007743,007742' from dual),
6 --
7 irp_split as
8 (select regexp_substr(i.routing_prefix, '[^,]+', 1, level) routing_prefix
9 from incoming_routing_prefix i
10 connect by level <= regexp_count(i.routing_prefix, ',') + 1
11 )
12 select regexp_replace(c.var_called_num, '^' || s.routing_prefix) result
13 from calnum c join irp_split s on s.routing_prefix = substr(c.var_called_num, 1, length(s.routing_prefix));
RESULT
----------------
8843212123
SQL>
By the way, why did you change the model to a worse version than it was before?
you can split the values
with test as (
select regexp_substr('007743,007742','[^,]+', 1, level) as ROUTING_PREFIX from dual
connect by regexp_substr('007743,007742S', '[^,]+', 1, level) is not null
)
and that use the view in your select
SELECT regexp_replace ('0077438843212123', '^' ||ROUTING_PREFIX)
FROM test WHERE '0077438843212123' LIKE ROUTING_PREFIX ||'%';

Regexp_substr expression

I have problem with my REGEXP expression which I want to loop and every iteration deletes text after slash. My expression looks like this now
REGEXP_SUBSTR('L1161148/1/10', '.*(/)')
I'm getting L1161148/1/ instead of L1161148/1
You said you wanted to loop.
CAVEAT: Both of these solutions assume there are no NULL list elements (all slashes have a value in between them).
SQL> with tbl(data) as (
select 'L1161148/1/10' from dual
)
select level, nvl(substr(data, 1, instr(data, '/', 1, level)-1), data) formatted
from tbl
connect by level <= regexp_count(data, '/') + 1 -- Loop # of delimiters +1 times
order by level desc;
LEVEL FORMATTED
---------- -------------
3 L1161148/1/10
2 L1161148/1
1 L1161148
SQL>
EDIT: To handle multiple rows:
SQL> with tbl(rownbr, col1) as (
select 1, 'L1161148/1/10/2/34/5/6' from dual
union
select 2, 'ALKDFJV1161148/123/456/789/1/2/3' from dual
)
SELECT rownbr, column_value substring_nbr,
nvl(substr(col1, 1, instr(col1, '/', 1, column_value)-1), col1) formatted
FROM tbl,
TABLE(
CAST(
MULTISET(SELECT LEVEL
FROM dual
CONNECT BY LEVEL <= REGEXP_COUNT(col1, '/')+1
) AS sys.OdciNumberList
)
)
order by rownbr, substring_nbr desc
;
ROWNBR SUBSTRING_NBR FORMATTED
---------- ------------- --------------------------------
1 7 L1161148/1/10/2/34/5/6
1 6 L1161148/1/10/2/34/5
1 5 L1161148/1/10/2/34
1 4 L1161148/1/10/2
1 3 L1161148/1/10
1 2 L1161148/1
1 1 L1161148
2 7 ALKDFJV1161148/123/456/789/1/2/3
2 6 ALKDFJV1161148/123/456/789/1/2
2 5 ALKDFJV1161148/123/456/789/1
2 4 ALKDFJV1161148/123/456/789
2 3 ALKDFJV1161148/123/456
2 2 ALKDFJV1161148/123
2 1 ALKDFJV1161148
14 rows selected.
SQL>
You can try removing the string after the last slash:
select regexp_replace('L1161148/1/10', '/([^/]*)$', '') from dual
You are trying to go as far as the last / and then "look back" and retain what was before it. With regular expressions you can do that with a subexpression, like this:
select regexp_substr('L1161148/1/10', '(.*)/.*', 1, 1, null, 1) from dual;
Here, as usual, the first argument "1" means where to start the search, the second "1" means which matching substring to choose, "null" means no special matching modifiers (like case-insensitive matching and such - not needed here), and the last "1" means return the first subexpression - the first thing in parentheses in the "match pattern."
However, regular expressions should only be used when you can't do it with the standard substr and instr (and translate) functions. Here the job is quite easy:
instr(text_string, '/', -1)
will give you the position of the LAST / in text_string (the -1 means find the last occurrence, instead of the first: count from the end of the string). So the whole thing can be written as:
select substr('L1161148/1/10', 1, instr('L1161148/1/10', '/', -1) - 1) from dual;
Edit: In the spirit of Gary_W's solution, here is a generalization to several strings and stripping successive layers from each input string; still not using regular expressions (resulting in slightly faster performance) and using a recursive CTE, available since Oracle version 11; I believe Gary's solution works only from Oracle 12c on.
Query: (I changed Gary's second input string a bit, to make sure the query works properly)
with tbl(item_id, input_str) as (
select 1, 'L1161148/1/10/2/34/5/6' from dual union all
select 2, 'ALKD/FJV11/61148/123/456/789/1/2/3' from dual
),
r (item_id, proc_string, stage) as (
select item_id, input_str, 0 from tbl
union all
select item_id, substr(proc_string, 1, instr(proc_string, '/', -1) - 1), stage + 1
from r
where instr(proc_string, '/') > 0
)
select * from r
order by item_id, stage;
Output:
ITEM_ID PROC_STRING STAGE
---------- ---------------------------------------- ----------
1 L1161148/1/10/2/34/5/6 0
1 L1161148/1/10/2/34/5 1
1 L1161148/1/10/2/34 2
1 L1161148/1/10/2 3
1 L1161148/1/10 4
1 L1161148/1 5
1 L1161148 6
2 ALKD/FJV11/61148/123/456/789/1/2/3 0
2 ALKD/FJV11/61148/123/456/789/1/2 1
2 ALKD/FJV11/61148/123/456/789/1 2
2 ALKD/FJV11/61148/123/456/789 3
2 ALKD/FJV11/61148/123/456 4
2 ALKD/FJV11/61148/123 5
2 ALKD/FJV11/61148 6
2 ALKD/FJV11 7
2 ALKD 8

comma Separated List

I have procedure that has parameter that takes comma separated value ,
so when I enter Parameter = '1,0,1'
I want to return ' one , Zero , One' ?
You could use REPLACE function.
For example,
SQL> WITH DATA(str) AS(
2 SELECT '1,0,1' FROM dual
3 )
4 SELECT str,
5 REPLACE(REPLACE(str, '0', 'Zero'), '1', 'One') new_str
6 FROM DATA;
STR NEW_STR
----- ------------------------------------------------------------
1,0,1 One,Zero,One
SQL>
This query splits list into into numbers, converts numbers into words and joins them again together with function listagg:
with t1 as (select '7, 0, 11, 132' col from dual),
t2 as (select level lvl,to_number(regexp_substr(col,'[^,]+', 1, level)) col
from t1 connect by regexp_substr(col, '[^,]+', 1, level) is not null)
select listagg(case
when col=0 then 'zero'
else to_char(to_date(col,'j'), 'jsp')
end,
', ') within group (order by lvl) col
from t2
Output:
COL
-------------------------------------------
seven, zero, eleven, one hundred thirty-two
The limitation of this solution is that values range is between 0 and 5373484 (because 5373484 is maximum value for function to_date).
If you need higher values you can find hints in this article.

Extract words from a comma separated string in oracle

Suppose I have string
Str = 'Aaa,Bbb,Abb,Ccc'
I want to separate the above str in two parts as follows
Str1 = 'Aaa,Abb'
Str2 = 'Bbb,Ccc'
That is any word in str starting with A should go in str1 rest all in str2.
How can I achieve this using Oracle queries?
That is any word in str starting with A should go in str1 rest all in str2.
To achieve it in pure SQL, I will use the following:
REGEXP_SUBSTR
LISTAGG
SUBSTR
INLINE VIEW
So, first I will split the comma delimited string using the techniques as demonstrated here Split single comma delimited string into rows.
And then, I will aggregate them using LISTAGG in an order.
For example,
SQL> WITH
2 t1 AS (
3 SELECT 'Aaa,Bbb,Abb,Ccc' str FROM dual
4 ),
5 t2 AS (
6 SELECT trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
7 FROM t1
8 CONNECT BY LEVEL <= regexp_count(str, ',')+1
9 ORDER BY str
10 )
11 SELECT
12 (SELECT listagg(str, ',') WITHIN GROUP(
13 ORDER BY NULL) str1
14 FROM t2
15 WHERE SUBSTR(str, 1, 1)='A'
16 ) str1,
17 (SELECT listagg(str, ',') WITHIN GROUP(
18 ORDER BY NULL) str
19 FROM t2
20 WHERE SUBSTR(str, 1, 1)<>'A'
21 ) str2
22 FROM dual
23 /
STR1 STR2
---------- ----------
Aaa,Abb Bbb,Ccc
SQL>
The WITH clause is just for demonstration purpose, in your real scenario, remove the with clause and use you table name directly. Though it looks neat using the WITH clause.
Use regext expression and ListAg function.
NOTE: LISTAGG function is available since Oracle 11g!
select listagg(s.name, ',') within group (order by name)
from (select regexp_substr('Aaa,Bbb,Abb,Ccc,Add,Ddd','[^,]+', 1, level) name from dual
connect by regexp_substr('Aaa,Bbb,Abb,Ccc,Add,Ddd', '[^,]+', 1, level) is not null) s
group by decode(substr(name,1,1),'A', 1, 0);
This query gives you the desired output in two different rows:
with temp as (select trim (both ',' from 'Aaa,Bbb,Abb,Ccc') as str from dual),
base_table as
( select trim (regexp_substr (t.str,
'[^' || ',' || ']+',
1,
level))
str
from temp t
connect by instr (str,
',',
1,
level - 1) > 0),
ult_table as
(select str,
case upper (substr (str, 1, 1)) when 'A' then 1 else 2 end
as l
from base_table)
select listagg (case when l = 1 then str else null end, ',')
within group (order by str)
str1,
listagg (case when l = 2 then str else null end, ',')
within group (order by str)
str2
from ult_table;
Output
L STR
---------- --------------------------------------------------------------------------------
1 Aaa,Abb
2 Bbb,Ccc