Performance of regexp_replace vs translate in Oracle?

Performance of regexp_replace vs translate in Oracle? - sql

For simple things is it better to use the translate function on the premise that it is less CPU intensive or is regexp_replace the way to go?
This question comes forth from How can I replace brackets to hyphens within Oracle REGEXP_REPLACE function?

I think you're running into simple optimization. The regexp expression is so expensive to compute that the result is cached in the hope that it will be used again in the future. If you actually use distinct strings to convert, you will see that the modest translate is naturally faster because it is its specialized function.
Here's my example, running on 11.1.0.7.0:
SQL> DECLARE
2 TYPE t IS TABLE OF VARCHAR2(4000);
3 l t;
4 l_level NUMBER := 1000;
5 l_time TIMESTAMP;
6 l_char VARCHAR2(4000);
7 BEGIN
8 -- init
9 EXECUTE IMMEDIATE 'ALTER SESSION SET PLSQL_OPTIMIZE_LEVEL=2';
10 SELECT dbms_random.STRING('p', 2000)
11 BULK COLLECT
12 INTO l FROM dual
13 CONNECT BY LEVEL <= l_level;
14 -- regex
15 l_time := systimestamp;
16 FOR i IN 1 .. l.count LOOP
17 l_char := regexp_replace(l(i), '[]()[]', '-', 1, 0);
18 END LOOP;
19 dbms_output.put_line('regex :' || (systimestamp - l_time));
20 -- tranlate
21 l_time := systimestamp;
22 FOR i IN 1 .. l.count LOOP
23 l_char := translate(l(i), '()[]', '----');
24 END LOOP;
25 dbms_output.put_line('translate :' || (systimestamp - l_time));
26 END;
27 /
regex :+000000000 00:00:00.979305000
translate :+000000000 00:00:00.238773000
PL/SQL procedure successfully completed
on 11.2.0.3.0 :
regex :+000000000 00:00:00.617290000
translate :+000000000 00:00:00.138205000
Conclusion: In general I suspect translate will win.

For SQL, I tested this with the following script:
set timing on
select sum(length(x)) from (
select translate('(<FIO>)', '()[]', '----') x
from (
select *
from dual
connect by level <= 2000000
)
);
select sum(length(x)) from (
select regexp_replace('[(<FIO>)]', '[\(\)\[]|\]', '-', 1, 0) x
from (
select *
from dual
connect by level <= 2000000
)
);
and found that the performance of translate and regexp_replace were almost always the same, but it could be that the cost of the other operations is overwhelming the cost of the functions I'm trying to test.
Next, I tried a PL/SQL version:
set timing on
declare
x varchar2(100);
begin
for i in 1..2500000 loop
x := translate('(<FIO>)', '()[]', '----');
end loop;
end;
/
declare
x varchar2(100);
begin
for i in 1..2500000 loop
x := regexp_replace('[(<FIO>)]', '[\(\)\[]|\]', '-', 1, 0);
end loop;
end;
/
Here the translate version takes just under 10 seconds, while the regexp_replace version around 0.2 seconds -- around 2 orders of magnitude faster(!)
Based on this result, I will be using regular expressions much more often in my performance critical code -- both SQL and PL/SQL.

Related

Oracle SQL Developer Query on recommended password setup

Setting up a random password for user using
select
dbms_random.string('L',2) || dbms_random.string('X',6) || '1!' as deflvrpwd,
'${access_request_cri_acc_cas9}' as ACNTDN
from dual
New requirement
New Hire Details:
Name :John Doe
Region: America
WDID : 876214
WDID Reverse and split
Region in the middle with the letter A replaced with # symbol
Should read if we follow your formula.
= 412#meric#s678
Please suggest attribute are same as mentioned.
Thank You

Here's one option; read comments within code.
SQL> WITH
2 -- sample data
3 test (name, region, wdid)
4 AS
5 (SELECT 'John Doe', 'America', '876214' FROM DUAL),
6 temp
7 AS
8 -- reverse WDID; don't use undocumented REVERSE function
9 -- replace "A" (or "a") with "#" in REGION
10 ( SELECT name,
11 REPLACE (REPLACE (region, 'A', '#'), 'a', '#') new_region,
12 LISTAGG (letter, '') WITHIN GROUP (ORDER BY lvl DESC) new_wdid
13 FROM ( SELECT SUBSTR (wdid, LEVEL, 1) letter,
14 LEVEL lvl,
15 name,
16 region
17 FROM test
18 CONNECT BY LEVEL <= LENGTH (wdid))
19 GROUP BY name, region)
20 -- finally
21 SELECT SUBSTR (new_wdid, 1, 3) || new_region || SUBSTR (new_wdid, 4) AS result
22 FROM temp;
RESULT
--------------------------------------------------------------------------------
412#meric#678
SQL>
I don't know where s in your result comes from (this: 412#meric#s678).

There's a small cost in context switching between SQL and PL/SQL, but this doesn't sound like a high-volume or performance-critical thing, so you might find it cleaner to put the logic in a function:
create or replace function get_password (p_wdid varchar2, p_region varchar2)
return varchar2 as
l_split pls_integer;
l_password varchar2(30);
begin
-- split WDID halfway, but allow for odd lengths
l_split := floor(length(p_wdid)/2);
-- iterate over the WDID in reverse
for i in reverse 1..length(p_wdid) LOOP
-- when we reach the split point, append the modified region
if i = l_split then
l_password := l_password || translate(p_region, 'Aax', '##x');
end if;
-- append each WDID character, in reverse order
l_password := l_password || substr(p_wdid, i, 1);
end loop;
return l_password;
end get_password;
/
The WDID is reversed in a loop, and the modified region is included at the midway point, based on the length of the WDID value.
You can then do:
select get_password('876214', 'America') from dual;
GET_PASSWORD('876214','AMERICA')
--------------------------------
412#meric#678
This also doesn't have the unexplained 's' from the example in your question.
If you can't create a function but are on a recent version of Oracle then you can define an ad hoc function in a CTE:
with
function invert (p_input varchar2) return varchar2 as
l_output varchar2(30);
begin
for i in reverse 1..length(p_input) LOOP
l_output := l_output || substr(p_input, i, 1);
end loop;
return l_output;
end invert;
t (wdid, region) as (
select invert('876214'), translate('America', 'Aax', '##x')
from dual
)
select substr(wdid, 1, floor(length(wdid)/2))
|| region
|| substr(wdid, floor(length(wdid)/2) + 1)
from t;
which gets the same result. (I've called the function invert to avoid confusion with the undocumented reverse function.)
db<>fiddle showing both.

ORA-01830 when converting number to words

High value is in decimal format eg.- 100.10, I want to convert it into word so I write below script but not getting execution by this..
SELECT SYMBOL, HIGH, UPPER(TO_CHAR(TO_DATE(HIGH,'J'),'JSP'))
AMT_IN_WORDS FROM BHAV;
getting error of
ORA-01830
please correct this where am wrong....
Thank you in advance...

You can creation a function.
CREATE OR REPLACE FUNCTION big_amt_in_words (p_input VARCHAR2) RETURN VARCHAR2
IS
v_running_input NUMBER;
v_num NUMBER;
v_amt_in_words VARCHAR2(2000);
BEGIN
v_running_input := P_input;
FOR i IN (
SELECT RPAD(1, (rownum*3)+1, 0) num_value,
CASE LENGTH(RPAD(1, (rownum*3)+1, 0))
WHEN 4 THEN 'THOUSAND'
WHEN 7 THEN 'MILLION'
WHEN 10 THEN 'BILLION'
WHEN 13 THEN 'TRILLION'
WHEN 16 THEN 'QUADRILLION'
WHEN 19 THEN 'QUINTILLION'
WHEN 22 THEN 'SEXTILLION'
WHEN 25 THEN 'SEPTILLION'
WHEN 28 THEN 'OCTILLION'
END place_value
FROM DUAL
CONNECT BY rownum < 10
ORDER BY rownum desc)
LOOP
v_num := TRUNC(v_running_input/i.num_value,0);
IF v_num > 0 THEN
v_amt_in_words := v_amt_in_words||' '||TO_CHAR(TO_DATE(v_num,'J'), 'JSP')||' '||i.place_value;
v_running_input := v_running_input - (v_num * i.num_value);
END IF;
END LOOP;
v_amt_in_words := v_amt_in_words||' '||TO_CHAR(TO_DATE(TRUNC(v_running_input),'J'), 'JSP')
||' AND '||UPPER(TO_CHAR(TO_DATE((ROUND(v_running_input-TRUNC(v_running_input),2)*100),'J'),'JSP'))||' CENTS';
RETURN TRIM(v_amt_in_words);
END;
/
To use it,
SELECT BIG_AMT_IN_WORDS(65763245345658.12) amt_in_words
FROM DUAL;
Output
---------------------------------------------
SIXTY-FIVE TRILLION SEVEN HUNDRED SIXTY-THREE BILLION TWO HUNDRED FORTY-FIVE MILLION THREE HUNDRED FORTY-FIVE THOUSAND SIX HUNDRED FIFTY-EIGHT AND TWELVE CENTS

The error is raised since the value of high that you have shown is a decimal, that cannot be cast as an integer implicitly, unlike 100.00. So, it cannot be converted to Julian date.
SELECT UPPER(TO_CHAR(TO_DATE(100.10,'J'),'JSP'))AMT_IN_WORDS FROM DUAL;
This causes
ORA-01830: date format picture ends before converting entire input
string
This can be resolved by rounding the decimal to the nearest integer.
SELECT UPPER(TO_CHAR(TO_DATE(ROUND(100.10),'J'),'JSP'))AMT_IN_WORDS FROM DUAL;
| AMT_IN_WORDS |
|--------------|
| ONE HUNDRED |
Demo
If you really want the float component as well, although limited, you may refer this answer's EDIT2: How to convert number to words - ORACLE

Oracle query to find all occurrences of a charcter in a string

I have to write an Oracle query in toad to find all the occurrences of a character in a string. For example if I'm searching for R in the string SSSRNNSRSSR, it should return positions 4, 8 and 11.
I am new to Oracle and tried this.
select instr(mtr_ctrl_flags, 'R', pos + 1, 1) as pos1
from mer_trans_reject
where pos in ( select instr(mtr_ctrl_flags, 'R', 1, 1) as pos
from mer_trans_reject
);
where mtr_ctrl_flags is the column name. I'm getting an error indicating that pos is an invalid identifier.

Extending GolezTrol's answer you can use regular expressions to significantly reduce the number of recursive queries you do:
select instr('SSSRNNSRSSR','R', 1, level)
from dual
connect by level <= regexp_count('SSSRNNSRSSR', 'R')
REGEXP_COUNT() returns the number of times the pattern matches, in this case the number of times R exists in SSSRNNSRSSR. This limits the level of recursion to the exact number you need to.
INSTR() simply searches for the index of R in your string. level is the depth of the recursion but in this case it's also the level th occurrence of the string as we restricted to the number of recurses required.
If the string you're wanting to pick out is more complicated you could go for regular expressions ans REGEXP_INSTR() as opposed to INSTR() but it will be slower (not by much) and it's unnecessary unless required.
Simple benchmark as requested:
The two CONNECT BY solutions would indicate that using REGEXP_COUNT is 20% quicker on a string of this size.
SQL> set timing on
SQL>
SQL> -- CONNECT BY with REGEX
SQL> declare
2 type t__num is table of number index by binary_integer;
3 t_num t__num;
4 begin
5 for i in 1 .. 100000 loop
6 select instr('SSSRNNSRSSR','R', 1, level)
7 bulk collect into t_num
8 from dual
9 connect by level <= regexp_count('SSSRNNSRSSR', 'R')
10 ;
11 end loop;
12 end;
13 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:03.94
SQL>
SQL> -- CONNECT BY with filter
SQL> declare
2 type t__num is table of number index by binary_integer;
3 t_num t__num;
4 begin
5 for i in 1 .. 100000 loop
6 select pos
7 bulk collect into t_num
8 from ( select substr('SSSRNNSRSSR', level, 1) as character
9 , level as pos
10 from dual t
11 connect by level <= length('SSSRNNSRSSR') )
12 where character = 'R'
13 ;
14 end loop;
15 end;
16 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:04.80
The pipelined table function is a fair bit slower, though it would be interesting to see how it performs over large strings with lots of matches.
SQL> -- PIPELINED TABLE FUNCTION
SQL> declare
2 type t__num is table of number index by binary_integer;
3 t_num t__num;
4 begin
5 for i in 1 .. 100000 loop
6 select *
7 bulk collect into t_num
8 from table(string_indexes('SSSRNNSRSSR','R'))
9 ;
10 end loop;
11 end;
12 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.54

This is a solution:
select
pos
from
(select
substr('SSSRNNSRSSR', level, 1) as character,
level as pos
from
dual
connect by
level <= length(t.text))
where
character = 'R'
dual is a built in table that just returns a single row. Very convenient!
connect by lets you build recursive queries. This is often used to generate lists from tree-like data (parent/child relations). It allows you to more or less repeat the query in front of it. And you've got special fields, like level that allows you to check how deeply the recursion went.
In this case, I use it to split the string to characters and return a row for each character. Using level, I can repeat the query and get a character until the end of the string is reached.
Then it is just a matter of returning the pos for all rows containing the character 'R'

To take up a_horse_with_no_name's challenge here is another answer with a pipelined table function.
A pipelined function returns an array, which you can query normally. I would expect that over strings with large numbers of matches this will perform better than the recursive query but as with everything test yourself first.
create type num_array as table of number
/
create function string_indexes (
PSource_String in varchar2
, PSearch_String in varchar2
) return num_array pipelined is
begin
for i in 1 .. length(PSource_String) loop
if substr(PSource_String, i, 1) = PSearch_String then
pipe row(i);
end if;
end loop;
return;
end;
/
Then in order to access it:
select *
from table(string_indexes('SSSRNNSRSSR','R'))
SQL Fiddle

oracle pl/sql ora-01722 error

I have a simple oracle statement in my procedure:
update org.security_training_question a
set a.actv_indr = 'N' where a.qstn_id in (v_qstns_to_delete);
v_qstns_to_delete is a parameter being passed. It is a varchar2 field and a.qstn_id is a numeric field.
When calling the Stored Procedure, for v_qstns_to_delete I am passing the following String: "24, 43, 23, 44, 21".
When I run the statement output the stored procedure thenn it runs fine but when I run it as a stored procedure I get an error on the above line saying Invalid Number.
Any clue?

You can't use a "in" clause with a variable like that. One way around it is
declare stmt varchar2(4000);
begin
stmt := 'update org.security_training_question a set a.actv_indr = ''N'' where a.qstn_id in ('||v_qstns_to_delete||')';
execute immediate stmt;
end;

if v_qstns_to_delete is a varchar, you would need to convert it somewhat to let Oracle understand that there may be several items in it. One method would be to convert the string to a table of items.
Supposing qstn_id is a NUMBER column, you would:
SQL> CREATE TYPE tab_number AS TABLE OF NUMBER;
2 /
Type created
SQL> CREATE OR REPLACE FUNCTION to_tab_number(p_in VARCHAR2,
2 p_separator VARCHAR2 DEFAULT ',')
3 RETURN tab_number AS
4 l_result tab_number := tab_number();
5 l_tail LONG := p_in;
6 BEGIN
7 WHILE l_tail IS NOT NULL LOOP
8 l_result.EXTEND;
9 IF instr(l_tail, p_separator) != 0 THEN
10 l_result(l_result.COUNT) := to_number(substr(l_tail,
11 1,
12 instr(l_tail, p_separator) - 1));
13 l_tail := substr(l_tail, instr(l_tail, p_separator) + 1);
14 ELSE
15 l_result(l_result.COUNT) := to_number(l_tail);
16 l_tail := NULL;
17 END IF;
18 END LOOP;
19 RETURN l_result;
20 END;
21 /
Function created
You could then convert a string to a table of number from SQL:
SQL> SELECT * FROM TABLE(to_tab_number('24, 43, 23, 44, 21'));
COLUMN_VALUE
------------
24
43
23
44
21
To do a variable in-list:
SQL> SELECT object_id, owner
2 FROM all_objects
3 WHERE object_id IN (SELECT column_value FROM TABLE(to_tab_number('18,19,20')));
OBJECT_ID OWNER
---------- ------------------------------
18 SYS
19 SYS
20 SYS
More on the same subject on askTom.

Determine Oracle null == null

I wish to search a database table on a nullable column. Sometimes the value I'm search for is itself NULL. Since Null is equal to nothing, even NULL, saying
where MYCOLUMN=SEARCHVALUE
will fail. Right now I have to resort to
where ((MYCOLUMN=SEARCHVALUE) OR (MYCOLUMN is NULL and SEARCHVALUE is NULL))
Is there a simpler way of saying that?
(I'm using Oracle if that matters)

You can do the IsNull or NVL stuff, but it's just going to make the engine do more work. You'll be calling functions to do column conversions which then have to have the results compared.
Use what you have
where ((MYCOLUMN=SEARCHVALUE) OR (MYCOLUMN is NULL and SEARCHVALUE is NULL))

#Andy Lester asserts that the original form of the query is more efficient than using NVL. I decided to test that assertion:
SQL> DECLARE
2 CURSOR B IS
3 SELECT batch_id, equipment_id
4 FROM batch;
5 v_t1 NUMBER;
6 v_t2 NUMBER;
7 v_c1 NUMBER;
8 v_c2 NUMBER;
9 v_b INTEGER;
10 BEGIN
11 -- Form 1 of the where clause
12 v_t1 := dbms_utility.get_time;
13 v_c1 := dbms_utility.get_cpu_time;
14 FOR R IN B LOOP
15 SELECT COUNT(*)
16 INTO v_b
17 FROM batch
18 WHERE equipment_id = R.equipment_id OR (equipment_id IS NULL AND R.equipment_id IS NULL);
19 END LOOP;
20 v_t2 := dbms_utility.get_time;
21 v_c2 := dbms_utility.get_cpu_time;
22 dbms_output.put_line('For clause: WHERE equipment_id = R.equipment_id OR (equipment_id IS NULL AND R.equipment_id IS NULL)');
23 dbms_output.put_line('CPU seconds used: '||(v_c2 - v_c1)/100);
24 dbms_output.put_line('Elapsed time: '||(v_t2 - v_t1)/100);
25
26 -- Form 2 of the where clause
27 v_t1 := dbms_utility.get_time;
28 v_c1 := dbms_utility.get_cpu_time;
29 FOR R IN B LOOP
30 SELECT COUNT(*)
31 INTO v_b
32 FROM batch
33 WHERE NVL(equipment_id,'xxxx') = NVL(R.equipment_id,'xxxx');
34 END LOOP;
35 v_t2 := dbms_utility.get_time;
36 v_c2 := dbms_utility.get_cpu_time;
37 dbms_output.put_line('For clause: WHERE NVL(equipment_id,''xxxx'') = NVL(R.equipment_id,''xxxx'')');
38 dbms_output.put_line('CPU seconds used: '||(v_c2 - v_c1)/100);
39 dbms_output.put_line('Elapsed time: '||(v_t2 - v_t1)/100);
40 END;
41 /
For clause: WHERE equipment_id = R.equipment_id OR (equipment_id IS NULL AND R.equipment_id IS NULL)
CPU seconds used: 84.69
Elapsed time: 84.8
For clause: WHERE NVL(equipment_id,'xxxx') = NVL(R.equipment_id,'xxxx')
CPU seconds used: 124
Elapsed time: 124.01
PL/SQL procedure successfully completed
SQL> select count(*) from batch;
COUNT(*)
----------
20903
SQL>
I was kind of surprised to find out just how correct Andy is. It costs nearly 50% more to do the NVL solution. So, even though one piece of code might not look as tidy or elegant as another, it could be significantly more efficient. I ran this procedure multiple times, and the results were nearly the same each time. Kudos to Andy...

In Expert Oracle Database Architecture I saw:
WHERE DECODE(MYCOLUMN, SEARCHVALUE, 1) = 1

I don't know if it's simpler, but I've occasionally used
WHERE ISNULL(MyColumn, -1) = ISNULL(SearchValue, -1)
Replacing "-1" with some value that is valid for the column type but also not likely to be actually found in the data.
NOTE: I use MS SQL, not Oracle, so not sure if "ISNULL" is valid.

Use NVL to replace null with some dummy value on both sides, as in:
WHERE NVL(MYCOLUMN,0) = NVL(SEARCHVALUE,0)

Another alternative, which is probably optimal from the executed query point of view, and will be useful only if you are doing some kind of query generation is to generate the exact query you need based on the search value.
Pseudocode follows.
if (SEARCHVALUE IS NULL) {
condition = 'MYCOLUMN IS NULL'
} else {
condition = 'MYCOLUMN=SEARCHVALUE'
}
runQuery(query,condition)

If an out-of-band value is possible:
where coalesce(mycolumn, 'out-of-band')
= coalesce(searchvalue, 'out-of-band')

Try
WHERE NVL(mycolumn,'NULL') = NVL(searchvalue,'NULL')

This can also do the job in Oracle.
WHERE MYCOLUMN || 'X' = SEARCHVALUE || 'X'
There are some situations where it beats the IS NULL test with the OR.
I was also surprised that DECODE lets you check NULL against NULL.
WITH
TEST AS
(
SELECT NULL A FROM DUAL
)
SELECT DECODE (A, NULL, 'NULL IS EQUAL', 'NULL IS NOT EQUAL')
FROM TEST

I would think that what you have is OK. You could maybe use:
where NVL(MYCOLUMN, '') = NVL(SEARCHVALUE, '')

This is a situation we find ourselves in a lot with our Oracle functions that drive reports. We want to allow users to enter a value to restrict results or leave it blank to return all records. This is what I have used and it has worked well for us.
WHERE rte_pending.ltr_rte_id = prte_id
OR ((rte_pending.ltr_rte_id IS NULL OR rte_pending.ltr_rte_id IS NOT NULL)
AND prte_id IS NULL)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Performance of regexp_replace vs translate in Oracle? - sql

For simple things is it better to use the translate function on the premise that it is less CPU intensive or is regexp_replace the way to go? This question comes forth from How can I replace brackets to hyphens within Oracle REGEXP_REPLACE function?

Related

Oracle SQL Developer Query on recommended password setup

ORA-01830 when converting number to words

Oracle query to find all occurrences of a charcter in a string

oracle pl/sql ora-01722 error

Determine Oracle null == null

Categories

Resources