Oracle Regexp_replace multiple occurrence - sql

Hi I want to append a letter C to a string if it starts with a number .
Also if it has any punctuation then replace with underscore _
Eg : 5-2-2-1 ==> C5_2_2_1
I tried ,but I am not able to replace the multiple occurrence of the punctuation. I am missing some simple thing, I cant get it.
SELECT REGEXP_REPLACE('9-1-1','^(\d)(-),'C\1_' ) FROM DUAL;

SELECT case when REGEXP_LIKE('9-1-1','^[[:digit:]]') then 'C' END
|| REGEXP_REPLACE('9-1-1', '[[:punct:]]', '_')
FROM DUAL;
[:digit:] any digit
[:punct:] punctuation symbol
if you have a lot of rows with different values then try to avoid regex:
SELECT case when substr('9-1-1',1,1) between '0' and '9' then 'C' end
|| translate('9-1-1', ',.!-', '_')
FROM DUAL;
Check here for example: Performance of regexp_replace vs translate in Oracle?

Try this:
select (case when substr(val, 1, 1) between '0' and '9' then 'C' else '' end) ||
regexp_replace(val, '([-+.,;:'"!])', '_')

Related

Oracle REGEXP_REPLACE for both space and "%" at the same time

I have following Oracle SQL code:
SELECT TO_NUMBER(TRIM(REGEXP_REPLACE(per_growth, '(%)(\s)')),
'FM99999999999999999990D099999999999999999',
'NLS_NUMERIC_CHARACTERS = '', ''') AS per_growth
FROM sometable;
This code supposed to look for percentage sign first then space and exclude them from the result. However, it is showing
ORA-01722: invalid number
error. I am learning sql yet and do not know exact cause. Is it something went wrong with (%)(\s)? The value in the table is 50%
You can use TRANSLATE to get rid of all instances of unwanted characters:
SELECT TO_NUMBER(
TRANSLATE(per_growth, '0% ', '0'),
'FM99999999999999999990D099999999999999999',
'NLS_NUMERIC_CHARACTERS = '', '''
) as per_growth
FROM sometable;
Note: TRANSLATE(expr, from_string, to_string) works by swapping all instances of the characters in from_string with the corresponding characters in to_string and if there are more characters in from_string than to_string then the remaining characters are removed. It is faster than using regular expressions and on a par with using REPLACE but it can handle multiple replacements at once, which REPLACE cannot.
If you did want to use the slower REGEXP_REPLACE then you can replace all whitespace characters and all percent characters, whether together or not, using:
SELECT TO_NUMBER(
REGEXP_REPLACE(per_growth, '[%[:space:]]'),
'FM99999999999999999990D099999999999999999',
'NLS_NUMERIC_CHARACTERS = '', '''
) as per_growth
FROM sometable;
Which, for the sample data:
CREATE TABLE sometable (per_growth) AS
SELECT '1%' FROM DUAL UNION ALL
SELECT '%2' FROM DUAL UNION ALL
SELECT '3 %' FROM DUAL UNION ALL
SELECT '4% ' FROM DUAL UNION ALL
SELECT '5,0%' FROM DUAL UNION ALL
SELECT '%%% 123 456 789,0123456 %' FROM DUAL;
Both output:
PER_GROWTH
1
2
3
4
5
123456789.0123456
db<>fiddle here
Did you try good, old REPLACE?
select replace(replace(per_growth, '%', ''), ' ', '') as result
from your_table
You can use
REGEXP_REPLACE(per_growth, '( )(%)')
in order to get rid of % sign and whitespace(s) together
or
TRIM(REPLACE(per_growth,'%'))
to get rid of % sign first, and then leading and trailing spaces next,
before numerical conversion.

varchar number starting with +

I am trying below to find out if the column (dis_num) value is numeric or not which is working fine.
REGEXP_LIKE(dis_num, '^[[:digit:]]+$')
Now dis_num column can starts with + and then numbers like +8143434344. How to modify above regex which is consider starting with + as well ? Means if column has number starting with + then also we need to consider as numeric.
Thanks
If you want to check for a literal + sign you can escape it; and can make it optional with ?
REGEXP_LIKE(dis_num, '^\+?[[:digit:]]+$')
Very quick demo:
with t (dis_num) as (
select '1234' from dual
union all select 'abc' from dual
union all select '+8143434344' from dual
)
select dis_num,
case when REGEXP_LIKE(dis_num, '^[[:digit:]]+$') then 'Yes' else 'No' end as check1,
case when REGEXP_LIKE(dis_num, '^\+?[[:digit:]]+$') then 'Yes' else 'No' end as check2
from t;
DIS_NUM CHE CHE
----------- --- ---
1234 Yes Yes
abc No No
+8143434344 No Yes
AS Littlefoot says you need to replace + then evaluate the string.
You can also uses TRANSLATE, I think it's pretty useful to create a function in any oracle DB that you can call anywhere like this:
FUNCTION only_numbers(p_value VARCHAR2) RETURN VARCHAR2 IS
BEGIN
RETURN(TRANSLATE(p_value , '1' || TRANSLATE(p_value , 'a1234567890', 'a'), '1'));
END only_numbers;
SELECT only_numbers(dis_num) FROM your_table should work too.
Remove + first, then check it:
REGEXP_LIKE(replace(dis_num, '+', ''), '^[[:digit:]]+$')
-------------------------

Manipulating with regexp_substr

I have an ETL task for datawarehouse-ing purposes, I need to extract the second part of a String after a delimiter occurence such as: '#', 'ý', '-'. For example test case string:
'Tori 1#MHK-MahallaKingaveKD' I should retrieve only 'MHK'
'HPHelm2ýFFS-Tredddline' I should retrieve only 'FFS'
I already tried using the cases above:
TRIM(CASE
WHEN INSTR('HPHelm2ýFFS-Tredddline', '#',1,1) > 0
THEN (REPLACE(
REGEXP_SUBSTR('HPHelm2ýFFS-Tredddline', '[^#]+', 1,2),
'#'
))
ELSE (CASE
WHEN INSTR('HPHelm2ýFFS-Tredddline', '-',1,1) > 0
THEN (REPLACE(
REGEXP_SUBSTR('HPHelm2ýFFS-Tredddline', '[^-]+', 1,2),
'-'
))
ELSE (CASE
WHEN INSTR('HPHelm2ýFFS-Tredddline','-') = 0 AND INSTR('HPHelm2ýFFS-Tredddline','ý') = 0 AND INSTR('HPHelm2ýFFS-Tredddline','#') = 0
THEN 'HPHelm2ýFFS-Tredddline'
ELSE (CASE
WHEN INSTR('HPHelm2ýFFS-Tredddline','ý',1,1) > 0
THEN (REPLACE(
REGEXP_SUBSTR('HPHelm2ýFFS-Tredddline', '[^ý]+', 1,2),
'ý'
))
END)
END)
END)
END)
Using the code above I can retrieve:
'Tori 1#MHK-MahallaKingaveKD' ====> 'MHK-MahallaKingaveKD'
'HPHelm2ýFFS-Tredddline' ====> 'FFS-Tredddline'
Expected output:
'Tori 1#MHK-MahallaKingaveKD' ====> 'MHK'
'HPHelm2ýFFS-Tredddline' ====> 'FFS'
So I have to exclude '-' and the string after.
I guess I should modify the regexp_substr pattern but can't seem to find a clear solution since '-' is specified in the case when statements as a delimiter.
I suggest retrieving the second occurrence of 1+ chars other than your delimiter chars:
regexp_substr(col, '[^#ý-]+', 1, 2)
Here, the search starts with the first char in the record (1), and the second occurrence is returned (2).
The [^#ý-]+ pattern matches one or more (+) chars other than #, ý and -.
The following will give you what you're looking for:
WITH cteData AS (SELECT 'Tori 1#MHK-MahallaKingaveKD' AS STRING FROM DUAL UNION ALL
SELECT 'HPHelm2ýFFS-Tredddline' FROM DUAL)
SELECT STRING, REGEXP_SUBSTR(STRING, '[#ý-](.*)[#ý-]', 1, 1, NULL, 1) AS SUB_STRING
FROM cteData;
The parentheses around the .* between the delimiter groups makes the .* a sub-expression, and the final ,1 in the parameter list tells REGEXP_SUBSTR to give you back the value of sub-expression #1. Since there's only one sub-expression in the regular expression it gives you back the value of the .*, which is what you're looking for.
sqlfiddle here

REGEXP_REPLACE in Oracle

I need to use REGEXP_REPLACE to do the following :
If word starts with 'ABCD' then replace first four(4) chars with 'FFFF'
else
If word starts with 'XYZ' then replace first three(3) chars with 'GGG'
How do I use REGEXP_REPLACE to do conditional replace ?
You can use case and string operations:
select (case when word like 'ABCD%'
then 'FFFF' || substr(word, 5)
when word like 'XYZ%'
then 'GGG' || substr(word, 4)
else word
end) as new_word
If it has to be REGEXP_REPLACE you'll have to combine two function calls:
REGEXP_REPLACE(
REGEXP_REPLACE(word,'^ABCD','FFFF')
,'^XYZ', 'GGG')
But I would prevere Gordon's caseapproach...

oracle searching word within the input string

i have to find out INPUT string word found within the other string that is pipe delimited,i am trying below way but it is surprisingly return 'Y' instead of 'N'.please let me know what i am doing in wrong in below cast statement.
CASE
WHEN REGEXP_INSTR('TCS|XY|XZ','CS',1,1,1,'i') > 0
THEN 'Y'
ELSE 'N'
END
Regards,
Raj
There is really no need to use regexp_instr() regular expression function. If you just need to know if a particular character literal is part of another character literal, instr() function will completely cover your needs:
with t1(col) as(
select 'TCS|XY|XZ' from dual union all
select 'TAB|XY|XZ' from dual
)
select col
, case
when instr(col, 'CS') > 0
then 'Y'
else 'N'
end as Is_Part
from t1
Result:
COL IS_PART
--------- -------
TCS|XY|XZ Y
TAB|XY|XZ N
Edit
If you need to take vertical bars into consideration - returning yes only if there is a standalone CS sub-string surrounded by vertical bars |CS| then yes, you could use regexp_instr() regular expression function as follows:
with t1(col) as(
select 'TCS|XY|XZ|' from dual
)
select col
, case
when regexp_instr(col, '(\||^)CS(\||$)', 1, 1, 0, 'i') > 0
then 'YES'
else 'NO'
end as res
from t1
Result:
COL RES
---------- ---
TCS|XY|XZ| NO
Note: If a character literal is dynamic you could use a concatenation operator || to form a search pattern '(\||^)' || <<'character literal', column or variable>> || '(\||$)'
The first field (TCS) contains CS which counts as a match.
If you want to match an entire field you can do like this:
CASE
WHEN REGEXP_INSTR('|' || 'TCS|XY|XZ' || '|' , '\|' || 'CS' || '\|',1,1,1,'i') > 0
THEN 'Y'
ELSE 'N'
END
Add the delimiter to your query string to "anchor" the search to whole fields. To be able to match the first and last field I also added the delimiter to the searched string.