remove characters between specific characters in pl/sql - sql

I need get a substring from the below example
luvi.luci#gma
and i want to return luci. So basically i need to remove all the information before '.' and after '#'
more examples:
pd.prd#gded

You can do this with regexp_substr(). Here is an example:
select translate(regexp_substr(email, '[.].*#', 1, 1), 'x.#', 'x')
from (select 'luvi.luci#gma' as email from dual) x

with data (val) as
(
select null from dual union all
select 'luvi.luci' from dual union all
select 'luvi.luci#gma' from dual union all
select 'pd.prd#gded' from dual
)
-- step:1
-- find the second group (\2) within the match
-- ie. (any word/sequence of characters (\w+) flanked by a dot and a #)
-- step:2
-- |. OR any other character not matched in step:1 - will be ignored
-- step:3
-- \2 for each match found while parsing, for the entire match,
-- replace it with the second group - so the dot and the # are dropped from the match
select val, regexp_replace (val, '(\.(\w+)#)|.', '\2') ss from data;

Related

How to select the list of words containing a particular substring as part of a SQL query (oracle)?

I'm trying to return the list of "words" (separated by spaces) containing a certain substring within a string as part of an Oracle Sql query. Would like to return the result as a comma separated list. Separate rows for each match would also work.
Example String in [text_col] field:
some words 123-asdf-789A and also this one 456-asdf-555A more words etc.
Desired result: 123-asdf-789A, 456-asdf-555A
This is what I have so far but it only returns the first result and the fact that it's two separate regular expressions makes it difficult to concatenate all matches as I would like to do.
CONCAT(REGEXP_SUBSTR(text_col, ''(([^[:space:]]+)\asdf)'', 1, 1, ''i'', 1),
REGEXP_SUBSTR(text_col, ''\asdf([^[:space:]]+)'', 1, 1, ''i'', 1))
You can use some regexp functions together as :
with tab(str) as
(
select 'some words 123-asdf-789A and also this one 456-asdf-555A more words etc' from dual
), t as
(
select regexp_substr(str,'[^[:space:]]+',1,level) as str, level as lvl
from tab
connect by level <= regexp_count(str,'[:space:]')
)
select listagg(str,',') within group (order by lvl) as "Result"
from t
where regexp_like(str,'-');
Result
---------------------------------
123-asdf-789A,456-asdf-555A
Demo
first split by spaces (through [:space:] posix) and take the ones containing dash characters, and finally concatenate by listagg() function
Use a recursive sub-query factoring clause and iterate through all the matches concatenating the string as you go:
Oracle Setup:
CREATE TABLE test_data ( value ) AS
SELECT 'some words 123-asdf-789A and also this one 456-asdf-555A more words etc.' FROM DUAL UNION ALL
SELECT 'some words without the expected sub-string' FROM DUAL UNION ALL
SELECT 'asdf asdf-123 456-asdf 78-asdf-90' FROM DUAL
Query:
WITH matches ( value, idx, cnt, match ) AS (
SELECT value,
0,
REGEXP_COUNT( value, '\S*asdf\S*' ),
CAST( NULL AS VARCHAR2(4000) )
FROM test_data
UNION ALL
SELECT value,
idx + 1,
cnt,
CASE idx WHEN 0 THEN '' ELSE match || ' ' END
|| REGEXP_SUBSTR( value, '\S*asdf\S*', 1, idx + 1 )
FROM matches
WHERE idx < cnt
)
SELECT value, match
FROM matches
WHERE idx = cnt;
Output:
VALUE | MATCH
:----------------------------------------------------------------------- | :--------------------------------
some words without the expected sub-string | null
some words 123-asdf-789A and also this one 456-asdf-555A more words etc. | 123-asdf-789A 456-asdf-555A
asdf asdf-123 456-asdf 78-asdf-90 | asdf asdf-123 456-asdf 78-asdf-90
db<>fiddle here

Update ID value to format XXXXXXXX-X using oracle SQL

Table name: TEST
Column name: ID [VARCHAR(200)]
The format of ID is ‘XXXXXXXX-X’, where ‘X’ is a number from 0 to 9.
Additional operations in case above format is not satisfied:
if the ID consists of 9 digits and there is a double dash between eighth and ninth digit , the extra dash is removed (e.g. 08452142--6 -> 08452142-6)
if the ID consists of 9 digits and there is/are space(s) between eighth and ninth digit and/or non-digits and/or non-letter symbol(s) then replace them to dash (e.g. 08452142 - . 3 -> 08452142-3)
if the ID consists 9 digits and starts/ends with non-digits and/or non-letter symbol(s) then delete that symbol(s) up to digit (e.g. 08452142-2.. -> 08452142-2)
if the ID contains only 9 digits then put a dash before the last digit (e.g. 123456789 -> 12345678-9)
I have achieved the necessary format by using the below snippet.
UPDATE TEST
SET ID = (SELECT REGEXP_REPLACE(ID,'^\d{8}-\d{1}$','') AS "ID"
from TEST
WHERE PK = 11;
)
What are the possible ways to add transformations as mentioned in points[1-4] above in a single query?
Using REGEXP_REPLACE, I can achieve ID in above format. But in case format is incorrect, and ID needs to be transformed[like removing extra dash, or adding dash in case 9 digits are received] to achieve satisfactory format, how can that be achieved in a single UPDATE query?
In any case, you need to extract 9 digits from your string in the first step. And then
add a hyphen before the last character. For both steps use regexp_replace() function
with test(id) as
(
select '08452142--6' from dual union all
select '08452142 - . 3' from dual union all
select '08452142-2..' from dual union all
select '123456789' from dual union all
select '1234567890' from dual
)
select case when length(regexp_replace(id,'(\D)'))=9 then
regexp_replace(regexp_replace(id,'(\D)'),
'(^[[:digit:]]{8})(.*)([[:digit:]]{1}$)','\1-\3')
end as id
from test;
ID
----------
08452142-6
08452142-3
08452142-2
12345678-9
<null>
Demo
You can use the following I think:
UPDATE TEST
SET ID = REGEXP_REPLACE(ID,'^\D*(\d{8})\D*(\d)\D*$','\1-\2')
WHERE REGEXP_LIKE(ID,'^\D*(\d{8})\D*(\d)\D*$')
This way you ignore all non-digit charcters and search for a 8-digit number and then an 1-digit number. Take these 2 numbers and put a single '-' in between.
This is a little more generous as you might need but should work with all your provided examples.
I think you want the first 8 digits, then a hyphen, then the 9th digit:
select ( substr(regexp_replace(id, '[^0-9]', ''), 1, 8) ||
'-' ||
substr(regexp_replace(id, '[^0-9]', ''), 9, 1)
)
I tried an approach based on the suggestion by #BarbarosÖzhan:
with source as (
select '02426467--6' id from dual union all
select '02426467-6' id from dual union all
select '02597718 -- . 3' id from dual union all
select '02597718 --dF5 . 3' id from dual union all
select '00120792-2..' id from dual union all
select '..00120792-2..' id from dual union all
select '123456789' id from dual union all
select '1234567890' id from dual
)
select
case
when regexp_like(id, '\d{8}-\d{1}')
then id
else
case
when regexp_like(id, '\d{8}-\d{1}')
then id
else
case
when regexp_count(id, '\d') = 9
then
case
when
regexp_like(
regexp_replace(
regexp_replace(
id, '(\d{8}-)(-)(\d{1})', '\1\3'
), '(\d{8})([^A-Za-z1-9])(\d{1})', '\1-\3'
)
, '\d{8}-\d{1}')
then
regexp_replace(
regexp_replace(
id, '(\d{8}-)(-)(\d{1})', '\1\3'
), '(\d{8})([^A-Za-z1-9])(\d{1})', '\1-\3'
)
else id
end
else id
end
end id_tr
from source
However, in cases 3 and 4, I cannot get rid of the space, dot and alphabets. I think something wrong with the logic in case length is more than 9. I end with "id" as it is so the result is the same without any modifications.
Any suggestions to impprove this?

Regexp_replace processing result

I have a string with groups of nubmers. And Id like to make constant length string. Now I use two regexp_replace. First to add 10 numbers to string and next to cut string and take last 10 values:
with s(txt) as ( select '1030123:12031:1341' from dual)
select regexp_replace(
regexp_replace(txt, '(\d+)','0000000000\1')
,'\d+(\d{10})','\1') from s ;
But Id like to use only one regex something like
regexp_replace(txt, '(\d+)',lpad('\1',10,'0'))
But it don't work. lpad executed before regexp. Could you have any ideas?
With a slightly different approach, you can try the following:
with s(id, txt) as
(
select rownum, txt
from (
select '1030123:12031:1341' as txt from dual union all
select '1234:0123456789:1341' from dual
)
)
SELECT listagg(lpad(regexp_substr(s.txt, '[^:]+', 1, lines.column_value), 10, '0'), ':') within group (order by column_value) txt
FROM s,
TABLE (CAST (MULTISET
(SELECT LEVEL FROM dual CONNECT BY instr(s.txt, ':', 1, LEVEL - 1) > 0
) AS sys.odciNumberList )) lines
group by id
TXT
-----------------------------------
0001030123:0000012031:0000001341
0000001234:0123456789:0000001341
This uses the CONNECT BY to split every string based on the separator ':', then uses LPAD to pad to 10 and then aggregates the strings to build rows containing the concatenation of padded values
This works for non-empty sequences (e.g. 123::456)
with s(txt) as ( select '1030123:12031:1341' from dual)
select regexp_replace (regexp_replace (txt,'(\d+)',lpad('0',10,'0') || '\1'),'0*(\d{10})','\1')
from s
;

substring, after last occurrence of character?

I need help with this problem:
I have a column named phone_number and I wanted to query this column to get the the string right of the last occurrence of '.' for all kinds of numbers in one single sql query.
example #:
515.123.1277
011.44.1345.629268
I need to get 1277 and 629268 respectively.
I have this so far:
select phone_number,
case when length(phone_number) <= 12
then
substr(phone_number,-4)
else
substr (phone_number, -6) end
from employees;
This works for this example, but I want it for all kinds of # formats.
Would be great to get some input.
Thanks
It should be as easy as this regex:
SELECT phone_number, REGEXP_SUBSTR(phone_number, '[^.]*$')
FROM employees;
With the end anchor $ it should get everything that is not a . character after the final .. If the last character is . then it will return NULL.
Search for a pattern including the period, [.] with digits, \d, followed by the end of the string, $.
Associate the digits with a character group by placing the pattern, \d, in parenthesis (see below). This is referenced with the subexpr parameter, 1 (last parameter).
Here is the solution:
SCOTT#dev> list
1 WITH t AS
2 ( SELECT '414.352.3100' p_number FROM dual
3 UNION ALL
4 SELECT '515.123.1277' FROM dual
5 UNION ALL
6 SELECT '011.44.1345.629268' FROM dual
7 )
8* SELECT regexp_substr(t.p_number, '[.](\d+)$', 1, 1, NULL, 1) end_num FROM t
SCOTT#dev> /
END_NUM
========================================================================
3100
1277
629268
You can do something like this in oracle:
select regexp_substr(num,'[^\.]+',1,regexp_count(num,'\.')+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );
Previous to 11gR2 you can use regexp_replace instead regexp_count:
select regexp_substr(num,'[^\.]+',1,length(regexp_replace (num , '[^\.]+'))+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );

How to add a delimiter at a particular position in Oracle

Hi I have a string like this
ABCDEFGH I want the output to be ABCDEF.GH
If it's a number like 1234567 then i want the output to be 12345.67
Basically i want the delimeter (.) before last 2 characters.
You can use regular expressions for this:
with v_data(val) as (
select '123456' from dual union all
select 'abcdef' from dual union all
select '678' from dual
)
select
val,
regexp_replace(val, '(\d+)(\d{2})', '\1.\2')
from v_data
This matches
one or more digits (\d+) (capturing them in group #1)
followed by exactly two digits (\d{2}) (capturing them in group #2)
and replaces this with the contents of group #1 followed by a . followed by the contents of group #2: \1.\2