substring, after last occurrence of character? - sql

I need help with this problem:
I have a column named phone_number and I wanted to query this column to get the the string right of the last occurrence of '.' for all kinds of numbers in one single sql query.
example #:
515.123.1277
011.44.1345.629268
I need to get 1277 and 629268 respectively.
I have this so far:
select phone_number,
case when length(phone_number) <= 12
then
substr(phone_number,-4)
else
substr (phone_number, -6) end
from employees;
This works for this example, but I want it for all kinds of # formats.
Would be great to get some input.
Thanks

It should be as easy as this regex:
SELECT phone_number, REGEXP_SUBSTR(phone_number, '[^.]*$')
FROM employees;
With the end anchor $ it should get everything that is not a . character after the final .. If the last character is . then it will return NULL.

Search for a pattern including the period, [.] with digits, \d, followed by the end of the string, $.
Associate the digits with a character group by placing the pattern, \d, in parenthesis (see below). This is referenced with the subexpr parameter, 1 (last parameter).
Here is the solution:
SCOTT#dev> list
1 WITH t AS
2 ( SELECT '414.352.3100' p_number FROM dual
3 UNION ALL
4 SELECT '515.123.1277' FROM dual
5 UNION ALL
6 SELECT '011.44.1345.629268' FROM dual
7 )
8* SELECT regexp_substr(t.p_number, '[.](\d+)$', 1, 1, NULL, 1) end_num FROM t
SCOTT#dev> /
END_NUM
========================================================================
3100
1277
629268

You can do something like this in oracle:
select regexp_substr(num,'[^\.]+',1,regexp_count(num,'\.')+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );
Previous to 11gR2 you can use regexp_replace instead regexp_count:
select regexp_substr(num,'[^\.]+',1,length(regexp_replace (num , '[^\.]+'))+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );

Related

SQL query to get 6 digits in the output value?

I have a query that gives the output like
select position_id from per_all_people_F
Position_id
FRT567
GFT890
GFT000876
ABC00046
How do i make sure that after first 3 letters, the numbers have to be 6 digit.
Say for example :
FRT567 should be FRT000567.
GFT890 should be GFT000890
or ABC00046 should be ABC000046
How can i tweak my query to accomodate this change?
You can use
SELECT SUBSTR(position_id,1,3)||LPAD(REGEXP_REPLACE(position_id,'\D+'),6,'0')
FROM per_all_people_F
assuming your data format is similar to the presented samples at all.
first piece : ordinary substring extraction for first three letters
second piece : only digits are extracted from the string by using REGEXP_REPLACE(), then zeroes are left padded to the string upto six characters
then concatenate the pieces by double pipe characters
Using only standard string functions (no regular expressions) - split the string after the initial three letters and concatenate the required number of zeros in the middle. This will work correctly even when there are no digits to begin with (the entire input string is just the three letters).
with
t (position_id) as (
select 'FRT567' from dual union all
select 'GFT890' from dual union all
select 'GFT000876' from dual union all
select 'ABC00046' from dual union all
select 'XQY' from dual
)
select position_id,
substr(position_id, 1, 3) || rpad('0', 9 - length(position_id), '0') ||
substr(position_id, 4) as valid_position_id
from t;
POSITION_ID VALID_POSITION_ID
------------ --------------------
FRT567 FRT000567
GFT890 GFT000890
GFT000876 GFT000876
ABC00046 ABC000046
XQY XQY000000
You can use simple string functions and find the first 3 characters using SUBSTR(position_id, 1, 3) and then concatenate || it with the remaining characters left-padded with zeroes to a length of 6 using LPAD(SUBSTR(position_id, 4), 6, '0'). If you can have 3-characters strings then you can use COALESCE to make sure there are always 6 digits:
SELECT position_id,
SUBSTR(position_id, 1, 3) || LPAD(SUBSTR(position_id, 4), 6, '0')
AS expanded_position_id,
-- Optional version for short strings
SUBSTR(position_id, 1, 3)
|| COALESCE(LPAD(SUBSTR(position_id, 4), 6, '0'), '000000')
AS expanded_position_id2
FROM per_all_people_F
Which, for the sample data:
CREATE TABLE per_all_people_F (position_id) as
SELECT 'FRT567' FROM DUAL UNION ALL
SELECT 'GFT890' FROM DUAL UNION ALL
SELECT 'GFT000876' FROM DUAL UNION ALL
SELECT 'ABC00046' FROM DUAL UNION ALL
SELECT 'ABC' FROM DUAL;
Outputs:
POSITION_ID
EXPANDED_POSITION_ID
EXPANDED_POSITION_ID2
FRT567
FRT000567
FRT000567
GFT890
GFT000890
GFT000890
GFT000876
GFT000876
GFT000876
ABC00046
ABC000046
ABC000046
ABC
ABC
ABC000000
db<>fiddle here

Ltrim trimming extra character

I have the below code:
SELECT
ltrim('REASON_ACTIVE_DCA', 'REASON_') reason
FROM
dual
However, I'm obtaining '_CTIVE_DCA'. What's happening and how can I get 'ACTIVE_DCA' with ltrim?
Because LTRIM() removes all the characters as a set. So all leading "R"s and "E"s and so on removed. In fact, the ordering of the characters in the second string is irrelevant, so you would get the same result with '_NOSAER'.
If you want to remove the leading string of REASON_ -- if present -- then you don't use trim(). Instead, one method is:
select (case when 'REASON_ACTIVE_DCA' LIKE 'REASON$_%' ESCAPE '$'
then substr('REASON_ACTIVE_DCA', 8)
else 'REASON_ACTIVE_DCA'
end)
There are other ways, such as:
select regexp_replace('REASON_ACTIVE_DCA', '^REASON_', '')
I would do it with regular string functions (not regular expressions), and using INSTR instead of LIKE so I don't have to worry about escaping underscore.
Something like this - including a few sample strings in the WITH clause for testing:
with
inputs (i_str) as (
select 'REASON_ACTIVE_DCA' from dual union all
select 'REASON_NOT_GIVEN' from dual union all
select null from dual union all
select 'REASON-SPECIAL' from dual union all
select 'REASON_' from dual union all
select 'REASON' from dual
)
select i_str, substr(i_str, case instr(i_str, 'REASON_')
when 1 then 1 + length('REASON_')
else 1 end) as new_str
from inputs;
I_STR NEW_STR
----------------- -----------------
REASON_ACTIVE_DCA ACTIVE_DCA
REASON_NOT_GIVEN NOT_GIVEN
REASON-SPECIAL REASON-SPECIAL
REASON_
REASON REASON

Update ID value to format XXXXXXXX-X using oracle SQL

Table name: TEST
Column name: ID [VARCHAR(200)]
The format of ID is ‘XXXXXXXX-X’, where ‘X’ is a number from 0 to 9.
Additional operations in case above format is not satisfied:
if the ID consists of 9 digits and there is a double dash between eighth and ninth digit , the extra dash is removed (e.g. 08452142--6 -> 08452142-6)
if the ID consists of 9 digits and there is/are space(s) between eighth and ninth digit and/or non-digits and/or non-letter symbol(s) then replace them to dash (e.g. 08452142 - . 3 -> 08452142-3)
if the ID consists 9 digits and starts/ends with non-digits and/or non-letter symbol(s) then delete that symbol(s) up to digit (e.g. 08452142-2.. -> 08452142-2)
if the ID contains only 9 digits then put a dash before the last digit (e.g. 123456789 -> 12345678-9)
I have achieved the necessary format by using the below snippet.
UPDATE TEST
SET ID = (SELECT REGEXP_REPLACE(ID,'^\d{8}-\d{1}$','') AS "ID"
from TEST
WHERE PK = 11;
)
What are the possible ways to add transformations as mentioned in points[1-4] above in a single query?
Using REGEXP_REPLACE, I can achieve ID in above format. But in case format is incorrect, and ID needs to be transformed[like removing extra dash, or adding dash in case 9 digits are received] to achieve satisfactory format, how can that be achieved in a single UPDATE query?
In any case, you need to extract 9 digits from your string in the first step. And then
add a hyphen before the last character. For both steps use regexp_replace() function
with test(id) as
(
select '08452142--6' from dual union all
select '08452142 - . 3' from dual union all
select '08452142-2..' from dual union all
select '123456789' from dual union all
select '1234567890' from dual
)
select case when length(regexp_replace(id,'(\D)'))=9 then
regexp_replace(regexp_replace(id,'(\D)'),
'(^[[:digit:]]{8})(.*)([[:digit:]]{1}$)','\1-\3')
end as id
from test;
ID
----------
08452142-6
08452142-3
08452142-2
12345678-9
<null>
Demo
You can use the following I think:
UPDATE TEST
SET ID = REGEXP_REPLACE(ID,'^\D*(\d{8})\D*(\d)\D*$','\1-\2')
WHERE REGEXP_LIKE(ID,'^\D*(\d{8})\D*(\d)\D*$')
This way you ignore all non-digit charcters and search for a 8-digit number and then an 1-digit number. Take these 2 numbers and put a single '-' in between.
This is a little more generous as you might need but should work with all your provided examples.
I think you want the first 8 digits, then a hyphen, then the 9th digit:
select ( substr(regexp_replace(id, '[^0-9]', ''), 1, 8) ||
'-' ||
substr(regexp_replace(id, '[^0-9]', ''), 9, 1)
)
I tried an approach based on the suggestion by #BarbarosÖzhan:
with source as (
select '02426467--6' id from dual union all
select '02426467-6' id from dual union all
select '02597718 -- . 3' id from dual union all
select '02597718 --dF5 . 3' id from dual union all
select '00120792-2..' id from dual union all
select '..00120792-2..' id from dual union all
select '123456789' id from dual union all
select '1234567890' id from dual
)
select
case
when regexp_like(id, '\d{8}-\d{1}')
then id
else
case
when regexp_like(id, '\d{8}-\d{1}')
then id
else
case
when regexp_count(id, '\d') = 9
then
case
when
regexp_like(
regexp_replace(
regexp_replace(
id, '(\d{8}-)(-)(\d{1})', '\1\3'
), '(\d{8})([^A-Za-z1-9])(\d{1})', '\1-\3'
)
, '\d{8}-\d{1}')
then
regexp_replace(
regexp_replace(
id, '(\d{8}-)(-)(\d{1})', '\1\3'
), '(\d{8})([^A-Za-z1-9])(\d{1})', '\1-\3'
)
else id
end
else id
end
end id_tr
from source
However, in cases 3 and 4, I cannot get rid of the space, dot and alphabets. I think something wrong with the logic in case length is more than 9. I end with "id" as it is so the result is the same without any modifications.
Any suggestions to impprove this?

find invalid characters in string

I need a select statement that will show any invalid characters in Customer number field.
A vaild customer number starts with the captial letter N then 10 digits, can be zero to 9.
Something like,
SELECT (CustomerField, 'N[0-9](10)') <> ''
FROM CustomerTable;
Use regexp_like.
select customerfield
from CustomerTable
where not regexp_like(CustomerField, '^N[0-9]{10}$')
This will show the customerfield's that don't follow the pattern specified.
If you really need to find the invalid characters in the string (and not to just simply find the strings that are invalid) perhaps this more complex query will help. You didn't state in what format you may need the output, so I made up my own. I also created several strings for testing (in particular, it is always important to check that the NULL input is treated correctly).
The column len shows the length of the input, if it's not 11. The length of the empty string (null in Oracle) is shown as 0. The first-nondigit columns refer to characters starting at the SECOND position in the string (ignoring the first character, for which the rules are different and which is checked for validity separately).
with
inputs ( str ) as (
select 'N0123456789' from dual union all
select '' from dual union all
select '02324434323' from dual union all
select 'N02345678' from dual union all
select 'A2140480080' from dual union all
select 'N93049c4995' from dual union all
select 'N4448883333' from dual union all
select 'PAR3993949Z' from dual union all
select 'AN39E' from dual
)
-- end of test data; query begins below this line
select str,
case when regexp_like(str, '^N\d{10}$') then 'valid'
else 'invalid' end as classif,
case when length(str) != 11 then length(str)
when str is null then 0 end as len,
case when substr(str, 1, 1) != 'N'
then substr(str, 1, 1) end as first_char,
regexp_substr(str, '[^0-9]', 2) as first_nondigit,
nullif(regexp_instr( str, '[^0-9]', 2), 0) as first_nondigit_pos
from inputs
;
OUTPUT
STR CLASSIF LEN FIRST_CHAR FIRST_NONDIG FIRST_NONDIGIT_POS
----------- ------- ----- ---------- ------------ ------------------
N0123456789 valid
invalid 0
02324434323 invalid 0
N02345678 invalid 9
A2140480080 invalid A
N93049c4995 invalid c 7
N4448883333 valid
PAR3993949Z invalid P A 2
AN39E invalid 5 A N 2
9 rows selected.
\d stands for digit
Perl-influenced Extensions in Oracle Regular Expressions
The rest if the regular expression elements can be found here
Regular Expression Operator Multilingual Enhancements
select *
from CustomerTable
where not regexp_like (CustomerField,'^N\d{10}$')

How to add a delimiter at a particular position in Oracle

Hi I have a string like this
ABCDEFGH I want the output to be ABCDEF.GH
If it's a number like 1234567 then i want the output to be 12345.67
Basically i want the delimeter (.) before last 2 characters.
You can use regular expressions for this:
with v_data(val) as (
select '123456' from dual union all
select 'abcdef' from dual union all
select '678' from dual
)
select
val,
regexp_replace(val, '(\d+)(\d{2})', '\1.\2')
from v_data
This matches
one or more digits (\d+) (capturing them in group #1)
followed by exactly two digits (\d{2}) (capturing them in group #2)
and replaces this with the contents of group #1 followed by a . followed by the contents of group #2: \1.\2