Ltrim trimming extra character - sql

I have the below code:
SELECT
ltrim('REASON_ACTIVE_DCA', 'REASON_') reason
FROM
dual
However, I'm obtaining '_CTIVE_DCA'. What's happening and how can I get 'ACTIVE_DCA' with ltrim?

Because LTRIM() removes all the characters as a set. So all leading "R"s and "E"s and so on removed. In fact, the ordering of the characters in the second string is irrelevant, so you would get the same result with '_NOSAER'.
If you want to remove the leading string of REASON_ -- if present -- then you don't use trim(). Instead, one method is:
select (case when 'REASON_ACTIVE_DCA' LIKE 'REASON$_%' ESCAPE '$'
then substr('REASON_ACTIVE_DCA', 8)
else 'REASON_ACTIVE_DCA'
end)
There are other ways, such as:
select regexp_replace('REASON_ACTIVE_DCA', '^REASON_', '')

I would do it with regular string functions (not regular expressions), and using INSTR instead of LIKE so I don't have to worry about escaping underscore.
Something like this - including a few sample strings in the WITH clause for testing:
with
inputs (i_str) as (
select 'REASON_ACTIVE_DCA' from dual union all
select 'REASON_NOT_GIVEN' from dual union all
select null from dual union all
select 'REASON-SPECIAL' from dual union all
select 'REASON_' from dual union all
select 'REASON' from dual
)
select i_str, substr(i_str, case instr(i_str, 'REASON_')
when 1 then 1 + length('REASON_')
else 1 end) as new_str
from inputs;
I_STR NEW_STR
----------------- -----------------
REASON_ACTIVE_DCA ACTIVE_DCA
REASON_NOT_GIVEN NOT_GIVEN
REASON-SPECIAL REASON-SPECIAL
REASON_
REASON REASON

Related

Find value that is not a number or a predefined string

I have to test a column of a sql table for invalid values and for NULL.
Valid values are: Any number and the string 'n.v.' (with and without the dots and in every possible combination as listed in my sql command)
So far, I've tried this:
select count(*)
from table1
where column1 is null
or not REGEXP_LIKE(column1, '^[0-9,nv,Nv,nV,NV,n.v,N.v,n.V,N.V]+$');
The regular expression also matches the single character values 'n','N','v','V' (with and without a following dot). This shouldn't be the case, because I only want the exact character combinations as written in the sql command to be matched. I guess the problem has to do with using REGEXP_LIKE. Any ideas?
I guess this regexp will work:
NOT REGEXP_LIKE(column1, '^([0-9]+|n\.?v\.?)$', 'i')
Note that , is not a separator, . means any character, \. means the dot character itself and 'i' flag could be used to ignore case instead of hard coding all combinations of upper and lower case characters.
No need to use regexp (performance will increase by large data) - plain old TRANSLATE is good enough for your validation.
Note that the first translate(column1,'x0123456789','x') remove all numeric charcters from the string, so if you end with nullthe string is OK.
The second translate(lower(column1),'x.','x') removes all dots from the lowered string so you expect the result nv.
To avoid cases as n.....v.... you also limit the string length.
select
column1,
case when
translate(column1,'x0123456789','x') is null or /* numeric string */
translate(lower(column1),'x.','x') = 'nv' and length(column1) <= 4 then 'OK'
end as status
from table1
COLUMN1 STATUS
--------- ------
1010101 OK
1012828n
1012828nv
n.....v....
n.V OK
Test data
create table table1 as
select '1010101' column1 from dual union all -- OK numbers
select '1012828n' from dual union all -- invalid
select '1012828nv' from dual union all -- invalid
select 'n.....v....' from dual union all -- invalid
select 'n.V' from dual; -- OK nv
You can use:
select count(*)
from table1
WHERE TRANSLATE(column1, ' 0123456789', ' ') IS NULL
OR LOWER(column1) IN ('nv', 'n.v', 'nv.', 'n.v.');
Which, for the sample data:
CREATE TABLE table1 (column1) AS
SELECT '12345' FROM DUAL UNION ALL
SELECT 'nv' FROM DUAL UNION ALL
SELECT 'NV' FROM DUAL UNION ALL
SELECT 'nV' FROM DUAL UNION ALL
SELECT 'n.V.' FROM DUAL UNION ALL
SELECT '...................n.V.....................' FROM DUAL UNION ALL
SELECT '..nV' FROM DUAL UNION ALL
SELECT 'n..V' FROM DUAL UNION ALL
SELECT 'nV..' FROM DUAL UNION ALL
SELECT 'xyz' FROM DUAL UNION ALL
SELECT '123nv' FROM DUAL;
Outputs:
COUNT(*)
5
or, if you want any quantity of . then:
select count(*)
from table1
WHERE TRANSLATE(column1, ' 0123456789', ' ') IS NULL
OR REPLACE(LOWER(column1), '.') = 'nv';
Which outputs:
COUNT(*)
9
db<>fiddle here

How to remove all characters except 'E' in oracle

I have strig like 'AA_0331L_02317_R5_P' and i want remove all characters except 'E' from second part after splitting with _ character, so here it is 0331N should become 0331 and if it comes like 0331E , then it should become 0331E .ie simply if i have i string like AA_0331N_02317_R5_P , then i want to be AA_0331_02317_R5_P and if i have a AA_0331E_02317_R5_P ,then it should be AA_0331E_02317_R5_P. I did like as shown below without any luck
SELECT REGEXP_REPLACE(REGEXP_SUBSTR( 'AA_0331L_02317_R5_P' , '[^_]+', 1, 2 ), '[^0-9]', '')
FROM dual
You might try something like the following -- keeping in mind that REGEXP_REPLACE() will return the original string if nothing is actually replaced. Here I'm using backreferences (if Oracle regexes had lookahead I could have omitted the 2nd capturing group and backreference):
WITH mytable AS (
SELECT 'AA_0331L_02317_R5_P' AS myvalue
FROM dual
UNION ALL
SELECT 'AA_0331N_02317_R5_P'
FROM dual
UNION ALL
SELECT 'AA_0331E_02317_R5_P'
FROM dual
)
SELECT myvalue
, REGEXP_REPLACE(myvalue, '^([^_]+_[^_]+)[^E](_)', '\1\2') mynewvalue
FROM mytable;
MYVALUE MYNEWVALUE
------------------------- -------------------------
AA_0331L_02317_R5_P AA_0331_02317_R5_P
AA_0331N_02317_R5_P AA_0331_02317_R5_P
AA_0331E_02317_R5_P AA_0331E_02317_R5_P
with s as (
select 'AA_0331L_02317_R5_P' str from dual union all
select 'AA_0331E_02317_R5_P' str from dual)
select str,
regexp_replace(regexp_substr(str, '[^_]+_[^_]+'), '[^E]$') || regexp_replace(str, '[^_]+_[^_]+', '', 1, 1) new_str
from s;
STR NEW_STR
------------------------------ ------------------------------
AA_0331L_02317_R5_P AA_0331_02317_R5_P
AA_0331E_02317_R5_P AA_0331E_02317_R5_P

Is there any function in oracle to keep a string length fixed?

I have a scenario like when a column value exceeds the length of 10 characters, I need to take a sub-string for only 10 characters (left most) but if it is shorter than that it should be left padded with zeroes. I tried the following:
with data1 as (select '1234567890123' as dummy1 from dual)
select CASE when (length(dummy1)>10) then substr(dummy1,1,10) else lpad(dummy1,10,'0') end from data1;
But this seems to me quite a longer way to do. Is there any shorter way to achieve this, maybe an Oracle function?
I tried to Google this but could not find any relevant result.
lpad is enough to do the job :
SELECT LPAD( '1234567890123', 10, '0' ) AS formatted
FROM dual;
Just use SUBSTR and LPAD together:
WITH data ( value ) AS (
SELECT '1234567890123' FROM DUAL UNION ALL
SELECT '1' FROM DUAL
)
SELECT LPAD( SUBSTR( value, 1, 10 ), 10, '0' ) AS formatted
FROM data;
Output:
FORMATTED
----------
1234567890
0000000001

find invalid characters in string

I need a select statement that will show any invalid characters in Customer number field.
A vaild customer number starts with the captial letter N then 10 digits, can be zero to 9.
Something like,
SELECT (CustomerField, 'N[0-9](10)') <> ''
FROM CustomerTable;
Use regexp_like.
select customerfield
from CustomerTable
where not regexp_like(CustomerField, '^N[0-9]{10}$')
This will show the customerfield's that don't follow the pattern specified.
If you really need to find the invalid characters in the string (and not to just simply find the strings that are invalid) perhaps this more complex query will help. You didn't state in what format you may need the output, so I made up my own. I also created several strings for testing (in particular, it is always important to check that the NULL input is treated correctly).
The column len shows the length of the input, if it's not 11. The length of the empty string (null in Oracle) is shown as 0. The first-nondigit columns refer to characters starting at the SECOND position in the string (ignoring the first character, for which the rules are different and which is checked for validity separately).
with
inputs ( str ) as (
select 'N0123456789' from dual union all
select '' from dual union all
select '02324434323' from dual union all
select 'N02345678' from dual union all
select 'A2140480080' from dual union all
select 'N93049c4995' from dual union all
select 'N4448883333' from dual union all
select 'PAR3993949Z' from dual union all
select 'AN39E' from dual
)
-- end of test data; query begins below this line
select str,
case when regexp_like(str, '^N\d{10}$') then 'valid'
else 'invalid' end as classif,
case when length(str) != 11 then length(str)
when str is null then 0 end as len,
case when substr(str, 1, 1) != 'N'
then substr(str, 1, 1) end as first_char,
regexp_substr(str, '[^0-9]', 2) as first_nondigit,
nullif(regexp_instr( str, '[^0-9]', 2), 0) as first_nondigit_pos
from inputs
;
OUTPUT
STR CLASSIF LEN FIRST_CHAR FIRST_NONDIG FIRST_NONDIGIT_POS
----------- ------- ----- ---------- ------------ ------------------
N0123456789 valid
invalid 0
02324434323 invalid 0
N02345678 invalid 9
A2140480080 invalid A
N93049c4995 invalid c 7
N4448883333 valid
PAR3993949Z invalid P A 2
AN39E invalid 5 A N 2
9 rows selected.
\d stands for digit
Perl-influenced Extensions in Oracle Regular Expressions
The rest if the regular expression elements can be found here
Regular Expression Operator Multilingual Enhancements
select *
from CustomerTable
where not regexp_like (CustomerField,'^N\d{10}$')

substring, after last occurrence of character?

I need help with this problem:
I have a column named phone_number and I wanted to query this column to get the the string right of the last occurrence of '.' for all kinds of numbers in one single sql query.
example #:
515.123.1277
011.44.1345.629268
I need to get 1277 and 629268 respectively.
I have this so far:
select phone_number,
case when length(phone_number) <= 12
then
substr(phone_number,-4)
else
substr (phone_number, -6) end
from employees;
This works for this example, but I want it for all kinds of # formats.
Would be great to get some input.
Thanks
It should be as easy as this regex:
SELECT phone_number, REGEXP_SUBSTR(phone_number, '[^.]*$')
FROM employees;
With the end anchor $ it should get everything that is not a . character after the final .. If the last character is . then it will return NULL.
Search for a pattern including the period, [.] with digits, \d, followed by the end of the string, $.
Associate the digits with a character group by placing the pattern, \d, in parenthesis (see below). This is referenced with the subexpr parameter, 1 (last parameter).
Here is the solution:
SCOTT#dev> list
1 WITH t AS
2 ( SELECT '414.352.3100' p_number FROM dual
3 UNION ALL
4 SELECT '515.123.1277' FROM dual
5 UNION ALL
6 SELECT '011.44.1345.629268' FROM dual
7 )
8* SELECT regexp_substr(t.p_number, '[.](\d+)$', 1, 1, NULL, 1) end_num FROM t
SCOTT#dev> /
END_NUM
========================================================================
3100
1277
629268
You can do something like this in oracle:
select regexp_substr(num,'[^\.]+',1,regexp_count(num,'\.')+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );
Previous to 11gR2 you can use regexp_replace instead regexp_count:
select regexp_substr(num,'[^\.]+',1,length(regexp_replace (num , '[^\.]+'))+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );