Regexp_substr expression - sql

I have problem with my REGEXP expression which I want to loop and every iteration deletes text after slash. My expression looks like this now
REGEXP_SUBSTR('L1161148/1/10', '.*(/)')
I'm getting L1161148/1/ instead of L1161148/1

You said you wanted to loop.
CAVEAT: Both of these solutions assume there are no NULL list elements (all slashes have a value in between them).
SQL> with tbl(data) as (
select 'L1161148/1/10' from dual
)
select level, nvl(substr(data, 1, instr(data, '/', 1, level)-1), data) formatted
from tbl
connect by level <= regexp_count(data, '/') + 1 -- Loop # of delimiters +1 times
order by level desc;
LEVEL FORMATTED
---------- -------------
3 L1161148/1/10
2 L1161148/1
1 L1161148
SQL>
EDIT: To handle multiple rows:
SQL> with tbl(rownbr, col1) as (
select 1, 'L1161148/1/10/2/34/5/6' from dual
union
select 2, 'ALKDFJV1161148/123/456/789/1/2/3' from dual
)
SELECT rownbr, column_value substring_nbr,
nvl(substr(col1, 1, instr(col1, '/', 1, column_value)-1), col1) formatted
FROM tbl,
TABLE(
CAST(
MULTISET(SELECT LEVEL
FROM dual
CONNECT BY LEVEL <= REGEXP_COUNT(col1, '/')+1
) AS sys.OdciNumberList
)
)
order by rownbr, substring_nbr desc
;
ROWNBR SUBSTRING_NBR FORMATTED
---------- ------------- --------------------------------
1 7 L1161148/1/10/2/34/5/6
1 6 L1161148/1/10/2/34/5
1 5 L1161148/1/10/2/34
1 4 L1161148/1/10/2
1 3 L1161148/1/10
1 2 L1161148/1
1 1 L1161148
2 7 ALKDFJV1161148/123/456/789/1/2/3
2 6 ALKDFJV1161148/123/456/789/1/2
2 5 ALKDFJV1161148/123/456/789/1
2 4 ALKDFJV1161148/123/456/789
2 3 ALKDFJV1161148/123/456
2 2 ALKDFJV1161148/123
2 1 ALKDFJV1161148
14 rows selected.
SQL>

You can try removing the string after the last slash:
select regexp_replace('L1161148/1/10', '/([^/]*)$', '') from dual

You are trying to go as far as the last / and then "look back" and retain what was before it. With regular expressions you can do that with a subexpression, like this:
select regexp_substr('L1161148/1/10', '(.*)/.*', 1, 1, null, 1) from dual;
Here, as usual, the first argument "1" means where to start the search, the second "1" means which matching substring to choose, "null" means no special matching modifiers (like case-insensitive matching and such - not needed here), and the last "1" means return the first subexpression - the first thing in parentheses in the "match pattern."
However, regular expressions should only be used when you can't do it with the standard substr and instr (and translate) functions. Here the job is quite easy:
instr(text_string, '/', -1)
will give you the position of the LAST / in text_string (the -1 means find the last occurrence, instead of the first: count from the end of the string). So the whole thing can be written as:
select substr('L1161148/1/10', 1, instr('L1161148/1/10', '/', -1) - 1) from dual;
Edit: In the spirit of Gary_W's solution, here is a generalization to several strings and stripping successive layers from each input string; still not using regular expressions (resulting in slightly faster performance) and using a recursive CTE, available since Oracle version 11; I believe Gary's solution works only from Oracle 12c on.
Query: (I changed Gary's second input string a bit, to make sure the query works properly)
with tbl(item_id, input_str) as (
select 1, 'L1161148/1/10/2/34/5/6' from dual union all
select 2, 'ALKD/FJV11/61148/123/456/789/1/2/3' from dual
),
r (item_id, proc_string, stage) as (
select item_id, input_str, 0 from tbl
union all
select item_id, substr(proc_string, 1, instr(proc_string, '/', -1) - 1), stage + 1
from r
where instr(proc_string, '/') > 0
)
select * from r
order by item_id, stage;
Output:
ITEM_ID PROC_STRING STAGE
---------- ---------------------------------------- ----------
1 L1161148/1/10/2/34/5/6 0
1 L1161148/1/10/2/34/5 1
1 L1161148/1/10/2/34 2
1 L1161148/1/10/2 3
1 L1161148/1/10 4
1 L1161148/1 5
1 L1161148 6
2 ALKD/FJV11/61148/123/456/789/1/2/3 0
2 ALKD/FJV11/61148/123/456/789/1/2 1
2 ALKD/FJV11/61148/123/456/789/1 2
2 ALKD/FJV11/61148/123/456/789 3
2 ALKD/FJV11/61148/123/456 4
2 ALKD/FJV11/61148/123 5
2 ALKD/FJV11/61148 6
2 ALKD/FJV11 7
2 ALKD 8

Related

REGEXP to validate a specific number

How can I search for a specific number in an array using REGEXP?
I have an array and need to verify if it has a specific number.
Ex: [5,2,1,4,6,19] and I am looking for number 1, but just the number 1 and not any number that contain the digit 1.
I had to do this:
case when REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[[]{1}[1][,]')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[,]{1}[1][,]{1}')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[,]{1}[1][]]')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[[]{1}[1][]]') <>0
then 'DIGIT_ONE' else 'NO_DIGIT_ONE'
end
Is there anything simpler?
You can use
(^|\D)1(\D|$)
This will seach for 1 not enclosed with other digits.
See this regex demo.
Details
(^|\D) - start of string or non-digit
1 - a 1 char
(\D|$) - non-digit or end of string.
Do NOT use regular expressions, use a proper JSON parser and then filter for the number you want:
SELECT my_json_column,
CASE
WHEN JSON_EXISTS( my_json_column, '$?(#.path[*] == 1)' )
THEN 'DIGIT ONE'
ELSE 'NO DIGIT ONE'
END AS has_one
FROM table_name;
or (if you are using Oracle 12.1 and cannot use path filter expressions with JSON_EXISTS, which is only available from Oracle 12.2):
SELECT my_json_column,
CASE
WHEN EXISTS(
SELECT 'X'
FROM JSON_TABLE(
t.my_json_column,
'$.path[*]'
COLUMNS (
value NUMBER PATH '$'
)
)
WHERE value = 1
)
THEN 'DIGIT ONE'
ELSE 'NO DIGIT ONE'
END
FROM table_name t;
Which, for the sample data:
CREATE TABLE table_name (
my_json_column CHECK ( my_json_column IS JSON )
) AS
SELECT '{"path":[5,2,1,4,6,19],"not_this_path":[1,2,3,4,5]}' FROM DUAL UNION ALL
SELECT '{"path":[5,2,4,6,19],"not_this_path":[1,2,3,4,5]}' FROM DUAL UNION ALL
SELECT '{"path":[11],"not_this_path":[1]}' FROM DUAL UNION ALL
SELECT '{"path":[2],"not_this_path":[1]}' FROM DUAL UNION ALL
SELECT '{"path":[1,11]}' FROM DUAL;
Both output:
MY_JSON_COLUMN | HAS_ONE
:-------------------------------------------------- | :-----------
{"path":[5,2,1,4,6,19],"not_this_path":[1,2,3,4,5]} | DIGIT ONE
{"path":[5,2,4,6,19],"not_this_path":[1,2,3,4,5]} | NO DIGIT ONE
{"path":[11],"not_this_path":[1]} | NO DIGIT ONE
{"path":[2],"not_this_path":[1]} | NO DIGIT ONE
{"path":[1,11]} | DIGIT ONE
db<>fiddle here
Alternatively, with a little bit more typing (a little bit? Am I kidding?!), splitting the string into rows and comparing values to the search string:
SQL> with test (col) as
2 (select '[5,2,1,4,6,19]' from dual)
3 select t.col,
4 case when '&par_search_string' in
5 (select regexp_substr(substr(col, 2, length(col) - 1), '[^,]+', 1, level) val
6 from test
7 connect by level <= regexp_count(col, ',') + 1
8 )
9 then 'Search string exists'
10 else 'Search string does not exist'
11 end result
12 from test t;
Enter value for par_search_string: 1
COL RESULT
-------------- ----------------------------
[5,2,1,4,6,19] Search string exists
SQL> /
Enter value for par_search_string: 24
COL RESULT
-------------- ----------------------------
[5,2,1,4,6,19] Search string does not exist
SQL>

Comma-separated string match

I have this query:
SELECT regexp_replace (var_called_num, '^' ||ROUTING_PREFIX) INTO Num
FROM INCOMING_ROUTING_PREFIX
WHERE var_called_num LIKE ROUTING_PREFIX ||'%';`
INCOMING_ROUTING_PREFIX table has two rows
1) 007743
2) 007742
var_called_num is 0077438843212123. So above query gives the result 8843212123.
So basically, the query is removing prefix (longest match from table) from var_called_num.
Now my table has changed. Now it has only 1 row which is comma-separated.
Modified Table:
INCOMING_ROUTING_PREFIX table has one row which is comma-separated:
1) 007743,007742
How to modify the query to achieve the same behavior. Need to remove longest match prefix from var_called_num.
Here's one option: you'd have to split the prefix into rows, and the use it in REGEXP_REPLACE.
SQL> with
2 calnum (var_called_num) as
3 (select '0077438843212123' from dual),
4 incoming_routing_prefix (routing_prefix) as
5 (select '007743,007742' from dual),
6 --
7 irp_split as
8 (select regexp_substr(i.routing_prefix, '[^,]+', 1, level) routing_prefix
9 from incoming_routing_prefix i
10 connect by level <= regexp_count(i.routing_prefix, ',') + 1
11 )
12 select regexp_replace(c.var_called_num, '^' || s.routing_prefix) result
13 from calnum c join irp_split s on s.routing_prefix = substr(c.var_called_num, 1, length(s.routing_prefix));
RESULT
----------------
8843212123
SQL>
By the way, why did you change the model to a worse version than it was before?
you can split the values
with test as (
select regexp_substr('007743,007742','[^,]+', 1, level) as ROUTING_PREFIX from dual
connect by regexp_substr('007743,007742S', '[^,]+', 1, level) is not null
)
and that use the view in your select
SELECT regexp_replace ('0077438843212123', '^' ||ROUTING_PREFIX)
FROM test WHERE '0077438843212123' LIKE ROUTING_PREFIX ||'%';

to find minimum missing number in oracle

i want to find the minimum missing number of a column named (s_no) and the table named (test_table) in oracle and I write the following code..
select
min_s_no-1+level missing_number
from (
select min(s_no) min_s_no, max(s_no) max_s_no
from test_table
) connect by level <= max_s_no-min_s_no+1
minus
select s_no from test_table
;
it gives me all the missing number as a result. But I want to select the minimum
number. Can any one help me please.
thanks in advance.
Using analytical function LEAD you can get the number from the next row in ascending order. Comparing of this value with with the original number increased by 1 you get the missing values (if two numbers do not match).
To get the first missing value in ascending order is the same selecting the MIN value:
select
num,
lead(num) over (order by num) num_lead,
case when num + 1 != lead(num) over (order by num) then num + 1 end as missing_num
from test_data
order by num;
NUM NUM_LEAD MISSING_NUM
---------- ---------- -----------
4 5
5 6
6 9 7
9 10
10 13 11
13
-- first missing number = MIN missing number
select min(missing_num)
from (
select
case when num + 1 != lead(num) over (order by num) then num + 1 end as missing_num
from test_data
);
MIN(MISSING_NUM)
----------------
7
ADDENDUM
A good practice in writing SQL is to consider edge cases - here a table that contains a complete interval without holes. The first missing value will be the successor of the last number.
select nvl(min(missing_num),max(num)+1) first_missing_value
from (
select
num,
case when num + 1 != lead(num) over (order by num) then num + 1 end as missing_num
from test_data
);
A complete table return no MISSING_NUM, so the original query return NULL. Using the NVL the expected result is provided.
The best way to find the gaps is to use analytic functiions lead or lag. An example with lag:
with test_data as (
select 1 num from dual union all
select 4 from dual union all
select 6 from dual union all
select 8 from dual union all
select 3 from dual union all
select 9 from dual union all
select 0 from dual
)
select min(gap) min_gap
from (
select num, lag(num) over (order by num)+1 gap
from test_data
)
where num != gap
;
MIN_GAP
------------------
2
More about how to find the gaps here
In Oracle 12.1 and above, MATCH_RECOGNIZE can do quick work of this kind of problems:
Edited. Initially I was picking the "next number" where a gap exists (in the example, the value 9). But that is not what the OP wants, he wants the first missing number (7 in this case). I edited to change the measures clause, to find the first missing number as requested. End Edit
with test_data (num) as (
select 4 from dual union all
select 5 from dual union all
select 6 from dual union all
select 9 from dual union all
select 10 from dual union all
select 13 from dual
)
-- end of test data; when you use the SQL query below,
-- replace test_data and num with your actual table and column names.
select result as num
from test_data
match_recognize (
order by num
measures last(b.num) + 1 as result
pattern ( ^ a b* c )
define b as num = prev(num) + 1,
c as num > prev(num) + 1
)
;
NUM
---
7

SQL function REGEXP_SUBSTR: Regular Expression how to get the content between two characters but not include them

For these strings
RSLR_AIRL19_ID3454_T20030913091226
RSLR_AIRL19_ID3122454_T20030913091226
RSLR_AIRL19_ID34_T20030913091226
How to get the number after ID ?
Or how to get the content between two characters but not include them ?
I use this '/\_ID([^_]+)/' got matches like Array ( [0] => _ID3454 [1] => 3454 )
Is this the right way?
To extract a number after an ID, you could write a similar query.
SQL> with t1 as(
2 select 'RSLR_AIRL19_ID3454_T20030913091226' as col from dual union all
3 select 'RSLR_AIRL19_ID3122454_T20030913091226' from dual union all
4 select 'RSLR_AIRL19_ID34_T20030913091226' from dual
5 )
6 select regexp_substr(col, '^([[:alnum:]]+_){2}ID([[:digit:]]+)_([[:alnum:]]+){1}$', 1, 1, 'i', 2) as ID
7 from t1
8 ;
ID
-------------
3454
3122454
34
Or, if you want to extract digits from a first occurrence of the pattern without verifying if an entire string matches a specific format:
SQL> with t1 as(
2 select 'RSLR_AI_RL19_ID3454_T20030913091226' as col from dual union all
3 select 'RSLR_AIRL19_ID3122454_T20030913091226' from dual union all
4 select 'RSLR_AIRL19_ID34_T20030913091226' from dual
5 )
6 select regexp_substr(col, 'ID([[:digit:]]+)', 1, 1, 'i', 1) as ID
7 from t1
8 ;
ID
--------------
3454
3122454
34
With pcre & perl engines :
ID\K\w+
NOTE
\K "restart" the match.
See http://www.phpfreaks.com/blog/pcre-regex-spotlight-k (php use pcre)

Oracle custom sort

The query...
select distinct name from myTable
returns a bunch of values that start with the following character sequences...
ADL*
FG*
FH*
LAS*
TWUP*
Where '*' is the remainder of the string.
I want to do an order by that sorts in the following manner...
ADL*
LAS*
TWUP*
FG*
FH*
But then I also want to sort within each name in the standard order by fashion. So, an example, if I have the following values
LAS-21A
TWUP-1
FG999
FH3
ADL99999
ADL88888
ADL77777
LAS2
I want it to be sorted like this...
ADL77777
ADL88888
ADL99999
LAS2
TWUP-1
FG999
FH3
I initially thought I could accomplish this vias doing an order by decode(blah) with some like trickery inside of the decode but I've been unable to accomplish it. Any insights?
Goofy and verbose, but should work:
select name, case when substr (name, 1, 3) = 'ADL' then 1
when substr (name, 1, 3) = 'LAS' then 2
when substr (name, 1, 4) = 'TWUP' then 3
when substr (name, 1, 2) = 'FG' then 4
when substr (name, 1, 2) = 'FH' then 5
else 6
end SortOrder
from myTable
order by 2, 1;
Not sure if 6 is the correct place to sort the other items, but it is obvious how to fix that. At least it is clear what is going on, even if I have no idea why you are doing it this way.
EDIT: If these are the only values, you could change lines 4 and 5:
select name, case when substr (name, 1, 3) = 'ADL' then 1
when substr (name, 1, 3) = 'LAS' then 2
when substr (name, 1, 4) = 'TWUP' then 3
when substr (name, 1, 1) = 'F' then 4
else 6
end SortOrder
from myTable
order by 2, 1;
ANOTHER EDIT: And again, if these are the only values, you can simplify even more. Since the only one out of order is the F* series, you can force them to the end, and use the actual first letter for all the others. This is simpler, but relies too much on the exact values for my preference. On the other hand, it does remove many of the seemingly unnecessary calls to substr :
select name, case when substr (name, 1, 1) = 'F' then 'Z'
else name
end SortOrder
from myTable
order by 2, 1;
The problem is that your prefix contains a variable number of characters. This is a good time to deploy regular expressions (if you have 10g or higher).
SQL> select cola
2 from t34
3 order by decode( regexp_substr(cola, '[[:alpha:]]+')
4 , 'ADL' , 10
5 , 'LAS', 20
6 , 'TWUP', 30
7 , 'FG' , 40
8 , 'FH' , 50
9 , 60 )
10 , cola
11 /
COLA
----------
ADL77777
ADL88888
ADL99999
LAS-21A
LAS2
TWUP-1
FG999
FH3
8 rows selected.
SQL>
If earlier versions of Oracle we can use the OWA_PATTERN.AMATCH() function to the same effect:
SQL> select cola
2 from t34
3 order by decode( owa_pattern.amatch(cola, 1, '^[A-Z]+')
4 , 'ADL' , 10
5 , 'LAS', 20
6 , 'TWUP', 30
7 , 'FG' , 40
8 , 'FH' , 50
9 , 60 )
10 , cola
11 /
COLA
----------
ADL77777
ADL88888
ADL99999
FG999
FH3
LAS-21A
LAS2
TWUP-1
8 rows selected.
SQL>