SQL function REGEXP_SUBSTR: Regular Expression how to get the content between two characters but not include them - sql

For these strings
RSLR_AIRL19_ID3454_T20030913091226
RSLR_AIRL19_ID3122454_T20030913091226
RSLR_AIRL19_ID34_T20030913091226
How to get the number after ID ?
Or how to get the content between two characters but not include them ?
I use this '/\_ID([^_]+)/' got matches like Array ( [0] => _ID3454 [1] => 3454 )
Is this the right way?

To extract a number after an ID, you could write a similar query.
SQL> with t1 as(
2 select 'RSLR_AIRL19_ID3454_T20030913091226' as col from dual union all
3 select 'RSLR_AIRL19_ID3122454_T20030913091226' from dual union all
4 select 'RSLR_AIRL19_ID34_T20030913091226' from dual
5 )
6 select regexp_substr(col, '^([[:alnum:]]+_){2}ID([[:digit:]]+)_([[:alnum:]]+){1}$', 1, 1, 'i', 2) as ID
7 from t1
8 ;
ID
-------------
3454
3122454
34
Or, if you want to extract digits from a first occurrence of the pattern without verifying if an entire string matches a specific format:
SQL> with t1 as(
2 select 'RSLR_AI_RL19_ID3454_T20030913091226' as col from dual union all
3 select 'RSLR_AIRL19_ID3122454_T20030913091226' from dual union all
4 select 'RSLR_AIRL19_ID34_T20030913091226' from dual
5 )
6 select regexp_substr(col, 'ID([[:digit:]]+)', 1, 1, 'i', 1) as ID
7 from t1
8 ;
ID
--------------
3454
3122454
34

With pcre & perl engines :
ID\K\w+
NOTE
\K "restart" the match.
See http://www.phpfreaks.com/blog/pcre-regex-spotlight-k (php use pcre)

Related

Workaround for REGEXP_REPLACE in Oracle SQL | Regular Expression too long

I am using REGEXP_REPLACE to search multiple source strings (>1000) in a column1 of table1 and replace with pattern 'xyz' using select statement. But I am getting below error as REGEXP_REPLACE has limitation of 512 bytes.
ORA-12733: regular expression too long
I was wondering if there is any work around for it.
Below is my initial query.
select REGEXP_REPLACE(table1.Column1,'SearchString1|SearchString2|SearchString1|.....SearchString1000','xyz')
from table1
My query would be very long if I use below solution.
Can it be done in loop using shell script?
https://stackoverflow.com/questions/21921658/oracle-regular-expression-regexp-like-too-long-error-ora-12733
I don't know whether you can do it in loop using shell script, but - why? Regular expressions still work, only if you adjust it a little bit.
I'd suggest you to store search strings into a separate table (or use a CTE, as in the following example). Then outer join it to the source table (test in my example) and - see the result.
Sample data:
SQL> with
2 test (col) as
3 (select 'Littlefoot' from dual union all
4 select 'Bigfoot' from dual union all
5 select 'Footloose' from dual union all
6 select 'New York' from dual union all
7 select 'Yorkshire' from dual union all
8 select 'None' from dual
9 ),
10 search_strings (sstring) as
11 (select 'foot' from dual union all
12 select 'york' from dual
13 )
Query:
14 select t.col,
15 regexp_replace(t.col, s.sstring, 'xyz', 1, 1, 'i') result
16 from test t left join search_strings s on regexp_instr(t.col, s.sstring, 1, 1, 0, 'i') > 0;
COL RESULT
---------- --------------------
Littlefoot Littlexyz
Bigfoot Bigxyz
Footloose xyzloose
New York New xyz
Yorkshire xyzshire
None None
6 rows selected.
SQL>

how to get the number after '-' in Oracle

I have some strings in my table. They are like 1101-1, 1101-2, 1101-10, 1101-11 pulse, shock, abc, 1104-2, 1104-11, 2201-1, 2202-4. I tried to sort them like below:
1101-1
1101-2
1101-10
1101-11
1104-2
1104-11
2201-1
2202-4
abc
pulse
shock
But I can't get the sort correctly. Below is my codes:
select column from table
order by regexp_substr(column, '^\D*') nulls first,
to_number(substr(regexp_substr(column, '\d+'),1,4)) asc
Sort numbers as numbers:
first the ones in front of the hyphen (line #16)
then the ones after it (line #17),
then the rest (line #18)
Mind the to_number function! Without it, you'll be sorting strings! and get the wrong result.
SQL> with test (col) as
2 ( select '1101-1' from dual union all
3 select '1101-2' from dual union all
4 select '1101-10' from dual union all
5 select '1101-11' from dual union all
6 select 'pulse' from dual union all
7 select 'shock' from dual union all
8 select 'abc' from dual union all
9 select '1104-2' from dual union all
10 select '1104-11' from dual union all
11 select '2201-1' from dual union all
12 select '2202-4' from dual
13 )
14 select col
15 from test
16 order by to_number(regexp_substr(col, '^\d+')),
17 to_number(regexp_substr(col, '\d+$')),
18 col;
COL
-------
1101-1
1101-2
1101-10
1101-11
1104-2
1104-11
2201-1
2202-4
abc
pulse
shock
11 rows selected.
SQL>
For your examples, this should do:
order by regexp_substr(column, '^[^-]+'), -- everything before the hyphen
len(column),
column
To get the number after '-' specifically:
with ttt (col) as (
select cast(column_value as varchar2(10)) as second_str
from table(sys.dbms_debug_vc2coll
( '1101-1'
, '1101-2'
, '1101-10'
, '1101-11'
, '1104-2'
, '1104-11'
, '2201-1'
, '2202-4'
, 'abc'
, 'pulse'
, 'shock'
))
)
select col
, regexp_substr(col, '(^\d+-)(\d+)', 1, 1, '', 2)
from ttt;
COL SECOND_STR
---------- ----------
1101-1 1
2201-1 1
1101-10 10
1101-11 11
1104-11 11
1101-2 2
1104-2 2
2202-4 4
abc
pulse
shock
11 rows selected
This treats the text string as two values, (^\d+-) followed by (\d+), and takes the second substring (the final '2' parameter). As only positional parameters are allowed for built-in SQL functions, you also have to specify occurrence (1) and match param (null, as we don't care about case etc).

how to find the same two-digit numbers in a sequence of characters regular expressions SQL

I would like to print out strings of characters in which at least two numbers are repeated in SQL
EX
11-22-33
11-22-44
22-22-33
55-22-33
11-66-33
11-88-33
33-88-33
77-77-22
OUTPUT :
22-22-33
77-77-22
33-88-33
But I have no idea how to write a regular expression that would help me
For this fixed format of NN-NN-NN, you could just use string functions and test the three possible combinations:
select *
from mytable
where substr(val, 1, 2) = substr(val, 3, 2)
or substr(val, 1, 2) = substr(val, 5, 2)
or substr(val, 3, 2) = substr(val, 5, 2)
We could get a little fancy and use a lateral join instead of the repeating or conditions. This scales better if you have more than 3 parts (the number of combinations increases rapidly, which makes the or solution less convinient):
select t.*
from mytable t
cross apply (
select count(distinct part) cnt_distinct_part
from (
select substr(t.val, 1, 2) part
union all select substr(t.val, 3, 2)
union all select substr(t.val, 5, 2)
) x
) x
where x.cnt_distinct_part < 3
You can use the regular expression (^|-)(\d+)(-\d+)*-\2(-|$) to match pairs of numbers of any number of digits or number of terms.
(^|-) matches either the start-of-the-string ^ or a hyphen - contained in the first capturing group ();
followed by one-or-more digit characters \d+ contained in the second capturing group () to match the first of the pair of the numbers;
then a third capturing group () which is matched zero-or-more times * containing a - followed by one-or-more digits \d+ to match any amount of numbers between the pair of matched numbers;
then a hyphen -;
then a duplicate of the second capturing group \2 which will match the second of the pair of numbers;
then either a hyphen - or the end-of-the-string $
Giving the query:
SELECT value
FROM table_name
WHERE REGEXP_LIKE( value, '(^|-)(\d+)(-\d+)?-\2(-|$)' );
Which, for the sample data:
CREATE TABLE table_name ( value ) AS
SELECT '11-22-33' FROM DUAL UNION ALL
SELECT '11-22-44' FROM DUAL UNION ALL
SELECT '22-22-33' FROM DUAL UNION ALL
SELECT '55-22-33' FROM DUAL UNION ALL
SELECT '11-66-33' FROM DUAL UNION ALL
SELECT '11-88-33' FROM DUAL UNION ALL
SELECT '33-88-33' FROM DUAL UNION ALL
SELECT '77-77-22' FROM DUAL UNION ALL
SELECT '11-77-77' FROM DUAL UNION ALL
SELECT '11-177-77' FROM DUAL UNION ALL
SELECT '11-77-771' FROM DUAL UNION ALL
SELECT '123-456-123' FROM DUAL UNION ALL
SELECT '1-2-2' FROM DUAL UNION ALL
SELECT '99999-99999-0' FROM DUAL UNION ALL
SELECT '1-2-3-4-5-6-7-8-9-0-11-2-13' FROM DUAL;
Outputs:
| VALUE |
| :-------------------------- |
| 22-22-33 |
| 33-88-33 |
| 77-77-22 |
| 11-77-77 |
| 123-456-123 |
| 1-2-2 |
| 99999-99999-0 |
| 1-2-3-4-5-6-7-8-9-0-11-2-13 |
db<>fiddle here

How can I get a natural numeric sort order in Oracle?

I have a column with a letter followed by either numbers or letters:
ID_Col
------
S001
S1001
S090
SV911
SV800
Sfoofo
Szap
Sbart
How can I order it naturally with the numbers first (ASC) then the letters alphabetically? If it starts with S and the remaining characters are numbers, sort by the numbers. Else, sort by the letter. So SV911should be sorted at the end with the letters since it also contains a V. E.g.
ID_Col
------
S001
S090
S1001
Sbart
Sfoofo
SV800
SV911
Szap
I see this solution uses regex combined with the TO_NUMBER function, but since I also have entries with no numbers this doesn't seem to work for me. I tried the expression:
ORDER BY
TO_NUMBER(REGEXP_SUBSTR(ID_Col, '^S\d+$')),
ID_Col
/* gives ORA-01722: invalid number */
Would this help?
SQL> with test (col) as
2 (select 'S001' from dual union all
3 select 'S1001' from dual union all
4 select 'S090' from dual union all
5 select 'SV911' from dual union all
6 select 'SV800' from dual union all
7 select 'Sfoofo' from dual union all
8 select 'Szap' from dual union all
9 select 'Sbart' from dual
10 )
11 select col
12 from test
13 order by substr(col, 1, 1),
14 case when regexp_like(col, '^[[:alpha:]]\d') then to_number(regexp_substr(col, '\d+$')) end,
15 substr(col, 2);
COL
------
S001
S090
S1001
Sbart
Sfoofo
SV800
SV911
Szap
8 rows selected.
SQL>

Get substring with REGEXP_SUBSTR

I need to use regexp_substr, but I can't use it properly
I have column (l.id) with numbers, for example:
1234567891123!123 EXPECTED OUTPUT: 1234567891123
123456789112!123 EXPECTED OUTPUT: 123456789112
12345678911!123 EXPECTED OUTPUT: 12345678911
1234567891123!123 EXPECTED OUTPUT: 1234567891123
I want use regexp_substr before the exclamation mark (!)
SELECT REGEXP_SUBSTR(l.id,'[%!]',1,13) from l.table
is it ok ?
You can try using INSTR() and substr()
DEMO
select substr(l.id,1,INSTR(l.id,'!', 1, 1)-1) from dual
You want to remove the exclamation mark and all following characters it seems. That is simply:
select regexp_replace(id, '!.*', '') from mytable;
Look at it like a delimited string where the bang is the delimiter and you want the first element, even if it is NULL. Make sure to test all possibilities, even the unexpected ones (ALWAYS expect the unexpected)! Here the assumption is if there is no delimiter you'll want what's there.
The regex returns the first element followed by a bang or the end of the line. Note this form of the regex handles a NULL first element.
SQL> with tbl(id, str) as (
select 1, '1234567891123!123' from dual union all
select 2, '123456789112!123' from dual union all
select 3, '12345678911!123' from dual union all
select 4, '1234567891123!123' from dual union all
select 5, '!123' from dual union all
select 6, '123!' from dual union all
select 7, '' from dual union all
select 8, '12345' from dual
)
select id, regexp_substr(str, '(.*?)(!|$)', 1, 1, NULL, 1)
from tbl
order by id;
ID REGEXP_SUBSTR(STR
---------- -----------------
1 1234567891123
2 123456789112
3 12345678911
4 1234567891123
5
6 123
7
8 12345
8 rows selected.
SQL>
If you like to use REGEXP_SUBSTR rather than regexp_replace then you can use
SELECT REGEXP_SUBSTR(l.id,'^\d+')
assuming you have only numbers before !
If I understand correctly, this is the pattern that you want:
SELECT REGEXP_SUBSTR(l.id,'^[^!]+', 1)
FROM (SELECT '1234567891123!123' as id from dual) l