REGEXP to validate a specific number - sql

How can I search for a specific number in an array using REGEXP?
I have an array and need to verify if it has a specific number.
Ex: [5,2,1,4,6,19] and I am looking for number 1, but just the number 1 and not any number that contain the digit 1.
I had to do this:
case when REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[[]{1}[1][,]')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[,]{1}[1][,]{1}')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[,]{1}[1][]]')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[[]{1}[1][]]') <>0
then 'DIGIT_ONE' else 'NO_DIGIT_ONE'
end
Is there anything simpler?

You can use
(^|\D)1(\D|$)
This will seach for 1 not enclosed with other digits.
See this regex demo.
Details
(^|\D) - start of string or non-digit
1 - a 1 char
(\D|$) - non-digit or end of string.

Do NOT use regular expressions, use a proper JSON parser and then filter for the number you want:
SELECT my_json_column,
CASE
WHEN JSON_EXISTS( my_json_column, '$?(#.path[*] == 1)' )
THEN 'DIGIT ONE'
ELSE 'NO DIGIT ONE'
END AS has_one
FROM table_name;
or (if you are using Oracle 12.1 and cannot use path filter expressions with JSON_EXISTS, which is only available from Oracle 12.2):
SELECT my_json_column,
CASE
WHEN EXISTS(
SELECT 'X'
FROM JSON_TABLE(
t.my_json_column,
'$.path[*]'
COLUMNS (
value NUMBER PATH '$'
)
)
WHERE value = 1
)
THEN 'DIGIT ONE'
ELSE 'NO DIGIT ONE'
END
FROM table_name t;
Which, for the sample data:
CREATE TABLE table_name (
my_json_column CHECK ( my_json_column IS JSON )
) AS
SELECT '{"path":[5,2,1,4,6,19],"not_this_path":[1,2,3,4,5]}' FROM DUAL UNION ALL
SELECT '{"path":[5,2,4,6,19],"not_this_path":[1,2,3,4,5]}' FROM DUAL UNION ALL
SELECT '{"path":[11],"not_this_path":[1]}' FROM DUAL UNION ALL
SELECT '{"path":[2],"not_this_path":[1]}' FROM DUAL UNION ALL
SELECT '{"path":[1,11]}' FROM DUAL;
Both output:
MY_JSON_COLUMN | HAS_ONE
:-------------------------------------------------- | :-----------
{"path":[5,2,1,4,6,19],"not_this_path":[1,2,3,4,5]} | DIGIT ONE
{"path":[5,2,4,6,19],"not_this_path":[1,2,3,4,5]} | NO DIGIT ONE
{"path":[11],"not_this_path":[1]} | NO DIGIT ONE
{"path":[2],"not_this_path":[1]} | NO DIGIT ONE
{"path":[1,11]} | DIGIT ONE
db<>fiddle here

Alternatively, with a little bit more typing (a little bit? Am I kidding?!), splitting the string into rows and comparing values to the search string:
SQL> with test (col) as
2 (select '[5,2,1,4,6,19]' from dual)
3 select t.col,
4 case when '&par_search_string' in
5 (select regexp_substr(substr(col, 2, length(col) - 1), '[^,]+', 1, level) val
6 from test
7 connect by level <= regexp_count(col, ',') + 1
8 )
9 then 'Search string exists'
10 else 'Search string does not exist'
11 end result
12 from test t;
Enter value for par_search_string: 1
COL RESULT
-------------- ----------------------------
[5,2,1,4,6,19] Search string exists
SQL> /
Enter value for par_search_string: 24
COL RESULT
-------------- ----------------------------
[5,2,1,4,6,19] Search string does not exist
SQL>

Related

Split a Column with Delimited Values and Compare Each Value

I have a column that contains multiple values in a delimited(comma-separated) format -
id | code
------------
1 11,19,21
2 55,87,33
3 3,11
4 11
I want to be able to compare to each value inside the 'code' column as below -
SELECT id FROM myTbl WHERE code = '11'
This should return -
1
3
4
I've tried the solution below but it does not work for all cases -
SELECT id FROM myTbl WHERE POSITION('11' IN code) <> 0
This will work with a 2 digit number like '11' as it will return a value that is <> 0 if it finds a match. But it will fail when searching for say '3' because rows with 'id' 2 and 3 both will be returned.
Here is link that talks about the POSITION function in REDSHIFT.
Any other approach that will solve this problem?
you can get the count of this string
SELECT id FROM myTbl WHERE regexp_count(user_action, '[11]') > 0
I think we can use regexp_substr() as follow.
select tb .id from myTbl tb where '11' in (
select regexp_substr( (select code from myTbl where id=tb.id),'[^,]+', 1, LEVEL) from dual
connect by regexp_substr((select code from myTbl where id=tb.id) , '[^,]+', 1, LEVEL) is not null);
just try this.
Use split_part() function
SELECT distinct id
FROM myTbl
WHERE '11' in ( split_part( code||',' , ',', 1 ),
split_part( code||',' , ',', 2 ),
split_part( code||',' , ',', 3 ) )
This is a very, very bad data model. You should be storing this information in a junction/association table, with one row per value.
But, if you have no choice, you can use like:
SELECT id
FROM myTbl
WHERE ',' || code || ',' LIKE '%,11,%';

Search comma separated value in oracle 12

I have a Table - Product In Oracle, wherein p_spc_cat_id is stored as comma separated values.
p_id p_name p_desc p_spc_cat_id
1 AA AAAA 26,119,27,15,18
2 BB BBBB 0,0,27,56,57,4
3 BB CCCC 26,0,0,15,3,8
4 CC DDDD 26,0,27,7,14,10
5 CC EEEE 26,119,0,48,75
Now I want to search p_name which have p_spc_cat_id in '26,119,7' And this search value are not fixed it will some time '7,27,8'. The search text combination change every time
my query is:
select p_id,p_name from product where p_spc_cat_id in('26,119,7');
when i execute this query that time i can't find any result
I am little late in answering however i hope that i understood the question correctly.
Read further if: you have a table storing records like
1. 10,20,30,40
2. 50,40,20,70
3. 80,60,30,40
And a search string like '10,60', in which cases it should return rows 1 & 3.
Please try below, it worked for my small table & data.
create table Temp_Table_Name (some_id number(6), Ab varchar2(100))
insert into Temp_Table_Name values (1,'112,120')
insert into Temp_Table_Name values (2,'7,8,100,26')
Firstly lets breakdown the logic:
The table contains comma separated data in one of the columns[Column AB].
We have a comma separated string which we need to search individually in that string column. ['26,119,7,18'-X_STRING]
ID column is primary key in the table.
1.) Lets multiple each record in the table x times where x is the count of comma separated values in the search string [X_STRING]. We can use below query to create the cartesian join sub-query table.
Select Rownum Sequencer,'26,119,7,18' X_STRING
from dual
CONNECT BY ROWNUM <= (LENGTH( '26,119,7,18') - LENGTH(REPLACE( '26,119,7,18',',',''))) + 1
Small note: Calculating count of comma separated values =
Length of string - length of string without ',' + 1 [add one for last value]
2.) Create a function PARSING_STRING such that PARSING_STRING(string,position). So If i pass:
PARSING_STRING('26,119,7,18',3) it should return 7.
CREATE OR REPLACE Function PARSING_STRING
(String_Inside IN Varchar2, Position_No IN Number)
Return Varchar2 Is
OurEnd Number; Beginn Number;
Begin
If Position_No < 1 Then
Return Null;
End If;
OurEnd := Instr(String_Inside, ',', 1, Position_No);
If OurEnd = 0 Then
OurEnd := Length(String_Inside) + 1;
End If;
If Position_No = 1 Then
Beginn := 1;
Else
Beginn := Instr(String_Inside, ',', 1, Position_No-1) + 1;
End If;
Return Substr(String_Inside, Beginn, OurEnd-Beginn);
End;
/
3.) Main query, with the join to multiply records.:
select t1.*,PARSING_STRING(X_STRING,Sequencer)
from Temp_Table_Name t1,
(Select Rownum Sequencer,'26,119,7,18' X_STRING from dual
CONNECT BY ROWNUM <= (Select (LENGTH( '26,119,7,18') - LENGTH(REPLACE(
'26,119,7,18',',',''))) + 1 from dual)) t2
Please note that with each multiplied record we are getting 1 particular position value from the comma separated string.
4.) Finalizing the where condition:
Where
/* For when the value is in the middle of the strint [,value,] */
AB like '%,'||PARSING_STRING(X_STRING,Sequencer)||',%'
OR
/* For when the value is in the start of the string [value,]
parsing the first position comma separated value to match*/
PARSING_STRING(AB,1) = PARSING_STRING(X_STRING,Sequencer)
OR
/* For when the value is in the end of the string [,value]
parsing the last position comma separated value to match*/
PARSING_STRING(AB,(LENGTH(AB) - LENGTH(REPLACE(AB,',',''))) + 1) =
PARSING_STRING(X_STRING,Sequencer)
5.) Using distinct in the query to get unique ID's
[Final Query:Combination of all logic stated above: 1 Query to find them all]
select distinct Some_ID
from Temp_Table_Name t1,
(Select Rownum Sequencer,'26,119,7,18' X_STRING from dual
CONNECT BY ROWNUM <= (Select (LENGTH( '26,119,7,18') - LENGTH(REPLACE( '26,119,7,18',',',''))) + 1 from dual)) t2
Where
AB like '%,'||PARSING_STRING(X_STRING,Sequencer)||',%'
OR
PARSING_STRING(AB,1) = PARSING_STRING(X_STRING,Sequencer)
OR
PARSING_STRING(AB,(LENGTH(AB) - LENGTH(REPLACE(AB,',',''))) + 1) = PARSING_STRING(X_STRING,Sequencer)
You can use like to find it:
select p_id,p_name from product where p_spc_cat_id like '%26,119%'
or p_spc_cat_id like '%119,26%' or p_spc_cat_id like '%119,%,26%' or p_spc_cat_id like '%26,%,119%';
Use the Oracle function instr() to achieve what you want. In your case that would be:
SELECT p_name
FROM product
WHERE instr(p_spc_cat_id, '26,119') <> 0;
Oracle Doc for INSTR
If the string which you are searching will always have 3 values (i.e. 2 commas present) then you can use below approach.
where p_spc_cat_id like regexp_substr('your_search_string, '[^,]+', 1, 1)
or p_spc_cat_id like regexp_substr('your_search_string', '[^,]+', 1, 2)
or p_spc_cat_id like regexp_substr('your_search_string', '[^,]+', 1, 3)
If you cant predict how many values will be there in your search string
(rather how many commas) in that case you may need to generate dynamic query.
Unfortunately sql fiddle is not working currently so could not test this code.
SELECT p_id,p_name
FROM product
WHERE p_spc_cat_id
LIKE '%'||'&i_str'||'%'`
where i_str is 26,119,7 or 7,27,8
This solution uses CTE's. "product" builds the main table. "product_split" turns products into rows so each element in p_spc_cat_id is in it's own row. Lastly, product_split is searched for each value in the string '26,119,7' which is turned into rows by the connect by.
with product(p_id, p_name, p_desc, p_spc_cat_id) as (
select 1, 'AA', 'AAAA', '26,119,27,15,18' from dual union all
select 2, 'BB', 'BBBB', '0,0,27,56,57,4' from dual union all
select 3, 'BB', 'CCCC', '26,0,0,15,3,8' from dual union all
select 4, 'CC', 'DDDD', '26,0,27,7,14,10' from dual union all
select 5, 'CC', 'EEEE', '26,119,0,48,75' from dual
),
product_split(p_id, p_name, p_spc_cat_id) as (
select p_id, p_name,
regexp_substr(p_spc_cat_id, '(.*?)(,|$)', 1, level, NULL, 1)
from product
connect by level <= regexp_count(p_spc_cat_id, ',')+1
and prior p_id = p_id
and prior sys_guid() is not null
)
-- select * from product_split;
select distinct p_id, p_name
from product_split
where p_spc_cat_id in(
select regexp_substr('26,119,7', '(.*?)(,|$)', 1, level, NULL, 1) from dual
connect by level <= regexp_count('26,119,7', ',') + 1
)
order by p_id;
P_ID P_
---------- --
1 AA
3 BB
4 CC
5 CC
SQL>

find invalid characters in string

I need a select statement that will show any invalid characters in Customer number field.
A vaild customer number starts with the captial letter N then 10 digits, can be zero to 9.
Something like,
SELECT (CustomerField, 'N[0-9](10)') <> ''
FROM CustomerTable;
Use regexp_like.
select customerfield
from CustomerTable
where not regexp_like(CustomerField, '^N[0-9]{10}$')
This will show the customerfield's that don't follow the pattern specified.
If you really need to find the invalid characters in the string (and not to just simply find the strings that are invalid) perhaps this more complex query will help. You didn't state in what format you may need the output, so I made up my own. I also created several strings for testing (in particular, it is always important to check that the NULL input is treated correctly).
The column len shows the length of the input, if it's not 11. The length of the empty string (null in Oracle) is shown as 0. The first-nondigit columns refer to characters starting at the SECOND position in the string (ignoring the first character, for which the rules are different and which is checked for validity separately).
with
inputs ( str ) as (
select 'N0123456789' from dual union all
select '' from dual union all
select '02324434323' from dual union all
select 'N02345678' from dual union all
select 'A2140480080' from dual union all
select 'N93049c4995' from dual union all
select 'N4448883333' from dual union all
select 'PAR3993949Z' from dual union all
select 'AN39E' from dual
)
-- end of test data; query begins below this line
select str,
case when regexp_like(str, '^N\d{10}$') then 'valid'
else 'invalid' end as classif,
case when length(str) != 11 then length(str)
when str is null then 0 end as len,
case when substr(str, 1, 1) != 'N'
then substr(str, 1, 1) end as first_char,
regexp_substr(str, '[^0-9]', 2) as first_nondigit,
nullif(regexp_instr( str, '[^0-9]', 2), 0) as first_nondigit_pos
from inputs
;
OUTPUT
STR CLASSIF LEN FIRST_CHAR FIRST_NONDIG FIRST_NONDIGIT_POS
----------- ------- ----- ---------- ------------ ------------------
N0123456789 valid
invalid 0
02324434323 invalid 0
N02345678 invalid 9
A2140480080 invalid A
N93049c4995 invalid c 7
N4448883333 valid
PAR3993949Z invalid P A 2
AN39E invalid 5 A N 2
9 rows selected.
\d stands for digit
Perl-influenced Extensions in Oracle Regular Expressions
The rest if the regular expression elements can be found here
Regular Expression Operator Multilingual Enhancements
select *
from CustomerTable
where not regexp_like (CustomerField,'^N\d{10}$')

Regexp_substr expression

I have problem with my REGEXP expression which I want to loop and every iteration deletes text after slash. My expression looks like this now
REGEXP_SUBSTR('L1161148/1/10', '.*(/)')
I'm getting L1161148/1/ instead of L1161148/1
You said you wanted to loop.
CAVEAT: Both of these solutions assume there are no NULL list elements (all slashes have a value in between them).
SQL> with tbl(data) as (
select 'L1161148/1/10' from dual
)
select level, nvl(substr(data, 1, instr(data, '/', 1, level)-1), data) formatted
from tbl
connect by level <= regexp_count(data, '/') + 1 -- Loop # of delimiters +1 times
order by level desc;
LEVEL FORMATTED
---------- -------------
3 L1161148/1/10
2 L1161148/1
1 L1161148
SQL>
EDIT: To handle multiple rows:
SQL> with tbl(rownbr, col1) as (
select 1, 'L1161148/1/10/2/34/5/6' from dual
union
select 2, 'ALKDFJV1161148/123/456/789/1/2/3' from dual
)
SELECT rownbr, column_value substring_nbr,
nvl(substr(col1, 1, instr(col1, '/', 1, column_value)-1), col1) formatted
FROM tbl,
TABLE(
CAST(
MULTISET(SELECT LEVEL
FROM dual
CONNECT BY LEVEL <= REGEXP_COUNT(col1, '/')+1
) AS sys.OdciNumberList
)
)
order by rownbr, substring_nbr desc
;
ROWNBR SUBSTRING_NBR FORMATTED
---------- ------------- --------------------------------
1 7 L1161148/1/10/2/34/5/6
1 6 L1161148/1/10/2/34/5
1 5 L1161148/1/10/2/34
1 4 L1161148/1/10/2
1 3 L1161148/1/10
1 2 L1161148/1
1 1 L1161148
2 7 ALKDFJV1161148/123/456/789/1/2/3
2 6 ALKDFJV1161148/123/456/789/1/2
2 5 ALKDFJV1161148/123/456/789/1
2 4 ALKDFJV1161148/123/456/789
2 3 ALKDFJV1161148/123/456
2 2 ALKDFJV1161148/123
2 1 ALKDFJV1161148
14 rows selected.
SQL>
You can try removing the string after the last slash:
select regexp_replace('L1161148/1/10', '/([^/]*)$', '') from dual
You are trying to go as far as the last / and then "look back" and retain what was before it. With regular expressions you can do that with a subexpression, like this:
select regexp_substr('L1161148/1/10', '(.*)/.*', 1, 1, null, 1) from dual;
Here, as usual, the first argument "1" means where to start the search, the second "1" means which matching substring to choose, "null" means no special matching modifiers (like case-insensitive matching and such - not needed here), and the last "1" means return the first subexpression - the first thing in parentheses in the "match pattern."
However, regular expressions should only be used when you can't do it with the standard substr and instr (and translate) functions. Here the job is quite easy:
instr(text_string, '/', -1)
will give you the position of the LAST / in text_string (the -1 means find the last occurrence, instead of the first: count from the end of the string). So the whole thing can be written as:
select substr('L1161148/1/10', 1, instr('L1161148/1/10', '/', -1) - 1) from dual;
Edit: In the spirit of Gary_W's solution, here is a generalization to several strings and stripping successive layers from each input string; still not using regular expressions (resulting in slightly faster performance) and using a recursive CTE, available since Oracle version 11; I believe Gary's solution works only from Oracle 12c on.
Query: (I changed Gary's second input string a bit, to make sure the query works properly)
with tbl(item_id, input_str) as (
select 1, 'L1161148/1/10/2/34/5/6' from dual union all
select 2, 'ALKD/FJV11/61148/123/456/789/1/2/3' from dual
),
r (item_id, proc_string, stage) as (
select item_id, input_str, 0 from tbl
union all
select item_id, substr(proc_string, 1, instr(proc_string, '/', -1) - 1), stage + 1
from r
where instr(proc_string, '/') > 0
)
select * from r
order by item_id, stage;
Output:
ITEM_ID PROC_STRING STAGE
---------- ---------------------------------------- ----------
1 L1161148/1/10/2/34/5/6 0
1 L1161148/1/10/2/34/5 1
1 L1161148/1/10/2/34 2
1 L1161148/1/10/2 3
1 L1161148/1/10 4
1 L1161148/1 5
1 L1161148 6
2 ALKD/FJV11/61148/123/456/789/1/2/3 0
2 ALKD/FJV11/61148/123/456/789/1/2 1
2 ALKD/FJV11/61148/123/456/789/1 2
2 ALKD/FJV11/61148/123/456/789 3
2 ALKD/FJV11/61148/123/456 4
2 ALKD/FJV11/61148/123 5
2 ALKD/FJV11/61148 6
2 ALKD/FJV11 7
2 ALKD 8

substring, after last occurrence of character?

I need help with this problem:
I have a column named phone_number and I wanted to query this column to get the the string right of the last occurrence of '.' for all kinds of numbers in one single sql query.
example #:
515.123.1277
011.44.1345.629268
I need to get 1277 and 629268 respectively.
I have this so far:
select phone_number,
case when length(phone_number) <= 12
then
substr(phone_number,-4)
else
substr (phone_number, -6) end
from employees;
This works for this example, but I want it for all kinds of # formats.
Would be great to get some input.
Thanks
It should be as easy as this regex:
SELECT phone_number, REGEXP_SUBSTR(phone_number, '[^.]*$')
FROM employees;
With the end anchor $ it should get everything that is not a . character after the final .. If the last character is . then it will return NULL.
Search for a pattern including the period, [.] with digits, \d, followed by the end of the string, $.
Associate the digits with a character group by placing the pattern, \d, in parenthesis (see below). This is referenced with the subexpr parameter, 1 (last parameter).
Here is the solution:
SCOTT#dev> list
1 WITH t AS
2 ( SELECT '414.352.3100' p_number FROM dual
3 UNION ALL
4 SELECT '515.123.1277' FROM dual
5 UNION ALL
6 SELECT '011.44.1345.629268' FROM dual
7 )
8* SELECT regexp_substr(t.p_number, '[.](\d+)$', 1, 1, NULL, 1) end_num FROM t
SCOTT#dev> /
END_NUM
========================================================================
3100
1277
629268
You can do something like this in oracle:
select regexp_substr(num,'[^\.]+',1,regexp_count(num,'\.')+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );
Previous to 11gR2 you can use regexp_replace instead regexp_count:
select regexp_substr(num,'[^\.]+',1,length(regexp_replace (num , '[^\.]+'))+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );