How can I search for a specific number in an array using REGEXP?
I have an array and need to verify if it has a specific number.
Ex: [5,2,1,4,6,19] and I am looking for number 1, but just the number 1 and not any number that contain the digit 1.
I had to do this:
case when REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[[]{1}[1][,]')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[,]{1}[1][,]{1}')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[,]{1}[1][]]')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[[]{1}[1][]]') <>0
then 'DIGIT_ONE' else 'NO_DIGIT_ONE'
end
Is there anything simpler?
You can use
(^|\D)1(\D|$)
This will seach for 1 not enclosed with other digits.
See this regex demo.
Details
(^|\D) - start of string or non-digit
1 - a 1 char
(\D|$) - non-digit or end of string.
Do NOT use regular expressions, use a proper JSON parser and then filter for the number you want:
SELECT my_json_column,
CASE
WHEN JSON_EXISTS( my_json_column, '$?(#.path[*] == 1)' )
THEN 'DIGIT ONE'
ELSE 'NO DIGIT ONE'
END AS has_one
FROM table_name;
or (if you are using Oracle 12.1 and cannot use path filter expressions with JSON_EXISTS, which is only available from Oracle 12.2):
SELECT my_json_column,
CASE
WHEN EXISTS(
SELECT 'X'
FROM JSON_TABLE(
t.my_json_column,
'$.path[*]'
COLUMNS (
value NUMBER PATH '$'
)
)
WHERE value = 1
)
THEN 'DIGIT ONE'
ELSE 'NO DIGIT ONE'
END
FROM table_name t;
Which, for the sample data:
CREATE TABLE table_name (
my_json_column CHECK ( my_json_column IS JSON )
) AS
SELECT '{"path":[5,2,1,4,6,19],"not_this_path":[1,2,3,4,5]}' FROM DUAL UNION ALL
SELECT '{"path":[5,2,4,6,19],"not_this_path":[1,2,3,4,5]}' FROM DUAL UNION ALL
SELECT '{"path":[11],"not_this_path":[1]}' FROM DUAL UNION ALL
SELECT '{"path":[2],"not_this_path":[1]}' FROM DUAL UNION ALL
SELECT '{"path":[1,11]}' FROM DUAL;
Both output:
MY_JSON_COLUMN | HAS_ONE
:-------------------------------------------------- | :-----------
{"path":[5,2,1,4,6,19],"not_this_path":[1,2,3,4,5]} | DIGIT ONE
{"path":[5,2,4,6,19],"not_this_path":[1,2,3,4,5]} | NO DIGIT ONE
{"path":[11],"not_this_path":[1]} | NO DIGIT ONE
{"path":[2],"not_this_path":[1]} | NO DIGIT ONE
{"path":[1,11]} | DIGIT ONE
db<>fiddle here
Alternatively, with a little bit more typing (a little bit? Am I kidding?!), splitting the string into rows and comparing values to the search string:
SQL> with test (col) as
2 (select '[5,2,1,4,6,19]' from dual)
3 select t.col,
4 case when '&par_search_string' in
5 (select regexp_substr(substr(col, 2, length(col) - 1), '[^,]+', 1, level) val
6 from test
7 connect by level <= regexp_count(col, ',') + 1
8 )
9 then 'Search string exists'
10 else 'Search string does not exist'
11 end result
12 from test t;
Enter value for par_search_string: 1
COL RESULT
-------------- ----------------------------
[5,2,1,4,6,19] Search string exists
SQL> /
Enter value for par_search_string: 24
COL RESULT
-------------- ----------------------------
[5,2,1,4,6,19] Search string does not exist
SQL>
I have a table with two columns:
OLD_REVISIONS |NEW_REVISIONS
-----------------------------------
1,25,26,24 |1,26,24,25
1,56,55,54 |1,55,54
1 |1
1,2 |1
1,96,95,94 |1,96,94,95
1 |1
1 |1
1 |1
1 |1
1,2 |1,2
1 |1
1 |1
1 |1
1 |1
For each row there will be a list of revisions for a document (comma separated)
The comma separated list might be the same in both columns but the order/sort might be different - e.g.
2,1 |1,2
I would like to find all the instances where the highest revision in the OLD_REVISIONS column is lower than than the highest revision in NEW_REVISIONS
The following would fit that criteria
OLD_REVISIONS |NEW_REVISIONS
-----------------------------------
1,2 |1
1,56,55,54 |1,55,54
I tried a solution using the MINUS option (joining the table to itself) but it returns differences even for when the list is the same but in the wrong order
I tried the function GREATEST (i.e where greatest(new_Revisions) < greatest(old_revisions)) but i am not sure why greatest(OLD_REVISIONS) always just returns the comma separated value. It does not return the max value. I suspect it is comparing strings because the columns are VARCHAR.
Also, MAX function expects a single number.
Is there another way i can achieve the above? I am looking for a pure SQL option so i can print out the results (or a PL/SQL option that can print out the results)
Edit
Apologies for not mentioning this but for the NEW_REVISIONS i do actually have the data in a table where each revision is in a separate row:
"DOCNUMBER" "REVISIONNUMBER"
67 1
67 24
67 25
67 26
75 1
75 54
75 55
75 56
78 1
79 1
79 2
83 1
83 96
83 94
Just to give some content, a few weeks ago i suspected that there are revisions disappearing.
To investigate this, i decided to take a count of all revisions for all documents and take a snapshot to compare later to see if revisions are indeed missing.
The snapshot that i took contained the following columns:
docnumber, count, revisions
The revisions were stored in a comma separated list using the listagg function.
The trouble i have now is the on live table, new revisions have been added so when i compare the main table and the snapshot using a MINUS i get a difference because
of the new revisions in the main table.
Even though in the actual table the revisions are individual rows, in the snapshot table i dont have the individual rows.
I am thinking the only way to recreate the snapshot in the same format and compare them find out if maximum revision in the main table is lower than the max revision in the snapshot table (hence why im trying to find out how to find out the max in a comma separated string)
Enjoy.
select xmlcast(xmlquery(('max((' || OLD_REVISIONS || '))') RETURNING CONTENT) as int) as OLD_REVISIONS_max
,xmlcast(xmlquery(('max((' || NEW_REVISIONS || '))') RETURNING CONTENT) as int) as NEW_REVISIONS_max
from t
;
Assuming your base table has an id column (versions of what?) - here is a solution based on splitting the rows.
Edit: If you like this solution, check out vkp's solution, which is better than mine. I explain why his solution is better in a Comment to his Answer.
with
t ( id, old_revisions, new_revisions ) as (
select 101, '1,25,26,24', '1,26,24,25' from dual union all
select 102, '1,56,55,54', '1,55,54' from dual union all
select 103, '1' , '1' from dual union all
select 104, '1,2' , '1' from dual union all
select 105, '1,96,95,94', '1,96,94,95' from dual union all
select 106, '1' , '1' from dual union all
select 107, '1' , '1' from dual union all
select 108, '1' , '1' from dual union all
select 109, '1' , '1' from dual union all
select 110, '1,2' , '1,2' from dual union all
select 111, '1' , '1' from dual union all
select 112, '1' , '1' from dual union all
select 113, '1' , '1' from dual union all
select 114, '1' , '1' from dual
)
-- END of TEST DATA; the actual solution (SQL query) begins below.
select id, old_revisions, new_revisions
from (
select id, old_revisions, new_revisions, 'old' as flag,
to_number(regexp_substr(old_revisions, '\d+', 1, level)) as rev_no
from t
connect by level <= regexp_count(old_revisions, ',') + 1
and prior id = id
and prior sys_guid() is not null
union all
select id, old_revisions, new_revisions, 'new' as flag,
to_number(regexp_substr(new_revisions, '\d+', 1, level)) as rev_no
from t
connect by level <= regexp_count(new_revisions, ',') + 1
and prior id = id
and prior sys_guid() is not null
)
group by id, old_revisions, new_revisions
having max(case when flag = 'old' then rev_no end) !=
max(case when flag = 'new' then rev_no end)
order by id -- ORDER BY is optional
;
ID OLD_REVISION NEW_REVISION
--- ------------ ------------
102 1,56,55,54 1,55,54
104 1,2 1
You can compare every value by putting together the revisions in the same order using listagg function.
SELECT listagg(o,',') WITHIN GROUP (ORDER BY o) old_revisions,
listagg(n,',') WITHIN GROUP (ORDER BY n) new_revisions
FROM (
SELECT DISTINCT rowid r,
regexp_substr(old_revisions, '[^,]+', 1, LEVEL) o,
regexp_substr(new_revisions, '[^,]+', 1, LEVEL) n
FROM table
WHERE regexp_substr(old_revisions, '[^,]+', 1, LEVEL) IS NOT NULL
CONNECT BY LEVEL<=(SELECT greatest(MAX(regexp_count(old_revisions,',')),MAX(regexp_count(new_revisions,',')))+1 c FROM table)
)
GROUP BY r
HAVING listagg(o,',') WITHIN GROUP (ORDER BY o)<>listagg(n,',') WITHIN GROUP (ORDER BY n);
This could be a way:
select
OLD_REVISIONS,
NEW_REVISIONS
from
REVISIONS t,
table(cast(multiset(
select level
from dual
connect by level <= length (regexp_replace(t.OLD_REVISIONS, '[^,]+')) + 1
) as sys.OdciNumberList
)
) levels_old,
table(cast(multiset(
select level
from dual
connect by level <= length (regexp_replace(t.NEW_REVISIONS, '[^,]+')) + 1
)as sys.OdciNumberList
)
) levels_new
group by t.ROWID,
OLD_REVISIONS,
NEW_REVISIONS
having max(to_number(trim(regexp_substr(t.OLD_REVISIONS, '[^,]+', 1, levels_old.column_value)))) >
max(to_number(trim(regexp_substr(t.new_REVISIONS, '[^,]+', 1, levels_new.column_value))))
This uses a double string split to pick the values from every field, and then simply finds the rows where the max values among the two collections match your requirement.
You should edit this by adding some unique key in the GROUP BYclause, or a rowid if you don't have any unique key on your table.
One way to do is to split the columns on comma separation using regexp_substr and checking if the max and min values are different.
Sample Demo
with rownums as (select t.*,row_number() over(order by old_revisions) rn from t)
select old_revisions,new_revisions
from rownums
where rn in (select rn
from rownums
group by rn
connect by regexp_substr(old_revisions, '[^,]+', 1, level) is not null
or regexp_substr(new_revisions, '[^,]+', 1, level) is not null
having max(cast(regexp_substr(old_revisions,'[^,]+', 1, level) as int))
<> max(cast(regexp_substr(new_revisions,'[^,]+', 1, level) as int))
)
Comments say normalise data. I agree but also I understand it may be not possible. I would try something like query below:
select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, rowid r from tab1
connect by regexp_substr(v1, '[^,]+', 1, level) is not null
) group by r) t1
inner join (
select max(val) val2, r from (
select regexp_substr(v2,'[^,]+', 1, level) val, rowid r from tab1
connect by regexp_substr(v2, '[^,]+', 1, level) is not null
) group by r) t2
on (t1.r = t2.r);
Tested on:
create table tab1 (v1 varchar2(100), v2 varchar2(100));
insert into tab1 values ('1,3,5','1,4,7');
insert into tab1 values ('1,3,5','1,2,9');
insert into tab1 values ('1,3,5','1,3,5');
insert into tab1 values ('1,3,5','1,4');
and seems to work fine. I left rowid for reference. I guess you have some id in table.
After your edit I would change query to:
select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, DOCNUMBER r from tab1
connect by regexp_substr(v1, '[^,]+', 1, level) is not null
) group by DOCNUMBER) t1
inner join (
select max(DOCNUMBER) val2, DOCNUMBER r from NEW_REVISIONS) t2
on (t1.r = t2.r);
You may write a PL/SQL function parsing the string and returning the maximal number
select max_num( '1,26,24,25') max_num from dual;
MAX_NUM
----------
26
The query ist than very simple:
select OLD_REVISIONS NEW_REVISIONS
from revs
where max_num(OLD_REVISIONS) < max_num(NEW_REVISIONS);
A prototyp function without validation and error handling
create or replace function max_num(str_in VARCHAR2) return NUMBER as
i number;
x varchar2(1);
n number := 0;
max_n number := 0;
pow number := 0;
begin
for i in 0.. length(str_in)-1 loop
x := substr(str_in,length(str_in)-i,1);
if x = ',' then
-- check max number
if n > max_n then
max_n := n;
end if;
-- reset
n := 0;
pow := 0;
else
n := n + to_number(x)*power(10,pow);
pow := pow +1;
end if;
end loop;
return(max_n);
end;
/
I have two columns with a hashtag delimited value, i.e. Email#Web#Telephone#SMS#MMS & 0#0#0#1#0 Note that each delimited value of the second column matches up with its corresponding delimited value in the first column, i.e. Email = 0, Web = 0, Telephone = 0, SMS = 1 etc.
Based on a parameter, I want to return the matching value of the second column. i.e. incoming param = Web#Telephone#SMS, thus the value that I want to return is 0#0#1.
This need to be done in PL SQL, and I have no clue where to start, which explains the lack of sample code.
Any help please?
There are a couple of very useful utility functions in an Oracle package called APEX_UTIL. (This package concerns Oracle Application Express aka APEX, but can be used anywhere). They are:
apex_util.string_to_table
apex_util.table_to_string
Using string_to_table you can convert the delimited string into a table of values:
declare
v_table apex_application_global.vc_arr2; -- This is the table type apex_util uses
begin
v_table := apex_util.table_to_string ('Email#Web#Telephone#SMS#MMS', '#');
end;
You now have an array with 5 elements ('Email', 'Web', 'Telephone', 'SMS', 'MMS');
You can do the same with the values string to get a table with elements ('0', '0', '0', '1', 0'). And you can do the same with the parameter to get a table with elements ('Web', 'Telephone', 'SMS').
You can then use PL/SQL logic to build a new array with elements for the values you need to return, i.e. ('0', '0', '1'). I have left this part to you!
Finally you can turn that back into a delimited string:
return apex_util.table_to_string (v_return_table, '#');
Firstly, you should normalize the table and have the attributes in different columns rather than delimited strings in a single column.
Anyway, you could do it in many ways using the techniques to Split comma delimited strings in a table
For example, using REGEXP_SUBSTR and CONNECT BY clause:
SQL> WITH DATA(attr, val) AS(
2 SELECT 'Email#Web#Telephone#SMS#MMS', '0#0#0#1#0' FROM dual
3 )
4 SELECT lines.COLUMN_VALUE,
5 trim(regexp_substr(t.attr, '[^#]+', 1, lines.COLUMN_VALUE)) attr,
6 trim(regexp_substr(t.val, '[^#]+', 1, lines.COLUMN_VALUE)) val
7 FROM data t,
8 TABLE (CAST (MULTISET
9 (SELECT LEVEL FROM dual CONNECT BY LEVEL <= regexp_count(t.attr, '#')+1
10 ) AS sys.odciNumberList ) ) lines
11 /
COLUMN_VALUE ATTR VAL
------------ --------------------------- ---------
1 Email 0
2 Web 0
3 Telephone 0
4 SMS 1
5 MMS 0
SQL>
Now, you can get the respective values for each attribute.
You could put the entire logic in a FUNCTION and return the corresponding values of each attribute and call the function in SELECT statement.
For example,
SQL> CREATE OR REPLACE
2 FUNCTION get_val_from_attr(
3 attr_name VARCHAR2)
4 RETURN NUMBER
5 IS
6 var_val NUMBER;
7 BEGIN
8 WITH DATA(attr, val) AS
9 ( SELECT 'Email#Web#Telephone#SMS#MMS', '0#0#0#1#0' FROM dual
10 ),
11 t2 AS
12 (SELECT lines.COLUMN_VALUE,
13 trim(regexp_substr(t.attr, '[^#]+', 1, lines.COLUMN_VALUE)) attr,
14 trim(regexp_substr(t.val, '[^#]+', 1, lines.COLUMN_VALUE)) val
15 FROM data t,
16 TABLE (CAST (MULTISET
17 (SELECT LEVEL FROM dual CONNECT BY LEVEL <= regexp_count(t.attr, '#')+1
18 ) AS sys.odciNumberList ) ) lines
19 )
20 SELECT val INTO var_val FROM t2 WHERE attr = attr_name;
21 RETURN var_val;
22 END;
23 /
Function created.
Let's call the function:
SQL> SELECT get_val_from_attr('Email') FROM dual;
GET_VAL_FROM_ATTR('EMAIL')
--------------------------
0
SQL> SELECT get_val_from_attr('SMS') FROM dual;
GET_VAL_FROM_ATTR('SMS')
------------------------
1
I have procedure that has parameter that takes comma separated value ,
so when I enter Parameter = '1,0,1'
I want to return ' one , Zero , One' ?
You could use REPLACE function.
For example,
SQL> WITH DATA(str) AS(
2 SELECT '1,0,1' FROM dual
3 )
4 SELECT str,
5 REPLACE(REPLACE(str, '0', 'Zero'), '1', 'One') new_str
6 FROM DATA;
STR NEW_STR
----- ------------------------------------------------------------
1,0,1 One,Zero,One
SQL>
This query splits list into into numbers, converts numbers into words and joins them again together with function listagg:
with t1 as (select '7, 0, 11, 132' col from dual),
t2 as (select level lvl,to_number(regexp_substr(col,'[^,]+', 1, level)) col
from t1 connect by regexp_substr(col, '[^,]+', 1, level) is not null)
select listagg(case
when col=0 then 'zero'
else to_char(to_date(col,'j'), 'jsp')
end,
', ') within group (order by lvl) col
from t2
Output:
COL
-------------------------------------------
seven, zero, eleven, one hundred thirty-two
The limitation of this solution is that values range is between 0 and 5373484 (because 5373484 is maximum value for function to_date).
If you need higher values you can find hints in this article.