I have the following query. I need to pull from a table, a reference for every order number (unique identifier). I will have in the end something like:
Order ID Ref
A Xz|Yz
But I want to have:
Order ID Ref
A Xz
A Yz
The catch is for each order ID, I could have more or less references concatenated but they are always separated by '|'.
I need to use a Select statement somehow (to read the data in Tableau in the proposed format above).
Does anybody have any ideas on how to achieve this?
This version handles NULL list elements and multiple rows:
WITH tbl(order_id, ref) AS (
SELECT 'A', 'Xz|Yz' FROM DUAL union all
SELECT 'B', 'B1||B3' FROM DUAL union all
SELECT 'C', '|C2|C3B3' FROM DUAL
)
SELECT order_id, regexp_substr(ref, '(.*?)(\||$)', 1, level, NULL, 1) ref
FROM tbl
CONNECT BY level <= regexp_count(ref, '\|') + 1
and prior order_id = order_id
and prior sys_guid() is not null
order by order_id;
DB Fiddle example
In Oracle, you can use a recursive query with CONNECT BY and REGEXP_SUBSTR:
SELECT order_id, TRIM(REGEXP_SUBSTR(ref, '[^|]+', 1, level)) ref
FROM t
CONNECT BY instr(ref, '|', 1, level - 1) > 0
ORDER BY order_id, ref
Demo on DB Fiddle:
WITH t AS (
SELECT 'A' order_id, 'Xz|Yz' ref FROM DUAL
)
SELECT order_id, trim(regexp_substr(ref, '[^|]+', 1, level)) ref
FROM t
CONNECT BY instr(ref, '|', 1, level - 1) > 0
order by order_id, ref
ORDER_ID | REF
:------- | :--
A | Xz
A | Yz
You will always have difficulty with that data, because its data is not normalised.
Specifically, the values in each column are currently not atomic; they fail “first normal form”.
The |-separated values should instead be broken out to separate values; for example, rows in some other table.
I have this query:
SELECT regexp_replace (var_called_num, '^' ||ROUTING_PREFIX) INTO Num
FROM INCOMING_ROUTING_PREFIX
WHERE var_called_num LIKE ROUTING_PREFIX ||'%';`
INCOMING_ROUTING_PREFIX table has two rows
1) 007743
2) 007742
var_called_num is 0077438843212123. So above query gives the result 8843212123.
So basically, the query is removing prefix (longest match from table) from var_called_num.
Now my table has changed. Now it has only 1 row which is comma-separated.
Modified Table:
INCOMING_ROUTING_PREFIX table has one row which is comma-separated:
1) 007743,007742
How to modify the query to achieve the same behavior. Need to remove longest match prefix from var_called_num.
Here's one option: you'd have to split the prefix into rows, and the use it in REGEXP_REPLACE.
SQL> with
2 calnum (var_called_num) as
3 (select '0077438843212123' from dual),
4 incoming_routing_prefix (routing_prefix) as
5 (select '007743,007742' from dual),
6 --
7 irp_split as
8 (select regexp_substr(i.routing_prefix, '[^,]+', 1, level) routing_prefix
9 from incoming_routing_prefix i
10 connect by level <= regexp_count(i.routing_prefix, ',') + 1
11 )
12 select regexp_replace(c.var_called_num, '^' || s.routing_prefix) result
13 from calnum c join irp_split s on s.routing_prefix = substr(c.var_called_num, 1, length(s.routing_prefix));
RESULT
----------------
8843212123
SQL>
By the way, why did you change the model to a worse version than it was before?
you can split the values
with test as (
select regexp_substr('007743,007742','[^,]+', 1, level) as ROUTING_PREFIX from dual
connect by regexp_substr('007743,007742S', '[^,]+', 1, level) is not null
)
and that use the view in your select
SELECT regexp_replace ('0077438843212123', '^' ||ROUTING_PREFIX)
FROM test WHERE '0077438843212123' LIKE ROUTING_PREFIX ||'%';
Thank you mathguy for your suggestion and assistance. The example you provided is a near perfect description of the issue. That being said I've used and edited your text to help describe this issue:
I receive a string that contains comma delimited digits in the form of 18656, 16380, 16424 (call this param1). The string only contains commas and digits.
In mytable I have a column named t with values such as 18656.01.02, 10.02.02, 16380.02.03, 16424.05.66, 16424.55.23.14.
I want to select the all rows that match all of the comma-separated digits in param1; where the first numeric component in column t is like 18656, 16380, 16424. Is there a way to use regexp_substr in this case.
Where param1 = 18656, 16380, 16424
the following works:
select * from mytable where t.mycolumn IN
(
(SELECT regexp_substr(:param1,'[^,]+', 1, level) as NUMLIST
FROM DUAL
CONNECT BY regexp_substr(:param1, '[^,]+', 1, level) IS NOT NULL)
);
How to use wildcard if data I seek from t.mycolumn = 18656.00.01, 16380.09.34, 16424.023.8
Can LIKE be used as search criteria? If possible please provide example.
Obviously, the following will not work but I am hoping to find a solution.
select * from mytable where t.mycolumn LIKE
(
(SELECT regexp_substr(:param1||'%','[^,]+', 1, level) as NUMLIST
FROM DUAL
CONNECT BY regexp_substr(:param1||'%', '[^,]+', 1, level) IS NOT NULL)
);
Assumptions:
There is a table named mytable with a column named t which
contains values as follows:
SELECT * FROM mytable;
T |
---------------|
18656.01.02 |
10.02.02 |
16380.02.03 |
16424.05.66 |
16424.55.23.14 |
There is a string received as a parameter, that contains comma delimited digits in the form of 18656, 16380, 16424. The string only contains commas and digits. This string is parsed into indyvidual rows with a help of a query that looks similar to the folowing one:
SELECT regexp_substr(param1,'[^,]+', 1, level) as NUMLIST
FROM (
select '18656,16380,16424' as param1 FROM DUAL
)
CONNECT BY regexp_substr(param1, '[^,]+', 1, level) IS NOT NULL
;
NUMLIST |
--------|
18656 |
16380 |
16424 |
Requirement
Can LIKE be used as search criteria? If possible please provide
example.
LIKE keyword is used below as a condition in JOIN ... ON clause:
SELECT * FROM mytable
WHERE t IN (
SELECT t
FROM mytable m
JOIN (
SELECT regexp_substr(param1,'[^,]+', 1, level) as NUMLIST
FROM (
select '18656,16380,16424' as param1 FROM DUAL
)
CONNECT BY regexp_substr(param1, '[^,]+', 1, level) IS NOT NULL
) x
ON m.t LIKE '%' || x.NUMLIST || '%'
)
T |
---------------|
18656.01.02 |
16380.02.03 |
16424.05.66 |
16424.55.23.14 |
I have a table with two columns:
OLD_REVISIONS |NEW_REVISIONS
-----------------------------------
1,25,26,24 |1,26,24,25
1,56,55,54 |1,55,54
1 |1
1,2 |1
1,96,95,94 |1,96,94,95
1 |1
1 |1
1 |1
1 |1
1,2 |1,2
1 |1
1 |1
1 |1
1 |1
For each row there will be a list of revisions for a document (comma separated)
The comma separated list might be the same in both columns but the order/sort might be different - e.g.
2,1 |1,2
I would like to find all the instances where the highest revision in the OLD_REVISIONS column is lower than than the highest revision in NEW_REVISIONS
The following would fit that criteria
OLD_REVISIONS |NEW_REVISIONS
-----------------------------------
1,2 |1
1,56,55,54 |1,55,54
I tried a solution using the MINUS option (joining the table to itself) but it returns differences even for when the list is the same but in the wrong order
I tried the function GREATEST (i.e where greatest(new_Revisions) < greatest(old_revisions)) but i am not sure why greatest(OLD_REVISIONS) always just returns the comma separated value. It does not return the max value. I suspect it is comparing strings because the columns are VARCHAR.
Also, MAX function expects a single number.
Is there another way i can achieve the above? I am looking for a pure SQL option so i can print out the results (or a PL/SQL option that can print out the results)
Edit
Apologies for not mentioning this but for the NEW_REVISIONS i do actually have the data in a table where each revision is in a separate row:
"DOCNUMBER" "REVISIONNUMBER"
67 1
67 24
67 25
67 26
75 1
75 54
75 55
75 56
78 1
79 1
79 2
83 1
83 96
83 94
Just to give some content, a few weeks ago i suspected that there are revisions disappearing.
To investigate this, i decided to take a count of all revisions for all documents and take a snapshot to compare later to see if revisions are indeed missing.
The snapshot that i took contained the following columns:
docnumber, count, revisions
The revisions were stored in a comma separated list using the listagg function.
The trouble i have now is the on live table, new revisions have been added so when i compare the main table and the snapshot using a MINUS i get a difference because
of the new revisions in the main table.
Even though in the actual table the revisions are individual rows, in the snapshot table i dont have the individual rows.
I am thinking the only way to recreate the snapshot in the same format and compare them find out if maximum revision in the main table is lower than the max revision in the snapshot table (hence why im trying to find out how to find out the max in a comma separated string)
Enjoy.
select xmlcast(xmlquery(('max((' || OLD_REVISIONS || '))') RETURNING CONTENT) as int) as OLD_REVISIONS_max
,xmlcast(xmlquery(('max((' || NEW_REVISIONS || '))') RETURNING CONTENT) as int) as NEW_REVISIONS_max
from t
;
Assuming your base table has an id column (versions of what?) - here is a solution based on splitting the rows.
Edit: If you like this solution, check out vkp's solution, which is better than mine. I explain why his solution is better in a Comment to his Answer.
with
t ( id, old_revisions, new_revisions ) as (
select 101, '1,25,26,24', '1,26,24,25' from dual union all
select 102, '1,56,55,54', '1,55,54' from dual union all
select 103, '1' , '1' from dual union all
select 104, '1,2' , '1' from dual union all
select 105, '1,96,95,94', '1,96,94,95' from dual union all
select 106, '1' , '1' from dual union all
select 107, '1' , '1' from dual union all
select 108, '1' , '1' from dual union all
select 109, '1' , '1' from dual union all
select 110, '1,2' , '1,2' from dual union all
select 111, '1' , '1' from dual union all
select 112, '1' , '1' from dual union all
select 113, '1' , '1' from dual union all
select 114, '1' , '1' from dual
)
-- END of TEST DATA; the actual solution (SQL query) begins below.
select id, old_revisions, new_revisions
from (
select id, old_revisions, new_revisions, 'old' as flag,
to_number(regexp_substr(old_revisions, '\d+', 1, level)) as rev_no
from t
connect by level <= regexp_count(old_revisions, ',') + 1
and prior id = id
and prior sys_guid() is not null
union all
select id, old_revisions, new_revisions, 'new' as flag,
to_number(regexp_substr(new_revisions, '\d+', 1, level)) as rev_no
from t
connect by level <= regexp_count(new_revisions, ',') + 1
and prior id = id
and prior sys_guid() is not null
)
group by id, old_revisions, new_revisions
having max(case when flag = 'old' then rev_no end) !=
max(case when flag = 'new' then rev_no end)
order by id -- ORDER BY is optional
;
ID OLD_REVISION NEW_REVISION
--- ------------ ------------
102 1,56,55,54 1,55,54
104 1,2 1
You can compare every value by putting together the revisions in the same order using listagg function.
SELECT listagg(o,',') WITHIN GROUP (ORDER BY o) old_revisions,
listagg(n,',') WITHIN GROUP (ORDER BY n) new_revisions
FROM (
SELECT DISTINCT rowid r,
regexp_substr(old_revisions, '[^,]+', 1, LEVEL) o,
regexp_substr(new_revisions, '[^,]+', 1, LEVEL) n
FROM table
WHERE regexp_substr(old_revisions, '[^,]+', 1, LEVEL) IS NOT NULL
CONNECT BY LEVEL<=(SELECT greatest(MAX(regexp_count(old_revisions,',')),MAX(regexp_count(new_revisions,',')))+1 c FROM table)
)
GROUP BY r
HAVING listagg(o,',') WITHIN GROUP (ORDER BY o)<>listagg(n,',') WITHIN GROUP (ORDER BY n);
This could be a way:
select
OLD_REVISIONS,
NEW_REVISIONS
from
REVISIONS t,
table(cast(multiset(
select level
from dual
connect by level <= length (regexp_replace(t.OLD_REVISIONS, '[^,]+')) + 1
) as sys.OdciNumberList
)
) levels_old,
table(cast(multiset(
select level
from dual
connect by level <= length (regexp_replace(t.NEW_REVISIONS, '[^,]+')) + 1
)as sys.OdciNumberList
)
) levels_new
group by t.ROWID,
OLD_REVISIONS,
NEW_REVISIONS
having max(to_number(trim(regexp_substr(t.OLD_REVISIONS, '[^,]+', 1, levels_old.column_value)))) >
max(to_number(trim(regexp_substr(t.new_REVISIONS, '[^,]+', 1, levels_new.column_value))))
This uses a double string split to pick the values from every field, and then simply finds the rows where the max values among the two collections match your requirement.
You should edit this by adding some unique key in the GROUP BYclause, or a rowid if you don't have any unique key on your table.
One way to do is to split the columns on comma separation using regexp_substr and checking if the max and min values are different.
Sample Demo
with rownums as (select t.*,row_number() over(order by old_revisions) rn from t)
select old_revisions,new_revisions
from rownums
where rn in (select rn
from rownums
group by rn
connect by regexp_substr(old_revisions, '[^,]+', 1, level) is not null
or regexp_substr(new_revisions, '[^,]+', 1, level) is not null
having max(cast(regexp_substr(old_revisions,'[^,]+', 1, level) as int))
<> max(cast(regexp_substr(new_revisions,'[^,]+', 1, level) as int))
)
Comments say normalise data. I agree but also I understand it may be not possible. I would try something like query below:
select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, rowid r from tab1
connect by regexp_substr(v1, '[^,]+', 1, level) is not null
) group by r) t1
inner join (
select max(val) val2, r from (
select regexp_substr(v2,'[^,]+', 1, level) val, rowid r from tab1
connect by regexp_substr(v2, '[^,]+', 1, level) is not null
) group by r) t2
on (t1.r = t2.r);
Tested on:
create table tab1 (v1 varchar2(100), v2 varchar2(100));
insert into tab1 values ('1,3,5','1,4,7');
insert into tab1 values ('1,3,5','1,2,9');
insert into tab1 values ('1,3,5','1,3,5');
insert into tab1 values ('1,3,5','1,4');
and seems to work fine. I left rowid for reference. I guess you have some id in table.
After your edit I would change query to:
select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, DOCNUMBER r from tab1
connect by regexp_substr(v1, '[^,]+', 1, level) is not null
) group by DOCNUMBER) t1
inner join (
select max(DOCNUMBER) val2, DOCNUMBER r from NEW_REVISIONS) t2
on (t1.r = t2.r);
You may write a PL/SQL function parsing the string and returning the maximal number
select max_num( '1,26,24,25') max_num from dual;
MAX_NUM
----------
26
The query ist than very simple:
select OLD_REVISIONS NEW_REVISIONS
from revs
where max_num(OLD_REVISIONS) < max_num(NEW_REVISIONS);
A prototyp function without validation and error handling
create or replace function max_num(str_in VARCHAR2) return NUMBER as
i number;
x varchar2(1);
n number := 0;
max_n number := 0;
pow number := 0;
begin
for i in 0.. length(str_in)-1 loop
x := substr(str_in,length(str_in)-i,1);
if x = ',' then
-- check max number
if n > max_n then
max_n := n;
end if;
-- reset
n := 0;
pow := 0;
else
n := n + to_number(x)*power(10,pow);
pow := pow +1;
end if;
end loop;
return(max_n);
end;
/
Below is the interview question, can some please help me resolve it?
select 'a1b2c3d4e5f6g7' from dual;
Output is sum of given integer number(1+2+3+4+5+6+7)=28.
Any help?
Use a Regex to keep only the numbers,then connect by to add each number
With T
as (select regexp_replace('a1b2c3d4e5f6g7', '[A-Za-z]') as col from dual)
select sum(val)
From
(
select substr(col,level,1) val from t connect by level <= length(col)
)
FIDDLE
Since it is only 1 digit numbers you can use SUBSTR() to extract every other character:
SQL Fiddle
Oracle 11g R2 Schema Setup:
Query 1:
WITH data ( value ) AS (
select 'a1b2c3d4e5f6g7' from dual
)
SELECT SUM( TO_NUMBER( SUBSTR( value, 2*LEVEL, 1 ) ) ) AS total
FROM data
CONNECT BY 2 * LEVEL <= LENGTH( value )
Results:
| TOTAL |
|-------|
| 28 |
However, if you have two digit numbers then you can do:
Query 2:
WITH data ( value ) AS (
select 'a1b2c3d4e5f6g7h8i9j10' from dual
)
SELECT SUM( TO_NUMBER( REGEXP_SUBSTR( value, '\d+', 1, LEVEL ) ) ) AS total
FROM data
CONNECT BY LEVEL <= REGEXP_COUNT( value, '\d+' )
Results:
| TOTAL |
|-------|
| 55 |
You can use regexp_substr to extract exactly the numbers, then just sum them:
with t as (select 'a1b2c3d4e5f6g7' expr from dual)
select sum(regexp_substr(t.expr, '[0-9]+',1, level)) as col
from dual
connect by level < regexp_instr(t.expr, '[0-9]+',1, level);
example:
select sum(regexp_substr('a1b2c3d4e5f6g7r22g4', '[0-9]+',1, level)) as col
from dual
connect by level < regexp_instr('a1b2c3d4e5f6g7r22g4', '[0-9]+',1, level);
Result:
54
This solution works with numbers with more than 1 digit and it doesn't matter how many characters are between the numbers:
with t as (select 'a1b2c3d4e5f6g7' as str from dual)
select sum(to_number(regexp_substr(str,'[0-9]+',1,level)))
from t
connect by regexp_substr(str,'[0-9]+',1,level) is not null