SQL How to perform multiple look-ups from a list, in one query - sql

We have a weird database table (wt) for which I can construct a query that can return a single row with these fields:
wt.thing_a_id = 5, wt.thing_b_id = 12, wt.thing_c_id = 9
Then, there's another lookup table (dt) that holds descriptions for these numbers, you could imagine it like this:
id desc
5 "flour"
12 "cups"
9 "barley"
what I need to end up with is numbers from wt, along with its description from dt.
I can do 3 simple queries, one to look up each of my three thing_ values (select desc from dt where id = ) but I was hoping to do it all in one query.
Is there a way to do this?
Even better, is there way to do my query to get my single row of thing id's and combine them with their descriptions? I think the fundamental problem/challenge is that my thing id's are not one per row, but that they come back as fields in just one row. This makes it really hard to join against them, for example.
Michael

You seem to want conditional aggregation:
select
max(case when id = 3 then descr end) descr_3,
max(case when id = 12 then descr end) descr_12,
max(case when id = 9 then descr end) descr_9
from dt
where id in (3, 12, 9)
Note that desc is a SQL keyword, hence a poor choice for a column name. I renamed it descr in the query.

You will need multiple joins to the dt table to get the description of each of the "things" you want in a single row:
SELECT thing_a_id, dta.desc AS thing_a_desc,
thing_b_id, dtb.desc AS thing_b_desc,
thing_c_id, dtc.desc AS thing_c_desc
FROM wt
JOIN dt dta ON dta.id = wt.thing_a_id
JOIN dt dtb ON dtb.id = wt.thing_b_id
JOIN dt dtc ON dtc.id = wt.thing_c_id

I love to play with common table expressions (CTE), this is an ideal candidate for one.
In the example below, decriptions and dataset are substitutes for the actual tables you use. I am just building them in memory rather than an actual table.
In the "breakdown" CTE I am splitting up the CSV value from dataset into multiple rows.
In the last part of the select I am converting everything after the = sign to a number, and then matching that on id from the descriptions CTE. The resulting dataset is I believe what you requested.
WITH
descriptions AS
(SELECT 5 AS id, 'flour' AS description FROM DUAL
UNION ALL
SELECT 12 AS id, 'cups' AS description FROM DUAL
UNION ALL
SELECT 9 AS id, 'barley' AS description FROM DUAL),
dataset AS
(SELECT 'wt.thing_a_id = 5, wt.thing_b_id = 12, wt.thing_c_id = 9' AS result FROM DUAL),
breakdown ( result, REMAINDER ) AS
(SELECT TRIM( SUBSTR( result
, 1
, INSTR( result || ',', ',' ) - 1 ) ) AS result
, TRIM( SUBSTR( result, INSTR( result || ',', ',' ) + 1 ) || ',' ) AS REMAINDER
FROM dataset
UNION ALL
SELECT TRIM( SUBSTR( REMAINDER
, 1
, INSTR( REMAINDER, ',' ) - 1 ) )
, SUBSTR( REMAINDER, INSTR( REMAINDER || ',', ',' ) + 1 ) AS REMAINDER
FROM breakdown
WHERE REMAINDER IS NOT NULL)
SELECT result, TO_NUMBER( TRIM( SUBSTR( result, INSTR( result, '=' ) + 1 ) ) ) AS id, description
FROM breakdown
LEFT OUTER JOIN descriptions
ON TO_NUMBER( TRIM( SUBSTR( breakdown.result, INSTR( breakdown.result, '=' ) + 1 ) ) ) =
descriptions.id
Results:
Result ID DESCRIPTION
wt.thing_a_id = 5 5 flour
wt.thing_b_id = 12 12 cups
wt.thing_c_id = 9 9 barley

Related

Count and order comma separated values

I have the below one column "table" (apologies for the data model, not my fault :():
COL_IN
------
2K, E
E, 2K
O
I would like to obtain the below output, ordered by count descending:
COL_OUT COUNT
----------
K 4
E 2
O 1
COUNT is a reserved keyword, so it's not a good column name - even in the final output. I use COUNT_ instead (with an underscore).
Other than that, you can modify the input strings so they become valid JSON arrays, so that you can then use JSON functions to split them. After you split the strings into tokens, it's a simple matter to separate the leading number (if present) from the rest of the string, and to aggregate. NVL in the sum adds 1 for each token without a leading integer.
Including the sample data for testing only (if you have an actual table, remove the WITH clause at the top):
with
tbl (col_in) as (
select '2K, E' from dual union all
select 'E, 2K' from dual union all
select 'O' from dual
)
select ltrim(col, '0123456789') as col_out
, sum(nvl(to_number(regexp_substr(col, '^\d*')), 1)) as count_
from tbl,
json_table('["' || regexp_replace(col_in, ', *', '","') || '"]', '$[*]'
columns col path '$')
group by ltrim(col, '0123456789')
order by count_ desc, col_out
;
COL_OUT COUNT_
------- ------
K 4
E 2
O 1
You can use hierarchical query in such a way that
WITH t2 AS
(
SELECT TRIM(REGEXP_SUBSTR(col_in,'[^,]+',1,level)) AS s
FROM t
CONNECT BY level <= REGEXP_COUNT(col_in,',')+1
AND PRIOR SYS_GUID() IS NOT NULL
AND PRIOR col_in = col_in
)
SELECT REGEXP_SUBSTR(s,'[^0-9]') AS col_out,
SUM(NVL(REGEXP_SUBSTR(s,'[^[:alpha:]]'),1)) AS count
FROM t2
GROUP BY REGEXP_SUBSTR(s,'[^0-9]'),REGEXP_SUBSTR(s,'[^[:alpha:]]')
ORDER BY count DESC
presuming all of the data are alphanumeric only(eg.not containing special charaters such as $,#,! ..etc.)

Oracle SQL - Combining columns with 'OR' bit function

Oracle 12.2 - I have a table with 3 columns... ID, ParentID and ProductList. ID is unique, with multiple IDs rolling up to a ParentID. (this is a account model... basically multiple accounts have the same parent...) ProductList is a string...also exactly 20 bytes... right now it is 20 letters of 'Y' and 'N', such as YYNYNYNYNNNY... but I can change the 'Y' and 'N' to 1 and 0 if it will help... what I need to do is within a group of ParentID, calculate a bitwise OR of the ProductList. The end result I need is a 20 byte string (or some type of 20 bits of data) that says - for each respective letter/bit - if any 'Y' then return 'Y'. Again, I can use 1/0 if easier than Y/N.
Here is pseudoCode of what I am trying to do... Any help appreciated.
with T1 as
(
select 10 as ID, 20 as ParentID, 'YYNNYNYNYNYYNNYNYNYN' as ProductList from dual
union
select 11 as ID, 20 as ParentID, 'NNNNNNNNNNYYYYYYYYYY' as ProductList from dual
union
select 22 as ID, 20 as ParentID, 'YYNNNNNNNNNNNNNNNNNN' as ProductList from dual
)
SELECT ParentID, BitWiseOr(ProductList) FROM t1
group by ParentID;
You can use the brute force method of taking the maximum of each character and then using ||:
SELECT ParentID,
(max(substr(productlist, 1, 1)) ||
max(substr(productlist, 2, 1)) ||
max(substr(productlist, 3, 1)) ||
. . .
max(substr(productlist, 20, 1)) ||
)
FROM t1
GROUP BY ParentID;
This works because 'Y' > 'N'.
Note: This is a lousy data model. You should have a separate table with one row per id and product.
You can destruct string to atomic values, compute result of or operation and assemble it back into string. (Credit to #GordonLinoff for Y>N trick.) dbfiddle here.
Unfortunately, Oracle does not allow something like unpivot (val FOR substring(ProductList,i,1 in ... and also Oracle does not have equivalent to Postgres bool_or, which would both made solution simpler. At least this solution scales with ProductList length.
Anyway you should avoid violating 1st normal form. If you cannot, it IMHO does not matter how boolean is modelled.
with T1 as
(
select 10 as ID, 20 as ParentID, 'YYNNYNYNYNYYNNYNYNYN' as ProductList from dual
union
select 11 as ID, 20 as ParentID, 'NNNNNNNNNNYYYYYYYYYY' as ProductList from dual
union
select 22 as ID, 20 as ParentID, 'YYNNNNNNNNNNNNNNNNNN' as ProductList from dual
), series (i) as (
select level as i
from dual
connect by level <= 20
), applied_or as (
select t1.parentid
, max(substr(t1.productlist, series.i, 1)) as or_result
, series.i
from t1
cross join series
group by t1.parentid, series.i
)
select parentid
, listagg(or_result) within group (order by i)
from applied_or
group by parentid

Split a Column with Delimited Values and Compare Each Value

I have a column that contains multiple values in a delimited(comma-separated) format -
id | code
------------
1 11,19,21
2 55,87,33
3 3,11
4 11
I want to be able to compare to each value inside the 'code' column as below -
SELECT id FROM myTbl WHERE code = '11'
This should return -
1
3
4
I've tried the solution below but it does not work for all cases -
SELECT id FROM myTbl WHERE POSITION('11' IN code) <> 0
This will work with a 2 digit number like '11' as it will return a value that is <> 0 if it finds a match. But it will fail when searching for say '3' because rows with 'id' 2 and 3 both will be returned.
Here is link that talks about the POSITION function in REDSHIFT.
Any other approach that will solve this problem?
you can get the count of this string
SELECT id FROM myTbl WHERE regexp_count(user_action, '[11]') > 0
I think we can use regexp_substr() as follow.
select tb .id from myTbl tb where '11' in (
select regexp_substr( (select code from myTbl where id=tb.id),'[^,]+', 1, LEVEL) from dual
connect by regexp_substr((select code from myTbl where id=tb.id) , '[^,]+', 1, LEVEL) is not null);
just try this.
Use split_part() function
SELECT distinct id
FROM myTbl
WHERE '11' in ( split_part( code||',' , ',', 1 ),
split_part( code||',' , ',', 2 ),
split_part( code||',' , ',', 3 ) )
This is a very, very bad data model. You should be storing this information in a junction/association table, with one row per value.
But, if you have no choice, you can use like:
SELECT id
FROM myTbl
WHERE ',' || code || ',' LIKE '%,11,%';

Finding out the highest number in a comma separated string using Oracle SQL

I have a table with two columns:
OLD_REVISIONS |NEW_REVISIONS
-----------------------------------
1,25,26,24 |1,26,24,25
1,56,55,54 |1,55,54
1 |1
1,2 |1
1,96,95,94 |1,96,94,95
1 |1
1 |1
1 |1
1 |1
1,2 |1,2
1 |1
1 |1
1 |1
1 |1
For each row there will be a list of revisions for a document (comma separated)
The comma separated list might be the same in both columns but the order/sort might be different - e.g.
2,1 |1,2
I would like to find all the instances where the highest revision in the OLD_REVISIONS column is lower than than the highest revision in NEW_REVISIONS
The following would fit that criteria
OLD_REVISIONS |NEW_REVISIONS
-----------------------------------
1,2 |1
1,56,55,54 |1,55,54
I tried a solution using the MINUS option (joining the table to itself) but it returns differences even for when the list is the same but in the wrong order
I tried the function GREATEST (i.e where greatest(new_Revisions) < greatest(old_revisions)) but i am not sure why greatest(OLD_REVISIONS) always just returns the comma separated value. It does not return the max value. I suspect it is comparing strings because the columns are VARCHAR.
Also, MAX function expects a single number.
Is there another way i can achieve the above? I am looking for a pure SQL option so i can print out the results (or a PL/SQL option that can print out the results)
Edit
Apologies for not mentioning this but for the NEW_REVISIONS i do actually have the data in a table where each revision is in a separate row:
"DOCNUMBER" "REVISIONNUMBER"
67 1
67 24
67 25
67 26
75 1
75 54
75 55
75 56
78 1
79 1
79 2
83 1
83 96
83 94
Just to give some content, a few weeks ago i suspected that there are revisions disappearing.
To investigate this, i decided to take a count of all revisions for all documents and take a snapshot to compare later to see if revisions are indeed missing.
The snapshot that i took contained the following columns:
docnumber, count, revisions
The revisions were stored in a comma separated list using the listagg function.
The trouble i have now is the on live table, new revisions have been added so when i compare the main table and the snapshot using a MINUS i get a difference because
of the new revisions in the main table.
Even though in the actual table the revisions are individual rows, in the snapshot table i dont have the individual rows.
I am thinking the only way to recreate the snapshot in the same format and compare them find out if maximum revision in the main table is lower than the max revision in the snapshot table (hence why im trying to find out how to find out the max in a comma separated string)
Enjoy.
select xmlcast(xmlquery(('max((' || OLD_REVISIONS || '))') RETURNING CONTENT) as int) as OLD_REVISIONS_max
,xmlcast(xmlquery(('max((' || NEW_REVISIONS || '))') RETURNING CONTENT) as int) as NEW_REVISIONS_max
from t
;
Assuming your base table has an id column (versions of what?) - here is a solution based on splitting the rows.
Edit: If you like this solution, check out vkp's solution, which is better than mine. I explain why his solution is better in a Comment to his Answer.
with
t ( id, old_revisions, new_revisions ) as (
select 101, '1,25,26,24', '1,26,24,25' from dual union all
select 102, '1,56,55,54', '1,55,54' from dual union all
select 103, '1' , '1' from dual union all
select 104, '1,2' , '1' from dual union all
select 105, '1,96,95,94', '1,96,94,95' from dual union all
select 106, '1' , '1' from dual union all
select 107, '1' , '1' from dual union all
select 108, '1' , '1' from dual union all
select 109, '1' , '1' from dual union all
select 110, '1,2' , '1,2' from dual union all
select 111, '1' , '1' from dual union all
select 112, '1' , '1' from dual union all
select 113, '1' , '1' from dual union all
select 114, '1' , '1' from dual
)
-- END of TEST DATA; the actual solution (SQL query) begins below.
select id, old_revisions, new_revisions
from (
select id, old_revisions, new_revisions, 'old' as flag,
to_number(regexp_substr(old_revisions, '\d+', 1, level)) as rev_no
from t
connect by level <= regexp_count(old_revisions, ',') + 1
and prior id = id
and prior sys_guid() is not null
union all
select id, old_revisions, new_revisions, 'new' as flag,
to_number(regexp_substr(new_revisions, '\d+', 1, level)) as rev_no
from t
connect by level <= regexp_count(new_revisions, ',') + 1
and prior id = id
and prior sys_guid() is not null
)
group by id, old_revisions, new_revisions
having max(case when flag = 'old' then rev_no end) !=
max(case when flag = 'new' then rev_no end)
order by id -- ORDER BY is optional
;
ID OLD_REVISION NEW_REVISION
--- ------------ ------------
102 1,56,55,54 1,55,54
104 1,2 1
You can compare every value by putting together the revisions in the same order using listagg function.
SELECT listagg(o,',') WITHIN GROUP (ORDER BY o) old_revisions,
listagg(n,',') WITHIN GROUP (ORDER BY n) new_revisions
FROM (
SELECT DISTINCT rowid r,
regexp_substr(old_revisions, '[^,]+', 1, LEVEL) o,
regexp_substr(new_revisions, '[^,]+', 1, LEVEL) n
FROM table
WHERE regexp_substr(old_revisions, '[^,]+', 1, LEVEL) IS NOT NULL
CONNECT BY LEVEL<=(SELECT greatest(MAX(regexp_count(old_revisions,',')),MAX(regexp_count(new_revisions,',')))+1 c FROM table)
)
GROUP BY r
HAVING listagg(o,',') WITHIN GROUP (ORDER BY o)<>listagg(n,',') WITHIN GROUP (ORDER BY n);
This could be a way:
select
OLD_REVISIONS,
NEW_REVISIONS
from
REVISIONS t,
table(cast(multiset(
select level
from dual
connect by level <= length (regexp_replace(t.OLD_REVISIONS, '[^,]+')) + 1
) as sys.OdciNumberList
)
) levels_old,
table(cast(multiset(
select level
from dual
connect by level <= length (regexp_replace(t.NEW_REVISIONS, '[^,]+')) + 1
)as sys.OdciNumberList
)
) levels_new
group by t.ROWID,
OLD_REVISIONS,
NEW_REVISIONS
having max(to_number(trim(regexp_substr(t.OLD_REVISIONS, '[^,]+', 1, levels_old.column_value)))) >
max(to_number(trim(regexp_substr(t.new_REVISIONS, '[^,]+', 1, levels_new.column_value))))
This uses a double string split to pick the values from every field, and then simply finds the rows where the max values among the two collections match your requirement.
You should edit this by adding some unique key in the GROUP BYclause, or a rowid if you don't have any unique key on your table.
One way to do is to split the columns on comma separation using regexp_substr and checking if the max and min values are different.
Sample Demo
with rownums as (select t.*,row_number() over(order by old_revisions) rn from t)
select old_revisions,new_revisions
from rownums
where rn in (select rn
from rownums
group by rn
connect by regexp_substr(old_revisions, '[^,]+', 1, level) is not null
or regexp_substr(new_revisions, '[^,]+', 1, level) is not null
having max(cast(regexp_substr(old_revisions,'[^,]+', 1, level) as int))
<> max(cast(regexp_substr(new_revisions,'[^,]+', 1, level) as int))
)
Comments say normalise data. I agree but also I understand it may be not possible. I would try something like query below:
select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, rowid r from tab1
connect by regexp_substr(v1, '[^,]+', 1, level) is not null
) group by r) t1
inner join (
select max(val) val2, r from (
select regexp_substr(v2,'[^,]+', 1, level) val, rowid r from tab1
connect by regexp_substr(v2, '[^,]+', 1, level) is not null
) group by r) t2
on (t1.r = t2.r);
Tested on:
create table tab1 (v1 varchar2(100), v2 varchar2(100));
insert into tab1 values ('1,3,5','1,4,7');
insert into tab1 values ('1,3,5','1,2,9');
insert into tab1 values ('1,3,5','1,3,5');
insert into tab1 values ('1,3,5','1,4');
and seems to work fine. I left rowid for reference. I guess you have some id in table.
After your edit I would change query to:
select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, DOCNUMBER r from tab1
connect by regexp_substr(v1, '[^,]+', 1, level) is not null
) group by DOCNUMBER) t1
inner join (
select max(DOCNUMBER) val2, DOCNUMBER r from NEW_REVISIONS) t2
on (t1.r = t2.r);
You may write a PL/SQL function parsing the string and returning the maximal number
select max_num( '1,26,24,25') max_num from dual;
MAX_NUM
----------
26
The query ist than very simple:
select OLD_REVISIONS NEW_REVISIONS
from revs
where max_num(OLD_REVISIONS) < max_num(NEW_REVISIONS);
A prototyp function without validation and error handling
create or replace function max_num(str_in VARCHAR2) return NUMBER as
i number;
x varchar2(1);
n number := 0;
max_n number := 0;
pow number := 0;
begin
for i in 0.. length(str_in)-1 loop
x := substr(str_in,length(str_in)-i,1);
if x = ',' then
-- check max number
if n > max_n then
max_n := n;
end if;
-- reset
n := 0;
pow := 0;
else
n := n + to_number(x)*power(10,pow);
pow := pow +1;
end if;
end loop;
return(max_n);
end;
/

Concatenate results from a SQL query in Oracle

I have data like this in a table
NAME PRICE
A 2
B 3
C 5
D 9
E 5
I want to display all the values in one row; for instance:
A,2|B,3|C,5|D,9|E,5|
How would I go about making a query that will give me a string like this in Oracle? I don't need it to be programmed into something; I just want a way to get that line to appear in the results so I can copy it over and paste it in a word document.
My Oracle version is 10.2.0.5.
-- Oracle 10g --
SELECT deptno, WM_CONCAT(ename) AS employees
FROM scott.emp
GROUP BY deptno;
Output:
10 CLARK,MILLER,KING
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,JAMES,TURNER,BLAKE,MARTIN,WARD
I know this is a little late but try this:
SELECT LISTAGG(CONCAT(CONCAT(NAME,','),PRICE),'|') WITHIN GROUP (ORDER BY NAME) AS CONCATDATA
FROM your_table
Usually when I need something like that quickly and I want to stay on SQL without using PL/SQL, I use something similar to the hack below:
select sys_connect_by_path(col, ', ') as concat
from
(
select 'E' as col, 1 as seq from dual
union
select 'F', 2 from dual
union
select 'G', 3 from dual
)
where seq = 3
start with seq = 1
connect by prior seq+1 = seq
It's a hierarchical query which uses the "sys_connect_by_path" special function, which is designed to get the "path" from a parent to a child.
What we are doing is simulating that the record with seq=1 is the parent of the record with seq=2 and so fourth, and then getting the full path of the last child (in this case, record with seq = 3), which will effectively be a concatenation of all the "col" columns
Adapted to your case:
select sys_connect_by_path(to_clob(col), '|') as concat
from
(
select name || ',' || price as col, rownum as seq, max(rownum) over (partition by 1) as max_seq
from
(
/* Simulating your table */
select 'A' as name, 2 as price from dual
union
select 'B' as name, 3 as price from dual
union
select 'C' as name, 5 as price from dual
union
select 'D' as name, 9 as price from dual
union
select 'E' as name, 5 as price from dual
)
)
where seq = max_seq
start with seq = 1
connect by prior seq+1 = seq
Result is: |A,2|B,3|C,5|D,9|E,5
As you're in Oracle 10g you can't use the excellent listagg(). However, there are numerous other string aggregation techniques.
There's no particular need for all the complicated stuff. Assuming the following table
create table a ( NAME varchar2(1), PRICE number);
insert all
into a values ('A', 2)
into a values ('B', 3)
into a values ('C', 5)
into a values ('D', 9)
into a values ('E', 5)
select * from dual
The unsupported function wm_concat should be sufficient:
select replace(replace(wm_concat (name || '#' || price), ',', '|'), '#', ',')
from a;
REPLACE(REPLACE(WM_CONCAT(NAME||'#'||PRICE),',','|'),'#',',')
--------------------------------------------------------------------------------
A,2|B,3|C,5|D,9|E,5
But, you could also alter Tom Kyte's stragg, also in the above link, to do it without the replace functions.
Here is another approach, using model clause:
-- sample of data from your question
with t1(NAME1, PRICE) as(
select 'A', 2 from dual union all
select 'B', 3 from dual union all
select 'C', 5 from dual union all
select 'D', 9 from dual union all
select 'E', 5 from dual
) -- the query
select Res
from (select name1
, price
, rn
, res
from t1
model
dimension by (row_number() over(order by name1) rn)
measures (name1, price, cast(null as varchar2(101)) as res)
(res[rn] order by rn desc = name1[cv()] || ',' || price[cv()] || '|' || res[cv() + 1])
)
where rn = 1
Result:
RES
----------------------
A,2|B,3|C,5|D,9|E,5|
SQLFiddle Example
Something like the following, which is grossly inefficient and untested.
create function foo returning varchar2 as
(
declare bar varchar2(8000) --arbitrary number
CURSOR cur IS
SELECT name,price
from my_table
LOOP
FETCH cur INTO r;
EXIT WHEN cur%NOTFOUND;
bar:= r.name|| ',' ||r.price || '|'
END LOOP;
dbms_output.put_line(bar);
return bar
)
Managed to get till here using xmlagg: using oracle 11G from sql fiddle.
Data Table:
COL1 COL2 COL3
1 0 0
1 1 1
2 0 0
3 0 0
3 1 0
SELECT
RTRIM(REPLACE(REPLACE(
XMLAgg(XMLElement("x", col1,',', col2, col3)
ORDER BY col1), '<x>'), '</x>', '|')) AS COLS
FROM ab
;
Results:
COLS
1,00| 3,00| 2,00| 1,11| 3,10|
* SQLFIDDLE DEMO
Reference to read on XMLAGG