how to get unique data from multiple columns in db2 - sql

I wanted to get data from 2 columns in below way:
Id1 id2 id3
1 1 2
2 3 null
2 4 null
O/p
Id1 data
1 1,2
2 3,4
Here id1 is pk and id2 and id3 is fk of other table.

Try this as is:
WITH TAB (ID1, ID2, ID3) AS
(
VALUES
(1, 1, 2)
, (2, 3, NULL)
, (2, 4, NULL)
)
SELECT ID1, LISTAGG(DISTINCT ID23, ',') AS DATA
FROM
(
SELECT T.ID1, CASE V.ID WHEN 2 THEN T.ID2 ELSE T.ID3 END AS ID23
FROM TAB T
CROSS JOIN (VALUES 2, 3) V(ID)
)
WHERE ID23 IS NOT NULL
GROUP BY ID1;

This is a bit strange -- concatenating both within the same row and across multiple rows. One method is to unpivot and then aggregate:
select id1, listagg(id2, ',') within group (order by id2)
from (select id1, id2 from t union all
select id1, id3 from t
) t
where id2 is not null
group by id1;
Assuming that only id2 could be NULL, you can also express this as:
select id1,
listagg(concat(id2, coalesce(concat(',', id3), '')), ',') within group (order by id2)
from t
group by id1;

Related

Merge Columns in Oracle with distinct values

Need help to merge columns in Oracle with distinct values.
I've one table called TEST with below data.
ID ID1 ID2 ID3
1 A B C
1 B P A
2 X Y Z
2 Y Z K
Need output as follows
ID MergedValues
1 A;B;C;P
2 X;Y;Z;K
This solution is close:
SELECT id, listagg(v, ';') WITHIN GROUP (ORDER BY v) AS MergedValues
FROM (
SELECT id, id1 AS v
FROM test
UNION
SELECT id, id2 AS v
FROM test
UNION
SELECT id, id3 AS v
FROM test
) t
GROUP BY id
SQLFiddle
It does not retain the order of encounter of MergedValues as you seem to have requested implicitly, but produces this:
| ID | MERGEDVALUES |
|----|--------------|
| 1 | A;B;C;P |
| 2 | K;X;Y;Z |
You can unpivot the columns into rows, and find the distinct values to remove duplicates:
select distinct id, val
from test
unpivot (val for pos in (id1 as 1, id2 as 2, id3 as 3));
And then apply listagg() to that:
select id,
listagg(val, ';') within group (order by val) as mergedvalues
from (
select distinct id, val
from test
unpivot (val for pos in (id1 as 1, id2 as 2, id3 as 3))
)
group by id
order by id;
With your sample data as a CTE:
with test (ID, ID1, ID2, ID3) as (
select 1, 'A', 'B', 'C' from dual
union all select 1, 'B', 'P', 'A' from dual
union all select 2, 'X', 'Y', 'Z' from dual
union all select 2, 'Y', 'Z', 'K' from dual
)
select id,
listagg(val, ';') within group (order by val) as mergedvalues
from (
select distinct id, val
from test
unpivot (val for pos in (id1 as 1, id2 as 2, id3 as 3))
)
group by id
order by id;
ID MERGEDVALUES
---------- ------------------------------
1 A;B;C;P
2 K;X;Y;Z
If the order within the list needs to match what you showed then it seems almost to be based on the first column the value was seen in, so you can do:
select id,
listagg(val, ';') within group (order by min_pos) as mergedvalues
from (
select id, val, min(pos) as min_pos
from test
unpivot (val for pos in (id1 as 1, id2 as 2, id3 as 3))
group by id, val
)
group by id
order by id;
ID MERGEDVALUES
---------- ------------------------------
1 A;B;P;C
2 X;Y;Z;K
which is closer but has C and P reversed; it isn't clear what should control that. Perhaps there is another column you haven't shown which implies a row order.
Here's my approach:
(Note: after posting, I see this resembles Alex Poole's approach, except that I order the input rows first.)
Order the input rows within each ID: you don't say how, I order by ID1,ID2,ID3
Unpivot the data, assigning numbers from 1 to 3 to the columns
Assign priorities to each value based on row order then column order
When a value appears more than once, keep only the minimum "priority"
Use LISTAGG, ordering by priority.
with data_with_rn as (
select t.*,
row_number() over(partition by id order by ID1,ID2,ID3) rn
from t
)
, unpivoted as (
select id, val,
row_number() over(partition by id order by rn, col) priority
from data_with_rn
unpivot(val for col in(ID1 as 1, ID2 as 2, ID3 as 3))
)
, grouped as (
select id, val, min(priority) priority
from unpivoted
group by id, val
)
select id, listagg(val, ';') within group(order by priority) vals
from grouped
group by id
order by id;
ID VALS
-- --------
1 A;B;C;P
2 X;Y;Z;K

Select a subcategory ID to associate with a primary ID based off which has the highest sum

I have a primary ID, ID1, and a secondary ID, ID2. ID1 can be associated with multiple ID2 values, and vice versa. I want to sum a third Values column by ID2 under each ID1, and pull the ID2 with the highest sum. The source data is structured like:
ID1 ID2 Value
1 10 1
1 10 2
1 20 1
2 10 1
2 30 2
And I want the final results to look like:
ID1 ID2
1 10
2 30
So far, I only have a nonfunctioning query:
SELECT ID1,
CASE WHEN ID2_Value = MAX(ID2_Value) THEN ID2
ELSE NULL
END AS PrimaryID2
FROM ( SELECT ID1,
ID2,
SUM(Value) AS ID2_Value
FROM SOME_SCHEMA
GROUP BY ID1, ID2
) AS ID2_Value
GROUP BY ID1;
My query doesn't work right now because it expects me to include ID2_Value in the GROUP BY statement, but I don't want to group by those values.
I would use row_number():
select id1, id2
from (select id1, id2, sum(value) as sumv,
row_number() over (partition by id1 order by sum(value) desc) as seqnum
from t
group by id1, id2
) t
where seqnum = 1;

Recursive SQL retrieve all levels

I am unable to retrieve the desired result my query when using Oracle's recursive approach:
Foo
ID1 ID2
1 2
1 3
4 2
4 3
4 5
Query:
select sys_connect_by_path(id2,' -> ')
FROM Foo
START WITH id1 = 1
CONNECT BY PRIOR id1 = id2
ORDER BY 1;
Outputs only level 1 hierarchy (2,3). I want it to detect the tree ( 1 -> (2,3) -> 4 -> 5 ), such that selecting distinct ID2 yields (2,3,5). Thank you.
If you are using Oracle 11.2 or above, a CTE (Common Table Expression) is preferred over using Oracle's CONNECT BY statement.
WITH
aset -- Create pseudo table with ID2 as ID1 and vice versa
AS
(SELECT id1, id2
FROM (SELECT id1, id2
FROM foo
UNION
SELECT id2, id1
FROM foo)
WHERE id1 < id2),
bset (id1, id2) -- Extract hierarchy from pseudo table
AS
(SELECT id1, id2
FROM aset
WHERE id1 = 1
UNION ALL
SELECT aset.id1, aset.id2
FROM bset INNER JOIN aset ON bset.id2 = aset.id1
WHERE bset.id1 <> aset.id2)
SELECT DISTINCT bset.id2 -- Only keep values that were originally ID2
FROM bset INNER JOIN foo ON bset.id2 = foo.id2
ORDER BY id2;
Here is the same thing using CONNECT BY
WITH
aset
-- Create pseudo table with ID2 as ID1 and vice versa
AS
(SELECT id1, id2
FROM (SELECT id1, id2
FROM foo
UNION
SELECT id2, id1
FROM foo)
WHERE id1 < id2),
bset
-- Extract hierarchy from pseudo table
AS
( SELECT id2
FROM aset
START WITH id1 = 1
CONNECT BY PRIOR id2 = id1)
SELECT DISTINCT bset.id2
-- Only keep values that were originally ID2
FROM bset INNER JOIN foo ON bset.id2 = foo.id2
ORDER BY id2

Oracle 11.2 SQL - help to condense data in ordered set

I have a data-set with a timestamp column and multiple identifier columns. I want to condense it to a single row for each "block" of adjacent rows with equal identifiers, when ordered by the timestamp. The min and max timestamp for each block is required.
Source Data:
TSTAMP ID1 ID2
t1 A B <= start of new block
t2 A B
t3 C D <= start of new block
t4 E F <= start of new block
t5 E F
t6 E F
t7 A B <= start of new block
t8 G H <= start of new block
Desired Result:
MIN_TSTAMP MAX_TSTAMP ID1 ID2
t1 t2 A B
t3 t3 C D
t4 t6 E F
t7 t7 A B
t8 t8 G H
I thought this was ripe for a window-ing analytic function but I cannot partition without grouping ALL equal combinations of IDn - rather than only those in adjacent rows, when ordered by timestamp.
A workaround is to create a key column first in an in-line view that I can later group by i.e. with same value for each row in the block and different value for each block. I can do this using LAG analytic function to compare row values and then calling a PL/SQL function to return nextval/currval values of a sequence (calling nextval/currval directly in the SQL is restricted in this context).
select min(ilv.tstamp), max(ilv.tstamp), id1, id2
from (
select case when (id1 != lag(id1,1,'*') over (partition by (1) order by tstamp)
or id2 != lag(id2,1,'*') over (partition by (1) order by tstamp))
then
pk_seq_utils.gav_get_nextval
else
pk_seq_utils.gav_get_currval
end ident, t.*
from tab1 t
order by tstamp) ilv
group by ident, id1, id2
order by 1;
where the gav_get_xxx functions simply return currval/nextval from a sequence.
But I would like to use SQL only and avoid PL/SQL (as I could also write this easily in PL/SQL and pipe out the result-rows from a pipeline function).
Any ideas?
Thanks.
Tabibitosan to the rescue!
with sample_data as (select 't1' tstamp, 'A' id1, 'B' id2 from dual union all
select 't2' tstamp, 'A' id1, 'B' id2 from dual union all
select 't3' tstamp, 'C' id1, 'D' id2 from dual union all
select 't4' tstamp, 'E' id1, 'F' id2 from dual union all
select 't5' tstamp, 'E' id1, 'F' id2 from dual union all
select 't6' tstamp, 'E' id1, 'F' id2 from dual union all
select 't7' tstamp, 'A' id1, 'B' id2 from dual union all
select 't8' tstamp, 'G' id1, 'H' id2 from dual)
select min(tstamp) min_tstamp, max(tstamp) max_tstamp, id1, id2
from (select tstamp,
id1,
id2,
row_number() over (order by tstamp) - row_number() over (partition by id1, id2 order by tstamp) grp
from sample_data)
group by id1,
id2,
grp
order by min(tstamp);
MIN_TSTAMP MAX_TSTAMP ID1 ID2
---------- ---------- --- ---
t1 t2 A B
t3 t3 C D
t4 t6 E F
t7 t7 A B
t8 t8 G H
You can use an analytic 'trick' to identify the gaps and islands, comparing the position of each row just against the tstamp across all rows with its position against tstamp just for that id2, id2 combination:
select tstamp, id1, id2,
row_number() over (partition by id1, id2 order by tstamp)
- row_number() over (order by tstamp) as block_id
from tab1;
TS I I BLOCK_ID
-- - - ----------
t1 A B 0
t2 A B 0
t3 C D -2
t4 E F -3
t5 E F -3
t6 E F -3
t7 A B -4
t8 G H -7
The actual value of block_id doesn't matter, just that it's unique for each block for the combination. You can then group using that:
select min(tstamp) as min_tstamp, max(tstamp) as max_tstamp, id1, id2
from (
select tstamp, id1, id2,
row_number() over (partition by id1, id2 order by tstamp)
- row_number() over (order by tstamp) as block_id
from tab1
)
group by id1, id2, block_id
order by min(tstamp);
MI MA I I
-- -- - -
t1 t2 A B
t3 t3 C D
t4 t6 E F
t7 t7 A B
t8 t8 G H
You should be able to use the row_number window function to do this, like below:
select
min(tstamp) mints, max(tstamp) maxts, id1, id2
from (
select
*,
row_number() over (order by tstamp)
- row_number() over (partition by id1, id2 order by tstamp) as rn
from t
) as subq
group by id1, id2, rn
order by rn
I haven't been able to test it with any Oracle db, but it works with MSSQL and should work in Oracle too as the window function works the same way.
You need to do this step by step:
Detect ID changes with LAG marking each change with a flag = 1.
Generate keys for the groups (i.e. adjacent records with the same ID) with SUM over the ID change flags (running total).
Group by generated group key and get min/max timestamp.
Query:
select
min(tstamp) as min_tstamp,
max(tstamp) as max_tstamp,
min(id1) as id1,
min(id2) as id2
from
(
select
grouped.*,
sum(newgroup) over (order by tstamp) as groupkey
from
(
select
mytable.*,
case when id1 <> lag(id1) over (order by tstamp)
or id2 <> lag(id2) over (order by tstamp)
then 1 else 0 end as newgroup
from mytable
order by tstamp
) grouped
)
group by groupkey
order by groupkey;

How to select maximum value of two identical columns in same table?

id1 id2 name
1 1 2 a
2 3 4 b
3 5 6 c
4 7 8 d
5 9 10 e
select id1, id2, name
from Emp3
where id2 in (select MAX(id2) from Emp3)
How can I print only the maximum number?
USe
select id2
from Emp3
where id2 in (select MAX(id2) from Emp3)
This will print only 10
if you want the maximum among those 2 columns then use
SELECT
CASE
WHEN MAX(id1) >= MAX(id2) THEN MAX(id1)
WHEN MAX(id2) >= MAX(id1) THEN MAX(id2)
END AS MaxValue
FROM Emp3
Use TOP and ORDER BY to get the result
SELECT TOP 1 ID2 FROM Emp3 ORDER BY ID2 DESC
select max(id) from (
select max(id1) as id from Emp3
union
select max(id2) as id from Emp3
)
DECLARE #T TABLE (ID1 INT ,ID2 INT ,NAME VARCHAR(80))
INSERT INTO #T VALUES (1,2,'NME1')
INSERT INTO #T VALUES (3,4,'NME2')
INSERT INTO #T VALUES (5,6,'NME3')
INSERT INTO #T VALUES (7,8,'NME4')
INSERT INTO #T VALUES (9,10,'NME5')
INSERT INTO #T VALUES (11,12,'NME6')
SELECT * FROM #T
SELECT MAX(ID1) ID FROM
(
SELECT ID1 FROM #T T1
UNION
SELECT ID2 FROM #T T2
)TT
If both id columns are indexed, use derived table with UNION ALL:
select max(id)
from
(
select max(id1) as id from Emp3
union all
select max(id2) from Emp3
)
If they are not indexed, use CASE:
select max(case when id1 > id2 then id1
when id2 > id1 then id2
else coalesce(id1,id2) end)
from Emp3
This answer uses COALESCE to handle NULLs (if there are any...) The table will be read only once. (With the UNION solution the table will be read twice, and you don't want to do that without any index!)
Older answers:
I guess you want the row with the highest id1/id2 value?
Return a row when no other row with higher id value exists:
select id1, id2, name
from Emp3 e1
where not exists (select 1 from Emp3 e2
where e2.id1 > e1.id1
or e2.id1 > e1.id2
or e2.id2 > e1.id1
or e2.id2 > e1.id2)
Will return both rows if there's a tie. (Two or more rows with same highest value.)
Alternative solution, use TOP, combined with ORDER BY with CASE to find each rows larger id value:
select TOP 1 id1, id2, name
from Emp3
order by case when id1 > id2 then id1 else id2 end desc
Alternative 3, a sub-query with UNION ALL to find largest id:
select TOP 1 id1, id2, name
from Emp3
where (select max(case when id1 > id2 then id1 else id2 end) from Emp3) in (id1,id2)