Oracle join view for best matches - sql

I wish to create a view which joins two tables together.
T1 =
Col1 Col2
AA BB
EE FF
YY ZZ
11 00
T2 =
Col1 Col2 Col3
AA BB 1
AA CC 2
CC BB 3
GG FF 4
GG HH 5
EE HH 6
XX YY 7
XX WW 8
YY RR 9
The rules for this view are a Best match scenario based upon the following rules:
1. Return Col3 from T2 if T1.Col1 & T1.Col2 = T2.Col1 & T2.Col2
ELSE
2. Return Col3 if T1.Col2 = T2.Col2
ELSE
3. Return Col3 if T1.Col = T2.Col1
ELSE
4. Return NULL
So in these examples I would expect the final view to contain:
AA BB 1 (Rule 1 match)
EE FF 4 (Rule 2 match)
YY ZZ 9 (Rule 3 match)
11 00 NULL (Rule 4 match)
The difficulty I am having is in the cases where it hits multiple rules (e.g. Rows 1 and 3 where rules 1 and 2 are hit or Rows 4 and 6 where rules 2 and 3 are hit separately).
I realise in this example that Rule 3 is hit multiple times - this is fine as the idea is it will only hit rule 3 when the other rules aren't true which should only ever yield 1 result (like in example 3).
Is there a way to do a similar union to cater for these cascading rules or will additional views need to creating with pre-filtering (such as having count < 2)?
A formula for this in excel would be:
=IF(AND(A3=$F$2,B3=$G$2),"Rule1",IF((B3=$G$2),"Rule 2",IF((A3=$F$2),"Rule 3","NULL")))
Where A3 = T2.Col1, B3 = T2.Col2 G2 = T1.Col2 and F2 = T1.Col1.

I'd do it like this:
with t1 as (select 'AA' col1, 'BB' col2 from dual union all
select 'EE' col1, 'FF' col2 from dual union all
select 'YY' col1, 'ZZ' col2 from dual union all
select '11' col1, '00' col2 from dual),
t2 as (select 'AA' col1, 'BB' col2, 1 col3 from dual union all
select 'AA' col1, 'CC' col2, 2 col3 from dual union all
select 'CC' col1, 'BB' col2, 3 col3 from dual union all
select 'GG' col1, 'FF' col2, 4 col3 from dual union all
select 'GG' col1, 'HH' col2, 5 col3 from dual union all
select 'EE' col1, 'HH' col2, 6 col3 from dual union all
select 'XX' col1, 'YY' col2, 7 col3 from dual union all
select 'XX' col1, 'WW' col2, 8 col3 from dual union all
select 'YY' col1, 'RR' col2, 9 col3 from dual),
res as (select t1.col1,
t1.col2,
t2.col3,
case when t1.col1 = t2.col1 and t1.col2 = t2.col2 then 1
when t1.col2 = t2.col2 then 2
when t1.col1 = t2.col1 then 3
end join_level,
min (case when t1.col1 = t2.col1 and t1.col2 = t2.col2 then 1
when t1.col2 = t2.col2 then 2
when t1.col1 = t2.col1 then 3
end) over (partition by t1.col1, t1.col2) min_join_level
from t1
left outer join t2 on (t1.col1 = t2.col1 or t1.col2 = t2.col2))
select col1,
col2,
col3
from res
where join_level = min_join_level
or join_level is null;
COL1 COL2 COL3
---- ---- ----------
11 00
AA BB 1
EE FF 4
YY ZZ 9
Ie. do the joins first (in this case, t1 left outer join t2 on (t2.col1 = t1.col1 or t2.col2 = t1.col2) includes rows where t1.col1 = t2.col1 and t1.col2 = t2.col2), and then filter the results based on which join condition takes precedence.
Here's a slightly different alternative, using aggregates instead of analytic functions like the above answer:
with t1 as (select 'AA' col1, 'BB' col2 from dual union all
select 'EE' col1, 'FF' col2 from dual union all
select 'YY' col1, 'ZZ' col2 from dual union all
select '11' col1, '00' col2 from dual),
t2 as (select 'AA' col1, 'BB' col2, 1 col3 from dual union all
select 'AA' col1, 'CC' col2, 2 col3 from dual union all
select 'CC' col1, 'BB' col2, 3 col3 from dual union all
select 'GG' col1, 'FF' col2, 4 col3 from dual union all
select 'GG' col1, 'HH' col2, 5 col3 from dual union all
select 'EE' col1, 'HH' col2, 6 col3 from dual union all
select 'XX' col1, 'YY' col2, 7 col3 from dual union all
select 'XX' col1, 'WW' col2, 8 col3 from dual union all
select 'YY' col1, 'RR' col2, 9 col3 from dual)
select t1.col1,
t1.col2,
min(t2.col3) keep (dense_rank first order by case when t1.col1 = t2.col1 and t1.col2 = t2.col2 then 1
when t1.col2 = t2.col2 then 2
when t1.col1 = t2.col1 then 3
end) col3
from t1
left outer join t2 on (t1.col1 = t2.col1 or t1.col2 = t2.col2)
group by t1.col1,
t1.col2;
COL1 COL2 COL3
---- ---- ----------
11 00
AA BB 1
EE FF 4
YY ZZ 9
N.B. These could return different results if there happened to be more than one row that met the highest priority available join condition. The first query would return each row with a (potentially) different col3, whereas the second query would return just one row, with the lowest available col3 value.
What would you expect to see if T2 contained:
COL1 COL2 COL3
---- ---- ----------
AA BB 1
AA CC 2
CC BB 3
GG FF 4
GG HH 5
EE HH 6
XX YY 7
XX WW 8
YY RR 9
YY SS 10
The first query will give you:
COL1 COL2 COL3
---- ---- ----------
11 00
AA BB 1
EE FF 4
YY ZZ 10
YY ZZ 9
The second query will give you:
COL1 COL2 COL3
---- ---- ----------
11 00
AA BB 1
EE FF 4
YY ZZ 9

Perhaps this method, which chains together the result sets of three common table expressions, each of which implements a different join and checks whether the rowid of the row in T1 has already been projected from a successful join:
with
first_join as (
select t1.col1,
t1.col2,
t2.col3,
t1.rowid
from t1 join t2 on t1.col1 = t2.col1 and t1.col2 = t2.col2),
second_join as (
select t1.col1,
t1.col2,
t2.col3,
t1.rowid
from t1 join t2 on t1.col2 = t2.col2
where t1.rowid not in (select rowid from first_join)),
third_join as (
select t1.col1,
t1.col2,
t2.col3,
from t1 join t2 on t1.col1 = t2.col1
where t1.rowid not in (select rowid from first_join union all
select rowid from second_join))
select col1, col2, col3 from first_join union all
select col1, col2, col3 from second_join union all
select col1, col2, col3 from third_join

Related

SQL query to append comma with successor value in Oracle [duplicate]

I have a table of two columns
Col1 Col2
A 1
A 2
A 3
B 1
B 2
B 3
Output I need is like this
Col1 Col2
A 1
A 1,2
A 1,2,3
B 1
B 1,2
B 1,2,3
Thank you in advance.
Here is a solution which would work for MySQL. It uses a correlated subquery in the select clause to group concatenate together Col2 values. The logic is that we only aggregate values which are less than or equal to the current row, for a given group of records sharing the same Col1 value.
SELECT
Col1,
(SELECT GROUP_CONCAT(t2.Col2 ORDER BY t2.Col2) FROM yourTable t2
WHERE t2.Col2 <= t1.Col2 AND t1.Col1 = t2.Col1) Col2
FROM yourTable t1
ORDER BY
t1.Col1,
t1.Col2;
Demo
Here is the same query in Oracle:
SELECT
Col1,
(SELECT LISTAGG(t2.Col2, ',') WITHIN GROUP (ORDER BY t2.Col2) FROM yourTable t2
WHERE t2.Col2 <= t1.Col2 AND t1.Col1 = t2.Col1) Col2
FROM yourTable t1
ORDER BY
t1.Col1,
t1.Col2;
Demo
Note that the only real change is substituting LISTAGG for GROUP_CONCAT.
with s (Col1, Col2) as (
select 'A', 1 from dual union all
select 'A', 2 from dual union all
select 'A', 3 from dual union all
select 'B', 1 from dual union all
select 'B', 2 from dual union all
select 'B', 3 from dual)
select col1, ltrim(sys_connect_by_path(col2, ','), ',') path
from s
start with col2 = 1
connect by prior col2 = col2 - 1 and prior col1 = col1;
C PATH
- ----------
A 1
A 1,2
A 1,2,3
B 1
B 1,2
B 1,2,3
6 rows selected.

SQL Query - Indirect joining of two tables

I have two tables like the following
Table1
COL1 COL2 COL3
A 10 ABC
A 11 ABC
A 1 DEF
A 2 DEF
B 10 ABC
B 11 ABC
B 1 DEF
C 3 DEF
C 12 ABC
C 21 GHI
Table2
COL1 GHI ABC DEF
A1 21 10 1
A2 21 12 1
A3 21 10 1
A4 23 10 1
A5 25 11 3
A6 21 14 3
A7 25 11 1
A8 23 10 1
A9 29 10 2
A10 21 12 3
I have created another temporary table that returns all the distinct values from tbl1.col1
The values of col3 in tbl1 are columns in tbl2, which are populated by some values.
What I need is for each of these distinct values of table1.column1, (A, B, C) in this case, return a combination of table2.column1 and table1.column1 such that
the ABC value of table2.column1 matches any of the ABC value of the "group" from table1,
AND the DEF value of table2.column1 matches any of the DEF value of the "group" from table1,
AND IF THE GROUP CONTAINS GHI VALUES, the GHI value of table2.column1 matches any of the GHI value of the "group" from table1
So, I would need something like the following
Output Table
Table2.COL1 Table1.Col1
A1 A
A3 A
A4 A
A7 A
A8 A
A9 A
A1 B
A3 B
A4 B
A7 B
A8 B
A10 C
I tried something like this, but Im not sure if this is the right way of approaching
select table2.col1, temp_distinct_table.column1
from table2, temp_distinct_table
where table2.def IN (SELECT col2
FROM table1
WHERE table1.col1 = temp_distinct_table.col1
AND table1.col3 = 'DEF')
AND table2.abc IN (SELECT col2
FROM table1
WHERE table1.col1 = temp_distinct_table.col1
AND table1.col3 = 'ABC')
AND (
table2.ghi IN (SELECT col2
FROM table1
WHERE table1.col1 = temp_distinct_table.col1
AND table1.col3 = 'GHI')
OR NOT EXISTS (SELECT col2
FROM table1
WHERE table1.col1 = temp_distinct_table.col1
AND table1.col3 = 'GHI')
)
where temp_distinct_table contains of all the distinct values from table1.col1
Could someone guide me on the matter?
Another approach, counting how many matches there are for each t1.col/t2.col combination after joining all the possible matches:
select distinct t2_col1, t1_col1
from (
select t2.col1 as t2_col1, t1.col1 as t1_col1, t1.ghi_count as t1_ghi_count,
count(case when t1.col3 = 'ABC' then 1 end)
over (partition by t1.col1, t2.col1) as abc_matches,
count(case when t1.col3 = 'DEF' then 1 end)
over (partition by t1.col1, t2.col1) as def_matches,
count(case when t1.col3 = 'GHI' then 1 end)
over (partition by t1.col1, t2.col1) as ghi_matches
from (
select t1.*,
count(case when t1.col3 = 'GHI' then 1 end)
over (partition by t1.col1) as ghi_count
from table1 t1
) t1
join table2 t2
on (t1.col3 = 'ABC' and t2.abc = t1.col2)
or (t1.col3 = 'DEF' and t2.def = t1.col2)
or (t1.col3 = 'GHI' and t2.ghi = t1.col2)
)
where abc_matches > 0
and def_matches > 0
and (t1_ghi_count = 0 or ghi_matches > 0)
order by t1_col1, t2_col1;
Which with your sample data gets:
T2_COL T1_COL
------ ------
A1 A
A3 A
A4 A
A7 A
A8 A
A9 A
A1 B
A3 B
A4 B
A7 B
A8 B
A10 C
Not sure if the efficiency of that will be significantly different to MTO's cross join with your real data.
This becomes quite simple when you use collections (and you only need to do one table scan for each table):
Oracle Setup:
CREATE TYPE intlist AS TABLE OF INT;
/
Query:
SELECT t2.col1 AS t2_col1,
t1.col1 AS t1_col1
FROM (
SELECT col1,
CAST( COLLECT( CASE col3 WHEN 'ABC' THEN col2 END ) AS INTLIST ) AS abc,
CAST( COLLECT( CASE col3 WHEN 'DEF' THEN col2 END ) AS INTLIST ) AS def,
CAST( COLLECT( CASE col3 WHEN 'GHI' THEN col2 END ) AS INTLIST ) AS ghi
FROM table1
GROUP BY col1
) t1
INNER JOIN table2 t2
ON ( t2.abc MEMBER OF t1.abc
AND t2.def MEMBER OF t1.def
AND ( t2.ghi MEMBER OF t1.ghi OR t1.ghi IS EMPTY ) );
Output:
t2_col1 t1_col1
------- -------
A1 A
A3 A
A4 A
A7 A
A8 A
A9 A
A1 B
A3 B
A4 B
A7 B
A8 B
A10 C
Update
An alternative query without using collections (it is going to be more efficient than your query but probably less efficient than collections):
SELECT t2.col1,
t1.col1
FROM table1 t1
CROSS JOIN
table2 t2
GROUP BY t1.col1, t2.col1
HAVING COUNT( CASE WHEN t1.col2 = t2.abc AND t1.col3 = 'ABC' THEN 1 END ) > 0
AND COUNT( CASE WHEN t1.col2 = t2.def AND t1.col3 = 'DEF' THEN 1 END ) > 0
AND ( COUNT( CASE WHEN t1.col2 = t2.ghi AND t1.col3 = 'GHI' THEN 1 END ) > 0
OR COUNT( CASE t1.col3 WHEN 'GHI' THEN 1 END ) = 0 )
ORDER BY t1.col1, t2.col1;
Update 2:
Changed from CROSS JOIN to INNER JOIN:
SELECT t2.col1 AS t2_col1,
t1.col1 AS t1_col1
FROM (
SELECT t1.*,
COUNT( CASE col3 WHEN 'GHI' THEN 1 END )
OVER ( PARTITION BY col1 ) AS has_ghi
FROM table1 t1
) t1
INNER JOIN table2 t2
ON ( t1.col3 = 'ABC' AND t2.abc = t1.col2 )
OR ( t1.col3 = 'DEF' AND t2.def = t1.col2 )
OR ( t1.col3 = 'GHI' AND t2.ghi = t1.col2 )
GROUP BY t1.col1, t2.col1, t1.has_ghi
HAVING COUNT( CASE t1.col3 WHEN 'ABC' THEN 1 END ) > 0
AND COUNT( CASE t1.col3 WHEN 'DEF' THEN 1 END ) > 0
AND ( COUNT( CASE t1.col3 WHEN 'GHI' THEN 1 END ) > 0 OR has_ghi = 0 )
ORDER BY t1.col1, t2.col1;

Pick minimum value and update all the rows SQL

I have a scenario where I need to pick up a minimum value of a priority column and take the product of those and put it in all the columns.
SD PL PRIO PRDT PNAME
1 29 10 MM CAR
1 LI 20 SS BRAKE
1 AA 30 AA ZZZZ
Since the Priority 10 is the minimum of gorup SD 1 MM should be replaced like below.
SD PL PRIO PRDT PNAME
1 29 10 MM CAR
1 LI 20 MM BRAKE
1 AA 30 MM ZZZZ
Could you please help with the select query.
You can use ROW_NUMBER:
SELECT
t.SD, t.PL, t.PRIO, t2.PRDT, t.PNAME
FROM YourTable t
INNER JOIN(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY SD ORDER BY PRIO) AS rn
FROM YourTable
) t2
ON t.SD = t2.SD
WHERE t2.rn = 1
How about using a correlated subquery:
UPDATE YourTable t
SET PRDT = (
SELECT PRDT
FROM YourTable t2
WHERE
t2.SD = t.SD
AND t2.PRIO = (SELECT MIN(t3.PRIO) FROM YourTable t3 WHERE t3.SD = t.SD)
)
Try this if you want only SELECT Query.
SELECT COL1,
COL2,
MIN(COL3) OVER(PARTITION BY COL1 ORDER BY COL3 ASC) COL3,
COL4,
COL5
FROM
(SELECT 1 COL1, '29' COL2, 10 COL3, 'MM' COL4, 'CAR' COL5 FROM DUAL
UNION ALL
SELECT 1 COL1, 'LI' COL2, 20 COL3, 'SS' COL4, 'BRAKE' COL5 FROM DUAL
UNION ALL
SELECT 1 COL1, 'AA' COL2, 30 COL3, 'AA' COL4 , 'ZZZZ' COL5 FROM DUAL
UNION ALL
SELECT 2 COL1, '29' COL2, 10 COL3, 'MM' COL4, 'CAR' COL5 FROM DUAL
UNION ALL
SELECT 2 COL1, 'LI' COL2, 05 COL3, 'SS' COL4, 'BRAKE' COL5 FROM DUAL
UNION ALL
SELECT 2 COL1, 'AA' COL2, 30 COL3, 'AA' COL4 , 'ZZZZ' COL5 FROM DUAL
);

How to select rows with column containing provided values

Hi all consider following as my table structure
Col1 Col2 Col3
A 1 Aa
A 2 Bb
A 1 Aa
A 4 Bb
B 2 Bb
C 1 Aa
C 5 Bb
D 3 Aa
As you can see Col3 contains distint values of Aa and Bb.
I am trying to write a query which return only rows with Col1 having value Aa and Bb (Both) or Aa(Alone).
Point is to remove those rows which have only have Bb associated with distinct Col1 value to it.
Example - For Col1 Distinct value of A should have Aa and Bb / Aa in corresponding Col3. This requirement is violated by value of B in Col1, hence result set should not have rows associated with B.
Expected output -
Col1 Col2 Col3
A 1 Aa
A 2 Bb
A 1 Aa
A 4 Bb
C 1 Aa
C 5 Bb
D 3 Aa
SELECT *
FROM TableName T
WHERE EXISTS ( SELECT 1
FROM TableName
WHERE T.Col1 = Col1
AND Col3 = 'Aa')
One other approach is to use intersect and union.
Fiddle with sample data
select * from t where col1 in (
select col1 from t where col3 = 'Aa'
intersect
select col1 from t where col3 = 'Bb'
union
select col1 from t where col3 = 'Aa')
Select table1.Col1, table1.Col2, table1.Col3
From Table1
join (SELECT Col1
,SUM(case when col3 = 'Aa' then 1 when Col3 = 'Bb' then 0 end) AS [Counting]
FROM [dbo].[Table1]
group by Col1
having SUM(case when col3 = 'Aa' then 1 when Col3 = 'Bb' then 0 end) > 0) keep on table1.Col1 = keep.Col1
Here's another option, using your values instead of a table.:
Select
T.*
From
(Select
'A' As Col1, 1 As Col2, 'Aa' As Col3
Union All
Select
'A', 2, 'Bb'
Union All
Select
'A',1,'Aa'
Union All
Select
'A',4,'Bb'
Union All
Select
'B',2,'Bb'
Union All
Select
'C',1,'Aa'
Union All
Select
'C',5,'Bb'
Union All
Select
'D',3,'Aa'
) T
Inner Join
(Select
Col1
,Sum(Case When Col3 = 'Aa' Then 1 Else 0 End) As CountAa
From
(Select
'A' As Col1, 1 As Col2, 'Aa' As Col3
Union All
Select
'A', 2, 'Bb'
Union All
Select
'A',1,'Aa'
Union All
Select
'A',4,'Bb'
Union All
Select
'B',2,'Bb'
Union All
Select
'C',1,'Aa'
Union All
Select
'C',5,'Bb'
Union All
Select
'D',3,'Aa'
) T2
Group By
Col1
) T3 On T.Col1 = T3.Col1
Where
T3.CountAa > 0
Simplified with a table, thats:
Select
T.*
From
YourTable T
Inner Join
(Select
Col1
,Sum(Case When Col3 = 'Aa' Then 1 Else 0 End) As CountAa
From
YourTable T2
Group By
Col1
) T3 On T.Col1 = T3.Col1
Where
T3.CountAa > 0
The nice thing about this method is that you can add in lots of conditions in that case statement on a more complicated data set.
Depending on what your actual data looks like, row_number() may be another possibility:
select Col1, Col2, Col3
from (
select *, i = row_number() over(partition by Col1 order by Col1, Col3)
from TableName
) a
where i > 1 or Col3 like 'Aa'

How to select duplicate columns data from table

Have table like :
col1 col2 col3 col4 col5
test1 1 13 15 1
test2 1 13 15 4
test3 2 7 3 5
test4 3 11 14 18
test5 3 11 14 8
test6 3 11 14 11
Want select col1,col2,col3,col4 data where col2,col3,col4 are duplicates
for example it must be :
col1 col2 col3 col4
test1 1 13 15
test2 1 13 15
test4 3 11 14
test5 3 11 14
test6 3 11 14
How to do it ?
Presuming SQL-Server >= 2005 you can use COUNT(*) OVER:
WITH CTE AS
(
SELECT col1, col2, col3, col4, cnt = COUNT(*) OVER (PARTITION BY col2, col3, col4)
FROM dbo.TableName t
)
SELECT col1, col2, col3, col4
FROM CTE WHERE cnt > 1
Demo
If I understand correctly:
select col1, col2, col3, col4
from table t
where exists (select 1 from table t2 where t2.col1 = t.col1 and t2.col1 <> t.col1) and
exists (select 1 from table t2 where t2.col2 = t.col2 and t2.col1 <> t.col1) and
exists (select 1 from table t2 where t2.col3 = t.col3 and t2.col1 <> t.col1);
Simple Join can work
select m1.col1,m1.col2,m1.col3,m1.col4 from Mytable m1
join Mytable m2
on m1.col2 =m2.col2
and m1.col3=m2.col3
and m1.col4 =m2.col4
You can use the following code for that:
SELECT * FROM your_table
MINUS
SELECT DISTINCT * FROM your_table
EDIT: sorry this works only for complete duplicates. If you want to exclude the first column, you can use
SELECT col2,col3,col4 FROM your_table
MINUS
SELECT DISTINCT col2,col3,col4 FROM your_table
and afterwards make a join with the table itself (ON its primary keys).