how to extract the rows where a group appears more than a certain number of times - sql

I have the following table
col1 col2 col3 key
A B C 1
A B B 2
A B B 3
A B D 4
B D C 5
I would like to extract the rows where the group col1, col2, col3 appears more than once in the table.
A B B 2
A B B 3
So far, I have:
SELECT col1, col2, col3, count(*)
FROM db.table
GROUP BY col1, col2, col3
HAVING count(*) > 1
col1 col2 col3 count(*)
A B B 2
Is there a way to extract those rows with A B B without having to join the final table with the initial table?

You could use exists logic:
SELECT col1, col2, col3, "key"
FROM yourTable t1
WHERE EXISTS (SELECT 1 FROM yourTable t2
WHERE t2.col1 = t1.col1 AND t2.col2 = t1.col2 AND
t2.col3 = t1.col3 AND
t2."key" <> t1."key");

Try below query with CTE
with MyCTE
as
(
select col1,col2,col3,Key,COUNT(*) over(PARTITION BY col1,col2,col3 order
by col1,col2,col3) as Duplicate from yourtable
)
select col1,col2,col3,key from MyCTE where Duplicate>1

Related

Oracle self join starting with minimum value for each partition

I have this table:
COL1 COL2 COL3
--------------------
A 1 VAL1
A 2 VAL2
A 4 VAL3
B 2 VAL4
B 4 VAL5
B 5 VAL6
And I would like to obtain this output:
COL1 COL2 COL3
--------------------
A 1 VAL1
A 2 VAL2
A 3 NULL
B 2 VAL4
B 3 NULL
B 4 VAL6
Logic:
with the smallest COL2 value for each partition of COL1, take the following 3 numbers and, if the combination COL1 and COL2 present in the first table, show COL3 and NULL otherwise.
Your question is a good example of what PARTITIONED OUTER JOIN was created for: DBFiddle
with top3 as (
select *
from (
select
col1, col2, col3
,min(col2)over(partition by col1) min_col2
,col2 - min(col2)over(partition by col1) + 1 as rn
from t
)
where col2 < min_col2 + 3
)
select
top3.col1
,r3.n as col2
,top3.col3
from
top3
partition by (col1)
right join
(select level n from dual connect by level<=3) r3
on r3.n=top3.rn;
As you can see, the first step is to get top3 and then just use partition by (col1) right join r3, where r3 is just generator of 3 rows.
Results:
COL1 COL2 COL3
----- ---------- ----
A 1 VAL1
A 2 VAL2
A 3
B 1 VAL4
B 2
B 3 VAL5
6 rows selected.
Note, this approach allows you to scan your table just once!
Let's see. Here is the table
select * from t order by col1, col2;
COL1 COL2 COL3
----- ---------- -----
A 1 VAL1
A 2 VAL2
A 4 VAL3
B 2 VAL4
B 4 VAL5
B 5 VAL6
6 rows selected
and now let's try to apply the described logic
with offsets as
(select level - 1 offset from dual connect by level <= 3),
smallest_col2 as
(select col1, min(col2) min_col2 from t group by col1)
select sc2.col1, sc2.min_col2 + o.offset col2, t.col3
from smallest_col2 sc2
cross join offsets o
left join t
on t.col1 = sc2.col1
and t.col2 = sc2.min_col2 + o.offset
order by 1, 2;
COL1 COL2 COL3
----- ---------- -----
A 1 VAL1
A 2 VAL2
A 3
B 2 VAL4
B 3
B 4 VAL5
6 rows selected
Use a recursive CTE to get the COL2s from the min of each COL1 up to the next 2 and then a left join to the table:
WITH cte(COL1, COL2, max_col2) AS (
SELECT COL1, MIN(COL2), MIN(COL2) + 2
FROM tablename
GROUP BY COL1
UNION ALL
SELECT COL1, COL2 + 1, max_col2
FROM cte
WHERE COL2 < max_col2
)
SELECT c.COL1, c.COL2, t.COL3
FROM cte c LEFT JOIN tablename t
ON t.COL1 = c.COL1 AND t.COL2 = c.COL2
ORDER BY c.COL1, c.COL2
See the demo.
The partitioned outer join, already demonstrated in Sayan's answer, is probably the best approach for that part of the assignment (data densification).
For the first part, in Oracle 12.1 and higher you can use the match_recognize clause:
select col1, col2, col3
from this_table
match_recognize(
partition by col1
order by col2
measures col2 - a.col2 + 1 as rn
all rows per match
pattern ( ^ a b* )
define b as col2 <= a.col2 + 2
)
partition by (col1)
right outer join
(select level as rn from dual connect by level <= 3) using (rn)
;
Another solution with the "recursive WITH clause"
With rws_numbered (COL1, COL2, COL3, rn) as (
select COL1, COL2, COL3
, row_number()over(order by col1, col3) rn
from Your_table
)
, cte ( COL1, COL2, COL3, rn ) as (
select COL1, COL2, COL3, rn
from rws_numbered
where rn = 1
union all
select
t.COL1
, case when t.col1 = c.col1 then c.col2 + 1 else t.col2 end COL2
, t.COL3
, t.rn
from rws_numbered t
join cte c
on c.rn + 1 = t.rn
)
select COL1, COL2, case when exists (select null from Your_table t where t.COL1 = cte.COL1 and t.COL2 = cte.COL2) then COL3 else null end COL3
from cte
order by 1, 2
;
db<>fiddle

Compare values in Different column and row

I have the following table:
ID COl1 COl2
1 13 15
2 13 16
3 13 17
4 17 13
What I need is to select all rows where Col1 value is available in Col2 and vice versa.
This case only ROW 4 or ROW 3 should be returned. They have same values (13 17).
Take it as col1 is Buyer and col2 is Seller
I want to know who are the users who bought / sell from EACH OTHER.
if user a bought from user b, user b should buy from user a in order to be returned.
SELECT
a.*
FROM
yourTable a
INNER JOIN
yourTable b
ON a.Col1 = b.Col2
AND a.Col2 = b.Col1
AND a.id != b.id
This can be done by using sub queries:
SELECT ID, COl1, COl2
FROM table1 WHERE COl1 IN (SELECT DISTINCT COl2 FROM table1)
UNION
SELECT ID, COl1, COl2
FROM table1 WHERE COl2 IN (SELECT DISTINCT COl1 FROM table1)
This sounds like exists:
select t.*
from t
where exists (select 1 from t t2 where t2.col1 = t.col2) and
exists (select 1 from t t2 where t2.col2 = t.col1) ;
If you want them in the same row, I would still use exists:
select t.*
from t
where exists (select 1 from t t2 where t2.col1 = t.col2 AND t2.col2 = t.col1) ;
I recommend this over a self-join because it will not generate multiple rows if there are multiple examples of the buyers and sellers on either side.
This also works
SELECT * FROM your_table WHERE
col1 IN (SELECT col2 FROM your_table)
AND
col2 IN (SELECT col1 FROM your_table);

Select distinct values one column into multiple columns

I have the following data: column 1 with many category and column 2 with values for each category. I need to convert or pivot this information to show each value for category group across multiple columns.
col1 col2
----------------
1 a
2 b
2 c
2 d
3 e
3 f
4 g
4 h
And need this result:
col1 col2 col3 col4 col5 col6
-----------------------------------------------
1 a
2 b c d
3 e f
4 g h
There are no more than seven values per tb1 count(column 2) group(column 1). All values from tb1 column 2 are different and about + 50 records.
You want to pivot your table, but your table doesn't currently contain the field that you want to pivot on ("col1", "col2", "col3", etc...). You need a row number, partitioned by col1. The Jet database does not provide a ROW_NUMBER function, so you have to fake it by joining the table to itself:
select t1.col1, t1.col2, count(*) as row_num
from [Sheet1$] t1
inner join [Sheet1$] t2 on t2.col1 = t1.col1 and t2.col2 <= t1.col2
group by t1.col1, t1.col2
Now you can pivot on row_num:
transform Min(x.col2) select x.col1
from(
select t1.col1, t1.col2, count(*) as row_num
from [Sheet1$] t1
inner join [Sheet1$] t2 on t2.col1 = t1.col1 and t2.col2 <= t1.col2
group by t1.col1, t1.col2
) x
group by x.col1
pivot x.row_num

Combine multiple tables in one

If I have tlb1 as :
col1
1
2
3
Now I have tlb2 as:
col2 col3
4 Four
5 Five
6 SIX
No I have tlb3 as
col4 col5
sample14 sample15
sample24 sample25
sample34 sample35
What can be the query if I want result as :
col1 col2 col3 col4 col5
1 4 Four sample14 sample15
2 5 Five sample24 sample25
3 6 Six sample34 sample35
I tried with :
select ( (select * from tlb1), (select * from tlb2),(select * from tlb3)) T
But this failed.
Please help me.
with t1 as (select col1, row_number() over (order by col1) rn from tbl1 ),
t2 as (select col2,col3, row_number() over (order by col2) rn from tbl2),
t3 as ( select col4,col5, row_number() over (order by col4) rn from tbl3)
select t1.col1,t2.col2,t2.col3,t3.col4,t3.col5
from t1 full outer join t2 on t1.rn = t2.rn
t3 full outerjoin t2 on t2.rn = t3.rn
try something like this...

Oracle - Find matched records with a different value for one field

Suppose I have the following table in my Oracle DB:
Col1: Col2: ... Coln:
1 a ... 1
1 a ... 1
1 b ... 1
1 b ... 1
1 c ... 1
1 a ... 1
2 d ... 1
2 d ... 1
2 d ... 1
3 e ... 1
3 f ... 1
3 e ... 1
3 e ... 1
4 g ... 1
4 g ... 1
And, what I want to get is a distinct list of records where, for Col1, Col2 is different - Ignoring any times that Col2 matches for all of Col1.
So, in this example I would like to get the result set:
Col1: Col2:
1 a
1 b
1 c
3 e
3 f
Now, I figured out how to do this using a query that feels fairly complex for the question at hand:
With MyData as
(
SELECT b.Col1, b.Col2, count(b.Col2) over(Partition By b.Col1) as cnt from
(
Select distinct a.Col1, a.Col2 from MyTable a
) b
)
select Col1, Col2
from MyData
where cnt > 1
order by Col1
What I'm wondering is what is a nicer way to do this - I didn't manage to do this using GROUP BY & HAVING and probably think this could maybe be done using a self-join... This is more of a quetion to see / learn new ways to get a result in a nicer (and perhaps more efficient) query.
Thanks!!!
Try this query:
SELECT distinct *
FROM table1 t1
WHERE EXISTS
( SELECT 1 FROM table1 t2
WHERE t1.col2 <> t2.col2
AND t1.col1 = t2.col1
)
order by 1,2
demo: http://www.sqlfiddle.com/#!4/9ce10/12
----- EDIT -------
Yes, there are other ways to do this:
SELECT distinct col1, col2
FROM table1 t1
WHERE col2 <> ANY (
SELECT col2 FROM table1 t2
WHERE t1.col1 = t2.col1
)
order by 1,2;
SELECT distinct col1, col2
FROM table1 t1
WHERE NOT col2 = ALL (
SELECT col2 FROM table1 t2
WHERE t1.col1 = t2.col1
)
order by 1,2
;
SELECT distinct t1.col1, t1.col2
FROM table1 t1
JOIN table1 t2
ON t1.col1 = t2.col1 AND t1.col2 <> t2.col2
order by 1, 2
;
SELECT t1.col1, t1.col2
FROM table1 t1
JOIN table1 t2
ON t1.col1 = t2.col1
GROUP BY t1.col1, t1.col2
HAVING COUNT( distinct t2.col2 ) > 1
order by 1, 2
;
SELECT t1.col1, t1.col2
FROM
table1 t1
JOIN (
SELECT col1
FROM table1
GROUP BY col1
HAVING COUNT( distinct col2 ) > 1
) t2
ON t1.col1 = t2.col1
GROUP BY t1.col1, t1.col2
ORDER BY t1.col1, t1.col2
;
Demo --> http://www.sqlfiddle.com/#!4/9ce10/33
Try them all, I really don't know how they will perform on your data.
However, creating a composite index:
CREATE INDEX name ON table1( col1, col2 )
will most likely speed up all of these queries.
Here is a method that uses aggregation and an analytic function:
with t as (
select col1, col2,
count(*) over (partition by col1) as cnt
from table1
group by col1, col2
)
select col1, col2
from t
where cnt > 1;
What I would like to do is:
select col1, col2,
count(*) over (partition by col1) as cnt
from table1
group by col1, col2
having count(*) over (partition by col1) > 1;
However, this is not valid SQL because the analytic functions are not allowed in the having clause.