How to select duplicate columns data from table - sql

Have table like :
col1 col2 col3 col4 col5
test1 1 13 15 1
test2 1 13 15 4
test3 2 7 3 5
test4 3 11 14 18
test5 3 11 14 8
test6 3 11 14 11
Want select col1,col2,col3,col4 data where col2,col3,col4 are duplicates
for example it must be :
col1 col2 col3 col4
test1 1 13 15
test2 1 13 15
test4 3 11 14
test5 3 11 14
test6 3 11 14
How to do it ?

Presuming SQL-Server >= 2005 you can use COUNT(*) OVER:
WITH CTE AS
(
SELECT col1, col2, col3, col4, cnt = COUNT(*) OVER (PARTITION BY col2, col3, col4)
FROM dbo.TableName t
)
SELECT col1, col2, col3, col4
FROM CTE WHERE cnt > 1
Demo

If I understand correctly:
select col1, col2, col3, col4
from table t
where exists (select 1 from table t2 where t2.col1 = t.col1 and t2.col1 <> t.col1) and
exists (select 1 from table t2 where t2.col2 = t.col2 and t2.col1 <> t.col1) and
exists (select 1 from table t2 where t2.col3 = t.col3 and t2.col1 <> t.col1);

Simple Join can work
select m1.col1,m1.col2,m1.col3,m1.col4 from Mytable m1
join Mytable m2
on m1.col2 =m2.col2
and m1.col3=m2.col3
and m1.col4 =m2.col4

You can use the following code for that:
SELECT * FROM your_table
MINUS
SELECT DISTINCT * FROM your_table
EDIT: sorry this works only for complete duplicates. If you want to exclude the first column, you can use
SELECT col2,col3,col4 FROM your_table
MINUS
SELECT DISTINCT col2,col3,col4 FROM your_table
and afterwards make a join with the table itself (ON its primary keys).

Related

Oracle self join starting with minimum value for each partition

I have this table:
COL1 COL2 COL3
--------------------
A 1 VAL1
A 2 VAL2
A 4 VAL3
B 2 VAL4
B 4 VAL5
B 5 VAL6
And I would like to obtain this output:
COL1 COL2 COL3
--------------------
A 1 VAL1
A 2 VAL2
A 3 NULL
B 2 VAL4
B 3 NULL
B 4 VAL6
Logic:
with the smallest COL2 value for each partition of COL1, take the following 3 numbers and, if the combination COL1 and COL2 present in the first table, show COL3 and NULL otherwise.
Your question is a good example of what PARTITIONED OUTER JOIN was created for: DBFiddle
with top3 as (
select *
from (
select
col1, col2, col3
,min(col2)over(partition by col1) min_col2
,col2 - min(col2)over(partition by col1) + 1 as rn
from t
)
where col2 < min_col2 + 3
)
select
top3.col1
,r3.n as col2
,top3.col3
from
top3
partition by (col1)
right join
(select level n from dual connect by level<=3) r3
on r3.n=top3.rn;
As you can see, the first step is to get top3 and then just use partition by (col1) right join r3, where r3 is just generator of 3 rows.
Results:
COL1 COL2 COL3
----- ---------- ----
A 1 VAL1
A 2 VAL2
A 3
B 1 VAL4
B 2
B 3 VAL5
6 rows selected.
Note, this approach allows you to scan your table just once!
Let's see. Here is the table
select * from t order by col1, col2;
COL1 COL2 COL3
----- ---------- -----
A 1 VAL1
A 2 VAL2
A 4 VAL3
B 2 VAL4
B 4 VAL5
B 5 VAL6
6 rows selected
and now let's try to apply the described logic
with offsets as
(select level - 1 offset from dual connect by level <= 3),
smallest_col2 as
(select col1, min(col2) min_col2 from t group by col1)
select sc2.col1, sc2.min_col2 + o.offset col2, t.col3
from smallest_col2 sc2
cross join offsets o
left join t
on t.col1 = sc2.col1
and t.col2 = sc2.min_col2 + o.offset
order by 1, 2;
COL1 COL2 COL3
----- ---------- -----
A 1 VAL1
A 2 VAL2
A 3
B 2 VAL4
B 3
B 4 VAL5
6 rows selected
Use a recursive CTE to get the COL2s from the min of each COL1 up to the next 2 and then a left join to the table:
WITH cte(COL1, COL2, max_col2) AS (
SELECT COL1, MIN(COL2), MIN(COL2) + 2
FROM tablename
GROUP BY COL1
UNION ALL
SELECT COL1, COL2 + 1, max_col2
FROM cte
WHERE COL2 < max_col2
)
SELECT c.COL1, c.COL2, t.COL3
FROM cte c LEFT JOIN tablename t
ON t.COL1 = c.COL1 AND t.COL2 = c.COL2
ORDER BY c.COL1, c.COL2
See the demo.
The partitioned outer join, already demonstrated in Sayan's answer, is probably the best approach for that part of the assignment (data densification).
For the first part, in Oracle 12.1 and higher you can use the match_recognize clause:
select col1, col2, col3
from this_table
match_recognize(
partition by col1
order by col2
measures col2 - a.col2 + 1 as rn
all rows per match
pattern ( ^ a b* )
define b as col2 <= a.col2 + 2
)
partition by (col1)
right outer join
(select level as rn from dual connect by level <= 3) using (rn)
;
Another solution with the "recursive WITH clause"
With rws_numbered (COL1, COL2, COL3, rn) as (
select COL1, COL2, COL3
, row_number()over(order by col1, col3) rn
from Your_table
)
, cte ( COL1, COL2, COL3, rn ) as (
select COL1, COL2, COL3, rn
from rws_numbered
where rn = 1
union all
select
t.COL1
, case when t.col1 = c.col1 then c.col2 + 1 else t.col2 end COL2
, t.COL3
, t.rn
from rws_numbered t
join cte c
on c.rn + 1 = t.rn
)
select COL1, COL2, case when exists (select null from Your_table t where t.COL1 = cte.COL1 and t.COL2 = cte.COL2) then COL3 else null end COL3
from cte
order by 1, 2
;
db<>fiddle

how to extract the rows where a group appears more than a certain number of times

I have the following table
col1 col2 col3 key
A B C 1
A B B 2
A B B 3
A B D 4
B D C 5
I would like to extract the rows where the group col1, col2, col3 appears more than once in the table.
A B B 2
A B B 3
So far, I have:
SELECT col1, col2, col3, count(*)
FROM db.table
GROUP BY col1, col2, col3
HAVING count(*) > 1
col1 col2 col3 count(*)
A B B 2
Is there a way to extract those rows with A B B without having to join the final table with the initial table?
You could use exists logic:
SELECT col1, col2, col3, "key"
FROM yourTable t1
WHERE EXISTS (SELECT 1 FROM yourTable t2
WHERE t2.col1 = t1.col1 AND t2.col2 = t1.col2 AND
t2.col3 = t1.col3 AND
t2."key" <> t1."key");
Try below query with CTE
with MyCTE
as
(
select col1,col2,col3,Key,COUNT(*) over(PARTITION BY col1,col2,col3 order
by col1,col2,col3) as Duplicate from yourtable
)
select col1,col2,col3,key from MyCTE where Duplicate>1

Why do left and right join ignore some values in a query?

I have a table with 3 columns and these values:
col1 col2 col3
-------------------
1 2 8
1 3 5
1 10 15
2 4 6
2 9 7
3 5 6
I join a query LEFT JOIN and RIGHT JOIN a grouping and counting query
for each number (MS-ACCESS).
SELECT Col1 AS Num, t1.CON1, t2.CON2, t3.CON3
FROM
(((SELECT col1, COUNT(col1) AS CON1 FROM table GROUP BY col1) AS t1
LEFT JOIN (SELECT col2, COUNT(col2) AS CON2 FROM table GROUP BY col2) AS t2
ON t1.col1 = t2.col2)
LEFT JOIN (SELECT col3, COUNT(col3) AS CON3 FROM table GROUP BY col3) AS t3
ON t2.col2 = t3.col3)
UNION
SELECT col3 AS Num, t1.CON1, t2.CON2, t3.CON3
FROM
(((SELECT col1, COUNT(col1) AS CON1 FROM table GROUP BY col1) AS t1
RIGHT JOIN (SELECT col2, COUNT(col2) AS CON2 FROM table GROUP BY col2) AS t2
ON t1.col1 = t2.col2)
RIGHT JOIN (SELECT col3, COUNT(col3) AS CON3 FROM table GROUP BY col3) AS t3
ON t2.col1 = t3.col3)
It results like this:
Num CON1 CON2 CON3
--------------------------
1 3
2 2 1
3 1 1
5 1 1
6 2
7 1
8 1
15 1
But this query ignores count of values from column 2 of table
Num CON2
---------------
4 1
9 1
10 1
What is missing in my query?
If you want to count each value and the number of times in each column, then use union all to split the data and then group by:
select num, sum(col1), sum(col2), sum(col3)
from ((select col1 as num, 1 as col1, 0 as col2, 0 as col3
from t
) union all
(select col2 as num, 0 as col1, 1 as col2, 0 as col3
from t
) union all
(select col3 as num, 0 as col1, 0 as col2, 1 as col3
from t
)
) as x
group by num
order by num;
This gives correct totals, but I can't figure how to summarise the results (it lists 3 twice, for example) ...
(
(SELECT col1 as num, COUNT(col1) AS CON1, null as CON2, null as CON3 FROM mytable t1
LEFT JOIN (SELECT col2, COUNT(col2) AS CON2 FROM mytable GROUP BY col2) t2
ON t1.col1 = t2.col2
LEFT JOIN (SELECT col3, COUNT(col3) AS CON3 FROM mytable GROUP BY col3) t3
ON t2.col2 = t3.col3
GROUP BY t1.col1)
UNION
(SELECT col2 as num, null as CON1, COUNT(col2) AS CON2, null as CON3 FROM mytable t4
left JOIN (SELECT col1, COUNT(col1) AS CON1 FROM mytable GROUP BY col1) t5
ON t4.col2 = t5.col1
left JOIN (SELECT col3, COUNT(col3) AS CON3 FROM mytable GROUP BY col3) t6
ON t4.col2 = t6.col3
GROUP BY t4.col2)
UNION
(SELECT col3 as num, null as CON1, null as CON2, COUNT(col3) AS CON3 FROM mytable t7
left JOIN (SELECT col1, COUNT(col1) AS CON1 FROM mytable GROUP BY col1) t8
ON t7.col3 = t8.col1
left JOIN (SELECT col2, COUNT(col2) AS CON3 FROM mytable GROUP BY col2) t9
ON t7.col3 = t9.col2
GROUP BY t7.col3)
)
RESULTS:
num CON1 CON2 CON3
1 3
2 1
2 2
3 1
3 1
4 1
5 1
5 1
6 2
7 1
8 1
9 1
10 1
15 1
... Anyone?
SQL TEST: https://sqltest.net/#979886

Sql Query for Unique and Duplicates in oracle sql?

I need to display unique records in one column and duplicates in another column in Oracle?
COL1 COL2
1 10
1 10
2 20
3 30
3 30
unique in one set duplicate in one set
col1 col2 col1 col2
2 20 1 10
1 10
3 30
3 30
You can use the group by for both cases with the having clause:
Unique records
select *
from table as t
inner join (
select col1, col2, count(*) as times
from table
group by col1, col2
having count(*) = 1) as t2 ON t.col1 = t2.col2 and t.col2 = t2.col2
Duplicate records:
select *
from table as t
inner join (
select col1, col2, count(*) as times
from table
group by col1, col2
having count(*) > 1) as t2 ON t.col1 = t2.col1 and t.col2 = t2.col2
Would something like this do? See comments within code.
SQL> with
2 test (col1, col2) as
3 -- sample data
4 (select 1, 10 from dual union all
5 select 1, 10 from dual union all
6 select 2, 20 from dual union all
7 select 3, 30 from dual union all
8 select 3, 30 from dual
9 ),
10 uni as
11 -- unique values
12 (select col1, col2
13 from test
14 group by col1, col2
15 having count(*) = 1
16 ),
17 dup as
18 -- duplicate values
19 (select col1, col2
20 from test
21 group by col1, col2
22 having count(*) > 1
23 )
24 -- the final result
25 select u.col1 ucol1,
26 u.col2 ucol2,
27 d.col1 dcol1,
28 d.col2 dcol2
29 from uni u full outer join dup d on u.col1 = d.col1;
UCOL1 UCOL2 DCOL1 DCOL2
---------- ---------- ---------- ----------
1 10
3 30
2 20
SQL>
You can identify the duplicate values using window functions, and then filter each query. Then to get unique records:
select col1, col2
from (select t.*, count(*) over (partition by col1) as cnt
from t
) t
where cnt = 1;
To get duplicates:
select col1, col2
from (select t.*, count(*) over (partition by col1) as cnt
from t
) t
where cnt > 1;

Count per category

have a table as below -
COL1 | COL2 | COL3
1 1 1
1 1 2
1 2 0
1 2 1
2 3 1
2 3 2
2 4 0
2 4 1
3 1 0
3 2 0
.
.
.
I want to select COL1 where all COL2 have sum(COL3) is > 0. If I am sure there are 20 distinct values in COL2, Then how can i pull all COL1 values that have all 20 COL2 filled with COL3 > 0. So the end result should be
COL1 | COL2 | COL3
1 1 3
1 2 1
2 3 3
2 4 1
I have tried a lot of ways to do this but no success.
Just use group by and having.
select col1,col2,sum(col3)
from tbl
group by col1,col2
having sum(col3)>0
select t1.*
from yourTable t1
inner join
(
select t.col1
from
(
select col1, col2, sum(col3) as col_sum
from yourTable
group by col1, col2
) t
group by t.col1
having sum(case when t.col_sum = 0 then 1 else 0 end) = 0
) t2
on t1.col1 = t2.col1
I use a CTE and a Group by with a where condition
;WITH CTE as (
select COL1,COL2,SUM(COL3) as COL3 FROM table1
Group By
COL1,COL2
)
select * from CTE
where COL3>0
Just group col2 and check if it's bigger then 0
select col1,col2,sum(col3)
from tbl
group by col2
having sum(col3)>0
http://sqlfiddle.com/#!9/537f8c/1
See if the below gives you the result that you are after. It is selecting the col1, col2 and a sum of col3 from a derived(?) table that is excluding the col3's that are 0:
select col1, col2, sum(col3)
from
(
select col1, col2, col3 from tbl where col3 <> 0
) as ds
group by col3