"duplicate" rows - how to select distinct - sql

I have a table with this structure
id1 id2
--------------
10 2
2 10
12 15
I need to select "distinct" using SQL in the sense that rows 1 and 2 are considered the same
So I need a query that results in
10 2
12 15
or
2 10
12 15
Both are fine.
Any good ideas. This problem is driving me crazy :-)

One simple method is:
select t.*
from t
where a < b or
not exists (select 1 from t t2 where t2.b = t.a and t2.a = t.b)

In a DBMS that supports LEAST and GREATEST you can use these to get ordered pairs:
select distinct
least(id1, id2) as lesser_id,
greatest(id1, id2) as greater_id
from mytable;
In a DBMS that doesn't support these functions , you can use CASE expressions to achieve the same:
select distinct
case when id1 <= id2 then id1 else id2 as lesser_id,
case when id1 >= id2 then id1 else id2 as greater_id
from mytable;

I would do:
SELECT DISTINCT id1, id2
FROM (
SELECT id1, id2 FROM mytable
UNION
SELECT id2, id1 FROM mytable
) AS combinations

Another solution, using relations instead of a DISTINCT clause:
SELECT A.id1, A.id2
FROM mytable A LEFT JOIN mytable B ON A.id1 > B.id1 AND A.id1 = B.id2 AND A.id2 = B.id1
WHERE B.id1 IS NULL

Related

Recursive SQL retrieve all levels

I am unable to retrieve the desired result my query when using Oracle's recursive approach:
Foo
ID1 ID2
1 2
1 3
4 2
4 3
4 5
Query:
select sys_connect_by_path(id2,' -> ')
FROM Foo
START WITH id1 = 1
CONNECT BY PRIOR id1 = id2
ORDER BY 1;
Outputs only level 1 hierarchy (2,3). I want it to detect the tree ( 1 -> (2,3) -> 4 -> 5 ), such that selecting distinct ID2 yields (2,3,5). Thank you.
If you are using Oracle 11.2 or above, a CTE (Common Table Expression) is preferred over using Oracle's CONNECT BY statement.
WITH
aset -- Create pseudo table with ID2 as ID1 and vice versa
AS
(SELECT id1, id2
FROM (SELECT id1, id2
FROM foo
UNION
SELECT id2, id1
FROM foo)
WHERE id1 < id2),
bset (id1, id2) -- Extract hierarchy from pseudo table
AS
(SELECT id1, id2
FROM aset
WHERE id1 = 1
UNION ALL
SELECT aset.id1, aset.id2
FROM bset INNER JOIN aset ON bset.id2 = aset.id1
WHERE bset.id1 <> aset.id2)
SELECT DISTINCT bset.id2 -- Only keep values that were originally ID2
FROM bset INNER JOIN foo ON bset.id2 = foo.id2
ORDER BY id2;
Here is the same thing using CONNECT BY
WITH
aset
-- Create pseudo table with ID2 as ID1 and vice versa
AS
(SELECT id1, id2
FROM (SELECT id1, id2
FROM foo
UNION
SELECT id2, id1
FROM foo)
WHERE id1 < id2),
bset
-- Extract hierarchy from pseudo table
AS
( SELECT id2
FROM aset
START WITH id1 = 1
CONNECT BY PRIOR id2 = id1)
SELECT DISTINCT bset.id2
-- Only keep values that were originally ID2
FROM bset INNER JOIN foo ON bset.id2 = foo.id2
ORDER BY id2

SQL Complex join not giving distinct result

I have two tables :-
Table1:-
ID1
1
1
1
1
4
5
Table2:-
Id2
2
2
1
1
1
8
I want to show all the ID2 from table2 which are present in ID1 of table1 by using joins
I used :-
select ID2 from Table2 t2 left join Table1 t1
on t2.Id2=t1.Id1
But this was giving repeated result as :-
Id2
1
1
1
1
1
1
1
It should show me 1 as 3 times only as it is present in Table2 3 times.
Please help.
You're matching the value 1 with 4 rows on Table1 and 3 rows on Table2 that's why you're seeing 12 rows. You need an additional JOIN condition. You can add a ROW_NUMBER and do an INNER JOIN to achieve your desired result.
WITH Cte1 AS(
SELECT *,
rn = ROW_NUMBER() OVER(PARTITION BY Id1 ORDER BY (SELECT NULL))
FROM Table1
),
Cte2 AS(
SELECT *,
rn = ROW_NUMBER() OVER(PARTITION BY Id2 ORDER BY (SELECT NULL))
FROM Table2
)
SELECT c2.Id2
FROM Cte2 c2
INNER JOIN Cte1 c1
ON c1.Id1 = c2.Id2
AND c1.rn = c2.rn
However, you can achieve the desired result without using a JOIN.
SELECT *
FROM Table2 t2
WHERE EXISTS(
SELECT 1 FROM Table1 t1 WHERE t1.Id1 = t2.Id2
)
It's the expected behavior of Join Operation. It will match every row from the two tables, so you will get 12 rows containing value 1 in result of join query.
You can use below query to get desired result.
select ID2 from Table2 t2 WHERE ID2 IN (SELECT ID1 FROM Table1 t1)
select id2 from table2 t2 where exists ( select 1 from table1 t1 where t1.id1 = t2.id2)
Your join logic works fine, the problem is each of your ID2 is matching against all ID1s. A simple solution would be to join with a table of distinct ID1s to avoid this duplication.
select
t2.ID2
from Table2 t2
left join (select distinct * from Table1) t1
on t1.Id1=t2.Id2
where t1.ID1 is not null
;
Here is a functional example
This will select your entire ID2 list with ID1 populated in a column. ID1 is null where there was no match. Select your ID2 column from this table but just don't pull null values (with where clause):

SQL Server get column not in Group By clause?

How to get the following result from this table?
ID1|ID2| Date
----------------------
1 | 1 | 01-01-2014
1 | 2 | 02-01-2014
2 | 3 | 03-01-2014
I want to get ID1 & ID2 for the maximum date when grouped by ID1
Result:
1,2
2,3
My code:
SELECT
ID1, MAX(DATE)
FROM
Table
GROUP BY
ID1
I need something like
SELECT
ID1, ID2, MAX(DATE)
FROM
Table
GROUP BY
ID1
Can someone help me?
There's three ways to do it.
One, a subquery:
SELECT t1.ID1, t1.ID2, t2.MAX_DATE
FROM Table t1
INNER JOIN (
SELECT ID1, MAX(DATE) AS "MAX_DATE" FROM Table GROUP BY ID1) t2
ON t1.ID1 = t2.ID2
Or you can use the OVER() clause if you're on SQL Server 2005+, recent versions of Oracle, or PostgreSQL (and most recent things not MySQL or MariaDB):
SELECT ID1,
ID2,
MAX(DATE) OVER(PARTITION BY ID1)
FROM Table
Or you can use a correlated subquery:
SELECT t1.ID1,
t1.ID2,
(SELECT MAX(DATE) FROM Table WHERE ID1 = t1.ID1)
FROM Table t1
You can accomplish this by joining the table to the aggregate, like this:
SELECT t.*
FROM
Table t
INNER JOIN
(
SELECT
ID1,
MAX(Date) MaxDate
FROM Table
GROUP BY ID1
) MaxDate ON
t.ID1 = MaxDate.ID1 AND
t.Date = MaxDate.MaxDate
you can use ROW_NUMBER analytic function
SELECT *
FROM
(SELECT *,
ROW_NUMBER() over ( partition by ID1 order by [date] desc) as seq
FROM Table1
) T
WHERE T.seq =1

How do I find groups of rows where all rows in each group have a specific column value

Sample data:
ID1 ID2 Num Type
---------------------
1 1 1 'A'
1 1 2 'A'
1 2 3 'A'
1 2 4 'A'
2 1 1 'A'
2 2 1 'B'
3 1 1 'A'
3 2 1 'A'
Desired result:
ID1 ID2
---------
1 1
1 2
3 1
3 2
Notice that I'm grouping by ID1 and ID2, but not Num, and that I'm looking specifically for groups where Type = 'A'. I know it's doable through a join two queries on the same table: one query to find all groups that have a distinct Type, and another query to filter rows with Type = 'A'. But I was wondering if this can be done in a more efficient way.
I'm using SQL Server 2008, and my current query is:
SELECT ID1, ID2
FROM (
SELECT ID1, ID2
FROM T
GROUP BY ID1, ID2
HAVING COUNT( DISTINCT Type ) = 1
) AS SingleType
INNER JOIN (
SELECT ID1, ID2
FROM T
WHERE Type = 'A'
GROUP BY ID1, ID2
) AS TypeA ON
TypeA.ID1 = SingleType.ID1 AND
TypeA.ID2 = SingleType.ID2
EDIT: Updated sample data and query to indicate that I'm grouping on two columns, not just one.
SELECT ID1, ID2
FROM MyTable
GROUP BY ID1, ID2
HAVING COUNT(Type) = SUM(CASE WHEN Type = 'A' THEN 1 ELSE 0 END)
There are two alternatives that don't require the aggregation (but do require distinct)
ANTI-JOIN
SELECT DISTINCT t1.ID1, t1.ID2
FROM
table t1
LEFT JOIN table t2
ON t1.ID1 = t2.ID1
and t1.Type <> t2.Type
WHERE
t1.Type = 'A'
AND
t2.ID1 IS NULL
See it working at this data.se query Sample for 9132209 (Anti-Join)
NOT EXISTS
SELECT DISTINCT t1.ID1, t1.ID2
FROM
table t1
WHERE
t1.Type = 'A'
AND
NOT EXISTS
(SELECT 1
FROM table t2
WHERE t1.ID1 = t2.ID1 AND Type <> 'A')
See it working at this data.se query Sample for 9132209 Not Exists

Sql query for comparing rows

Let's suppose we have a table:
id1 id2
1 2
2 1
3 4
4 3
The expected output is
id1 id2
1 2
3 4
Rows 1,2 and 2,1 are same, and only one needs to be outputted.
What's the SQL query for this.
Assuming your RDBMS supports LEAST and GREATEST (Oracle does):
SELECT DISTINCT LEAST(id1, id2), GREATEST(id1, id2)
FROM mytable
Cross-platform version:
SELECT DISTINCT
CASE WHEN id1 < id2 THEN id1 ELSE id2 END,
CASE WHEN id1 > id2 THEN id1 ELSE id2 END
FROM mytable
Select ...
From MyTable As T
Where Exists (
Select 1
From MyTable As T2
Where T2.id1 = T.id2
And T2.id2 = T.id1
)
And T.id1 < T.id2
Another solution using Union
Select T.id1, T.id2
From MyTable As T
Where T.id1 <= T.id2
Union
Select T.id2, T.id1
From MyTable As T
Where T.id1 > T.id2
My interpretation of what you're trying to do is: return rows were id1 matches id2 and id2 matches id1, but only return rows from that set when id1 is also less than or equal to id2.
select x.id1, x.id2 from
myTable x, myTable y
where x.id1 = y.id2 and y.id1 = x.id2 and x.id1 <= y.id1
Exact same question I also had to solve recently. See Eliminating duplicates.
select id1, id2
from t
where not exists (
select 1
from t
where id1 = t.id2
and id2 = t.id1
and rowid > t.rowid
);