How do I find duplicate rows, and at the same time distinct?

How do I find duplicate rows, and at the same time distinct? - sql

I have a SQL query question.
I have a table with first name, last name and mobile numbers of clients.
Thor Prestby 98726364
Thor Prestby 98726364
Lars Testrud 12938485
Lise Robol 12938485
I want to find rows with the same mobile number, that have different names. As you see above Thor has 2 rows, and that's right. Lars and Lise have the same mobile number and that is what I want to find.

You pretty much outlined yourself the actions needed to take in your question.
In a nutshell
use a subselect to get all the distinct rows
group on mobilenumber from this unique resultset from the subselect
retain only those mobilenumbers that occur at least twice
SQL Statement
SELECT mobilenumber, COUNT(*)
FROM (
SELECT DISTINCT mobilenumber, firstname, lastname
FROM YourTable
) AS q
GROUP BY
mobilenumber
HAVING COUNT(*) > 1

I am assuming that you are using MS SQL Server here but you could use:
Declare #t table
(
FirstName varchar(100),
LastName varchar(100),
Mobile bigint
)
Insert Into #t
values ('Thor','Prestby',98726364),
('Thor','Prestby', 98726364),
('Lars','Testrud',12938485),
('Lise','Robol', 12938485),
('AN','Other', 12345868)
Select Mobile
From #t
Group By Mobile
Having Count(*) > 1
EXCEPT
Select Mobile
From #t
Group By FirstName, LastName, Mobile
Having Count(*) > 1

SELECT * FROM the_table tt
WHERE EXISTS (
SELECT * FROM the_table xx
WHERE xx.mobineno = tt.mobileno
AND (xx.fname <> tt.fname OR xx.lname <> tt.lname)
);

For this records:
Thor Prestby 98726364
Thor Prestby 98726364
Lars Testrud 12938485
Lise Robol 12938485
AN Other 12345868
Try it:
select t.mobile, count(*) from new_table t
where t.mobile in
(select t1.mobile from new_table t1
where t1.mobile=t.mobile
group by t1.firstname, t1.lastname, t1.mobile
having count(*)=1)
group by t.mobile
having count(*)>1
You will get this result:
12938485 2

SELECT t1.phone, count(t1.phone) CountNo
FROM (SELECT distinct * FROM YourTable) t1
Left Outer Join YourTable t2 ON t1.phone = t2.phone AND ( t1.FirstName <> t2.FirstName OR t1.LastName <> t2.LastName)
WHERE t2.FirstName IS NOT NULL
GROUP BY t1.phone
HAVING count(t1.phone) > 1
ORDER BY CountNo desc

I am using standard SQL syntax and joins to achieve the results
Setup:
create table dummy
(
firstname varchar2(20),
lastname varchar2(20),
phone number
);
insert into dummy values('Thor','Prestby',98726364);
insert into dummy values('Thor','Prestby',98726364);
insert into dummy values('Lars','Testrud',12938485);
insert into dummy values('Lise','Robol',12938485);
Query:
select a.firstname,a.lastname,a.phone from dummy a inner join dummy b
on a.phone=b.phone and a.firstname != b.firstname and a.lastname != b.lastname;
Result
FIRSTNAME LASTNAME PHONE
-------------------- -------------------- ----------------------
Lise Robol 12938485
Lars Testrud 12938485

Related

I have a table without any primary key and i want it all duplicate records

I have a table without any primary key and I want it all duplicate records
Table --
EmpName City
-------------
Shivam Noida
Ankit Delhi
Mani Gurugram
Shivam Faizabad
Mukesh Noida
and want output like this --
EmpName City
-------------
Shivam Noida
Shivam Faizabad
Mukesh Noida
Thanks in Advance.

I think you want exists:
select t.*
from t
where exists (select 1
from t t2
where (t2.empname = t.empname and t2.city <> t.city) or
(t2.city = t.city and t2.empname <> t.empname)
);

You seem to be looking for all rows where the name or city also appears in another row of the table.
select *
from mytable
where city in (select city from mytable group by city having count(*) > 1)
or empname in (select empname from mytable group by empname having count(*) > 1);
You say there is no primary key. This suggests that there can be duplicate rows (same name and city). This makes it impossible in many DBMS to use EXISTS here to look up the other rows. This is why I am suggesting IN and COUNT.

use exists and or condition
with cte as
(
select 'Shivam' as name, 'Noida' as city union all
select 'Ankit' , 'Delhi' union all
select 'Mani' , 'Gurugram' union all
select 'Shivam' , 'Faizabad' union all
select 'Mukesh' , 'Noida'
) select t1.* from cte t1 where exists ( select 1 from cte t2
where t1.name=t2.name
group by name
having count(*)>1
)
or exists
(
select 1 from cte t2
where t1.city=t2.city
group by city
having count(*)>1
)
output
name city
Shivam Noida
Shivam Faizabad
Mukesh Noida

Do a UNION ALL to put both types of names in one column. (d1)
GROUP BY its result, and use HAVING to return only duplicates (d2).
JOIN:
select EmpName, City
from tablename t1
join (select name from
(select EmpName name from tablename
union all
select City from tablename) d1
group by name
having count(*) > 1) d2
on d2.name in (EmpName, City)

select distinct * from table
where col1 in (select col1 from table group by col1 having count(1) > 1)
or col2 in (select col2 from table group by col2 having count(1) > 1)

Find non Identical values using Group by in SQL Server

I need to find the Non-Identical values are exist in a particular Group. Kindly have a look in the following Table Contact
ContactId FirstName LastName Mobile
_________________________________________________
1 Emma Watsan 9991234567
2 Jhon Wick 8887654321
1 Emma Watsan 9990001111
Here I need to fetch the Emma Watsan and need to find the Mobile numbers are Identical (bool - bit) If both Mobile numbers are Identical than 1 otherwise 0.
I tried the following Query
SELECT COUNT(*) FROM Contact c
GROUP BY c.ContactId, c.FirstName, c.LastName
HAVING COUNT(*) >1
Kindly assist me how to find the result.

for get the infor related to Emma Watsan you could ue
select * from Contact
where ContactId in (
select c.ContactId FROM Contact c
group by c.ContactId
having COUNT(*) >1
)
for get the Contacts that have the same numebr
select c.ContactId, COUNT(distinct Mobile) FROM Contact c
group by c.ContactId
having COUNT(distinct Mobile)>1

Use Count(Distinct [Mobile]) to get the number of distinct mobile nmumbers per ContactId. And the use a CASE expression to give 0 or 1 based of the count. If count is greater than 1 then 0 else 1.
Query
select t.[Name], case when t.[Mobile] > 1 then 0 else 1 end as [Mobile_Identity] from(
select ContactId,
max([FirstName] + ' ' + [LastName]) as [Name],
count(distinct [Mobile]) as [Mobile]
from contacts
group by ContactId
)t;
And if you want to retrieve only the rows with multiple mobile numbers, then use a having clause.
select t.[Name], case when t.[Mobile] > 1 then 0 else 1 end as [Mobile_Identity] from(
select ContactId,
max([FirstName] + ' ' + [LastName]) as [Name],
count(distinct [Mobile]) as [Mobile]
from contacts
group by ContactId
having count(distinct [Mobile]) > 1
)t;

Try This:
create table Contacts(ContactId int,FirstName varchar(15),LastName varchar(15),Mobile bigint)
insert into Contacts
select 1 ,'Emma','Watsan',9991234567
union all
select 2,'Jhon','Wick',8887654321
union all
select 1,'Emma','Watsan',9990001111
select c.ContactId, c.FirstName, c.LastName,IIF(cnt>1,1,0)ISIdentitcal
from (
SELECT c.ContactId, c.FirstName, c.LastName,
COUNT(*)cnt FROM Contacts c
GROUP BY c.ContactId, c.FirstName, c.LastName)c

How about:
Select distinct a.ContactId, case when b.mobile is null then 0 else 1 end as [is_duplicate]
from Contact a
left join Contact b
on a.ContactId = b.ContactId
and a.mobile = b.mobile
and a.id <> b.id
Where [id] is the primary key column in the table (you should have one).
Hope this helps.
PS: the table isn't normalized properly - if the ContactIDs repeat, the first and last name should not be in the same table.

I would do it something like this...(sample table variable with data included)
DECLARE #TABLE TABLE (ContactID INT, Firstname VARCHAR(55), Lastname VARCHAR(55), Mobile VARCHAR(55));
INSERT INTO #TABLE VALUES (1, 'Emma', 'Watsan', '9991234567');
INSERT INTO #TABLE VALUES (2, 'Jhon', 'Wick', '8887654321');
INSERT INTO #TABLE VALUES (1, 'Emma', 'Watsan', '9990001111');
INSERT INTO #TABLE VALUES (1, 'Emma', 'Watsan', '9990001111');
SELECT
T1.FirstName + ' ' + T1.LastName AS Name
,T1.Mobile
,MAX(CASE WHEN T2.RowID IS NULL THEN 0 ELSE 1 END) AS Duplicate
FROM
(
SELECT
ROW_NUMBER() OVER (ORDER BY FirstName,LastName, Mobile) AS RowID
,*
FROM #TABLE
) T1
LEFT JOIN
(
SELECT
ROW_NUMBER() OVER (ORDER BY FirstName,LastName, Mobile) AS RowID
,*
FROM #TABLE
) T2
ON T1.ContactID = T2.ContactID
AND T1.Mobile = T2.Mobile
AND T1.RowID <> T2.RowID
GROUP BY T1.FirstName + ' ' + T1.LastName, T1.Mobile
;
If the actual table already has row numbers, than the row_number() function can be skipped and the actual row ID of the table used in its place.
In the example here, Emma Watsan has the same number twice (on purpose), and another number that shows only once in the table. The duplicate mobile number is marked (Duplicate = 1), but the other numbers are not, as desired.

Show all rows that have certain columns duplicated

suppose I have following sql table
objid firstname lastname active
1 test test 0
2 test test 1
3 test1 test1 1
4 test2 test2 0
5 test2 test2 0
6 test3 test3 1
Now, the result I am interested in is as follows:
objid firstname lastname active
1 test test 0
2 test test 1
4 test2 test2 0
5 test2 test2 0
How can I achieve this?
I have tried the following query,
select firstname,lastname from table
group by firstname,lastname
having count(*) > 1
But this query gives results like
firstname lastname
test test
test2 test2

You've found your duplicated records but you're interested in getting all the information attached to them. You need to join your duplicates to your main table to get that information.
select *
from my_table a
join ( select firstname, lastname
from my_table
group by firstname, lastname
having count(*) > 1 ) b
on a.firstname = b.firstname
and a.lastname = b.lastname
This is the same as an inner join and means that for every record in your sub-query, that found the duplicate records you find everything from your main table that has the same firstseen and lastseen combination.
You can also do this with in, though you should test the difference:
select *
from my_table a
where ( firstname, lastname ) in
( select firstname, lastname
from my_table
group by firstname, lastname
having count(*) > 1 )
Further Reading:
A visual representation of joins from Coding Horror
Join explanation from Wikipedia

SELECT DISTINCT t1.*
FROM myTable AS t1
INNER JOIN myTable AS t2
ON t1.firstname = t2.firstname
AND t1.lastname = t2.lastname
AND t1.objid <> t2.objid
This will output every row which has a duplicate, basing on firstname and lastname.

Here's a little more legible way to do Ben's first answer:
WITH duplicates AS (
select firstname, lastname
from my_table
group by firstname, lastname
having count(*) > 1
)
SELECT a.*
FROM my_table a
JOIN duplicates b ON (a.firstname = b.firstname and a.lastname = b.lastname)

SELECT user_name,email_ID
FROM User_Master WHERE
email_ID
in (SELECT email_ID
FROM User_Master GROUP BY
email_ID HAVING COUNT(*)>1)

nice option get all duplicated value from tables
select * from Employee where Name in (select Name from Employee group by Name having COUNT(*)>1)

This is the easiest way:
SELECT * FROM yourtable a WHERE EXISTS (SELECT * FROM yourtable b WHERE a.firstname = b.firstname AND a.secondname = b.secondname AND a.objid <> b.objid)

If you want to print all duplicate IDs from the table:
select * from table where id in (select id from table group By id having count(id)>1)

I'm surprised that there is no answer using Window function. I just came across this use case and this helped me.
select t.objid, t.firstname, t.lastname, t.active
from
(
select t.*, count(*) over (partition by firstname, lastname) as cnt
from my_table t
) t
where t.cnt > 1;
Fiddle - https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=c0cc3b679df63c4d7d632cbb83a9ef13
The format goes like
select
tbl.relevantColumns
from
(
select t.*, count(*) over (partition by key_columns) as cnt
from desiredTable t
) as tbl
where tbl.cnt > 1;
This format selects whatever columns you require from the table (sometimes all columns) where the count > 1 for the key_columns being used to identify the duplicate rows. key_columns can be any number of columns.

This answer may not be great one, but I think it is simple to understand.
SELECT * FROM table1 WHERE (firstname, lastname) IN ( SELECT firstname, lastname FROM table1 GROUP BY firstname, lastname having count() > 1);

This Query returns dupliacates
SELECT * FROM (
SELECT a.*
FROM table a
WHERE (`firstname`,`lastname`) IN (
SELECT `firstname`,`lastname` FROM table
GROUP BY `firstname`,`lastname` HAVING COUNT(*)>1
)
)z WHERE z.`objid` NOT IN (
SELECT MIN(`objid`) FROM table
GROUP BY `firstname`,`lastname` HAVING COUNT(*)>1
)

Please try
WITH cteTemp AS (
SELECT EmployeeID, JoinDT,
row_number() OVER(PARTITION BY EmployeeID, JoinDT ORDER BY EmployeeID) AS [RowFound]
FROM dbo.Employee
)
SELECT * FROM cteTemp WHERE [RowFound] > 1 ORDER BY JoinDT

Find duplicates in SQL

I have a large table with the following data on users.
social security number
name
address
I want to find all possible duplicates in the table
where the ssn is equal but the name is not
My attempt is:
SELECT * FROM Table t1
WHERE (SELECT count(*) from Table t2 where t1.name <> t2.name) > 1

A grouping on SSN should do it
SELECT
ssn
FROM
Table t1
GROUP BY
ssn
HAVING COUNT(*) > 1
..or if you have many rows per ssn and only want to find duplicate names)
...
HAVING COUNT(DISTINCT name) > 1
Edit, oops, misunderstood
SELECT
ssn
FROM
Table t1
GROUP BY
ssn
HAVING MIN(name) <> MAX(name)

This will handle more than two records with duplicate ssn's:
select count(*), name from table t1, (
select count(*) ssn_count, ssn
from table
group by ssn
having count(*) > 1
) t2
where t1.ssn = t2.ssn
group by t1.name
having count(*) <> t2.ssn_count

SQL DISTINCT Value Question

How can I filter my results in a Query? example
I have 5 Records
John,Smith,apple
Jane,Doe,apple
Fred,James,apple
Bill,evans,orange
Willma,Jones,grape
Now I want a query that would bring me back 3 records with the DISTINCT FRUIT, BUT... and here is the tricky part, I still want the columns for First Name , Last Name. PS I do not care which of the 3 it returns mind you, but I need it to only return 3 (or what ever how many DISTINCT fruit there are.
ex return would be
John,Smith,apple
Bill,evans,orange
Willma,Jones,grape
Thanks in advance I've been banging my head on this all day.

Oddly enough, the best solution doesn't involve GROUP BY.
WITH DistinctFruit AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY Fruit ORDER BY LastName) AS FruitNo,
LastName,
FirstName,
Fruit
FROM table)
SELECT FirstName, LastName, Fruit
FROM DistinctFruit
WHERE FruitNo = 1;

If you have a small amount of data (not tens of thousands of rows), you can do sub-queries.
select distinct t1.fruit as Fruit,
(select top 1 t2.lastname
from t1 as t2
where t1.fruit = t2.fruit
order by t2.lastname) as LastName,
(select top 1 t2.firstname
from t1 as t2
where t1.fruit = t2.fruit
order by t2.lastname, t2.firstname) as FirstName
from t1
Note the FirstName column is sorted the same as the LastName column. This will give you a matching last name with the correct first name.
Here is my test data:
create table t1
(firstname varchar(20),
lastname varchar(20),
fruit varchar(20))
insert into t1
values ('John','Smith','apple')
insert into t1
values ('Jane','Doe','apple')
insert into t1
values ('Fred','James','apple')
insert into t1
values ('Bill','evans','orange')
insert into t1
values ('Willma','Jones','grape')

Just another solution
select distinct x.*,fruit from t1
cross apply
(select top 1 firstname, lastname from t1 t2 where t1.fruit=t2.fruit) x

SELECT DISTINCT x.*,fruit FROM peopleFruit pf
CROSS APPLY
(SELECT TOP 1 firstname, lastname FROM peopleFruit pf1 WHERE pf.fruit=pf1.fruit) x

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How do I find duplicate rows, and at the same time distinct? - sql

SELECT * FROM the_table tt WHERE EXISTS ( SELECT * FROM the_table xx WHERE xx.mobineno = tt.mobileno AND (xx.fname <> tt.fname OR xx.lname <> tt.lname) );

SELECT t1.phone, count(t1.phone) CountNo FROM (SELECT distinct * FROM YourTable) t1 Left Outer Join YourTable t2 ON t1.phone = t2.phone AND ( t1.FirstName <> t2.FirstName OR t1.LastName <> t2.LastName) WHERE t2.FirstName IS NOT NULL GROUP BY t1.phone HAVING count(t1.phone) > 1 ORDER BY CountNo desc

Related

I have a table without any primary key and i want it all duplicate records

Find non Identical values using Group by in SQL Server

Show all rows that have certain columns duplicated

Find duplicates in SQL

SQL DISTINCT Value Question

Categories

Resources