List All Rows Containing Duplicates SQL - sql

I want to List All Rows Containing Duplicates by a name in the table for instance, name = TestName or TestName1 or Test name 2..
but also want to know what else is attached to them..
I have done this query.
select firstname,lastname from table
group by firstname,lastname
having count(*) > 1
I have found duplicated records but you're interested in getting all the information attached to them, however this is all on one table..
I have tried to do this..
select * from my_table a join ( select firstname, lastname from my_table group by firstname, lastname having count(*) > 1 ) b on a.firstname = b.firstname and a.lastname = b.lastname
but this is if the data was on 2 separate tables?

You can use window functions:
select t.*
from (select t.*, count(*) over (partition by firstname, lastname) as cnt
from table t
) t
where cnt > 1

Related

SQL Select column which is not used in select section of subquery which find duplicates

I am trying to find in my database records which has duplicated fields like name, surname and type.
Example:
SELECT name, surname, type, COUNT(*)
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1
Query results:
Robb|Stark|1|2
Tyrion|Lannister|1|3
So we have duplicated customer with name and surname "Robb Stark" 2 times and "Tyrion Lannister" 3 times
Now, I want to know the id of these records.
I found similar problem described here:
Finding duplicate values in a SQL table
there is answer but no example.
Use COUNT as an analytic function:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY name, surname) cnt
FROM customers
)
SELECT * -- return all columns
FROM cte
WHERE cnt > 1
ORDER BY name, surname;
The simplest way will be to use the EXISTS as follows:
SELECT t.*
FROM customers t
where exists
(select 1 from customers tt
where tt.name = t.name
and tt.surname = t.surname
and tt.id <> t.id)
Or use your original query in IN clause as follows:
select * from customers where (name, surname) in
(SELECT name, surname
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1)
If you want one row per group of duplicate, with the list of id in a comma separated string, you can just use string aggration with your existing query:
SELECT name, surname, COUNT(*) as cnt,
STRING_AGG(id, ',') WITHIN GROUP (ORDER BY id) as all_ids
FROM customers
GROUP BY name, surname
HAVING COUNT(*) > 1

show extra columns where group has count of 1

Suppose I have a table Students with just 2 columns LastName and FirstName. I know I can get all the LastNames that only have a 1 FirstName with:
select LastName from Students group by LastName having count(*) = 1
But what if I also want to show the FirstName for those rows?
You could filter with a correlated subquery:
select s.*
from students s
where (select count(*) from students s1 where s1.LastName = s.LastName) = 1
Or, if you have a primary column, you can use not exists:
select s.*
from student s
where not exists (
select 1 from students s1 where s1.LastName = s.LastName and s1.id <> s.id
)
This query would take advantage of an index on (id, LastName).
Finally, another common option is to do a window count:
select *
from (select s.*, count(*) over(partition by LastName) cnt from students s) t
where cnt = 1
You could also do
Select * from Students where lastname in (Select lastname from Students group by lastname having count(*) =1)
Just add it to the select:
select s.LastName, min(s.firstname)
from Students s
group by s.LastName
having count(*) = 1;
If only one row matches, then min() returns the value on that row.
You need to get it using any aggregate function that can return the desired value. Min, Max, first_value(), ir string_agg() you may want all names with the lastname in other use case
select LastName,
first_value(firstname) over (order by firstname) firstname,
Min(firstname)
from Students
group by LastName having count(*) = 1
I found another one myself:
with LastNames as (
select LastName from Students group by LastName having count(*) = 1
)
select LastName, FirstName from Students
where LastName in (select LastName from LastNames)

how to find people with the same first and last name

so i had a table with 3 columns:
id \ first_name \ last_name
and i need to find how many of people share the same full name.
i had something like this:
SELECT COUNT(*)
FROM ACTOR
WHERE FIRST_NAME IN (SELECT FIRST_NAME,LAST_NAME
FROM ACTOR
HAVING COUNT(FIRST_NAME,LAST_NAME) >1);
Use GROUP BY
SELECT FIRST_NAME, LAST_NAME, Count(*) AS CNT
FROM ACTOR
GROUP BY FIRST_NAME, LAST_NAME
HAVING COUNT(*) > 1
This returns the first- and lastname and how often they appear for all which have duplicates. If you only want to know how many that are you can use:
In SQL-Server:
SELECT TOP 1 COUNT(*) OVER () AS RecordCount -- TOP 1 because the total-count is repeated for every row
FROM ACTOR
GROUP BY FIRST_NAME, LAST_NAME
HAVING COUNT(*) > 1
all others:
Select COUNT(*) AS RecordCount
From
(
SELECT FIRST_NAME, LAST_NAME
FROM ACTOR
GROUP BY FIRST_NAME, LAST_NAME
HAVING COUNT(*) > 1
) As X
Use concatenate and group by
select id,FIRST_NAME,LAST_NAME,count(*)
from
(
select id,FIRST_NAME,LAST_NAME,FIRST_NAME||LAST_NAME as full_name
from
actor)x
group by id,FIRST_NAME,LAST_NAME
having count(*) > 1;
Try this:
SELECT COUNT(*) as Totals, NAME
FROM
(SELECT FIRST_NAME+LAST_NAME AS NAME
FROM ACTOR)A
GROUP BY NAME
There are several possibilities for fixing your approach. I think the best to learn is EXISTS:
SELECT COUNT(*)
FROM ACTOR a
WHERE EXISTS (SELECT 1
FROM ACTOR a2
WHERE a2.FIRST_NAME = a.FIRST_NAME AND a2.LAST_NAME = a.LAST_NAME AND
a2.id <> a.id
);

SELECT using GROUP BY and HAVING not returning records

I'm trying to select all records that have a duplicate value in the LASTNAME column. This is my code so far
If EXISTS( SELECT name FROM sysobjects WHERE name = 'USER_DUPLICATES' AND type = 'U' )
DROP TABLE USER_DUPLICATES
GO
CREATE TABLE USER_DUPLICATES
(
FIRSTNAME VARCHAR(MAX),
LASTNAME VARCHAR(MAX),
PHONE VARCHAR(MAX),
EMAIL VARCHAR(MAX),
TITLE VARCHAR(MAX),
LMU VARCHAR(MAX)
)
GO
INSERT INTO USER_DUPLICATES
(
FIRSTNAME,
LASTNAME,
PHONE,
EMAIL,
TITLE,
LMU
)
SELECT
FIRSTNAME,
LASTNAME,
PHONE,
EMAIL,
TITLE,
LMU
FROM TM_USER
GROUP BY
FIRSTNAME,
LASTNAME,
PHONE,
EMAIL,
TITLE,
LMU
HAVING COUNT(LASTNAME) > 1
It does not return any records. I changed the
HAVING COUNT(LASTNAME) > 1
to
HAVING COUNT(LASTNAME) > 0
and it returns all the records. I am also certain there are records with the same LASTNAME value. It is written using T-SQL on SQL Server
Try this:
SELECT
a.FIRSTNAME,
a.LASTNAME,
a.PHONE,
a.EMAIL,
a.TITLE,
a.LMU
FROM TM_USER a
INNER JOIN
(
SELECT LASTNAME
FROM TM_USER
GROUP BY LASTNAME
HAVING COUNT(1) > 1
) b ON a.LASTNAME = b.LASTNAME
Your Group By clause will Group By all the comuns in the list. Those columns probably define a discreet record of count = 1
You will need to do something like:
Select LAST_NAME from TM_USER GROUP BY LAST_NAME HAVING COUNT(LAST_NAME) > 1
COUNT function is computed over all grouping expression, not over LASTNAME.
To get unique last names use
SELECT LASTNAME FROM TM_USER GROUP BY LASTNAME HAVING COUNT(LASTNAME) > 1
If you group by few columns, you will get count of their unique combination even if computing COUNT over single column value.

MySQL, return only rows where there are duplicates among two columns

I have a table in MySQL of contact information ;
first name, last name, address, etc.
I would like to run a query on this table that will return only rows with first and last name combinations which appear in the table more than once.
I do not want to group the "duplicates" (which may only be duplicates of the first and last name, but not other information like address or birthdate) -
I want to return all the "duplicate" rows so I can look over the results and determine if they are dupes or not. This seemed like it would be a simple thing to do, but it has not been.
Every solution I can find either groups the dupes and gives me a count only (which is not useful for what I need to do with the results) or doesn't work at all.
Is this kind of logic even possible in a query ? Should I try and do this in Python or something?
You should be able doing this with the GROUP BY approach in a sub-query.
SELECT t.first_name, t.last_name, t.address
FROM your_table t
JOIN ( SELECT first_name, last_name
FROM your_table
GROUP BY first_name, last_name
HAVING COUNT(*) > 1
) t2
ON ( t.first_name = t2.first_name, t.last_name = t2.last_name )
The sub-query returns all names (first_name and last_name) that exist more than once, and the JOIN returns all records that match these names.
You could do it with a GROUP BY / HAVING and A SUB SELECT. Something like
SELECT t.*
FROM Table t INNER JOIN
(
SELECT FirstName, LastName
FROM Table
GROUP BY FirstName, LastName
HAVING COUNT(*) > 1
) Dups ON t.FirstName = Dups.FirstName
AND t.LastName = Dups.LastName
select * from people
join (select firstName, lastName
from people
group by firstName, lastName
having count(*) > 1
) dupe
using (firstName, lastName)