I want to List All Rows Containing Duplicates by a name in the table for instance, name = TestName or TestName1 or Test name 2..
but also want to know what else is attached to them..
I have done this query.
select firstname,lastname from table
group by firstname,lastname
having count(*) > 1
I have found duplicated records but you're interested in getting all the information attached to them, however this is all on one table..
I have tried to do this..
select * from my_table a join ( select firstname, lastname from my_table group by firstname, lastname having count(*) > 1 ) b on a.firstname = b.firstname and a.lastname = b.lastname
but this is if the data was on 2 separate tables?
You can use window functions:
select t.*
from (select t.*, count(*) over (partition by firstname, lastname) as cnt
from table t
) t
where cnt > 1
Related
I am trying to find in my database records which has duplicated fields like name, surname and type.
Example:
SELECT name, surname, type, COUNT(*)
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1
Query results:
Robb|Stark|1|2
Tyrion|Lannister|1|3
So we have duplicated customer with name and surname "Robb Stark" 2 times and "Tyrion Lannister" 3 times
Now, I want to know the id of these records.
I found similar problem described here:
Finding duplicate values in a SQL table
there is answer but no example.
Use COUNT as an analytic function:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY name, surname) cnt
FROM customers
)
SELECT * -- return all columns
FROM cte
WHERE cnt > 1
ORDER BY name, surname;
The simplest way will be to use the EXISTS as follows:
SELECT t.*
FROM customers t
where exists
(select 1 from customers tt
where tt.name = t.name
and tt.surname = t.surname
and tt.id <> t.id)
Or use your original query in IN clause as follows:
select * from customers where (name, surname) in
(SELECT name, surname
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1)
If you want one row per group of duplicate, with the list of id in a comma separated string, you can just use string aggration with your existing query:
SELECT name, surname, COUNT(*) as cnt,
STRING_AGG(id, ',') WITHIN GROUP (ORDER BY id) as all_ids
FROM customers
GROUP BY name, surname
HAVING COUNT(*) > 1
Suppose I have a table Students with just 2 columns LastName and FirstName. I know I can get all the LastNames that only have a 1 FirstName with:
select LastName from Students group by LastName having count(*) = 1
But what if I also want to show the FirstName for those rows?
You could filter with a correlated subquery:
select s.*
from students s
where (select count(*) from students s1 where s1.LastName = s.LastName) = 1
Or, if you have a primary column, you can use not exists:
select s.*
from student s
where not exists (
select 1 from students s1 where s1.LastName = s.LastName and s1.id <> s.id
)
This query would take advantage of an index on (id, LastName).
Finally, another common option is to do a window count:
select *
from (select s.*, count(*) over(partition by LastName) cnt from students s) t
where cnt = 1
You could also do
Select * from Students where lastname in (Select lastname from Students group by lastname having count(*) =1)
Just add it to the select:
select s.LastName, min(s.firstname)
from Students s
group by s.LastName
having count(*) = 1;
If only one row matches, then min() returns the value on that row.
You need to get it using any aggregate function that can return the desired value. Min, Max, first_value(), ir string_agg() you may want all names with the lastname in other use case
select LastName,
first_value(firstname) over (order by firstname) firstname,
Min(firstname)
from Students
group by LastName having count(*) = 1
I found another one myself:
with LastNames as (
select LastName from Students group by LastName having count(*) = 1
)
select LastName, FirstName from Students
where LastName in (select LastName from LastNames)
so i had a table with 3 columns:
id \ first_name \ last_name
and i need to find how many of people share the same full name.
i had something like this:
SELECT COUNT(*)
FROM ACTOR
WHERE FIRST_NAME IN (SELECT FIRST_NAME,LAST_NAME
FROM ACTOR
HAVING COUNT(FIRST_NAME,LAST_NAME) >1);
Use GROUP BY
SELECT FIRST_NAME, LAST_NAME, Count(*) AS CNT
FROM ACTOR
GROUP BY FIRST_NAME, LAST_NAME
HAVING COUNT(*) > 1
This returns the first- and lastname and how often they appear for all which have duplicates. If you only want to know how many that are you can use:
In SQL-Server:
SELECT TOP 1 COUNT(*) OVER () AS RecordCount -- TOP 1 because the total-count is repeated for every row
FROM ACTOR
GROUP BY FIRST_NAME, LAST_NAME
HAVING COUNT(*) > 1
all others:
Select COUNT(*) AS RecordCount
From
(
SELECT FIRST_NAME, LAST_NAME
FROM ACTOR
GROUP BY FIRST_NAME, LAST_NAME
HAVING COUNT(*) > 1
) As X
Use concatenate and group by
select id,FIRST_NAME,LAST_NAME,count(*)
from
(
select id,FIRST_NAME,LAST_NAME,FIRST_NAME||LAST_NAME as full_name
from
actor)x
group by id,FIRST_NAME,LAST_NAME
having count(*) > 1;
Try this:
SELECT COUNT(*) as Totals, NAME
FROM
(SELECT FIRST_NAME+LAST_NAME AS NAME
FROM ACTOR)A
GROUP BY NAME
There are several possibilities for fixing your approach. I think the best to learn is EXISTS:
SELECT COUNT(*)
FROM ACTOR a
WHERE EXISTS (SELECT 1
FROM ACTOR a2
WHERE a2.FIRST_NAME = a.FIRST_NAME AND a2.LAST_NAME = a.LAST_NAME AND
a2.id <> a.id
);
I'm trying to select all records that have a duplicate value in the LASTNAME column. This is my code so far
If EXISTS( SELECT name FROM sysobjects WHERE name = 'USER_DUPLICATES' AND type = 'U' )
DROP TABLE USER_DUPLICATES
GO
CREATE TABLE USER_DUPLICATES
(
FIRSTNAME VARCHAR(MAX),
LASTNAME VARCHAR(MAX),
PHONE VARCHAR(MAX),
EMAIL VARCHAR(MAX),
TITLE VARCHAR(MAX),
LMU VARCHAR(MAX)
)
GO
INSERT INTO USER_DUPLICATES
(
FIRSTNAME,
LASTNAME,
PHONE,
EMAIL,
TITLE,
LMU
)
SELECT
FIRSTNAME,
LASTNAME,
PHONE,
EMAIL,
TITLE,
LMU
FROM TM_USER
GROUP BY
FIRSTNAME,
LASTNAME,
PHONE,
EMAIL,
TITLE,
LMU
HAVING COUNT(LASTNAME) > 1
It does not return any records. I changed the
HAVING COUNT(LASTNAME) > 1
to
HAVING COUNT(LASTNAME) > 0
and it returns all the records. I am also certain there are records with the same LASTNAME value. It is written using T-SQL on SQL Server
Try this:
SELECT
a.FIRSTNAME,
a.LASTNAME,
a.PHONE,
a.EMAIL,
a.TITLE,
a.LMU
FROM TM_USER a
INNER JOIN
(
SELECT LASTNAME
FROM TM_USER
GROUP BY LASTNAME
HAVING COUNT(1) > 1
) b ON a.LASTNAME = b.LASTNAME
Your Group By clause will Group By all the comuns in the list. Those columns probably define a discreet record of count = 1
You will need to do something like:
Select LAST_NAME from TM_USER GROUP BY LAST_NAME HAVING COUNT(LAST_NAME) > 1
COUNT function is computed over all grouping expression, not over LASTNAME.
To get unique last names use
SELECT LASTNAME FROM TM_USER GROUP BY LASTNAME HAVING COUNT(LASTNAME) > 1
If you group by few columns, you will get count of their unique combination even if computing COUNT over single column value.
I have a table in MySQL of contact information ;
first name, last name, address, etc.
I would like to run a query on this table that will return only rows with first and last name combinations which appear in the table more than once.
I do not want to group the "duplicates" (which may only be duplicates of the first and last name, but not other information like address or birthdate) -
I want to return all the "duplicate" rows so I can look over the results and determine if they are dupes or not. This seemed like it would be a simple thing to do, but it has not been.
Every solution I can find either groups the dupes and gives me a count only (which is not useful for what I need to do with the results) or doesn't work at all.
Is this kind of logic even possible in a query ? Should I try and do this in Python or something?
You should be able doing this with the GROUP BY approach in a sub-query.
SELECT t.first_name, t.last_name, t.address
FROM your_table t
JOIN ( SELECT first_name, last_name
FROM your_table
GROUP BY first_name, last_name
HAVING COUNT(*) > 1
) t2
ON ( t.first_name = t2.first_name, t.last_name = t2.last_name )
The sub-query returns all names (first_name and last_name) that exist more than once, and the JOIN returns all records that match these names.
You could do it with a GROUP BY / HAVING and A SUB SELECT. Something like
SELECT t.*
FROM Table t INNER JOIN
(
SELECT FirstName, LastName
FROM Table
GROUP BY FirstName, LastName
HAVING COUNT(*) > 1
) Dups ON t.FirstName = Dups.FirstName
AND t.LastName = Dups.LastName
select * from people
join (select firstName, lastName
from people
group by firstName, lastName
having count(*) > 1
) dupe
using (firstName, lastName)