Remove only duplicates from specific Postgresql table

Remove only duplicates from specific Postgresql table - sql

I have a table which contains duplicates and I would like to keep only one row for each duplicates.
I can select duplicates with my SQL command :
SELECT DISTINCT ON (email, first_name, last_name) * from customer;
But I would like to use DELETE with my previous command.
This command should work right ?
DELETE FROM customer WHERE customer.id NOT IN
(SELECT id FROM
(SELECT DISTINCT ON (email, first_name, last_name) * from customer));
Is it true ?

I guess you have a id field.
delete from customer
where id not in (
select min(id)
from customer
group by email, first_name, last_name
)
The subquery finds the id of the rows you want to keep.
Then you delete the other rows

I can't find your ID in (SELECT DISTINCT ON (email, first_name, last_name) * from customer));
The distinct on only return the first row of the duplication data that is unpredictable

Related

How to delete all duplicate records from sql

i have table in sql
and I want to remove duplicate records
but remove all duplicate records
First name
last name
a
b
a
b
a
c
after run
First name
last name
a
c

You can use group by:
select first_name, last_name
from t
group by first_name, last_name
having count(*) = 1;

SQL Select column which is not used in select section of subquery which find duplicates

I am trying to find in my database records which has duplicated fields like name, surname and type.
Example:
SELECT name, surname, type, COUNT(*)
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1
Query results:
Robb|Stark|1|2
Tyrion|Lannister|1|3
So we have duplicated customer with name and surname "Robb Stark" 2 times and "Tyrion Lannister" 3 times
Now, I want to know the id of these records.
I found similar problem described here:
Finding duplicate values in a SQL table
there is answer but no example.

Use COUNT as an analytic function:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY name, surname) cnt
FROM customers
)
SELECT * -- return all columns
FROM cte
WHERE cnt > 1
ORDER BY name, surname;

The simplest way will be to use the EXISTS as follows:
SELECT t.*
FROM customers t
where exists
(select 1 from customers tt
where tt.name = t.name
and tt.surname = t.surname
and tt.id <> t.id)
Or use your original query in IN clause as follows:
select * from customers where (name, surname) in
(SELECT name, surname
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1)

If you want one row per group of duplicate, with the list of id in a comma separated string, you can just use string aggration with your existing query:
SELECT name, surname, COUNT(*) as cnt,
STRING_AGG(id, ',') WITHIN GROUP (ORDER BY id) as all_ids
FROM customers
GROUP BY name, surname
HAVING COUNT(*) > 1

Retrieving data with single occurrence of repeated data

SELECT *
FROM employee
GROUP BY first_name
HAVING count(first_name) >= 1;
How can i retrieve all rows and columns with single occurrence of duplicates? i want to retrieve all the table contents including repeated data that must occur only at once. In a table first_name,last_name are repeated twice but with different in other info.
Please Help.

try this Sql Query
SELECT * FROM EMPLOYEE WHERE FIRST_NAME NOT IN
(
SELECT FIRST_NAME FROM
(
SELECT ROW_NUMBER() OVER(PARTITION BY FIRST_NAME ORDER BY FIRST_NAME) RNK,FIRST_NAME FROM EMPLOYEE
)A WHERE A.RNK=2
)

how to remove a multiple records for same zipcode keeping atleast one record for that zipcode in database table

how to remove a multiple records for same zipcode keeping atleast one record for that zipcode in database table
id zipcode
1 38000
2 38000
3 38000
4 38005
5 38005
i want table with two column with id and zipcode ...
my final will be following
id zipcode
1 38000
4 38005

How about
delete from myTable
where id not in (
select Min( id )
from myTable
group by zipcode )
That lets you keep your lowest IDs, which is what you seemed to want.

To just select that resultset you would use a DISTINCT statement:
SELECT id, zipcode
FROM table
WHERE zipcode IN (SELECT DISTINCT zipcode FROM table)
To delete the other records and keep only one you usea subquery like so:
DELETE FROM table
WHERE id NOT IN
(SELECT id
FROM table
WHERE zipcode IN (SELECT DISTINCT zipcode FROM table)
)
You can also accomplish this using a join if you perfer.

with cte as (
select row_number() over (partitioned by zipcode order by id desc) as rn
from table)
delete from cte
where rn > 1;
This has the advantage of correctly handling duplicates and offers tight control over what gets deleted and what gets kept.

Create temporary table with desired result:
select min(id), zipcode
into tmp_sometable
from sometable
group by zipcode
Remove the original table:
drop table sometable
Rename temporary table:
sp_rename 'tmp_sometable', 'sometable';
or something like:
delete from sometable
where id not in
(
select min(id)
from sometable
group by zipcode
)

delete from table where id not in (select min(id) from table zipcode in(select distinct zipcode from table));
select distinct zipcode from table - would give the distinct zipcode in the table
select min(id) from table zipcode in(select distinct zipcode from table) - would give the record with the min ID for each zip code
delete from table where id not in (select min(id) from table zipcode in(select distinct zipcode from table)) - this would delete all the records in the table that are not there as a result of query 2

There's an easier way if you want the lowest ID number. I just tested this:
SELECT
min(ID),
zipcode
FROM #zip
GROUP BY zipcode

MySQL, return only rows where there are duplicates among two columns

I have a table in MySQL of contact information ;
first name, last name, address, etc.
I would like to run a query on this table that will return only rows with first and last name combinations which appear in the table more than once.
I do not want to group the "duplicates" (which may only be duplicates of the first and last name, but not other information like address or birthdate) -
I want to return all the "duplicate" rows so I can look over the results and determine if they are dupes or not. This seemed like it would be a simple thing to do, but it has not been.
Every solution I can find either groups the dupes and gives me a count only (which is not useful for what I need to do with the results) or doesn't work at all.
Is this kind of logic even possible in a query ? Should I try and do this in Python or something?

You should be able doing this with the GROUP BY approach in a sub-query.
SELECT t.first_name, t.last_name, t.address
FROM your_table t
JOIN ( SELECT first_name, last_name
FROM your_table
GROUP BY first_name, last_name
HAVING COUNT(*) > 1
) t2
ON ( t.first_name = t2.first_name, t.last_name = t2.last_name )
The sub-query returns all names (first_name and last_name) that exist more than once, and the JOIN returns all records that match these names.

You could do it with a GROUP BY / HAVING and A SUB SELECT. Something like
SELECT t.*
FROM Table t INNER JOIN
(
SELECT FirstName, LastName
FROM Table
GROUP BY FirstName, LastName
HAVING COUNT(*) > 1
) Dups ON t.FirstName = Dups.FirstName
AND t.LastName = Dups.LastName

select * from people
join (select firstName, lastName
from people
group by firstName, lastName
having count(*) > 1
) dupe
using (firstName, lastName)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Remove only duplicates from specific Postgresql table - sql

I guess you have a id field. delete from customer where id not in ( select min(id) from customer group by email, first_name, last_name ) The subquery finds the id of the rows you want to keep. Then you delete the other rows

I can't find your ID in (SELECT DISTINCT ON (email, first_name, last_name) * from customer)); The distinct on only return the first row of the duplication data that is unpredictable

Related

How to delete all duplicate records from sql

SQL Select column which is not used in select section of subquery which find duplicates

Retrieving data with single occurrence of repeated data

how to remove a multiple records for same zipcode keeping atleast one record for that zipcode in database table

MySQL, return only rows where there are duplicates among two columns

Categories

Resources