Delete duplicate records on SQL Server - sql

I have a table with duplicate records, where I've already created a script to summarize the duplicate records with the original ones, but I'm not able to delete the duplicate records.
I'm trying this way:
DELETE FROM TB_MOVIMENTO_PDV_DETALHE_PLANO_PAGAMENTO
WHERE COD_PLANO_PAGAMENTO IN (SELECT MAX(COD_PLANO_PAGAMENTO) COD_PLANO_PAGAMENTO
FROM TB_MOVIMENTO_PDV_DETALHE_PLANO_PAGAMENTO
GROUP BY COD_PLANO_PAGAMENTO)
The idea was to take the last record of each COD_PLANO_PAGAMENTO and delete it, but this way all the records are being deleted, what am I doing wrong?
The table is structured as follows:
I need to delete, for example, the second record of COD_MOVIMENTO = 405 with COD_PLANO_PAGAMENTO = 9, there should only be one record of COD_PLANO_PAGAMENTO different in each COD_MOVIMENTO

You can use an updatable CTE with row-numbering to calculate which rows to delete.
You may need to adjust the partitioning and ordering clauses, it's not clear exactly what you need.
WITH cte AS (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY COD_MOVIMENTO, COD_PLANO_PAGAMENTO ORDER BY (SELECT 1)
FROM TB_MOVIMENTO_PDV_DETALHE_PLANO_PAGAMENTO mp
)
DELETE FROM cte
WHERE rn > 1;

Your delete statement will take the max() but even if you have only one record, it'll return a value.
Also note that your group by should be on COD_MOVIMENTO.
As a fix, make sure there are at least two items:
DELETE FROM TB_MOVIMENTO_PDV_DETALHE_PLANO_PAGAMENTO
WHERE COD_PLANO_PAGAMENTO IN
(SELECT MAX(COD_PLANO_PAGAMENTO)COD_PLANO_PAGAMENTO
FROM TB_MOVIMENTO_PDV_DETALHE_PLANO_PAGAMENTO
WHERE cod_plano_pagamento in
(select cod_plano_pagamento
from TB_MOVIMENTO_PDV_DETALHE_PLANO_PAGAMENTO
group by COD_PLANO_PAGAMENTO
having count(*) > 1)
GROUP BY COD_MOVIMENTO )

In your comment you want remove duplicate rows with same COD_MOVIMENTO, COD_PLANO_PAGAMENTO and VAL_TOTAL_APURADO, try this:
delete f1 from
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY COD_MOVIMENTO, COD_PLANO_PAGAMENTO, VAL_TOTAL_APURADO ORDER BY COD_MOVIMENTO) rang
FROM TB_MOVIMENTO_PDV_DETALHE_PLANO_PAGAMENTO
) f1
where f1.rang>1

Related

SQL: Deleting Duplicates using Not in and Group by

I have the following SQL Syntax to delete duplicate rows, but never are any rows affected.
DELETE FROM content_stacks WHERE id NOT IN (
SELECT id
FROM content_stacks
GROUP BY user_id, content_id
);
The subquery itself is returning the id list of first entries correctly.
SELECT id
FROM content_stacks
GROUP BY user_id, content_id
When I'm inserting the results list as a string it is working, too:
DELETE FROM content_stacks WHERE id NOT IN (239,231,217,218,219,232,233,220,230,226,234,235,224,225,221,223,222,227,228,229,236,237,238,216,208,209,210,204,211,212,242,203,240,201,241,205,206,207,213,214,215);
I checked many similar examples and this should be working in my opinion. What am I missing?
First find first rows using ROW_NUMBER Then delete record with row number greater than 1:
WITH CTE AS (
SELECT id , ROW_NUMBER() OVER(PARTITION BY user_id, content_id, ORDER BY id) rn
FROM content_stacks
)
DELETE cs
FROM content_stacks cs
INNER JOIN CTE ON CTE.id = cs.id
WHERE rn > 1
Am sorry to ask but if your deleting why would u need to group the records.
Are not just increasing the runtime.
The code from Meyssam Toluie is not working as it is but I made a similar solution with the same idea with rownumbers:
DELETE FROM content_stacks WHERE id IN
(SELECT id FROM (
SELECT id, ROW_NUMBER() OVER(PARTITION BY user_id, content_id)row_num
FROM content_stacks
) sub
WHERE row_num > 1)
This is working for me now.
My first command did not work because: The group by command does not show all ids in the output, but they are still there, so in fact all ids were returned in the NOT IN id-list. The row number seems to be the easiest way for this problem.

How to find duplicates from Unique code column and delete the rows they're attached too, while still keeping the original row?

I have a table in my azure sql server named dbo.SQL_Transactional, and there are columns with headers as code, saledate, branchcode
code is my primary key, so if there is ever 2 or more rows with the same code, they are duplicates and need to be deleted. How can I do so?
I don't need to worry about if saledate or branchcode are duplicates, because if the code is duplicated then that's all I need to be able to delete the entire duplicate row.
If you just want to delete the duplicate rows, then try an updateable CTE:
with todelete as (
select t.*, row_number() over (partition by code order by code) as seqnum
from t
)
delete from todelete
where seqnum > 1;
If you just wanted to select one row, then you would use where seqnum = 1.

Delete specific record from multiple duplicates in the table

How do I delete specific record from multiple duplicates
below is the table for eg
This is just one of the example and we have many cases like this. From this table I need to delete rank 2 and 3.
Kindly suggest me best way to identify duplicate records and delete the specific rows
This should work
delete
from <your table> t
where rank != (select top(rank)
from <your table> tt
where tt.emp_id = t.emp_id
order by rank desc --put asc if you want to keep the lowest rank
)
group by t.emp_id
I do not encourage record deleting but this solution can help with expiring records or deleting them:
The table should have a unique ID and a field that allows you to identify that the record has been expired. If it does not, I recommend adding it to the table. You can creating a composite ID in your query but down the road you will wish you had these attributes.
Create a query that identifies every record where the RANK <> 1. This will be your subquery.
Write your UPDATE query
UPDATE A
SET [EXPIRE_DTTM] = GETDATE()
FROM *TableNameWithTheRecords* A
INNER JOIN (*SubQuery*) B ON A.UniqueID = B.UniqueID
**If you truly want to delete the records, use this:
DELETE FROM *TableNameWithTheRecords*
WHERE *UniqueID* = (SELECT *UniqueID* FROM *TableNameWithTheRecords* WHERE RANK <> 1)
WITH tbl_alias AS
(
SELECT emp_ID,
RN = ROW_NUMBER() OVER(PARTITION BY emp_ID ORDER BY emp_ID)
FROM tblName
)
DELETE FROM tbl_alias WHERE RN > 1

Remove multiple postings but keep first

I have a table which has had numerous postings over the course of the week that I need to remove, the timestamp is different so i need to keep the first entry but then remove all the others which came after that.
What techniques would be advised.
SQL Server 2008
Many thanks
J
You can use a CTE with delete. The result is something like this:
with todelete as (
select p.*,
row_number() over (partition by post_id order by datetimecol asc) as seqnum
from posts p
)
delete from todelete
where seqnum > 1;
You can just run the subquery to see what is happening.
Delete all the posts except the oldest
DELETE FROM tbl
WHERE ID NOT IN
(
select top 1 id
from tbl
order by TimeStampColumn
)

Select last duplicate row with different id Oracle 11g

I have a table that look like this:
The problem is I need to get the last record with duplicates in the column "NRODENUNCIA".
You can use MAX(DENUNCIAID), along with GROUP BY... HAVING to find the duplicates and select the row with the largest DENUNCIAID:
SELECT MAX(DENUNCIAID), NRODENUNCIA, FECHAEMISION, ADUANA, MES, NOMBREESTADO
FROM YourTable
GROUP BY NRODENUNCIA, FECHAEMISION, ADUANA, MES, NOMBREESTADO
HAVING COUNT(1) > 1
This will only show rows that have at least one duplicate. If you want to see non-duplicate rows too, just remove the HAVING COUNT(1) > 1
There are a number of solutions for your problem. One is to use row_number.
Note that I've ordered by DENUNCIID in the OVER clause. This defines the "Last Record" as the one that has the largest DENUNCIID. If you want to define it differently you'd need to change the field that is being ordered.
with dupes as (
SELECT
ROW_NUMBER() OVER (Partition by NRODENUNCIA ORDER BY DENUNCIID DESC) RN,
*
FROM
YourTable
)
SELECT * FROM dupes where rn = 1
This only get's the last record per dupe.
If you want to only include records that have dupes then you change the where clause to
WHERE rn =1
and NRODENUNCIA in (select NRODENUNCIA from dupes where rn > 1)