How to delete a record by its OID? - sql

I have a weird case which I have no idea how it happened.
This is my table:
id
date
amount
where id can not be NULL and is auto increasing.
Someone last year the system created the following situation:
OID id date amount
710604512 197 2015-03-11 10657.61
710604513 197 2015-03-11 10657.61
This causes huge problems as id should be unique.
I can't fix this from regular SQL because any action I'll do will be done on both rows.
One of them needs to be deleted.
The solution of deleting both and inserting one is unacceptable as I can not play with the dates (it records the date of creation and the logs will show it)
How can I delete the row by its OID?

If you want to delete the record with the smaller OID should duplicates occur, then you can try this:
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY id, date, amount ORDER BY OID DESC) AS rn
FROM yourTable
)
DELETE FROM cte WHERE rn=2; -- or rn >=2 to delete all duplicates
To delete the record with the greater OID, just change the ORDER BY clause to this:
ORDER BY OID

Assuming you have want to keep the row with max OID for each ID, you can use this:
delete
from your_table t1
using (
select id, max(OID)
from your_table
group by id
) t2
where t1.id = t2.id and t1.OID <> t2.OID;
Or:
delete
from your_table t1
where exists (
select 1
from your_table t2
where t1.id = t2.id
and t1.OID < t2.OID
);

If id, date, amount are the business key in your case, you can remove all records beyond the second by grouping by these columns. Something like this:
DELETE FROM theTable
WHERE OID IN (
SELECT OID
FROM ( SELECT ROW_NUMBER() OVER (PARTITION BY id, date, amount) AS RowNo, OID
FROM tab) x
WHERE x.RowNo > 1);
Note: this should work regardless of the number of duplicates.

Related

Avoid duplicate records from a particular column of a table

I have a table as shown in the image.In Number column, the values are appeared more than once (for example 63 appeared twice). I would like to keep only one value. Please see my code:
delete from t1 where
(SELECT *,row_number() OVER (
PARTITION BY
Number
ORDER BY
Date) as rn from t1 where rn > 1)
It shows error. Can anyone please assist.
enter image description here
The column created by row_number() was not accessed by your main query, in order to enable that, you can create a quick sub query and use the desired filter
SELECT *
FROM
(
SELECT *,
row_number() OVER (PARTITION BY Number ORDER BY Date) as rn
FROM t1 ) T
where rn = 1;
The partition by determines how row numbers repeat. The row numbers are assigned per group of partition by keys. So, you can get duplicates.
If you want a unique row number over all rows, just leave out the partition by:
select t1.*
from (select t1.*,
row_number() over (order by date) as rn
from t1
) t1
where rn > 1
if you want to keep only one value, rn = 1 instead of "> 1"

Deleting specific rows - SQLite

I'm trying to delete duplicate rows from my table 'exchange_transactions' associated with the surgeon name 'Lucille Torres' using a cte. The transaction_id column should be unique but is duplicated in this case hence the attempt to delete them. I tried this code but it doesn't seem to work. Replacing 'DELETE' with 'SELECT *' shows me all the rows I want to delete. What am I doing wrong?
WITH cte AS (
SELECT
transaction_id,
surgeon,
ROW_NUMBER() OVER (
PARTITION BY
transaction_id
) row_num
FROM exchange_transactions)
DELETE FROM cte
WHERE surgeon = 'Lucille Torres' AND row_num > 1
Use the column ROWID to get the minimum value for each transaction_id that you will not delete:
delete from exchange_transactions
where surgeon = 'Lucille Torres'
and exists (
select 1 from exchange_transactions t
where t.surgeon = exchange_transactions.surgeon
and t.transaction_id = exchange_transactions.transaction_id
and t.rowid < exchange_transactions.rowid
)
Deleting directly from a CTE won't work in SqLite.
But if that table has a primary key (f.e. id)
then the result of the CTE can be used in the delete.
For example:
WITH CTE_DUPS AS
(
SELECT id,
ROW_NUMBER() OVER (
PARTITION BY surgeon, transaction_id
ORDER BY id) AS rn
FROM exchange_transactions
WHERE surgeon = 'Lucille Torres'
)
DELETE
FROM exchange_transactions
WHERE id IN (select id from CTE_DUPS where rn > 1)
Test on db<>fiddle here

Delete Distinct column and latest date of other column

I have a table where the primary key is a composite key of ID and date. Is there a way that I can delete a single row where ID matches and the date is the latest date?
I am new to SQL, so I have tried a few things, but I either don't get the results I am looking for or cant get the syntax correct
DELETE FROM Master
WHERE ((Identifier = 'SomeID')
AND (EffectiveDate = MAX(EffectiveDate));
There are multiple columns with the same ID, but different dates, ie.
ID EffectiveDate
-------------------------
A '2019-09-18'
A '2019-09-17'
A '2019-09-16'
Is there a way I can delete only the row with A | '2019-09-18'?
You can use window functions and an updatable CTE:
with todelete as (
select t.*, row_number() over (partition by id order by effective_date desc) as seqnum
from t
)
delete from todelete
where seqnum = 1;
Note: If you want to limit this to a single id, then be sure to include a where id = 'a' in either the subquery or outer query.
use row_number()
delete from (select *, row_number() over(partition by id order by effectivedate desc) rn from table_name
) a where a.rn=1
A correlated subquery might get the job done:
DELETE FROM Master
WHERE
Identifier = 'SomeID'
AND EffectiveDate = (
SELECT MAX(EffectiveDate) FROM Master WHERE Identifier = 'SomeID'
)
;
Use the CTE Function to Delete the Row but the below Query will not delete the Record of Max Date of those ID's where Single Record exist against that.
with todelete as (
select t.*, row_number() over (partition by id order by effective_date desc) as seqnum
from t
)
delete from todelete
where seqnum = 1 and id in(select distinct id from todelete where seqnum<>1)
With correlated subquery for all IDs:
delete table1
from table1 t1
where t1.EffectiveDate =
(
select max(t2.EffectiveDate)
from table1 t2
where t2.ID = t1.ID
)

Delete duplicates but keep 1 with multiple column key

I have the following SQL select. How can I convert it to a delete statement so it keeps 1 of the rows but deletes the duplicate?
select s.ForsNr, t.*
from [testDeleteDublicates] s
join (
select ForsNr, period, count(*) as qty
from [testDeleteDublicates]
group by ForsNr, period
having count(*) > 1
) t on s.ForsNr = t.ForsNr and s.Period = t.Period
Try using following:
Method 1:
DELETE FROM Mytable WHERE RowID NOT IN (SELECT MIN(RowID) FROM Mytable GROUP BY Col1,Col2,Col3)
Method 2:
;WITH cte
AS (SELECT ROW_NUMBER() OVER (PARTITION BY ForsNr, period
ORDER BY ( SELECT 0)) RN
FROM testDeleteDublicates)
DELETE FROM cte
WHERE RN > 1
Hope this helps!
NOTE:
Please change the table & column names according to your need!
This is easy as long as you have a generated primary key column (which is a good idea). You can simply select the min(id) of each duplicate group and delete everything else - Note that I have removed the having clause so that the ids of non-duplicate rows are also excluded from the delete.
delete from [testDeleteDublicates]
where id not in (
select Min(Id) as Id
from [testDeleteDublicates]
group by ForsNr, period
)
If you don't have an artificial primary key you may have to achieve the same effect using row numbers, which will be a bit more fiddly as their implementation varies from vendor to vendor.
You can do with 2 option.
Add primary-key and delete accordingly
http://www.mssqltips.com/sqlservertip/1103/delete-duplicate-rows-with-no-primary-key-on-a-sql-server-table/
'2. Use row_number() with partition option, runtime add row to each row and then delete duplicate row.
Removing duplicates using partition by SQL Server
--give group by field in partition.
;with cte(
select ROW_NUMBER() over( order by ForsNr, period partition ForsNr, period) RowNo , * from [testDeleteDublicates]
group by ForsNr, period
having count(*) > 1
)
select RowNo from cte
group by ForsNr, period

SQL to delete the duplicates in a table

I have a table transaction which has duplicates. i want to keep the record that had minimum id and delete all the duplicates based on four fields DATE, AMOUNT, REFNUMBER, PARENTFOLDERID. I wrote this query but i am not sure if this can be written in an efficient way. Do you think there is a better way? I am asking because i am worried about the run time.
DELETE FROM TRANSACTION
WHERE ID IN
(SELECT FIT2.ID
FROM
(SELECT MIN(ID) AS ID, FIT.DATE, FIT.AMOUNT, FIT.REFNUMBER, FIT.PARENTFOLDERID
FROM EWORK.TRANSACTION FIT
GROUP BY FIT.DATE, FIT.AMOUNT , FIT.REFNUMBER, FIT.PARENTFOLDERID
HAVING COUNT(1)>1 and FIT.AMOUNT >0) FIT1,
EWORK.TRANSACTION FIT2
WHERE FIT1.DATE=FIT2.DATE AND
FIT1.AMOUNT=FIT2.AMOUNT AND
FIT1.REFNUMBER=FIT2.REFNUMBER AND
FIT1.PARENTFOLDERID=FIT2.PARENTFOLDERID AND
FIT1.ID<>FIT2.ID)
It would probably be more efficient to do something like
DELETE FROM transaction t1
WHERE EXISTS( SELECT 1
FROM transaction t2
WHERE t1.date = t2.date
AND t1.refnumber = t2.refnumber
AND t1.parentFolderId = t2.parentFolderId
AND t2.id > t1.id )
DELETE FROM transaction
WHERE ID IN (
SELECT ID
FROM (SELECT ID,
ROW_NUMBER () OVER (PARTITION BY date
,amount
,refnumber
,parentfolderid
ORDER BY ID) rn
FROM transaction)
WHERE rn <> 1);
I will try like this
I would try something like this:
DELETE transaction
FROM transaction
LEFT OUTER JOIN
(
SELECT MIN(id) as id, date, amount, refnumber, parentfolderid
FROM transaction
GROUP BY date, amount, refnumber, parentfolderid
) as validRows
ON transaction.id = validRows.id
WHERE validRows.id IS NULL