Delete duplicate rows based on a condition - sql

I have a table that has ID and EventDate. It has duplicate rows as I used Union of two tables. Now I got to have the rows with the minimum Eventdate and remove the other duplicates.
the table for eg
ID | Date
--- | ---
1 | 10/27/1993
1 | 10/27/1994
2 | 10/17/1993
2 | 08/15/1993
Delete duplicate rows based on condition

You can use ROW_NUMBER:
;WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY EventDate)
FROM dbo.YourTable
)
DELETE FROM CTE
WHERE RN > 1;

Use this!
delete A
(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY [COLUMN] ORDER BY EventDate ASC),*
FROM dbo.Your_Table
) AS A
where rn > 1

If we talk about Firebird it is enough
DELETE FROM table1 t1_1
WHERE EXISTS(
SELECT t1_2.id FROM table1 t1_2 WHERE t1_1.EventDate>t1_2.EventDate
);

As documentation (if we about MySQL) you cannot "delete from a table and select from the same table in a subquery".
So
CREATE table1 LIKE table2;
INSERT table2 SELECT * FROM table1;
DELETE FROM table1
WHERE EXISTS(
SELECT t2.id FROM table2 t2 WHERE table1.EventDate>t2.EventDate
);
DROP TABLE table2;
Where table1 you original table.

Related

SQL Joining two tables and removing the duplicates from the two tables but without loosing any duplicates from the tables itslef

I want to join two tables and remove duplicates from both the tables but keeping any duplicate value found in the first table.
T1
Name
-----
A
A
B
C
T2
Name
----
A
D
E
Expected result
A - > FROM T1
A - > FROM T1
B
C
D
E
I tried union but removes all duplicates of 'A' from both tables.
How can I achieve this?
Filter T2 before UNION ALL
select col
from T1
union all
select col
from T2
where not exists (select 1 from T1 where T1.col = T2.col)
Assuming you want the number of duplicates from the table with the most repetitions for each value, you can do it with the ROW_NUMBER() windowing function, to eliminate duplicates by their sequence with the set of repetitions in each table.
SELECT Name FROM (
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T1
UNION
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T2
) x
ORDER BY Name
To see how this works out, we add two B rows to T2 then do this:
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T1
Name Row
A 1
A 2
B 1
C 1
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T2
Name Row
A 1
B 1
B 2
D 1
E 1
Now UNION them without ALL to combine and eliminate duplicates:
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T1
UNION
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T2
Name Row
A 1
A 2
B 1
B 2
C 1
D 1
E 1
The final query up top is then just eliminating the Row column and sorting the result, to ensure ascending order.
See SQL Fiddle for demo.
select * from T1
union all
select * from T2 where name not in (select distinct name from T1)
Sql Fiddle Demo
you should use "union all" instead of "union".
"union" remove other duplicated records while "union all" gives all of them.
for you result,because of we filtered intersects from table 2 in "where",we don't need "UNION ALL"
select col1 from t1
union
select col1 from t2 where t2.col1 not in(select t1.col1 from t1)
I D'not know the following code is good practice or not But it's working
select name from T1
UNION
select name from T2 Where name not in (select name from T1)
The Above Query Filter the value based on T1 value and then join two tables values and show the result.
I hope it's helps you thanks.
Note : It's not better way to get result it's affect your performance.
I sure i update the better solution after my research
You want all names from T1 and all names from T2 except the names that are in T1.
So you can use UNION ALL for the 2 cases and the operator EXCEPT to filter the rows of T2:
SELECT Name FROM T1
UNION ALL
(
SELECT Name FROM T2
EXCEPT
SELECT Name FROM T1
)
See the demo.
Results:
> | Name |
> | :--- |
> | A |
> | A |
> | B |
> | C |
> | D |
> | E |

SQL remove duplicate row depend on certain value

I spend day in hope to figure out how to solve this query.
I have following table
ID Name Pregnancy Gender
1 Raghad Yes Female
1 Raghad No Female
2 Ohoud no Male
What I need is to remove duplicate (in this case 1,1) and to keep one of these rows which has a pregnancy status of yes.
To clarify, I can't use delete since it's a restricted database. I can only retrieve data.
Using an exists clause:
DELETE
FROM yourTable t1
WHERE
pregnancy = 'no' AND
EXISTS (SELECT 1 FROM yourTable t2 WHERE t2.ID = t1.ID AND t2.pregnancy = 'yes');
There are other ways to go about doing this, e.g. using ROW_NUMBER, but as you did not tag your database, I offer the above solution which should work on basically any database.
If you want to just view your data with the "duplicates" removed, then use:
SELECT *
FROM yourTable t1
WHERE
pregnancy = 'yes' OR
NOT EXISTS (SELECT 1 FROM yourTable t2 WHERE t2.ID = t1.ID AND t2.pregnancy = 'yes');
If column Pregnancy have just two values "Yes" and "No", in that case you can use ROW_NUMBER() also to get the results.
;WITH CTE
AS (
SELECT *,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Pregnancy DESC) RN
FROM TABLE_NAME
)
SELECT *
FROM CTE
WHERE RN = 1
In case of multiple values when you want to give highest priorty to "Yes", you can write your query like following
;WITH CTE
AS (
SELECT *,ROW_NUMBER() OVER
(PARTITION BY id ORDER BY CASE WHEN Pregnancy = 'Yes' then 0 else 1 end) RN
FROM TABLE_NAME
)
SELECT *
FROM CTE
WHERE RN= 1
For this sample data you can group by ID, Name, Gender and return the maximum value of the column Pregnancy for each group since Yes is greater compared to No:
SELECT ID, Name, MAX(Pregnancy) Pregnancy, Gender
FROM tablename
GROUP BY ID, Name, Gender
See the demo.
Results:
> ID | Name | Pregnancy | Gender
> -: | :----- | :-------- | :-----
> 1 | Raghad | Yes | Female
> 2 | Ohoud | No | Male
Here is how you could do it in MySQL 8.
Similar Common Table Expressions exist in SQL Server and Oracle.
There you may need to add a comma after then closing parentheses that
ends the CTE (with) definition.
with dups as (
Select id from test
group by id
Having count(1) > 1
)
select * from test
where id in (select id from dups)
and Pregnancy = 'Yes'
union all
select * from test where id not in (select id from dups);
You can see it in action, by running it here
Note this does it without deleting the original.
But it gives you a result set to work with that has what you want.
If you wanted to delete, then you could use this instead, after the dups CTE definition:
delete from test
where id in (select id from dups) and Pregnancy = 'No'
Or distill this into:
delete from test
where id in (Select id from test
group by id
Having count(1) > 1) and Pregnancy = 'No'
1) First of all, update design of your table. ID must be primary key. This would automatically restrict the duplicate rows having same ID.
2) You can use Group by and having clause to remove duplicates
delete from table where pregnancy='no' and exists (SELECT
id
FROM table
GROUP BY id
HAVING count(id)>1)

Delete except max records on duplicate

I have table like below format,
Item_Txn Item_Name
101 Mouse
102 Mouse
103 Mouse
104 Keyboard
105 CPU
106 Monitor
107 Monitor
I want to delete duplicate items except max Item_Txn. For eg., Mouse is duplicate items(3 times). I want to delete Mouse record except (103, Mouse).
For SQL Server 2008 and newer:
;WITH cte AS
(
SELECT Item_Txn, Item_Name,
ROW_NUMBER() OVER (PARTITION BY Item_Name ORDER BY Item_Txn DESC) AS RowNumber
FROM my_table
)
DELETE FROM cte
WHERE RowNumber > 1
DELETE a
FROM my_table a
WHERE EXISTS (SELECT *
FROM my_table b
WHERE a.Item_Name = b.Item_Name
AND b.Item_Txn > a.Item_Txn);
Try this:
DELETE FROM MyTable
WHERE Item_Txn IN (
SELECT K.Item_Txn
FROM ( SELECT Item_Txn ,
ROW_NUMBER() OVER ( PARTITION BY Item_Name ORDER BY Item_Txn DESC ) AS RN
FROM MyTable
) AS K
WHERE K.RN > 1 );
Try this,
delete from table
where Item_Txn not in
(select max(Item_Txn) from table group by Item_Name)
You can do it using an Intermediate Subquery
DELETE FROM `table`
WHERE id NOT IN (
SELECT id
FROM (
SELECT id
FROM `table`
ORDER BY id DESC
LIMIT 1 -- keep this many records
) foo
);
Above is for MySQL
This is for SQL
DELETE FROM chat WHERE id NOT IN
(SELECT TOP 50 id FROM chat ORDER BY id DESC)
DELETE t
FROM YourTable t
OUTER APPLY (
SELECT MAX(Item_Txn) as Item_Txn
FROM YourTable t1
WHERE t1.Item_Name = t.Item_Name
) as p
WHERE p.Item_Txn != t.Item_Txn
That query will left only:
103 Mouse
104 Keyboard
105 CPU
107 Monitor
;WITH CTE AS
(
SELECT MAX(Item_Txn)Item_Txn, Item_Name FROM ITEM GROUP BY Item_Name
)
DELETE t
FROM ITEM t
WHERE EXISTS
(
SELECT 1 FROM CTE WHERE t.Item_Name = CTE.Item_Name AND t.Item_Txn <> CTE.Item_Txn
)

Send faulty rows to other table

I have a table with many columns in which I have to find the duplicate based on one column.
I.e. if I found duplicate customer_name in the Customer_name then
I have to remove all repeating from the source table.
Send all those rows to other table with same structure.
If you have two tables like this:
CREATE TABLE t1 (ID int, customerName varchar(64))
CREATE TABLE t2 (ID int, customerName varchar(64))
You can make something like this: (The ID column is for just to have a base for the deceision what to keep, you can change it as you need)
--First Copy
WITH CTE_T1
AS
(
SELECT
ID,
customerName,
ROW_NUMBER() OVER(PARTITION BY customerName ORDER BY ID) as OrderOfCustomer
FROM
t1
)
INSERT INTO t2
SELECT ID, customerName FROM cte_T1
WHERE OrderOfCustomer > 1;
--Then Delete
WITH CTE_T1
AS
(
SELECT
ID,
customerName,
ROW_NUMBER() OVER(PARTITION BY customerName ORDER BY ID) as OrderOfCustomer
FROM
t1
)
DELETE FROM CTE_T1
WHERE OrderOfCustomer > 1
Here is an SQLFiddle to show how it works.
I guess each row has a unique Id primary key.
This inserts into your duplicate rows table :
Insert into duplicateRowsTable
select * from myTable t1
where (select count(*) from myTable t2 where t1.customerId = t2.customerId) > 1
You delete from the duplicateRowsTable the good rows:
delete from duplicatesTable
where --this is not the faulty row for each customerId
finally you delete from your first table :
delete from myTable
where id IN (select id from duplicatesTable)
Try this:
For moving duplicates
INSERT Into DuplicatesTable
SELECT *
FROM
(SELECT *, ROW_NUMBER() OVER(PARTITION BY Customer_name ORDER BY Customer_name) As RowID,
FROM SourceTable) as temp
WHERE RowID > 1
For deteting:
WITH TableCTE
AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY Customer_name ORDER BY Customer_name) AS RowID
FROM SourceTable
)
DELETE
FROM TableCTE
WHERE RowID> 1

get id value when max is on other column

is this the best way to get the id value of the most recent date?
table1
id,entrydate
1,8/23/2012
2,8/24/2012
3,8/23/2012
select id from table1 where entrydate = ( select MAX(entrydate) from table1 )
Assuming you're using SQL-Server, you can use ORDER BY and then take one row:
SELECT TOP 1 id
FROM table
ORDER BY entrydate DESC
In MySql it is LIMIT:
SELECT id
FROM table
ORDER BY entrydate DESC
LIMIT 1
In Oracle:
SELECT id
FROM (SELECT id FROM table ORDER BY entrydate DESC)
WHERE ROWNUM = 1
You already have a good way there. I'd watch out for ties:
select top id from table1 where entrydate = ( select MAX(entrydate) from table1 )
This, of course, assuming you are using SQL Server.
SELECT id FROM table1 ORDER BY entrydate DESC LIMIT 1
You should be able to do SELECT id FROM table1 ORDER BY entrydate DESC LIMIT 1
Not exactly, you want to do this:
For SQL Server:
SELECT TOP 1 id, MAX(entrydate) FROM table1 GROUP BY id
For MySQL:
SELECT id, MAX(entrydate) FROM table1 GROUP BY id LIMIT 1