This question already has answers here:
Delete duplicate records in SQL Server?
(10 answers)
Closed 4 years ago.
I'm trying to remove all rows from a database that duplicate an int (named user_id) in order to keep just the first occurrence. Not sure why my attempts didn't work, and would like an explanation of how you solved the problem even more than a solution.
My attempt (and sample data) http://sqlfiddle.com/#!18/9f6fc/5
End goal:
user_id, PAios_AccountId
123 a
223 b
The easiest way is to use ROW_NUMBER:
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY PAios_AccountId) AS rn
FROM [User]
)
DELETE FROM cte
WHERE rn <> 1;
DBFiddle Demo
Related
This question already has answers here:
Delete duplicate records from a SQL table without a primary key
(20 answers)
Delete duplicate entries keeping one entry of each if id column not available
(2 answers)
Closed 2 years ago.
I loaded some data into a SQL Server table from a .CSV file for test purposes, I don't have any primary key, unique key or auto-generated ID in that table.
Helow is an example of the situation:
select *
from people
where name in (select name
from people
group by name
having count(name) > 1)
When I run this query, I get these results:
The goal is to keep one row and remove other duplicate rows.
Is there any way other than save the content somewhere else, delete all duplicate rows and insert a new one?
Thanks for helping!
You could use an updatable CTE for this.
If you want to delete rows that are exact duplicates on the three columns (as shown in your sample data and explained in the question):
with cte as (
select row_number() over(partition by name, age, gender order by (select null)) rn
from people
)
delete from cte where rn > 1
If you want to delete duplicates on name only (as shown in your existing query):
with cte as (
select row_number() over(partition by name order by (select null)) rn
from people
)
delete from cte where rn > 1
How are you defining "duplicate"? Based on your code example, it appears to be by name.
For the deletion, you can use an updatable CTE with row_number():
with todelete as (
select p.*,
row_number() over (partition by name order by (select null)) as seqnum
from people p
)
delete from todelete
where seqnum > 1;
If more columns define the duplicate, then adjust the partition by clause.
This question already has answers here:
Selecting first row per group
(2 answers)
Closed 6 years ago.
I have the table named DealOffers :
I want to select only one record from each group of dealIds where
Price is minimum.
i.e : the expected output should be like this:
You can do something like this. However, you should consider performance if you end up having to do this on a massive scale.
select *
from (
select *,
SeqNum = row_number() over(
partition by DealId
order by Price)
from DealOffers) do
where do.SeqNum = 1;
This question already has answers here:
How to delete duplicate rows in SQL Server?
(26 answers)
Closed 6 years ago.
I have a table with 82,535 rows, where 65,087 rows are unique by ID. When I pull the entire result set of 82,535 and copy to Excel and remove duplicates, it shows that there are 17,448 duplicates. But when I'm using the query below I'm getting different results:
SELECT
BLD_ID, COUNT(BLD_ID) AS [BLD_ID COUNT]
FROM
Project.BreakageAnalysisOutcome_SentToAIM
GROUP BY
BLD_ID
HAVING
COUNT(BLD_ID) > = 2
This query returns a value of 17,364
I know for sure that the number of unique BLD_ID is 65,087
Most likely reason fro that is duplicate record may have more than 2 occurrence.
find duplicate count
Select COUNT(BLD_ID)- COUNT( DISTINCT BLD_ID)
From Project.BreakageAnalysisOutcome_SentToAIM
Use CTE with a Row_Number Function instead of count with group by clause and filer by Row_Number > 1.
;WITH cte
AS
(
SELECT ID,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID) AS Rn
FROM [Table1]
)
DELETE cte WHERE Rn > 1
This question already has answers here:
How to delete duplicate rows in SQL Server?
(26 answers)
Closed 6 years ago.
I have a table and need to delete entire row where ID occurs second and subsequent times, but leave the first occurrence of suCustomerIDBy the way. M table has ID which is a primary key and CustometID which is duplicated. So I need to remove all rows with duplicated CustomerID.
Delete From Table1 where ID IN (select ID From Table1 where count(distinct CutomerID) >=2 group by CustomerID)
The code above will delete all id including the first occurrence of each of the IDs, but I need to keep their first occurrence. Please advise.
This code should give you what you need.
There may be better ways to do it if you can provide the full table schema for Table1
If you obtain the row number and then just ignore the first ones:
;WITH cte
AS
(
SELECT ID,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID) AS Rn
FROM [Table1]
)
DELETE cte WHERE Rn > 1
delete a from(
Select dense_rank() OVER(PARTITION BY id ORDER BY id) AS Rn,*
from Table1)a
where a.Rn>1
This question already has answers here:
Selecting specific row number in sql [duplicate]
(2 answers)
Closed 7 years ago.
I wonder, is there any option like "Skip"(from LINQ) in SQL to select particular rows in a table.
I mean, in a table named "abcd". In that table 300 rows are there. but from that 300 rows i want to select rows from 233 to 300 or 233 to 258.
How to do this?? Please anyone help.
You can use a cte or derived table for this:
With cte as (
Select col1, col2, row_number() over(order by sortColumn) as rn
From table
)
Select *
from cte
Where rn >= 233
And rn <= 258