How can I delete duplicated row - sql

I am using SQL Server 2008 R2.
I found duplicate rows with this script:
SELECT CLDest, CdClient,
COUNT(CLDest) AS NumOccurrences
FROM DEST
GROUP BY CLDest,CdClient
HAVING ( COUNT(CLDest) > 1 )
It return 48 entries
Before I delete I have to make sure that I delete the doubles:
SELECT DEST.CdClient
,DEST.CLDest
FROM [Soft8Exp_Client_WEB].[dbo].[DEST]
WHERE DEST.CdClient IN (SELECT CdClient
FROM DEST
GROUP BY CdClient
HAVING (COUNT(CLDest) > 1) )
AND DEST.CLDest IN (SELECT CLDest
FROM DEST
GROUP BY CLDest
HAVING (COUNT(CLDest) > 1) )
This query returns 64628 entries
So I suppose my select is wrong.

SQL Server has the nice property of updatable CTEs. When combined with the function row_number(), this does what you want:
with todelete as (
select d.*,
row_number() over (partition by CLDest, CdClient order by newid()) as seqnum
from dest d
)
delete from todelete
where seqnum > 1;
This version will randomly delete one of the duplicates. What it does is assign a sequential number to the rows with the same value and delete all but the first one found. If you want to keep something by date, then use a different expression in the order by.

;WITH Duplicates
AS
(
SELECT CLDest
, CdClient
, ROW_NUMBER() OVER (PARTITION BY CLDest, CdClient ORDER BY CdClient) AS Rn
FROM DEST
)
DELETE FROM Duplicates
WHERE RN > 1

SELECT DEST.CdClient,DEST.CLDest
FROM [Soft8Exp_Client_WEB].[dbo].[DEST]
WHERE DEST.CdClient+DEST.CLDest
IN (
SELECT CdClient+CLDest FROM DEST GROUP BY CLDest HAVING ( COUNT(CLDest) > 1 )
)

Related

Delete all rows per group except one using CTE

I'm using MariaDB, and I am trying to make two things, both are failing.
(1) I'm trying to delete all duplicated items, but maintaining one record.
WITH CTE AS (
SELECT asin, ROW_NUMBER() OVER (PARTITION BY asin ORDER BY created_at) AS n
FROM asin_list
)
DELETE
FROM CTE
WHERE n > 1
This returns the following error:
You have an error in your SQL syntax; check the manual that
corresponds to your MariaDB.
(2) As a workaround from above query I was trying to insert all duplicated ASINs into a table, having as a goal to select max(asin) later on and delete it.
WITH CTE AS (
SELECT asin, ROW_NUMBER() OVER (PARTITION BY asin ORDER BY created_at) AS n
FROM asin_list
)
INSERT INTO temp1 *
FROM FROM CTE
WHERE n > 1
But this returns the same error. Can you please, help me fixing this?
You could write the statement as:
select * -- delete
from asin_list as newer
where exists (
select *
from asin_list as older
where older.asin = newer.asin and (
older.created_at < newer.created_at or
older.created_at = newer.created_at and older.pri_key < newer.pri_key
)
)
Try to add “;” before “WITH”. Something like:
;WITH CTE AS ( SELECT asin , row_number() OVER(PARTITION BY asin ORDER BY asin_list.created_at) AS n FROM asin_list ) delete from CTE WHERE n > 1
Let me know

Return second from the last oracle sql

SELECT * FROM
(
SELECT DISTINCT(TRUNC(receipt_dstamp))
FROM inventory
WHERE substr(location_id,1,3) = 'GI-'
ORDER BY 1 ASC
)
WHERE ROWNUM <= 5
Output:
Hi all, i've got this subeqery and in this case my oldest date is in row 1, i want to retrive only second from the last(from the top in this case) which is gonna be 01-SEP-21.
I was trying to play with ROWNUM and OVER but without any results, im getting blank output.
Thank you.
Full query:
SELECT TRUNC(receipt_dstamp) as old_putaway_date, COUNT(tag_id) as tag_old_putaway
FROM inventory
WHERE substr(location_id,1,3) = 'GI-'
AND TRUNC(receipt_dstamp) IN (
SELECT * FROM
(
SELECT DISTINCT(TRUNC(receipt_dstamp))
FROM inventory
WHERE substr(location_id,1,3) = 'GI-'
ORDER BY 1 ASC
)
WHERE ROWNUM = 1
)
GROUP BY TRUNC(receipt_dstamp);
You should be able to simplify the entire query to:
SELECT old_putaway_date,
COUNT(tag_id) as tag_old_putaway
FROM (
SELECT TRUNC(receipt_dstamp) as old_putaway_date,
tag_id,
DENSE_RANK() OVER (ORDER BY TRUNC(receipt_dstamp)) AS rnk
FROM inventory
WHERE substr(location_id,1,3) = 'GI-'
)
WHERE rnk = 3
GROUP BY
old_putaway_date;
You can use dense_rank() :
SELECT * FROM (
SELECT L.*,DENSE_RANK()
OVER (PARTITION BY L.TAG_OLD_PUTAWAY ORDER BY L.OLD_PUTAWAY_DATE DESC) RNK
FROM
(
SELECT TRUNC(receipt_dstamp) as old_putaway_date, COUNT(tag_id) as tag_old_putaway
FROM inventory
WHERE substr(location_id,1,3) = 'GI-'
AND TRUNC(receipt_dstamp) IN (
SELECT * FROM
(
SELECT DISTINCT(TRUNC(receipt_dstamp))
FROM inventory
WHERE substr(location_id,1,3) = 'GI-'
ORDER BY 1 ASC
)
WHERE ROWNUM = 1
)
GROUP BY TRUNC(receipt_dstamp)
) L
) WHERE RNK = 2
You are using an old Oracle syntax that is not standard compliant in the regard that it relies on a subquery result order. (Sub)query results are unordered data sets by definition, but Oracle lets this pass in order to make their ROWNUM work with it.
Oracle now supports the standard SQL FETCH clause, which you should use instead.
SELECT DISTINCT TRUNC(receipt_dstamp) AS receipt_date
FROM inventory
WHERE SUBSTR(location_id, 1, 3) = 'GI-'
ORDER BY receipt_date
OFFSET 2 ROWS
FETCH NEXT 1 ROW ONLY;
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6

SQL Query Pattern Selecting

Table part
I need to select all from the rows where the first 7 characters of the Assoc.Ref column are the same on a specific day.
Result example
You need aggregation :
SELECT t.col
FROM table t
GROUP BY t.col
HAVING COUNT(*) > 1;
If you want exactly two rows for each then use COUNT(*) = 2 instead .
If you want all rows then you can use windows function :
SELECT t.*
FROM (SELECT t.*,
COUNT(*) OVER(PARTITION BY col) AS cnt
FROM table t
) t
WHERE t.cnt > 1;
EDIT : After made update on question you might need LEFT() :
SELECT t.*
FROM (SELECT t.*,
COUNT(*) OVER(PARTITION BY CAST(Date_created AS date), LEFT(associated_ref, 7)) AS cnt
FROM table t
) t
WHERE t.cnt > 1 AND CAST(t.Date_created AS date) = '2019-02-08';
If the Date_created has no time then no conversation is needed. Just use Date_created instead.

Delete duplicates but keep 1 with multiple column key

I have the following SQL select. How can I convert it to a delete statement so it keeps 1 of the rows but deletes the duplicate?
select s.ForsNr, t.*
from [testDeleteDublicates] s
join (
select ForsNr, period, count(*) as qty
from [testDeleteDublicates]
group by ForsNr, period
having count(*) > 1
) t on s.ForsNr = t.ForsNr and s.Period = t.Period
Try using following:
Method 1:
DELETE FROM Mytable WHERE RowID NOT IN (SELECT MIN(RowID) FROM Mytable GROUP BY Col1,Col2,Col3)
Method 2:
;WITH cte
AS (SELECT ROW_NUMBER() OVER (PARTITION BY ForsNr, period
ORDER BY ( SELECT 0)) RN
FROM testDeleteDublicates)
DELETE FROM cte
WHERE RN > 1
Hope this helps!
NOTE:
Please change the table & column names according to your need!
This is easy as long as you have a generated primary key column (which is a good idea). You can simply select the min(id) of each duplicate group and delete everything else - Note that I have removed the having clause so that the ids of non-duplicate rows are also excluded from the delete.
delete from [testDeleteDublicates]
where id not in (
select Min(Id) as Id
from [testDeleteDublicates]
group by ForsNr, period
)
If you don't have an artificial primary key you may have to achieve the same effect using row numbers, which will be a bit more fiddly as their implementation varies from vendor to vendor.
You can do with 2 option.
Add primary-key and delete accordingly
http://www.mssqltips.com/sqlservertip/1103/delete-duplicate-rows-with-no-primary-key-on-a-sql-server-table/
'2. Use row_number() with partition option, runtime add row to each row and then delete duplicate row.
Removing duplicates using partition by SQL Server
--give group by field in partition.
;with cte(
select ROW_NUMBER() over( order by ForsNr, period partition ForsNr, period) RowNo , * from [testDeleteDublicates]
group by ForsNr, period
having count(*) > 1
)
select RowNo from cte
group by ForsNr, period

Total Row Count in sql query---sql server 2008

My query is as follows
BEGIN
WITH MyCTE
AS (
SELECT T.MusicAlbumTitle
,D.musicTitle
,D.mVideoID
,D.musicFileName
,T.ReleaseDate AS ReleasedDate
,D.MusicLength
,D.musicSinger
,D.MusicVideoID
,D.ExternalLink
,D.CoverImg
,ROW_NUMBER() OVER (
PARTITION BY D.MusicVideoID ORDER BY D.mVideoID
) AS row_num
FROM dbo.Music_Video T
JOIN dbo.Music_Video_Details D ON T.MusicVideoID = D.MusicVideoID
WHERE T.PortalID = #PortalID
AND T.CultureCode = #CultureCode
AND T.ComingSoon <> 1
GROUP BY T.MusicAlbumTitle
,D.musicTitle
,D.mVideoID
,T.ReleaseDate
,D.musicFileName
,D.MusicLength
,D.musicSinger
,D.MusicVideoID
,D.ExternalLink
,D.CoverImg
)
SELECT a.mVideoID
,a.MusicVideoID
,a.musicFileName
,a.MusicAlbumTitle
,a.ReleasedDate
,a.row_num
,a.CoverImg
,a.ExternalLink
,a.musicTitle
,a.MusicLength
FROM MyCTE a
WHERE row_num = 1
ORDER BY MusicVideoID DESC
END
I need to achieve total row count from last select statement.
which mean total row count that is being selected.
or any idea that might be use in this condition
How can i do this ..
Please add COUNT(*) OVER() in your select, which returns total rows selected as a new column.
Ex:
SELECT
*,
COUNT(*) OVER() AS [Total_Rows]
FROM YourTable
Just to be clear, you need to add the count to the CTE, not the outer query. The outer select is returning only one row, so the count would always be one.
The CTE should start:
WITH MyCTE
AS (
SELECT T.MusicAlbumTitle
,D.musicTitle
,D.mVideoID
,D.musicFileName
,T.ReleaseDate AS ReleasedDate
,D.MusicLength
,D.musicSinger
,D.MusicVideoID
,D.ExternalLink
,D.CoverImg
,ROW_NUMBER() OVER (
PARTITION BY D.MusicVideoID ORDER BY D.mVideoID
) AS row_num,
COUNT(*) over () as total_count