What is the proper syntax for delete query with BigQuery - sql

I have the following query where the ID is not UNIQUE:
delete
( SELECT ROW_NUMBER() OVER (PARTITION BY createdOn, id order by updatedOn) as rn , id FROM `a.tab` ) as t
WHERE t.rn> 1;
The inner select return the result but the delete fails with:
Error: Syntax error: Unexpected "(" at [2:7]
What is the syntax problem here?

Unlike SQL Server, and a few other databases, Big Query does not allow deleting directly from a CTE. But, we can specify your target table, and then use the row number in the WHERE clause.
DELETE
FROM yourTable AS t1
WHERE (SELECT ROW_NUMBER() OVER (PARTITION BY createdOn, id ORDER BY updatedOn)
FROM yourTable AS t2
WHERE t1.id = t2.id) > 1;
The idea here is to correlate the row number value to each row in the delete statement using the id, which is presumably a primary key.

Use query as ::
delete from t
From ( SELECT ROW_NUMBER() OVER (PARTITION BY createdOn, id order by updatedOn) as rn , id FROM a.tab ) as t
WHERE t.rn> 1;
Hope this works

Related

Avoid duplicate records from a particular column of a table

I have a table as shown in the image.In Number column, the values are appeared more than once (for example 63 appeared twice). I would like to keep only one value. Please see my code:
delete from t1 where
(SELECT *,row_number() OVER (
PARTITION BY
Number
ORDER BY
Date) as rn from t1 where rn > 1)
It shows error. Can anyone please assist.
enter image description here
The column created by row_number() was not accessed by your main query, in order to enable that, you can create a quick sub query and use the desired filter
SELECT *
FROM
(
SELECT *,
row_number() OVER (PARTITION BY Number ORDER BY Date) as rn
FROM t1 ) T
where rn = 1;
The partition by determines how row numbers repeat. The row numbers are assigned per group of partition by keys. So, you can get duplicates.
If you want a unique row number over all rows, just leave out the partition by:
select t1.*
from (select t1.*,
row_number() over (order by date) as rn
from t1
) t1
where rn > 1
if you want to keep only one value, rn = 1 instead of "> 1"

Delete Distinct column and latest date of other column

I have a table where the primary key is a composite key of ID and date. Is there a way that I can delete a single row where ID matches and the date is the latest date?
I am new to SQL, so I have tried a few things, but I either don't get the results I am looking for or cant get the syntax correct
DELETE FROM Master
WHERE ((Identifier = 'SomeID')
AND (EffectiveDate = MAX(EffectiveDate));
There are multiple columns with the same ID, but different dates, ie.
ID EffectiveDate
-------------------------
A '2019-09-18'
A '2019-09-17'
A '2019-09-16'
Is there a way I can delete only the row with A | '2019-09-18'?
You can use window functions and an updatable CTE:
with todelete as (
select t.*, row_number() over (partition by id order by effective_date desc) as seqnum
from t
)
delete from todelete
where seqnum = 1;
Note: If you want to limit this to a single id, then be sure to include a where id = 'a' in either the subquery or outer query.
use row_number()
delete from (select *, row_number() over(partition by id order by effectivedate desc) rn from table_name
) a where a.rn=1
A correlated subquery might get the job done:
DELETE FROM Master
WHERE
Identifier = 'SomeID'
AND EffectiveDate = (
SELECT MAX(EffectiveDate) FROM Master WHERE Identifier = 'SomeID'
)
;
Use the CTE Function to Delete the Row but the below Query will not delete the Record of Max Date of those ID's where Single Record exist against that.
with todelete as (
select t.*, row_number() over (partition by id order by effective_date desc) as seqnum
from t
)
delete from todelete
where seqnum = 1 and id in(select distinct id from todelete where seqnum<>1)
With correlated subquery for all IDs:
delete table1
from table1 t1
where t1.EffectiveDate =
(
select max(t2.EffectiveDate)
from table1 t2
where t2.ID = t1.ID
)

How to delete a record by its OID?

I have a weird case which I have no idea how it happened.
This is my table:
id
date
amount
where id can not be NULL and is auto increasing.
Someone last year the system created the following situation:
OID id date amount
710604512 197 2015-03-11 10657.61
710604513 197 2015-03-11 10657.61
This causes huge problems as id should be unique.
I can't fix this from regular SQL because any action I'll do will be done on both rows.
One of them needs to be deleted.
The solution of deleting both and inserting one is unacceptable as I can not play with the dates (it records the date of creation and the logs will show it)
How can I delete the row by its OID?
If you want to delete the record with the smaller OID should duplicates occur, then you can try this:
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY id, date, amount ORDER BY OID DESC) AS rn
FROM yourTable
)
DELETE FROM cte WHERE rn=2; -- or rn >=2 to delete all duplicates
To delete the record with the greater OID, just change the ORDER BY clause to this:
ORDER BY OID
Assuming you have want to keep the row with max OID for each ID, you can use this:
delete
from your_table t1
using (
select id, max(OID)
from your_table
group by id
) t2
where t1.id = t2.id and t1.OID <> t2.OID;
Or:
delete
from your_table t1
where exists (
select 1
from your_table t2
where t1.id = t2.id
and t1.OID < t2.OID
);
If id, date, amount are the business key in your case, you can remove all records beyond the second by grouping by these columns. Something like this:
DELETE FROM theTable
WHERE OID IN (
SELECT OID
FROM ( SELECT ROW_NUMBER() OVER (PARTITION BY id, date, amount) AS RowNo, OID
FROM tab) x
WHERE x.RowNo > 1);
Note: this should work regardless of the number of duplicates.

Delete field Duplicates from the Same table

I am writing this query to display a bunch of Names from a table filled automatically from an outside source:
select MAX(UN_ID) as [ID] , MAX(UN_Name) from UnavailableNames group by (UN_Name)
I have a lot of name duplicates, so I used "Group by"
I want to delete all the duplicates right after I do this select query..
(Delete where the field UN_Name is available twice, leave it once)
Any way to do this?
Something likes this should work:
WITH CTE AS
(
SELECT rn = ROW_NUMBER()
OVER(
PARTITION BY UN_Name
ORDER BY UN_ID ASC), *
FROM dbo.UnavailableNames
)
DELETE FROM cte
WHERE rn > 1
You basically assign an increasing "row number" within each group that shares the same "un_name".
Then you just delete all rows which have a "row number" higher than 1 and keep all the ones that appeared first.
With CTE As
(
Select uid,ROW_NUMBER() OVER( PARTITION BY uname order by uid) as rownum
From yourTable
)
Delete
From yourTable
where uid in (select uid from CTE where rownum> 1 )

How do I delete duplicate rows in SQL Server using the OVER clause?

Here are the columns in my table:
Id
EmployeeId
IncidentRecordedById
DateOfIncident
Comments
TypeId
Description
IsAttenIncident
I would like to delete duplicate rows where EmployeeId, DateOfIncident, TypeId and Description are the same - just to clarify - I do want to keep one of them. I think I should be using the OVER clause with PARTITION, but I am not sure.
Thanks
If you want to keep one row of the duplicate-groups you can use ROW_NUMBER. In this example i keep the row with the lowest Id:
WITH CTE AS
(
SELECT rn = ROW_NUMBER()
OVER(
PARTITION BY employeeid, dateofincident, typeid, description
ORDER BY Id ASC), *
FROM dbo.TableName
)
DELETE FROM cte
WHERE rn > 1
use this query without using CTE....
delete a from
(select id,name,place, ROW_NUMBER() over (partition by id,name,place order by id) row_Count
from dup_table) a
where a.row_Count >1
You can use the following query. This has an assumption that you want to keep the latest row and delete the other duplicates.
DELETE [YourTable]
FROM [YourTable]
LEFT OUTER JOIN (
SELECT MAX(ID) as RowId
FROM [YourTable]
GROUP BY EmployeeId, DateOfIncident, TypeId, Description
) as KeepRows ON
[YourTable].ID = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL