I have the following sql query that returns all duplicates (except most recent).
select *
from CodeData
inner join
(select max(cdID) as lastId, cdName
from CodeData
where cdName in (select cdName from CodeData
group by cdName
having count(*) > 1)
group by cdName) duplic on duplic.cdName = CodeData.cdName
where CodeData.cdID < duplic.lastId;
How would I go about updating a column for each result returned in the above query? Say I have an arbitrary column A, I was thinking about something along the lines of this at the end of the query.
update CodeData
set A = 0
where CodeData.cdID < CodeData.duplic.lastId;
But this ends up updating the entire table, and not just the returned results above.
Any tips?
Use an updatable CTE and window functions:
with toupdate as (
select cd.*,
row_number() over (partition by cdName order by cdId desc) as seqnum
from codedata cd
)
update toupdate
set a = 0
where seqnum > 1;
Note: You might find that row_number() is sufficient to handle the "duplication" issue and you don't need to store an additional column at all.
And you might really want:
update toupdate
set a = (case when seqnum > 1 then 0 else 1 end);
Related
I am using an update query in which I use an outer variable in a nested sub query, but I find an error:
SQL Error: ORA-00904: "FS"."GR_NUMBER": invalid identifier
update table fs set fs.branch_id=
(select branch_id
from
(select branch_id,row_number() over(PARTITION by gr_number order by updated_ts desc) as Sno
from admission_log
where gr_number=fs.gr_number )
where sno=1) ;
The issue with your query is that you're trying to access the fs.gr_number column in a subquery that's two levels down. You can only access the top level column in a subquery one level down.
Your statement should be:
UPDATE fs
SET fs.branch_id = (SELECT branch_id
FROM (SELECT branch_id,
gr_number,
row_number() OVER (PARTITION BY gr_number ORDER BY updated_ts DESC) AS sno
FROM admission_log) x
WHERE x.gr_number = fs.gr_number
AND sno = 1);
That moves the correlation to one level down. Note how I've also aliased the nested subquery.
The performance shouldn't be terrible, since the x.gr_number = fs.gr_number predicate involves the same column in the subquery which the analytic function is being partitioned on. That should allow Oracle to filter the subquery appropriately.
ETA: you could also use a MERGE statement instead:
MERGE INTO fs tgt
USING (SELECT branch_id,
row_number() OVER (PARTITION BY gr_number ORDER BY updated_ts DESC) AS sno
FROM admission_log) src
ON (tgt.gr_number = src.gr_number AND src.sno = 1)
WHEN MATCHED THEN
UPDATE tgt.branch_id = src.branch_id;
Since it's Oracle, you can use a merge clause and use the min() function to get the value instead of the row_number() function.
merge into table fs using
(select gr_number, min(branch_id) as branch_id from admission_log) qry
on
(fx.gr_number = qry.gr_number)
when matched then
update
set
fs.branch_id = qry.branch_id;
If you still wanted to use the roe_number() function, below is the query. Your WHERE clause to join the table should be outside the subquery.
update table fs set fs.branch_id=
(select branch_id
from
(select branch_id,row_number() over(PARTITION by gr_number order by updated_ts desc)
as Sno
from admission_log)
where sno=1
and gr_number=fs.gr_number) ;
Use this code:
update fs a
set a.branch_id =
(select c.branch_id
from (select b.branch_id,
row_number() over(PARTITION by b.gr_number order by b.updated_ts desc) as Sno
from admission_log b
where b.gr_number = a.gr_number) c
where c.sno = 1);
So I have query to return data and a row number using ROW_NUMBER() OVER(PARTITION BY) and I place it into a temp table. The initial output looks the screenshot:
.
From here I need to, in the bt_newlabel column, replace the nulls respectively. So Rownumber 1-4 would be in progress, 5-9 would be underwriting, 10-13 would be implementation, and so forth.
I am hitting a wall trying to determine how to do this. Thanks for any help or input of how I would go about this.
One method is to assign groups, and then the value. Such as:
select t.*, max(bt_newlabel) over (partition by grp) as new_newlabel
from (select t.*, count(bt_newlabel) over (order by bt_stamp) as grp
from t
) t;
The group is simply the number of known values previously seen in the data.
You can update the field with:
with toupdate as (
select t.*, max(bt_newlabel) over (partition by grp) as new_newlabel
from (select t.*, count(bt_newlabel) over (order by bt_stamp) as grp
from t
) t
)
update toupdate
set bt_newlabel = new_newlabel
where bt_newlabel is null;
If I understood what you are trying to do, this is the type of update you need to do on your temp table:
--This will update rows 1-4 to 'Pre-Underwritting'
UPDATE temp_table SET bt_newlabel = 'Pre-Underwritting'
WHERE rownumber between
1 AND (SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Pre-Underwritting');
--This will update rows 5-9 to 'Underwritting'
UPDATE temp_table SET bt_newlabel = 'Underwritting'
WHERE rownumber between
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Pre-Underwritting')
AND
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Underwritting');
--This will update rows 10-13 to 'Implementation'
UPDATE temp_table SET bt_newlabel = 'Implementation'
WHERE rownumber between
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Underwritting')
AND
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Implementation');
I made a working Fiddle to check out the results: http://sqlfiddle.com/#!18/1cae2/1/3
I working of update query for with the help of CTEs. Actually I want to update a table records, on the basis of duplicate records I just want to update one one row among those duplicate rows. My code is mentioned below:
with toupdate as (
select c.*,
count(*) over (partition by c.ConsumerReferenceNumber) as cnt,
max(c.ID) over (partition by c.ID) as onhand_value
from [dbo].[tbl_NADRA_CPS] c
)
update [dbo].[tbl_NADRA_CPS]
set StatusID = 38
where cnt > 1;
I am unable to use 'cnt' in my update where clause.
Thanks in advance.
Because cnt is a field of your CTE, not of [dbo].[tbl_NADRA_CPS]
with toupdate as (
select c.*,
count(*) over (partition by c.ConsumerReferenceNumber) as cnt,
max(c.ID) over (partition by c.ID) as onhand_value
from [dbo].[tbl_NADRA_CPS] c
)
update toupdate
set StatusID = 38
where cnt > 1;
If you want to update one row among the duplicates, then your query will not do that. Instead:
with toupdate as (
select c.*,
count(*) over (partition by c.ConsumerReferenceNumber) as cnt,
row_number() over (partition by c.ConsumerReferenceNumber order by c.ID) as seqnum
from [dbo].[tbl_NADRA_CPS] c
)
update toupdate
set StatusID = 38
where cnt > 1 and seqnum = 1;
The cnt > 1 gets reference numbers with more than one row. The seqnum = 1 ensures that just one is updated.
I have some duplication's in a table that I wish to delete. However SQL doesn't like my query below.
delete from tblS
where Field = 'spread'
group by FundCode, Region, DateEntered
having count(*) > 1
So i tried the below query again though SQL doesn't like this. How should my query look?
delete s
from tblS s
join
(
select FundCode, Region, DateEntered, count(*)
from tblS
where Field = 'spread'
group by FundCode, Region, DateEntered
having count(*) > 1
) as d on s.FundCode = d.FundCode and s.DateEntered = d.DateEntered and s.Region= d.Region
Normally when you want to delete duplicates, you want to keep one of them. The right function to use is row_number() and SQL Server supports updatable CTEs. So:
with toupdate as (
select t.*,
row_number() over (partition by FundCode, Region, DateEntered order by FundCode) as seqnum
from tblS t
where Field = 'spread'
)
delete toupdate
where seqnum > 1;
If you actually want to delete all duplicates, then use count(*) instead of row_number().
I have the follow sql statement. I'm trying to get the highest version. My query is returning multiple results, although sorted by verison desc. How can I just get the highest one?
SELECT pv.version, pv.vin, pv.policyid, pv.segeffdate, pv.segexpdate, pv.changenum from
nsa_al.polvehicle pv
WHERE pv.vin = '2GTEC19T011201788'
AND pv.changenum > 0
AND pv.VERSION = (SELECT MAX(PV.VERSION) FROM NSA_AL.POLVERSION)
ORDER BY pv.version DESC
I've tried to use the rownum = 1 but I kept getting missing ")" error.
Thanks
There are a couple of ways to do this. Here is one with a subquery and ROWNUM:
SELECT *
FROM (
SELECT pv.version, pv.vin, pv.policyid, pv.segeffdate, pv.segexpdate, pv.changenum
FROM nsa_al.polvehicle pv
WHERE pv.vin = '2GTEC19T011201788'
AND pv.changenum > 0
ORDER BY pv.version DESC
) t
WHERE ROWNUM = 1
SQL Fiddle Demo
This will only return a single record. If you need ties, you can use the analytic function RANK() instead.
SELECT *
FROM (
SELECT RANK() OVER (ORDER BY version DESC) rnk, pv.version, pv.vin, pv.changenum
FROM pv
WHERE pv.vin = '2GTEC19T011201788'
AND pv.changenum > 0
ORDER BY pv.version DESC
) t
WHERE rnk = 1
If you prefer to use a MAX aggregate, then it's easiest to do that with a common table expression and JOIN the table back on itself. This could yield multiple results if there are multiple records with the same version:
WITH CTE AS (
SELECT pv.version, pv.vin, pv.changenum
FROM pv
WHERE pv.vin = '2GTEC19T011201788'
AND pv.changenum > 0
)
SELECT *
FROM CTE C
JOIN (
SELECT MAX(version) maxVersion
FROM CTE) C2 ON C.version = C2.maxVersion
More Fiddle