update table where there is duplicate records using sql

update table where there is duplicate records using sql - sql

i am trying to update a table but my problem is the target table has duplicate records so my update is failing for that reason. This is the error: attempt to update a target row with values from multiple join rows. I know when updating a table, we have to join unique keys but i cannot delete the duplicates from the table so i am looking for a work around for my situation. The CUSTOMERTABLE is the one that has the duplicates. Here is my query:
UPDATE CUSTOMERTABLE
SET SERVICE = 'BILLING'
FROM
(SELECT distinct(CUSTOMER_ID)AS ACCT_ID
,ED.CUSTOMER_NAME
, ED.CUSTOMER_ADDRESS
FROM CUSTOMER_RELATION ED, STG_CUSTOMER_REV TXN
WHERE ED.CUSTOMER_ID = TXN.CUS_ID
)AS X
WHERE X.ACCT_ID = CUSTOMERTABLE.ACCOUNT_NUMBER;

Try writing it with an IN clause:
UPDATE CUSTOMERTABLE
SET SERVICE = 'BILLING'
WHERE CUSTOMERTABLE.ACCOUNT_NUMBER IN
(SELECT CUSTOMER_ID
FROM CUSTOMER_RELATION ED
JOIN STG_CUSTOMER_REV TXN ON ED.CUSTOMER_ID = TXN.CUS_ID)

Here is another option, which probably has a better performance compared to an IN solution if CUSTOMER_RELATION or STG_CUSTOMER_REV are large tables.
UPDATE C
SET SERVICE = 'BILLING'
FROM CUSTOMERTABLE C
WHERE EXISTS (SELECT 1
FROM CUSTOMER_RELATION ED, STG_CUSTOMER_REV TXN
WHERE ED.CUSTOMER_ID = TXN.CUS_ID AND CUSTOMER_ID = C.ACCOUNT_NUMBER);

Try grouping on the CustomerId
UPDATE CUSTOMERTABLE
SET SERVICE = 'BILLING'
FROM
(SELECT distinct(CUSTOMER_ID)AS ACCT_ID
,ED.CUSTOMER_NAME
, ED.CUSTOMER_ADDRESS
FROM CUSTOMER_RELATION ED, STG_CUSTOMER_REV TXN
WHERE ED.CUSTOMER_ID = TXN.CUS_ID
GROUP BY ED.CUSTOMER_ID
)AS X
WHERE X.ACCT_ID = CUSTOMERTABLE.ACCOUNT_NUMBER;
You need to make sure that your select return non duplicates. Try using that select without the update statement and check if the select cotains the duplicates you want to get rid off.

Related

SQL Server - Code for Updating values using inner query self join

this may be quite basic for most, but I'd like to get some help.
Using SQL Server I have the following Orders table (Excel excerpt to simplify):
Please note there are multiple orders (OrderID). Some may have "PrimaryOrder" value, meaning they are related to an existing prior order. Related orders receive the "PrimaryOrder" of the 1st related order, and an "OrderIndex" noting the order they came in.
Only the first order in each set has value. If an order's "PrimaryOrder" is NULL, it means it is a single order and I should simply ignore it.
What I need is, using SQL Server Update command give all orders which are related, the same "value" as their 1st related order's "Value".
Meaning for each order that has "OrderIndex" > 1, update it's Value field from NULL to it's "PrimaryOrder" value.
If "PrimaryOrder" = 1 OR is NULL, ignore and don't update.
Tried some simple INNER JOIN but got lost.
I don't think it should be too complicated, but I might be overthinking it.
Thank you!

You can use correlated sub-query with update statement :
update o
set o.value = (select top (1) o1.value
from Orders o1
where o1.primaryorder = o.primaryorder and
o1.value is not null and
o1.orderindex <= o.orderindex
order by o1.orderindex desc
)
from Orders o
where o.value is null;

Perhaps something like this:
UPDATE #table
SET a.Value=b.Value
FROM #table a INNER JOIN #table b
on a.OrderID=b.PrimaryOrder and a.OrderIndex>1 and a.Value is NULL

UPDATE o
SET Value=MaxVals.MaxValue
FROM Orders o
INNER JOIN (
SELECT MAX(Value) AS MaxValue, PrimaryOrder
FROM Orders
WHERE PrimaryOrder IS NOT NULL
GROUP BY PrimaryOrder
) AS MaxVals ON MaxVals.PrimaryOrder=o.PrimaryOrder
WHERE o.Value IS NULL

Thank you all.
Managed to take something from all the above and this solved it:
UPDATE O
SET O.[Value] = B.[Value]
FROM Orders O INNER JOIN Orders B
ON O.PrimaryOrder = B.[PrimaryOrder] and O.OrderIndex > 1 and O.[Value] is NULL
AND B.[OrderIndex] = 1

Update only first record from duplicate entries in SQL Server

I need help trying to update a table that has multiple duplicate records, but I am stuck.
I have this table, and I need to update im_cust9 with the alt_item_id1 value.
The query I am using to get this result from the table is the following:
SELECT
o.item_id, o.alt_item_id1, o.im_cust9, o.owner_id, o.if_updatestamp
FROM
item_master o
INNER JOIN
(SELECT
alt_item_id1, COUNT(*) AS dupeCount
FROM
item_master
WHERE
owner_id = 'GIII' AND alt_item_id1 <> ''
GROUP BY
alt_item_id1
HAVING
COUNT(*) > 1) oc ON o.alt_item_id1 = oc.alt_item_id1
WHERE
owner_id = 'GIII' AND o.alt_item_id1 <> ''
ORDER BY
alt_item_id1, if_updatestamp ASC
Not sure how to update the oldest record of every set of duplicate alt_item_id1
I am using SQL Server 2012
Any help is greatly appreciated!

To get the newest row to update, use the max of the if_updatestamp. for the oldest use the min. Then join it to your table for the udpate like so...
update IM
Set IM.im_cust9 = NewDupeRow.alt_item_id1
From item_master IM
JOIN (
SELECT alt_item_id1,Max(if_updatestamp) MaxUpdateValue
FROM item_master WHERE owner_id='GIII' AND alt_item_id1<>''
GROUP BY alt_item_id1 ) NewDupeRow
On IM.alt_item_ID = NewDupeRow.alt_item_ID
AND IM.if_updatedstamp = NewDupeRow.MaxUpdateValue

You can do this using an updatable CTE and row_number():
with toupdate as (
select i.*,
row_number() over (partition by alt_item_id order by if_updatestamp) as seqnum
from item_master i
)
update toupdate
set im_cust9 = alt_item_id1
where seqnum = 1;

need to replace subquery with JOIN

I need to use join in below instead of Subquery.
can anybody help me to rewrite this with JOIN.
update Table1
set status = 'Edited'
where val_74 ='1' and status ='Valid'
and val_35 is not null
and (val_35,network_id) in
(select val_35,network_id from
Table2 where val_35 is not null
and status='Correct_1');
update Table1 b SET (Val_12,Val_13,Val_14)=
(select Val_12,Val_13,Val_14 from
(select Val_35,network_id, Val_12, Val_13, Val_14
from Table2
where Val_34 is not null
and (REGEXP_LIKE(Val_13,'^[0-9]+$'))
and (Val_14 is null or (REGEXP_LIKE(Val_14,'^[0-9]+$')))
group by Val_35,network_id,Val_12,Val_13,Val_14
)
where Val_35 = b.Val_35 and network_id = b.network_id and rownum=1
)
where status = 'PCStep2' and (regexp_like(Val_13,'[MSS]+') or regexp_like(Val_14,'[MSS]+'));
I tried a lot with my less Knowledge In SQL JOINs. but getting multiple erros.
can anybody help me with the queries at the earliest.
Hearty thanks in advance

Actually you can not mix a update statement with a join statement. An update statement always expects exactly one table definition after the update command.
-- ORA-00971: missing SET keyword
update orders o, customers c
set o.order_value = c.account_value
where o.cust_id = c.cust_id
-- works fine
update orders o
set o.order_value = (select c.account_value
from customers c
where c.id = o.cust_id)

How to speed up update query in SQL Server 2008?

update orders
set tname = (select top 1 t.task
from task t
where prod_typ='2' and sorder_nbr = t.ORDER_NBR
order by t.strt_dt desc)
where Prod_type='2'
update orders
set tname= (select top 1 t.task
from task t
where prod_typ='1' and sorder_nbr=t.ORDER_NBR
order by t.strt_dt desc)
where Prod_type='1'
I am trying to update the tname column of orders table by the latest task from the task table
And the condition is prod_typ of orders table is 1 and sorder_nbr of orders table and order_nbr of task table are equal
My first update statement works well where the rows are 900k and for the second update rows are 400k for second update statement it takes more than one hour to run and at last I cancelled the query

1) You query and my query:
update orders
set tname = (select top 1 t.task
from task t
where prod_type='2' and order_nbr = t.ORDER_NBR
order by t.strt_dt desc)
where Prod_type='2';
go
update o
set tname = (select top 1 t.task
from task t
where prod_type='2' and o.order_nbr = t.ORDER_NBR
order by t.strt_dt desc)
from dbo.orders o
where Prod_type='2';
go
The actual execution plans:
As you can see, if default collation for current DB is CI (case insensitive) then following predicate order_nbr=t.ORDER_NBR force SQL Server to compare the values of t.ORDER_NBR with the values order_nbr column from the same table task t. Look at first execution plan which corresponds to first query.
To solve just this problem, I've used another alias
dbo.orders o and I've reqrite the predicate thus o.order_nbr = t.ORDER_NBR. You may see this also within second execution plan.
Depending on how many tasks are for every order_num & prod_type you could test S#1 if there are many tasks or S#2 if there is a small amount of tasks per order_num & prod_type. Again, you need to test with your data to see which solution is better.
2) Solution #1:
UPDATE o
SET tname =
COALESCE(
(SELECT TOP(1) t.task
FROM dbo.task t
WHERE t.prod_type=o.Prod_type
AND o.order_nbr = t.ORDER_NBR
ORDER BY t.strt_dt DESC), tname
)
FROM dbo.orders o
WHERE o.Prod_type IN ('1', '2');
3) Solution #2:
UPDATE o
SET tname = lt.task
FROM dbo.orders o
INNER JOIN
(
SELECT src.order_nbr, src.prod_type, src.task
FROM (
SELECT t.ORDER_NBR, t.prod_type, t.task,
ROW_NUMBER() OVER(PARTITION BY t.ORDER_NBR, t.prod_type ORDER BY t.strt_dt DESC) RowNum
FROM dbo.task t
) src
WHERE src.RowNum = 1
) lt -- last task
ON o.order_nbr = lt.ORDER_NBR AND o.prod_type = lt.prod_type
WHERE o.Prod_type IN ('1', '2');
If you have questions then feel free to ask.
4) An index on dbo.task(order_nbr, prod_type, strt_dt) include (task) should help both solutions.
5) Also you should publish the actual execution plans.

If the data size is large than i suggest you to use variables for updating the table, or Using CTE to update
Update a table using CTE and NEWID()
Updating record using CTE?
I hope this will help
with tname (t.task) as
(select top 1 t.task
from task t
where prod_typ='2' and order_nbr = t.ORDER_NBR
order by t.strt_dt desc )
insert into Orders(t.task)

Try something like this. This will update prod_type of 1 and 2 at the same time.
UPDATE orders
SET tname = t1.task
FROM orders o
CROSS APPLY (
SELECT order_nbr, prod_type, t.task, row_number() OVER (PARTITION BY order_nbr, prod_type ORDER BY strt_dt DESC) rownumber
FROM task t
WHERE o.prod_type = t.prod_type
AND o.order_nbr = t.order_nbr) t1
WHERE t1.rownumber = 1
AND o.prod_type in (1,2)

Using a CTE query will speed up this, because the subquery is need not be created for every row, it is pre-prepared. Here is the sqlfiddle
;with cteTaskNames as
(
select top 1 t.task
from task t
where prod_type='2' and order_nbr=t.ORDER_NBR
order by t.strt_dt desc
)
update orders
set tname = (select task from cteTaskNames)
where Prod_type='2'
go
Also,
1) Is "prod_type" an integer field or a string field?
2) If you add group by in the cte, you can do an inner join on orders and cte query to run all updates at once instead of doing each query.

Update in child table, only one value got updated

Below I am trying to update value of a parent table from child table and counting matching values. Tables in my db:
issue_dimension with id = issue_id and have column accno.
star_schema with id star_id,this Child column have fk issue_id and column book_frequency
The book_frequency need to match the count of each accno in parent table , I tried this
update [test1] .[dbo] .star_schema
set [book_frequency] = (
select top 1 COUNT([issue_dimension].ACCNO)as book_frequency
from issue_dimension
group by ACCNO having (COUNT(*)>1) and
issue_dimension.ACCNO = star_schema .ACCNO
)
It only updates only 1st value count issue_dimension. I need to count every accno in issue_dimension and update it to matching accno of star_schema.
I never did update by joining two or more tables , can anyone help in this with joins

UPDATE s
SET [book_frequency] = i.CNT
FROM [test1].[dbo].star_schema s
INNER JOIN
(
SELECT ACCNO, COUNT(*) as CNT
FROM issue_dimension
GROUP BY ACC_NO
HAVING COUNT(*)>1
) i on (s.ACCNO = i.ACCNO)
I didn't check it but it should works

Try in this way, without grouping, just with the WHERE clause:
UPDATE [test1].[dbo].star_schema SET
[book_frequency] =
(
SELECT COUNT([issue_dimension].ACCNO)
FROM issue_dimension
WHERE issue_dimension .ACCNO = star_schema.ACCNO
HAVING COUNT(*)>1
)

It's not fully clear to me so the answer is a bit of guessing:
update s set
book_frequency = t.qty
from star_schema s
join issue_dimension i on s.issue_id = s.issue_id
join (select count(*) as qty, accno
from issue_dimension
group by accno
) t on i.accno = t.accno

Here's the example from BOL that does the kind of thing you're looking for, using AW:
USE AdventureWorks2008R2;
GO
UPDATE Sales.SalesPerson
SET SalesYTD = SalesYTD +
(SELECT SUM(so.SubTotal)
FROM Sales.SalesOrderHeader AS so
WHERE so.OrderDate = (SELECT MAX(OrderDate)
FROM Sales.SalesOrderHeader AS so2
WHERE so2.SalesPersonID = so.SalesPersonID)
AND Sales.SalesPerson.BusinessEntityID = so.SalesPersonID
GROUP BY so.SalesPersonID);

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

update table where there is duplicate records using sql - sql

Try writing it with an IN clause: UPDATE CUSTOMERTABLE SET SERVICE = 'BILLING' WHERE CUSTOMERTABLE.ACCOUNT_NUMBER IN (SELECT CUSTOMER_ID FROM CUSTOMER_RELATION ED JOIN STG_CUSTOMER_REV TXN ON ED.CUSTOMER_ID = TXN.CUS_ID)

Related

SQL Server - Code for Updating values using inner query self join

Update only first record from duplicate entries in SQL Server

need to replace subquery with JOIN

How to speed up update query in SQL Server 2008?

Update in child table, only one value got updated

Categories

Resources