Is there any way to run an update statement in SQL Server that skips rows that already exist in the target?
For instance, I have a view vw_BranchCaseCurrent which contains a CaseID and a BranchID, in addition to an auto-incremented ID. I want to do this:
update a
set
a.CaseID = #NewCaseID
from vw_BranchCaseCurrent a
where
a.CaseID = #OldCaseID;
But the problem is, if there is already an existing row in vw_BranchCaseCurrent for the new CaseID and the existing BranchID then this SQL will crash because it is violating the unique constraint on the backing table. So I'd need to skip that row when performing the update.
I was thinking maybe I could use a merge statement but I'm not entirely familiar with how those work...
There are about a dozen other views that need to be updated so I'm looking for something simple, if possible...
edit: let me clarify with an example:
| CaseID | BranchID |
|--------|----------|
| 42 | 8008 |
| 42 | 9001 |
| 86 | 9001 |
So I want to merge case 42 into case 86 by updating the CaseID field in this view. I want to change the first CaseID from 42 to 86. But the second row, I can't do anything with this because the BranchID of 9001 already exists for CaseID 86. So I leave that one alone.
This is a simple example; some of the other views I need to merge have multiple ID fields in addition to the CaseID...
You can express this using not exists, not in, or a left join . . . or even using window functions.
with toupdate as (
select bcc.*, sum(case when CaseID = #NewCaseID then 1 else 0 end) over () as num_new
from vw_BranchCaseCurrent bcc
)
update toupdate
set CaseID = #NewCaseID
where CaseID = #OldCaseID and num_new = 0;
Related
I've been working on a SQL query which needs to pull a value with a two-column key, where one of the columns may be null.And if it's null, I want to pick that value only if there is no row with the specific key
So.
CUSTOM_____PLAN_____COST
VENDCO_____LMNK_____50
VENDCO_____null_____25
BALLCO_____null_____10
I'm trying to run a query that will pull this into one field, i.e., the value of VENDCO at 50, and the value of BUYCO at 10, ignoring the VENDCO row with 25. This would be as part of a joined subquery, so I can't use the actual keys of VENDCO/BUYCO etc. Essentially, pick the cost value with the plan if it exists, but the one where it's null if the plan is not there.
It might also be worthwhile to point out that if I "select * from table where PLAN is null" I don't get results -- I have to select where PLAN=''. I'm not sure if that indicates anything weird about the data.
Hope I'm making myself clear.
I think that not exists should do what you want:
select t.*
from mytable t
where
plan is not null
or not exists (
select 1 from mytable t1 where t1.custom = t.custom and t1.plan is not null
)
Basically this gives priority to rows where plan is not null in groups sharing the same custom.
Demo on DB Fiddle:
CUSTOM | PLAN | COST
:----- | :--- | ---:
VENDCO | LMNK | 50
BALLCO | null | 10
I have a table with two columns, Prime_GuestID and Dup_GuestID to indicate the links between an GuestID's and the ID(s) it is replacing (duplicate records)
Now I want to go through a number of other relationship tables of the format and update any occurrences of a Dup_GuestID to it's Prime_GuestID.
However if the Prime_GuestID already has an entry for a given ThingID then instead I need to delete that row from the relationship table.
Currently I'm using the following script, though while it works for most cases it fails if two Dup_GuestID's update to the same Prime_GuestID for a given ThingID. It seems like the merge statement queues up all the changes before making them, therefore my clash check won't detect them.
MERGE Thing_Relation AS t
USING Guest_Relation AS g
ON t.GuestID = g.Dup_GuestID
WHEN MATCHED AND EXISTS ( -- Clash Check here
select *
from Thing_Relation AS t2
where t2.ThingID = t.ThingID
and t2.GuestID = g.Prime_GuestID
)
THEN DELETE
WHEN MATCHED
THEN UPDATE SET t.GuestID = g.Prime_GuestID
Is there a better way to be doing the check at 'When matched and exists' to check for clashes that would result from this merge? Or is there a better way to do this whole thing?
EDIT: Here is some sample data for the table
Thing_Relation Guest_Relation
ThingID | GuestID Prime_GuestID | Dup_GuestID
------------------ ---------------------------
1 | 101 101 | 102
1 | 102 107 | 104
2 | 103 107 | 105
3 | 104
3 | 105
Thing_Relation after merge
ThingID | GuestID
------------------
1 | 101
2 | 103
3 | 107
The 1|102 gets changed to 1|101 which already exists so row is deleted.
2|103 isn't affected
3|104 is changed to 3|107, and since 3|105 also changes to 3|107 but the previous update hasn't happened yet it isn't picked up by the EXISTS clause.
You are right, MERGE would not run your checks on the changes made by the MERGE statement itself. It is the basic property of MERGE. Few of the options could be :
Create a TRIGGER on UPDATE that will keep on checking for clashes on every update and delete the duplicate rows. OR
simply write two different statements one for UPDATE, later one DELETE for duplicate entries.
I'm trying to update 500.000 rows at once. I have a table with products like this:
+------------+----------------+--------------+-------+
| PRODUCT_ID | SUB_PRODUCT_ID | DESCRIPTION | CLASS |
+------------+----------------+--------------+-------+
| A001 | ACC1 | coffeemaker | A |
| A002 | ACC1 | toaster | A |
| A003 | ACC2 | coffee table | A |
| A004 | ACC5 | couch | A |
+------------+----------------+--------------+-------+
I've sets of individually statements, for example:
update products set class = 'A' where product_id = 'A001';
update products set class = 'B' where product_id = 'A005';
update products set class = 'Z' where product_id = 'A150';
I'm making a query putting one update statement below the other update statement and putting a commit statement each 1.000 rows.
It works fine (slow, but fine) but I wanna do it better if it can be possible in any way.
There is a better way to do this more efficient and faster?
One approach would be to create a temporary table holding your update information:
new_product_class:
product_id class
========== =====
A A001
B A005
Z A150
product_id should be an indexed primary key on this new table. Then you can do an UPDATE or a MERGE on the old table joined to this temporary table:
UPDATE (SELECT p.product_id, p.class, n.product_id, n.class
FROM product p
JOIN new_product_class n ON (p.product_id = n.product_id)
SET p.class = n.class
or
MERGE INTO product p
USING new_product_class n
ON (p.product_id = n.product_id)
WHEN MATCHED THEN
UPDATE SET p.class = n.class
Merge should be fast. Other things that you could look into depending on your environment: create a new table based on the old table with nologging followed by some renaming (should backup before and after), bulk updates.
Unless you have an index, each of your update statements scans the entire table. Even if you do have an index, there is a cost associated with the compilation and execution of each statement.
If you have a lot of conditions, and those conditions can vary, then I think Glenn's solution is clearly the way to go. This does everything in a single transaction, and there is no reason to run batches of 1,000 rows -- just do everything all at once.
If the number of conditions is relatively finite (as your example), and they don't change very often, then you can also do this as a simple case:
update products
set class =
case product_id
when 'A001' then 'A'
when 'A005' then 'B'
when 'A150' then 'C'
end
where
product_id in ('A001', 'A005', 'A150')
If it's possible your class field is already set to the correct value, then there is also great value in adding a condition to make sure you are not updating something to the same value. For example if this:
update products set class = 'A' where product_id = 'A001';
Updates 5,000 records, 4,000 of which are already set to 'A', then this would be significantly more efficient:
update products
set class = 'A'
where
product_id = 'A001' and
(class is null or class != 'A')
I have two tables like:
ID | TRAFFIC
fd56756 | 4398
645effa | 567899
894fac6 | 611900
894fac6 | 567899
and
USER | ID | TRAFFIC
andrew | fd56756 | 0
peter | 645effa | 0
john | 894fac6 | 0
I need to get SUM ("TRAFFIC") from first table AND set column traffic to the second table where first table ID = second table ID. ID's from first table are not unique, and can be duplicated.
How can I do this?
Table names from your later comment. Chances are, you are reporting table and column names incorrectly.
UPDATE users u
SET "TRAFFIC" = sub.sum_traffic
FROM (
SELECT "ID", sum("TRAFFIC") AS sum_traffic
FROM stats.traffic
GROUP BY 1
) sub
WHERE u."ID" = sub."ID";
Aside: It's unwise to use mixed-case identifiers in Postgres. Use legal, lower-case identifiers, which do not need to be double-quoted, to make your life easier. Start by reading the manual here.
Something like this?
UPDATE users t2 SET t2.traffic = t1.sum_traffic FROM
(SELECT sum(t1.traffic) t1.sum_traffic FROM stats.traffic t1)
WHERE t1.id = t2.id;
This is somewhat related to this question:
I have a table with a primary key, and I have several tables that reference that primary key (using foreign keys). I need to remove rows from that table, where the primary key isn't being referenced in any of those other tables (as well as a few other constraints).
For example:
Group
groupid | groupname
1 | 'group 1'
2 | 'group 3'
3 | 'group 2'
... | '...'
Table1
tableid | groupid | data
1 | 3 | ...
... | ... | ...
Table2
tableid | groupid | data
1 | 2 | ...
... | ... | ...
and so on. Some of the rows in Group aren't referenced in any of the tables, and I need to remove those rows. In addition to this, I need to know how to find all of the tables/rows that reference a given row in Group.
I know that I can just query every table and check the groupid's, but since they are foreign keys, I imagine that there is a better way of doing it.
This is using Postgresql 8.3 by the way.
DELETE
FROM group g
WHERE NOT EXISTS
(
SELECT NULL
FROM table1 t1
WHERE t1.groupid = g.groupid
UNION ALL
SELECT NULL
FROM table1 t2
WHERE t2.groupid = g.groupid
UNION ALL
…
)
At the heart of it, SQL servers don't maintain 2-way info for constraints, so your only option is to do what the server would do internally if you were to delete the row: check every other table.
If (and be damn sure first) your constraints are simple checks and don't carry any "on delete cascade" type statements, you can attempt to delete everything from your group table. Any row that does delete would thus have nothing reference it. Otherwise, you're stuck with Quassnoi's answer.