I've found a lot of near misses on this question. So many similar, but not quite right, scenarios. No doubt my ignorance will shine here.
Using DB2 and a shred of knowledge, my scenario is as follows:
On a table, insert a row of data if a given value is not present in a given column, or update the corresponding row if the value is present.
I have a table
id, bigint, not nullable
ref,varchar, nullable
I am not sure if a MERGE is the correct path here as most examples and thorough discussions all seem to revolve around merging one table into another. I'm simply gathering user input and either adding it or updating it. It seems like it should be really simple.
I'm using jdbc and prepared statements to get this done.
Is MERGE the correct way to do this?
When testing my query in DB2 Control Center, I run up against
"No row was found for FETCH, UPDATE or DELETE; or the result of a
query is an empty table"
or a variety of other errors depending on how I structure my MERGE. Here's what I have presently.
merge into table1 as t1
using (select id from table1 group by id) as t2
on t1.id = t2.id
when matched then update set t1.ref = 'abc'
when not matched then insert (t1.id, t1.ref) values (123, 'abc');
If I were to instead compose an update followed by an insert; for new data the insert runs and the update fails, and for existing data they both succeed resulting in bad data in the table, e.g. two identical rows.
The desired result is if on initial use with the values:
id = 1
ref = a
a new row is added. On subsequent use if the values change to:
id = 1
ref = b
the row with id = 1 is updated. Subsequent uses would follow the same rules.
Please let me know how I can phrase this question better.
Update id is not an automatic incrementing key. It's an external key that will be unique but not every thing we are referencing will need a related row in the table I'm attempting to update. This table is rather unstructured on its own but is part of a larger data model.
I'm a bit puzzled by your query. Reading the text makes me suspect that you want something like this:
merge into table1 as t1
using ( values (123, 'abc') ) as t2 (id, ref)
on t1.id = t2.id
when matched then update
set t1.ref = t2.ref
when not matched then
insert (id, ref) values (t2.id, t2.ref);
Is that correct?
Related
I'm having trouble getting this compound insert to work in my MERGE statement between two tables (Ignore the when match condition, I know its bad practice). The issue I'm having is getting the ServerId field in the target table to fill. The Team field is filling fine but all of the rows have a null value for ServerId. I can't find an example online for this so I'm hoping someone can help. I don't seem to have any syntactical errors and I know the column 'ServerName' in the Source table is filled for all rows.
MERGE ApplicationTeams AS Target
USING TempApplicationTeams AS Source
ON (Target.ServerId = (SELECT ID from Servers WHERE Name='Source.ServerName') AND Target.Team = Source.Team)
WHEN MATCHED THEN
UPDATE SET Target.Team = Target.Team
WHEN NOT MATCHED BY TARGET THEN
INSERT (ServerId, Team) VALUES((SELECT ID from Servers WHERE Name='Source.ServerName'), Source.Team)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
;
Thanks.
I think you should remove the single quoutes on the where clausule.
You wrote:
(SELECT ID from Servers WHERE Name='Source.ServerName')
But I think you should write this:
(SELECT ID from Servers WHERE Name=Source.ServerName)
And make sure the select id returns only one row otherwise the statement will fail
I hope it will be usefully
I am attempting to optimise a bulk UPDATE statement in Postgres using the UPDATE..FROM syntax to update from a list of values. It works except when the same row might be updated more than once in the same query.
For example say I have a table "test" with columns "key" and "value".
update test as t set value = v.value from (values
('key1', 'update1'),
('key1', 'update2') )
as v (key, value)
where t.key = v.key;
My desired behavior is for the row with key 'key1' to be updated twice, finishing with value set to 'update2'. In practice sometimes the value is updated to update1 and sometimes to update2. Also an update trigger function on the table is only invoked once.
The documentation (http://www.postgresql.org/docs/9.1/static/sql-update.html) explains why:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
Because of this indeterminacy, referencing other tables only within sub-selects is safer, though often harder to read and slower than using a join.
Is there any way to reformulate this query to achieve the behavior I'm looking for? Does the reference to sub-selects in the documentation give a hint?
Example (assuming id is a PK in the target table, and {id, date_modified} is a PK in the source table)
UPDATE target dst
Set a = src.a , b = src.b
FROM source src
WHERE src.id = dst.id
AND NOT EXISTS (
SELECT *
FROM source nx
WHERE nx.id = src.id
-- use an extra key field AS tie-breaker
AND nx.date_modified > src.date_modified
);
(in fact, this is deduplication of the source table -> forcing the source table to the same PK as the target table)
I am trying to figure out which i need to use here: deleted, inserted or updated.
basically.
I need to write some data to the history table, when the main table is updated, and only if the status changes from something to either pending or active.
This is what I have now:
ALTER TRIGGER [dbo].[trg_SourceHistory] ON [dbo].[tblSource]
FOR UPDATE AS
DECLARE #statusOldValue char(1)
DECLARE #statusNewValue char(1)
SELECT #statusOldValue = statusCode FROM deleted
SELECT #statusNewValue= statusCode FROM updated
IF (#statusOldValue <> #statusNewValue) AND
(#statusOldValue = 'P' or #statusOldValue = 'A')
BEGIN TRY
INSERT * INTO tblHistoryTable)
select * from [DELETED]
so I want the new data to stay in the main table, the the history table to be updated with what is being overwritten... right now it just copies the same info over. so after update, both my tables have the same data.
There are only the Inserted and Deleted pseudo tables - there's no Updated.
For an UPDATE, Inserted contains the new values (after the update) while Deleted contains the old values before the update.
Also be aware that the triggers is fired once per batch - not once for each row. So both pseudo tables will potentially contain multiple rows! Don't just assume a single row and assign this to a variable - this
SELECT #statusOldValue = statusCode FROM deleted
SELECT #statusNewValue= statusCode FROM updated
will fail if you have multiple rows ! You need to write your triggers in such a fashion that they work with multiple rows in Inserted and Deleted !
Update: yes - there IS a much better way to write this:
ALTER TRIGGER [dbo].[trg_SourceHistory] ON [dbo].[tblSource]
FOR UPDATE
AS
INSERT INTO dbo.tblHistoryTable(Col1, Col2, Col3, ...., ColN)
SELECT Col1, COl2, Col3, ..... ColN
FROM Deleted d
INNER JOIN Inserted i ON i.PrimaryKey = d.PrimaryKey
WHERE i.statusCode <> d.statusCode
AND d.statusCode IN ('A', 'P')
Basically:
explicitly specify the columns you want to insert - both in the INSERT statement as well as the SELECT statement retrieving the data to insert - to avoid any nasty surprises
create an INNER JOIN between Inserted and Deleted pseudo-tables to get all rows that were updated
specify all other conditions (different status codes etc.) in the WHERE clause of the SELECT
This solution works for batches of rows being updated - it won't fail on a multi-row update....
You need to use both the inserted and deleted tables together to check for records that:
1. Already existed (to check it's not an insert)
2. Still exists (to check it's not a delete)
3. The Status field changed
You also need to make sure you do that in a set based approach, as per marc_s's answer, triggers are not single record processes.
INSERT INTO
tblHistoryTable
SELECT
deleted.*
FROM
inserted
INNER JOIN
deleted
ON inserted.PrimaryKey = deleted.PrimaryKey
WHERE
inserted.StatusCode <> deleted.StatusCode
AND (inserted.StatusCode = 'P' OR inserted.StatusCode = 'A')
inserted = the new values
deleted = the old values
There is no updated table, you are looking for inserted.
We have a status table. When the status changes we currently delete the old record and insert a new.
We are wondering if it would be faster to do a select to check if it exists followed by an insert or update.
Although similar to the following question, it is not the same, since we are changing individual records and the other question was doing a total table refresh.
DELETE, INSERT vs UPDATE || INSERT
Since you're talking SQL Server 2008, have you considered MERGE? It's a single statement that allows you to do an update or insert:
create table T1 (
ID int not null,
Val1 varchar(10) not null
)
go
insert into T1 (ID,Val1)
select 1,'abc'
go
merge into T1
using (select 1 as ID,'def' as Val1) upd on T1.ID = upd.ID --<-- These identify the row you want to update/insert and the new value you want to set. They could be #parameters
when matched then update set Val1 = upd.Val1
when not matched then insert (ID,Val1) values (upd.ID,upd.Val1);
What about INSERT ... ON DUPLICATE KEY? First doing a select to check if a record exists and checking in your program the result of that creates a race condition. That might not be important in your case if there is only a single instance of the program however.
INSERT INTO users (username, email) VALUES ('Jo', 'jo#email.com')
ON DUPLICATE KEY UPDATE email = 'jo#email.com'
You can use ##ROWCOUNT and perform UPDATE. If it was 0 rows affected - then perform INSERT after, nothing otherwise.
Your suggestion would mean always two instructions for each status change. The usual way is to do an UPDATE and then check if the operation changed any rows (Most databases have a variable like ROWCOUNT which should be greater than 0 if something changed). If it didn't, do an INSERT.
Search for UPSERT for find patterns for your specific DBMS
Personally, I think the UPDATE method is the best. Instead of doing a SELECT first to check if a record already exists, you can first attempt an UPDATE but if no rows are affected (using ##ROWCOUNT) you can do an INSERT.
The reason for this is that sooner or later you might want to track status changes, and the best way to do this would be to keep an audit trail of all changes using a trigger on the status table.
If only one row with a distinct field is to be updated,I can use:
insert into tab(..) value(..) on duplicate key update ...
But now it's not the case,I need to update 4 rows inside the same table,which have its field "accountId" equal to $_SESSION['accountId'].
What I can get out of my mind at the moment is:
delete from tab where accountId = $_SESSION['accountId'],
then insert the new rows.
Which obviously is not the best solution.
Has someone a better idea about this?
Use the update just like that!
update tab set col1 = 'value' where accountId = $_SESSION['accountId']
Moreover, MySQL allows you to do an update with a join, if that makes your life a bit easier:
update
tab t
inner join accounts a on
t.accountid = a.accountid
set
t.col1 = 'value'
where
a.accountname = 'Tom'
Based on your question, it seems like you should review the Update Statement.
Insert is used to put new rows in - not update them. Delete is used to remove. And Update is used to modify existing rows. Using "Insert On Duplicate Key Update" is a hackish way to modify rows, and is poor form to use when you know the row is already there.
load all of the values in to a temporary table.
UPDATE all of the values using a JOIN.
INSERT all of the values from the temp table that don't exist in the target table.
You can use replace statement. This will work as a DELETE followed by INSERT