MERGE - to replace ON Duplicate with sql server - sql

I've predominantly used mySQL so moving over to azure and sql server I realise that on duplicate does not work.
I'm trying to do this:
INSERT INTO records (jid, pair, interval, entry) VALUES (1, 'alpha', 3, 'unlimited') ON DUPLICATE KEY UPDATE entry = "limited";
But of course on duplicate key isn't allowed here. So MERGE is the right form.
I've looked at:
https://technet.microsoft.com/en-gb/library/bb522522(v=sql.105).aspx
But honestly the example is a bit excessive and eye watering. Could someone dumb it down for me to fit my example so I can understand it better?

In order to do the merge you need some form of source table/table var for the merge statement. Then you can do the merging. So something along the lines of this maybe (note: not completely syntax checked, apologies in advance):
WITH src AS (
-- This should be your source
SELECT 1 AS Id, 2 AS Val
)
-- The above is not neccessary if you have a source table
MERGE Target -- the detination table, so in your case records
USING src -- as defined above
ON (Target.Id = src.Id) -- how do we join the tables
WHEN NOT MATCHED BY TARGET
-- if we dont match, what do to the destination table. This case insert it.
THEN INSERT(Id, Val) VALUES(src.Id, src.Val)
WHEN MATCHED
-- what do we do if we match. This case update Val
THEN UPDATE SET Target.Val = src.Val;
Don't forget to read the proper syntax page: https://msdn.microsoft.com/en-us/library/bb510625.aspx
I think this translates to your example (tm):
WITH src AS (
-- This should be your source
SELECT 1 AS jid, 'alpha' AS pair, 3 as 'interval'
)
MERGE records -- the detination table, so in your case records
USING src -- as defined above
ON (records.Id = src.Id) -- how do we join the tables
WHEN NOT MATCHED BY TARGET
-- if we dont match, what do to the destination table. This case insert it.
THEN INSERT(jid, pair, interval, entry) VALUES(src.jid, src.pair, src.interval, 'unlimited')
WHEN MATCHED
-- what do we do if we match. This case update Val
THEN UPDATE SET records.entry = 'limited';

Related

Improving insert query for SCD2 solution

I have two insert statements. The first query is to inserta new row if the id doesn't exist in the target table. The second query inserts to the target table only if the joined id hash value is different (indicates that the row has been updated in the source table) and the id in the source table is not null. These solutions are meant to be used for my SCD2 solution, which will be used for inserts of hundreds thousands of rows. I'm trying not to use the MERGE statement for practices.
The columns "Current" value 1 indicates that the row is new and 0 indicates that the row has expired. I use this information later to expire my rows in the target table with my update queries.
Besides indexing is there a more competent and effective way to improve my insert queries in a way that resembles the like of the SCD2 merge statement for inserting new/updated rows?
Query:
Query 1:
INSERT INTO TARGET
SELECT Name,Middlename,Age, 1 as current,Row_HashValue,id
from Source s
Where s.id not in (select id from TARGET) and s.id is not null
Query 2:
INSERT INTO TARGET
SELECT Name,Middlename,Age,1 as current ,Row_HashValue,id
FROM SOURCE s
LEFT JOIN TARGET t ON s.id = t.id
AND s.Row_HashValue = t.Row_HashValue
WHERE t.Row_HashValue IS NULL and s.ID IS NOT NULL
You can use WHERE NOT EXISTS, and have just one INSERT statement:
INSERT INTO TARGET
SELECT Name,Middlename,Age,1 as current ,Row_HashValue,id
FROM SOURCE s
WHERE NOT EXISTS (
SELECT 1
FROM TARGET t
WHERE s.id = t.id
AND s.Row_HashValue = t.Row_HashValue)
AND s.ID IS NOT NULL;

Merging deltas with duplicate keys

I'm trying to perform a merge into a target table in our Snowflake instance where the source data contains change data with a field denoting the at source DML operation i.e I=Insert,U=Update,D=Delete.
The problem is dealing with the fact the log (deltas) source might contain multiple updates for the same record. The merge I've constructed bombs out complaining about duplicate keys.
I'm struggling to think of a solution without going the likes of GROUP BY and MAX on the updates. I've done a similar setup with Oracle and the AND clause on the MATCH was enough.
MERGE INTO "DB"."SCHEMA"."TABLE" t
USING (
SELECT * FROM "DB"."SCHEMA"."TABLE_LOG"
ORDER BY RECORD_TIMESTAMP ASC
) s ON t.RECORD_KEY = s.RECORD_KEY
WHEN MATCHED AND s.RECORD_OPERATION = 'D' THEN DELETE
WHEN MATCHED AND s.RECORD_OPERATION = 'U' THEN UPDATE
SET t.ID=COALESCE(s.ID,t.ID),
t.CREATED_AT=COALESCE(s.CREATED_AT,t.CREATED_AT),
t.PRODUCT=COALESCE(s.PRODUCT,t.PRODUCT),
t.SHOP_ID=COALESCE(s.SHOP_ID,t.SHOP_ID),
t.UPDATED_AT=COALESCE(s.UPDATED_AT,t.UPDATED_AT)
WHEN NOT MATCHED AND s.RECORD_OPERATION = 'I' THEN
INSERT (RECORD_KEY, ID, CREATED_AT, PRODUCT,
SHOP_ID, UPDATED_AT)
VALUES (s.RECORD_KEY, s.ID, s.CREATED_AT, s.PRODUCT,
s.SHOP_ID, s.UPDATED_AT);
Is there a way to rewrite the above merge so that it works as is?
The Snowflake docs show the ability for the AND case predicate during the match clause, it sounds like you tried this and it's not working because of the duplicates, right?
https://docs.snowflake.net/manuals/sql-reference/sql/merge.html#matchedclause-for-updates-or-deletes
There is even an example there which is using the AND command:
merge into t1 using t2 on t1.t1key = t2.t2key
when matched and t2.marked = 1 then delete
when matched and t2.isnewstatus = 1 then update set val = t2.newval, status = t2.newstatus
when matched then update set val = t2.newval
when not matched then insert (val, status) values (t2.newval, t2.newstatus);
I think you are going to have to get the "last record" per key and use that as your update, or process these serially which will be pretty slow...
Another thing to look at would be to try to see if you can apply the last_value( ) function to each column, where you order by your timestamp and partition over your key. If you do that in your inline view, that might work.
I hope this helps, I have a feeling it won't help much...Rich
UPDATE:
I found the following: https://docs.snowflake.net/manuals/sql-reference/parameters.html#error-on-nondeterministic-merge
If you run the following command before your merge, I think you'll be OK (testing required of course):
ALTER SESSION SET ERROR_ON_NONDETERMINISTIC_MERGE=false;

What am I doing wrong with this MERGE / INSERT query?

I have a database table that I want to update with SQL. Basically, it carries a set of description information for parts of a timetable booklet, but that isn't important. Some of the data is entered already via an application, but it is time consuming and blocks of it are "boiler plate" that want to be updated each week. In some cases, I may have missed entering the data via the application, and so also want to create data automatically if it doesn't exist.
That leads me to use a MERGE query, as follows:
MERGE INTO TTP_LINE_DESCRIPTION o
USING
(SELECT DISTINCT lv.lv_nv_id TLDE_NV_ID,
lv.lv_id TLDE_LV_ID,
dir.dir_id TLDE_DIR_ID,
8 TLDE_MED_FLAG,
1 TLDE_TYPE,
0 TLDE_SORT_NO,
'Timetable valid from ' || to_char(lv.lv_valid_from,'DD/MM/YYYY') || ' until ' || nvl2(lv.lv_valid_until,to_char(lv.lv_valid_until,'DD/MM/YYYY'),'further notice') TLDE_TEXT,
0 TLDE_ALIGNMENT,
null TLDE_FONT_SIZE,
null TLDE_FONT_STYLE
FROM LINE_VERSION lv
JOIN line_point_sequence lps ON (lv.lv_id = lps.lps_lv_id)
JOIN direction dir ON (dir.dir_id = lps.lps_dir_id)
where lv.lv_nv_id=3799 and lv.lv_id=10455244) n
ON (o.TLDE_NV_ID=n.TLDE_NV_ID
and o.TLDE_LV_ID=n.TLDE_LV_ID
and o.TLDE_DIR_ID=n.TLDE_DIR_ID
and o.TLDE_TYPE=n.TLDE_TYPE
and o.TLDE_SORT_NO=n.TLDE_SORT_NO)
WHEN MATCHED THEN
UPDATE SET o.TLDE_TEXT=n.TLDE_TEXT,
o.TLDE_ALIGNMENT=n.TLDE_ALIGNMENT,
o.TLDE_FONT_SIZE=n.TLDE_FONT_SIZE,
o.TLDE_FONT_STYLE=n.TLDE_FONT_STYLE
WHEN NOT MATCHED THEN
INSERT (o.tlde_id, o.tlde_nv_id, o.tlde_lv_id, o.tlde_dir_id,
o.tlde_med_flag, o.tlde_type, o.tlde_sort_no, o.tlde_text,
o.tlde_alignment, o.tlde_font_size, o.tlde_font_style,
o.updated_by, o.updated_on, o.updated_prog)
VALUES ((select max(tld.tlde_id)+1 from TTP_LINE_DESCRIPTION tld),
n.tlde_nv_id, n.tlde_lv_id, n.tlde_dir_id, n.tlde_med_flag,
n.tlde_type, n.tlde_sort_no, n.tlde_text, n.tlde_alignment,
n.tlde_font_size, n.tlde_font_style, 'STUARTR',
SYSDATE, 'PL/SQL Developer');
There is one thing to note here, which is the WHERE clause in the SELECT DISTINCT. The lv.lv_id=10455244 is a reference that I know will force the select to return only a single pair of rows in that SELECT, so that I can limit my testing. For this purpose, 10455244 is a valid value that is NOT currently in the TTP_LINE_DESCRIPTION table.
When I use a value that is in the table, the WHEN MATCHED code executes correctly, and updates a pair of rows in 0.016s.
Running the SELECT statement on its own using the value I have shown above returns the two rows that need adding in 0.109s.
Getting the max id and adding one to it (this is the primary key) as per the first item in the VALUES line at the end takes 0s.
Finally, if I write an INSERT INTO and explicitly write all of the values that I want to write for one of the rows, I can do the INSERT of one row in 0.016s.
But put it all together and ... nothing. The execution just sits there, executing, and doesn't appear to end. Or I get nervous waiting to see if it will end. I've left it a reasonable time, and nothing seems to go in.
So what's going on, and why won't it do what I think that it should?
You are doing this:
create table t (id, nv_id, val) as (select 1, 101, 'A' from dual);
merge into t o
using (
select 102 nv_id, 'P' val from dual union all
select 103 nv_id, 'Q' val from dual ) n
on (o.nv_id = n.nv_id)
when matched then update set val = n.val
when not matched then
insert (o.id, o.nv_id, o.val)
values ((select max(id) + 1 from t), n.nv_id, n.val);
In my case it worked, but id for inserted rows was the same: 2. When I did rollback and added primary key constraint:
alter table t add constraint t_pk primary key(id);
merge caused error - unique constraint violated. I suspect that this is something connected to your tlde_id, maybe not constraint but something else. Generating values this way is a big no-no. If your Oracle version is 12c or higher you can change this column to auto generated
generated by default on null as identity
or in older versions use sequence and trigger (find max id, add 1 and set it as starting value for sequence)
create sequence seq_t_id start with 2;
create or replace trigger t_on_insert
before insert on t for each row
begin
select seq_t_id.nextval into :new.id from dual;
end;
Then modify your merge, either by removing id from insert clause or just insert null and trigger will set proper values:
merge into t o
using (
select 102 nv_id, 'P' val from dual union all
select 103 nv_id, 'Q' val from dual ) n
on (o.nv_id = n.nv_id)
when matched then update set val = n.val
when not matched then
insert (o.nv_id, o.val)
values (n.nv_id, n.val);
Even if this does not solve your problem you should not generate ids as max() + 1, this causes problems with concurent sessions, commits, multi inserts etc.

Sync two tables with single SQL

I need to sync two table in Oracle. I used MERGE to do this job, but I need help to get a working SQL to do this.
My target table has a PK and some other columns. Some of these columns has not null constraint.
My source table has a different layout and data then my target table, so I need to query my source table and convert data to target layout.
My actual code is (simplified):
MERGE INTO TARGET t USING(
WITH SRC AS ( --do the transformation
SELECT ID, DECODE(VAL,'THIS','THAT','OTHER') VAL1, REGEXP_SUBSTR(VAL,'\d+') VAL2 FROM SOURCE
)
SELECT t.ROWID ROW_ID, s.* FROM SRC s
FULL OUTER JOIN TARGET t ON s.ID=t.ID
) s ON (t.ROWID=s.ROW_ID)
WHEN MATCHED THEN UPDATE SET t.VAL1=s.VAL1 AND t.VAL2=s.VAL2
DELETE WHERE s.ID IS NULL
WHEN NOT MATCHED THEN INSERT(ID, VAL1, VAL2) VALUES (s.ID, s.VAL1, s.VAL2);
The problem is that these rows, that match the DELETE condition, throws ORA-01407: cannot update (string) to NULL. It seems, that Oracle first tries to update and do the delete later. This causes the ERROR.
The MERGE keyword is really horrible for sync Tables with deletion, but I'd like to use a single Query, because my transformation SQL is really heavy.
Is there any alternative to MERGE or any suggestion what to do, to get this working?
Thanks
Thats my solution. Maybe this can help someone.
MERGE INTO TARGET t USING(
WITH SRC AS ( --do the transformation
SELECT ID, DECODE(VAL,'THIS','THAT','OTHER') VAL1, REGEXP_SUBSTR(VAL,'\d+') VAL2 FROM SOURCE
)
SELECT t.ROWID ROW_ID, NVL2(s.ID,null,1) delFlag, s.*
FROM SRC s
FULL OUTER JOIN TARGET t ON s.ID=t.ID
) s ON (t.ROWID=s.ROW_ID)
WHEN MATCHED THEN UPDATE SET t.VAL1=NVL2(s.delFlag,t.VAL1,s.VAL1) AND t.VAL2=NVL2(s.delFlag,t.VAL1,s.VAL2)
DELETE WHERE s.delFlag IS NOT NULL
WHEN NOT MATCHED THEN INSERT(ID, VAL1, VAL2) VALUES (s.ID, s.VAL1, s.VAL2);
But it's really strange that rows that will be deleted must pass constraint checks.
In order to delete rows as part of the merge statement, you need to first update the row.
Therefore, you need to take account of null values in the update statement (either directly in the set clause, or in the source query), e.g.:
MERGE INTO target t
USING (WITH src AS ( --do the transformation
SELECT id,
DECODE(val, 'THIS', 'THAT', 'OTHER') val1,
regexp_substr(val, '\d+') val2
FROM SOURCE)
SELECT t1.rowid row_id,
s1.id,
NVL(s1.val1, t1.val1) val1,
NVL(s1.val2, t1.val2) val2
FROM src s1
FULL OUTER JOIN target t1
ON s1.id = t1.id) s ON (t.rowid = s.row_id)
WHEN MATCHED THEN
UPDATE SET t.val1 = s.val1 AND t.val2 = s.val2
DELETE WHERE s.id IS NULL
WHEN NOT MATCHED THEN
INSERT (id, val1, val2)
VALUES (s.id, s.val1, s.val2);
N.B. I am assuming that the val1 and val2 in the source and target tables are not nullable. Amend as appropriate for your tables.
As weird as it seems, rather than either updating or deleting a row, Oracle's MERGE only deletes rows that it already updated. Doesn't seem to make sense, but that's the way it is. Please see Boneist's answer for a solution.
Update: I've looked this up in some standard SQL drafts. From what I gather, MERGE in standard SQL 2003 only supported insert and delete, and both WHEN MATCHED THEN and WHEN NOT MATCHED THEN were not allowed to appear more than once.
Oracle obviously wanted to provide a delete option and decided for the weird approach to extend the update clause and apply deletes on updated rows only.
In a newer draft (supposedly the current one, found here: https://www.wiscorp.com/SQLStandards.html) a delete option was added, the WHEN clauses extended to WHEN MATCHED [ AND <search condition> ] THEN and WHEN NOT MATCHED [ AND <search condition> ] THEN, and these clauses can appear more than once. This looks way better. Let's hope Oracle adopts this new syntax some time soon :-)
The problem is that these rows, that match the DELETE condition, throws ORA-01407: cannot update (string) to NULL. It seems, that Oracle first tries to update and do the delete later. This causes the ERROR.
Then give the UPDATE part a condition too:
WHEN MATCHED THEN
UPDATE SET t.VAL1=s.VAL1 AND t.VAL2=s.VAL2 WHERE s.ID IS NOT NULL
DELETE WHERE s.ID IS NULL

SQL With... Update

Is there any way to do some kind of "WITH...UPDATE" action on SQL?
For example:
WITH changes AS
(...)
UPDATE table
SET id = changes.target
FROM table INNER JOIN changes ON table.id = changes.base
WHERE table.id = changes.base;
Some context information: What I'm trying to do is to generate a base/target list from a table and then use it to change values in another table (changing values equal to base into target)
Thanks!
You can use merge, with the equivalent of your with clause as the using clause, but because you're updating the field you're joining on you need to do a bit more work; this:
merge into t42
using (
select 1 as base, 10 as target
from dual
) changes
on (t42.id = changes.base)
when matched then
update set t42.id = changes.target;
.. gives error:
ORA-38104: Columns referenced in the ON Clause cannot be updated: "T42"."ID"
Of course, it depends a bit what you're doing in the CTE, but as long as you can join to your table withint that to get the rowid you can use that for the on clause instead:
merge into t42
using (
select t42.id as base, t42.id * 10 as target, t42.rowid as r_id
from t42
where id in (1, 2)
) changes
on (t42.rowid = changes.r_id)
when matched then
update set t42.id = changes.target;
If I create my t42 table with an id column and have rows with values 1, 2 and 3, this will update the first two to 10 and 20, and leave the third one alone.
SQL Fiddle demo.
It doesn't have to be rowid, it can be a real column if it uniquely identifies the row; normally that would be an id, which would normally never change (as a primary key), you just can't use it and update it at the same time.