Sync two tables with single SQL - sql

I need to sync two table in Oracle. I used MERGE to do this job, but I need help to get a working SQL to do this.
My target table has a PK and some other columns. Some of these columns has not null constraint.
My source table has a different layout and data then my target table, so I need to query my source table and convert data to target layout.
My actual code is (simplified):
MERGE INTO TARGET t USING(
WITH SRC AS ( --do the transformation
SELECT ID, DECODE(VAL,'THIS','THAT','OTHER') VAL1, REGEXP_SUBSTR(VAL,'\d+') VAL2 FROM SOURCE
)
SELECT t.ROWID ROW_ID, s.* FROM SRC s
FULL OUTER JOIN TARGET t ON s.ID=t.ID
) s ON (t.ROWID=s.ROW_ID)
WHEN MATCHED THEN UPDATE SET t.VAL1=s.VAL1 AND t.VAL2=s.VAL2
DELETE WHERE s.ID IS NULL
WHEN NOT MATCHED THEN INSERT(ID, VAL1, VAL2) VALUES (s.ID, s.VAL1, s.VAL2);
The problem is that these rows, that match the DELETE condition, throws ORA-01407: cannot update (string) to NULL. It seems, that Oracle first tries to update and do the delete later. This causes the ERROR.
The MERGE keyword is really horrible for sync Tables with deletion, but I'd like to use a single Query, because my transformation SQL is really heavy.
Is there any alternative to MERGE or any suggestion what to do, to get this working?
Thanks
Thats my solution. Maybe this can help someone.
MERGE INTO TARGET t USING(
WITH SRC AS ( --do the transformation
SELECT ID, DECODE(VAL,'THIS','THAT','OTHER') VAL1, REGEXP_SUBSTR(VAL,'\d+') VAL2 FROM SOURCE
)
SELECT t.ROWID ROW_ID, NVL2(s.ID,null,1) delFlag, s.*
FROM SRC s
FULL OUTER JOIN TARGET t ON s.ID=t.ID
) s ON (t.ROWID=s.ROW_ID)
WHEN MATCHED THEN UPDATE SET t.VAL1=NVL2(s.delFlag,t.VAL1,s.VAL1) AND t.VAL2=NVL2(s.delFlag,t.VAL1,s.VAL2)
DELETE WHERE s.delFlag IS NOT NULL
WHEN NOT MATCHED THEN INSERT(ID, VAL1, VAL2) VALUES (s.ID, s.VAL1, s.VAL2);
But it's really strange that rows that will be deleted must pass constraint checks.

In order to delete rows as part of the merge statement, you need to first update the row.
Therefore, you need to take account of null values in the update statement (either directly in the set clause, or in the source query), e.g.:
MERGE INTO target t
USING (WITH src AS ( --do the transformation
SELECT id,
DECODE(val, 'THIS', 'THAT', 'OTHER') val1,
regexp_substr(val, '\d+') val2
FROM SOURCE)
SELECT t1.rowid row_id,
s1.id,
NVL(s1.val1, t1.val1) val1,
NVL(s1.val2, t1.val2) val2
FROM src s1
FULL OUTER JOIN target t1
ON s1.id = t1.id) s ON (t.rowid = s.row_id)
WHEN MATCHED THEN
UPDATE SET t.val1 = s.val1 AND t.val2 = s.val2
DELETE WHERE s.id IS NULL
WHEN NOT MATCHED THEN
INSERT (id, val1, val2)
VALUES (s.id, s.val1, s.val2);
N.B. I am assuming that the val1 and val2 in the source and target tables are not nullable. Amend as appropriate for your tables.

As weird as it seems, rather than either updating or deleting a row, Oracle's MERGE only deletes rows that it already updated. Doesn't seem to make sense, but that's the way it is. Please see Boneist's answer for a solution.
Update: I've looked this up in some standard SQL drafts. From what I gather, MERGE in standard SQL 2003 only supported insert and delete, and both WHEN MATCHED THEN and WHEN NOT MATCHED THEN were not allowed to appear more than once.
Oracle obviously wanted to provide a delete option and decided for the weird approach to extend the update clause and apply deletes on updated rows only.
In a newer draft (supposedly the current one, found here: https://www.wiscorp.com/SQLStandards.html) a delete option was added, the WHEN clauses extended to WHEN MATCHED [ AND <search condition> ] THEN and WHEN NOT MATCHED [ AND <search condition> ] THEN, and these clauses can appear more than once. This looks way better. Let's hope Oracle adopts this new syntax some time soon :-)
The problem is that these rows, that match the DELETE condition, throws ORA-01407: cannot update (string) to NULL. It seems, that Oracle first tries to update and do the delete later. This causes the ERROR.
Then give the UPDATE part a condition too:
WHEN MATCHED THEN
UPDATE SET t.VAL1=s.VAL1 AND t.VAL2=s.VAL2 WHERE s.ID IS NOT NULL
DELETE WHERE s.ID IS NULL

Related

Need to filter target table in SQL merge query

I am using BigQuery SQL to execute a merge query. Here is the query
MERGE `dataset.target_table` AS Target
USING
(
select
*
from
`dataset.source_table` s_data
WHERE
trans_id is not null and user_id is not null
)
AS Source
ON Source.trans_id = Target.trans_id and Target.start_date IN
(
select distinct start_date from `dataset.source_table`
)
WHEN NOT MATCHED BY Target THEN
INSERT (...)
VALUES (...)
WHEN MATCHED and Target.user_id is null THEN
UPDATE SET ...
I am getting an issue with the using a subquery in the ON statement. In Subquery not supported by join predicate
The reason I have this subquery is because I want to filter the Target table before the Merge happens or bigquery throws a OOM exception. The target table is 10billion rows while the source is 200m rows. I don't need the subquery in the ON statement, but it's a hacky way to esentially filter the Target table before the merge happens. Is there some other approach I can take?
I tried the approach here - https://dba.stackexchange.com/questions/30633/merge-a-subset-of-the-target-table utilizing
WITH TARGET AS
(
SELECT *
FROM `dataset.target_table`
WHERE <filter target_table here>
)
MERGE INTO TARGET
...
but it seems this isn't supported by BigQuery and gave a syntax error. How can I filter my Target table before the Merge happens so it doesn't need to load the entire table in memory?
I'm a little confused. Can you just take the start_date from the matching row in source?
MERGE `dataset.target_table` AS Target USING
(select *
from `dataset.source_table` s_data
WHERE trans_id is not null and user_id is not null
) AS Source
ON Source.trans_id = Target.trans_id and
Target.start_date = source.start_date

Merging deltas with duplicate keys

I'm trying to perform a merge into a target table in our Snowflake instance where the source data contains change data with a field denoting the at source DML operation i.e I=Insert,U=Update,D=Delete.
The problem is dealing with the fact the log (deltas) source might contain multiple updates for the same record. The merge I've constructed bombs out complaining about duplicate keys.
I'm struggling to think of a solution without going the likes of GROUP BY and MAX on the updates. I've done a similar setup with Oracle and the AND clause on the MATCH was enough.
MERGE INTO "DB"."SCHEMA"."TABLE" t
USING (
SELECT * FROM "DB"."SCHEMA"."TABLE_LOG"
ORDER BY RECORD_TIMESTAMP ASC
) s ON t.RECORD_KEY = s.RECORD_KEY
WHEN MATCHED AND s.RECORD_OPERATION = 'D' THEN DELETE
WHEN MATCHED AND s.RECORD_OPERATION = 'U' THEN UPDATE
SET t.ID=COALESCE(s.ID,t.ID),
t.CREATED_AT=COALESCE(s.CREATED_AT,t.CREATED_AT),
t.PRODUCT=COALESCE(s.PRODUCT,t.PRODUCT),
t.SHOP_ID=COALESCE(s.SHOP_ID,t.SHOP_ID),
t.UPDATED_AT=COALESCE(s.UPDATED_AT,t.UPDATED_AT)
WHEN NOT MATCHED AND s.RECORD_OPERATION = 'I' THEN
INSERT (RECORD_KEY, ID, CREATED_AT, PRODUCT,
SHOP_ID, UPDATED_AT)
VALUES (s.RECORD_KEY, s.ID, s.CREATED_AT, s.PRODUCT,
s.SHOP_ID, s.UPDATED_AT);
Is there a way to rewrite the above merge so that it works as is?
The Snowflake docs show the ability for the AND case predicate during the match clause, it sounds like you tried this and it's not working because of the duplicates, right?
https://docs.snowflake.net/manuals/sql-reference/sql/merge.html#matchedclause-for-updates-or-deletes
There is even an example there which is using the AND command:
merge into t1 using t2 on t1.t1key = t2.t2key
when matched and t2.marked = 1 then delete
when matched and t2.isnewstatus = 1 then update set val = t2.newval, status = t2.newstatus
when matched then update set val = t2.newval
when not matched then insert (val, status) values (t2.newval, t2.newstatus);
I think you are going to have to get the "last record" per key and use that as your update, or process these serially which will be pretty slow...
Another thing to look at would be to try to see if you can apply the last_value( ) function to each column, where you order by your timestamp and partition over your key. If you do that in your inline view, that might work.
I hope this helps, I have a feeling it won't help much...Rich
UPDATE:
I found the following: https://docs.snowflake.net/manuals/sql-reference/parameters.html#error-on-nondeterministic-merge
If you run the following command before your merge, I think you'll be OK (testing required of course):
ALTER SESSION SET ERROR_ON_NONDETERMINISTIC_MERGE=false;

Oracle Merge statement not inserting data

I have came up with below sql statement on oracle 11g for merge the data.
MERGE INTO myTable tgt
USING ( SELECT myTable.ROWID AS rid
FROM myTable
WHERE myTable.myRef = 'uuuu' or
myTable.name = 'sam'
--union
--select rowId,null from dual
) src
ON (tgt.rowid = src.rid)
WHEN MATCHED THEN
update set tgt.myRef = 'uuuu',tgt.name='name_worked'
when not matched then
insert (
tgt.myRef,tgt.name) values ('RRRR','HHH');
I need to insert data except ID column here which will manage from the trigger for primary key insertion. due to error ORA-38104: Columns referenced in the ON Clause cannot be updated i used RowId approach here.
now my issue is merge statement works fine when it comes to update but fail in inserting data in this situation.
I went through this post and added the union statement. but still it fails when goes to insert duplicate record due to the constraint in my table instead of running it smoothly.
Can anyone please help me out here? Appreciate it a lot. thank you in advanced.
==========Edited===========
Please find the complete code sample and the error messages below.
MERGE INTO myTable tgt
USING ( SELECT myTable.ROWID AS rid
FROM myTable
WHERE myTable.myRef = 'RRRR' or
myTable.mytablename = 'sam'
union
select rowId from dual
) src
ON (tgt.rowid = src.rid)
WHEN MATCHED THEN
update set tgt.myRef = 'uuuu',tgt.mytablename='myt_name', tgt.name='nameA'
when not matched then
insert (
tgt.mytableid,tgt.mytablename,tgt.name,tgt.myRef) values (11,'RRRR','HHH','myref1');
and my table is -
CREATE TABLE "sa"."MYTABLE"
(
"MYTABLEID" NUMBER NOT NULL ENABLE,
"MYTABLENAME" VARCHAR2(20 BYTE) NOT NULL ENABLE,
"NAME" VARCHAR2(20 BYTE),
"MYREF" VARCHAR2(20 BYTE),
CONSTRAINT "MYTABLE_PK" PRIMARY KEY ("MYTABLEID", "MYTABLENAME")
)
if i run this first time it will insert the record as expected.
when i run it the second time my expectation is it should match the myRef = 'RRRR' and do the update. i intentionally put 'or' between condition because if i find any value exist in the table it should go and update the existing record.
but instead of doing that update it will throw this error because merge statement try to insert again.
SQL Error: ORA-00001: unique constraint (sa.MYTABLE_PK) violated
00001. 00000 - "unique constraint (%s.%s) violated"
my requirement is when it run the first time it should insert and when i run the same again it should pick that record and update it. Please let me know what to adjust in my merge statement in order to work as expected here. Thanks in advance.
The below MERGE query is what you are looking for.
MERGE INTO myTable tgt
USING ( select x.rid from
(SELECT myTable.ROWID AS rid
FROM myTable
WHERE myTable.myRef IN ('myref1','uuuu')) x
right outer join dual
on x.rowid <> dual.rowid
) src
ON (tgt.rowid = src.rid)
WHEN MATCHED THEN
update set tgt.myRef = 'uuuu',tgt.mytablename='myt_name', tgt.name='nameA'
when not matched then
insert (tgt.mytableid,tgt.mytablename,tgt.name,tgt.myRef) values (mytable_seq.nextval,'RRRR','HHH','myref1');
There have been 3 changes made from the query that you provided.
Inside the 'using' subquery 'RRRR' was being checked in myTable.myRef column, but while inserting 'myref1' was being inserted in the same. Hence this has been changed in the using subquery to check 'myref1'.
'uuuu' has been introduced in the same check.
DUAL tas been introduced in right outer join.
The query will process as below -
1.During first run, there would be no row in mytable. Hence the right outer join with dual will prevail. This will enable the insert to happen.
During 2nd run, there will be a row with myRef column of 'myref1'. Its Rowid will be picked up and the Update will happen. After update, myRef column will be updated to 'uuuu'.
3.During all subsequent runs, 1 row will always be returned in the inside using subquery, this time because of the column value of 'uuuu'. This will enable the update to happen , which will again update the columns with the same existing values.
Thus in effect, 1st time INSERT will happen and in all subsequent times UPDATE will take place.
It appears you want something like the following:
MERGE INTO MYTABLE t
USING (SELECT newID, newTable_name from DUAL) d
ON (t.MYTABLEID = d.newID AND
t.MYTABLENAME = d.newTable_name)
WHEN MATCHED THEN
UPDATE SET NAME = newName,
MYREF = newRef
WHEN NOT MATCHED THEN
INSERT (MYTABLEID, MYTABLENAME, NAME, MYREF)
VALUES (newID, newTable_name, newName, newRef)
where the variables newID, newTable_name, newName, and newRef are populated from whatever source of data you're using.
Also - do you really want the primary key to be (MYTABLEID, MYTABLENAME)? Can you actually have multiple rows with the same MYTABLEID value? And do you really want to allow multiple rows having the same MYTABLENAME value? My thought is that the primary key should be MYTABLEID, with a UNIQUE constraint on MYTABLENAME. If this is the case then MYTABLEID is redundant, and MYTABLENAME could be used as the primary key. ????
Best of luck.
I was able to make this work on below query.
MERGE INTO myTable tgt
USING ( SELECT (SELECT myTable.ROWID AS rid
FROM myTable
WHERE myTable.myRef = '2' or
myTable.mytablename = 'sam'
) as rid from dual) src
ON (tgt.rowid = src.rid)
WHEN MATCHED THEN
update set tgt.myRef = 'uuuu',tgt.mytablename='test1', tgt.name='nameA'
when not matched then
insert (
tgt.mytableid,tgt.mytablename,tgt.name,tgt.myRef) values (11,'RRRR1','1','2');
first time it will insert the row and for subsequent runs it will update the row.

MERGE - to replace ON Duplicate with sql server

I've predominantly used mySQL so moving over to azure and sql server I realise that on duplicate does not work.
I'm trying to do this:
INSERT INTO records (jid, pair, interval, entry) VALUES (1, 'alpha', 3, 'unlimited') ON DUPLICATE KEY UPDATE entry = "limited";
But of course on duplicate key isn't allowed here. So MERGE is the right form.
I've looked at:
https://technet.microsoft.com/en-gb/library/bb522522(v=sql.105).aspx
But honestly the example is a bit excessive and eye watering. Could someone dumb it down for me to fit my example so I can understand it better?
In order to do the merge you need some form of source table/table var for the merge statement. Then you can do the merging. So something along the lines of this maybe (note: not completely syntax checked, apologies in advance):
WITH src AS (
-- This should be your source
SELECT 1 AS Id, 2 AS Val
)
-- The above is not neccessary if you have a source table
MERGE Target -- the detination table, so in your case records
USING src -- as defined above
ON (Target.Id = src.Id) -- how do we join the tables
WHEN NOT MATCHED BY TARGET
-- if we dont match, what do to the destination table. This case insert it.
THEN INSERT(Id, Val) VALUES(src.Id, src.Val)
WHEN MATCHED
-- what do we do if we match. This case update Val
THEN UPDATE SET Target.Val = src.Val;
Don't forget to read the proper syntax page: https://msdn.microsoft.com/en-us/library/bb510625.aspx
I think this translates to your example (tm):
WITH src AS (
-- This should be your source
SELECT 1 AS jid, 'alpha' AS pair, 3 as 'interval'
)
MERGE records -- the detination table, so in your case records
USING src -- as defined above
ON (records.Id = src.Id) -- how do we join the tables
WHEN NOT MATCHED BY TARGET
-- if we dont match, what do to the destination table. This case insert it.
THEN INSERT(jid, pair, interval, entry) VALUES(src.jid, src.pair, src.interval, 'unlimited')
WHEN MATCHED
-- what do we do if we match. This case update Val
THEN UPDATE SET records.entry = 'limited';

Prevent the insertion of duplicate rows using SQL Server 2008

I am trying to insert some data from one table into another but I would like to prevent the insertion of duplicate rows. I have currently the following query:
INSERT INTO Table1
(
Table1Col1,
Table1Col2,
Table1Col3,
Table1Col4,
Table1Col5
)
SELECT
Table2Col1,
Table2Col2 = constant1,
Table2Col3 = constant2,
Table2Col4 = constant3,
Table2Col5 = constant4
FROM Table2
WHERE
Condition1 = constant5
AND
Condition2 = constant6
AND
Condition3 = constant7
AND
Condition4 LIKE '%constant8%'
What I do not know is that the row I am trying to insert from Table2 into Table1 might already exist and I would like to prevent this possible duplication from happening and skip the insertion and just move onto inserting the next unique row.
I have seen that I can use a WHERE NOT EXISTS clause and use of the INTERSECT keyword but I did not fully understand how to apply it to my particular query as I only want to use some of the selected data from Table2 and then some constant values to insert into Table1.
EDIT:
I should add that the columns TableCol2 through to TableCol5 don't actually exist in the result set and I am just populating these columns alongside Table2Col1 that is returned.
Since you are on SQL Server 2008, you can use a merge statement.
You can easily check if a row exists base on a key
something like this:
merge TableMain AS target
using TableA as source
ON <join tables here>
WHEN MATCHED THEN <update>
WHEN NOT MATCHED BY TARGET <Insert>
WHEN NOT MATCHED BY SOURCE <delete>
Intersect (minus in Sql Server's terms) is out of question because it compares whole row. Other two options are not in/not exists/left join and merge. Not In is for single-column prinary key only, so it is out of question in this instance. In/Exists/Left join should have the same performance in Sql Server, so I'll just use exists:
INSERT INTO Table1
(
Table1Col1,
Table1Col2,
Table1Col3,
Table1Col4,
Table1Col5
)
SELECT
Table2Col1,
Table2Col2 = constant1,
Table2Col3 = constant2,
Table2Col4 = constant3,
Table2Col5 = constant4
FROM Table2
WHERE
Condition1 = constant5
AND
Condition2 = constant6
AND
Condition3 = constant7
AND
Condition4 LIKE '%constant8%'
AND NOT EXISTS
(
SELECT *
FROM Table1 target
WHERE target.Table1Col1 = Table2.Table2Col1
AND target.Table1Col2 = Table2.Table2Col2
AND target.Table1Col3 = Table2.Table2Col3
)
Merge is used to sync two tables; it has ability to insert, update and delete records from target table.
merge into table1 as target
using table2 as source
on target.Table1Col1 = source.Table2Col1
AND target.Table1Col2 = source.Table2Col2
AND target.Table1Col3 = source.Table2Col3
when not matched by target then
insert (Table1Col1,
Table1Col2,
Table1Col3,
Table1Col4,
Table1Col5)
values (Table2Col1,
Table2Col2,
Table2Col3,
Table2Col4,
Table2Col5);
If columns from table2 are computed during transfer, in not exists() case you might use derived table in place of table2, and the same applies to merge example - just place your query in place of reference to table2.
we have check the whether the data is already exist or not in table. For this we have to use If condition to avoid the duplicate insertion