BigQuery MERGE Query: Merge tables based on condition - sql

I was playing with the MERGE query in BigQuery and noticed that I can't update the specific rows based on the condition.
For example, I have 5 records already present in the table. I want to update only two records whose values are changed. When I execute the below query all the rows get updated. That means out of 5 for 3 records I don't want to change the values. AS soon as I'll receive new values that should change the existing records.
MERGE `test.organization_user` T
USING `test.user_details` S
ON T.user_id = S.user_id
WHEN MATCHED AND
(
T.organization <> S.organization OR
T.contact_number <> S.contact_number
)
THEN
UPDATE
SET
T.organization = S.organization,
T.contact_number = S.contact_number
WHEN NOT MATCHED THEN
INSERT ROW
Is there any solution for this kind of scenario or using merge it will update all the matching records and if we don't want to update the existing one then should we have values for all the fields for those records in the source table(from which values will get updated to the target table)?
An example is:

You should handle null in condition:-
MERGE `test.organization_user` T
USING `test.user_details` S
ON T.user_id = S.user_id
WHEN MATCHED AND
(
IFNULL(T.organization,'') <> IFNULL(S.organization,'') OR
IFNULL(T.contact_number,0) <> IFNULL(S.contact_number,0)
)
THEN
UPDATE
SET
T.organization = S.organization,
T.contact_number = S.contact_number
WHEN NOT MATCHED THEN
INSERT ROW

Related

Improving insert query for SCD2 solution

I have two insert statements. The first query is to inserta new row if the id doesn't exist in the target table. The second query inserts to the target table only if the joined id hash value is different (indicates that the row has been updated in the source table) and the id in the source table is not null. These solutions are meant to be used for my SCD2 solution, which will be used for inserts of hundreds thousands of rows. I'm trying not to use the MERGE statement for practices.
The columns "Current" value 1 indicates that the row is new and 0 indicates that the row has expired. I use this information later to expire my rows in the target table with my update queries.
Besides indexing is there a more competent and effective way to improve my insert queries in a way that resembles the like of the SCD2 merge statement for inserting new/updated rows?
Query:
Query 1:
INSERT INTO TARGET
SELECT Name,Middlename,Age, 1 as current,Row_HashValue,id
from Source s
Where s.id not in (select id from TARGET) and s.id is not null
Query 2:
INSERT INTO TARGET
SELECT Name,Middlename,Age,1 as current ,Row_HashValue,id
FROM SOURCE s
LEFT JOIN TARGET t ON s.id = t.id
AND s.Row_HashValue = t.Row_HashValue
WHERE t.Row_HashValue IS NULL and s.ID IS NOT NULL
You can use WHERE NOT EXISTS, and have just one INSERT statement:
INSERT INTO TARGET
SELECT Name,Middlename,Age,1 as current ,Row_HashValue,id
FROM SOURCE s
WHERE NOT EXISTS (
SELECT 1
FROM TARGET t
WHERE s.id = t.id
AND s.Row_HashValue = t.Row_HashValue)
AND s.ID IS NOT NULL;

Why does this bigquery merge sql not work when matched

This query properly updates all records with a matched name and null target key -
MERGE
`MY_TAGET_TABLE` AS Target
USING
(
select
distinct name, key
from
`MY_SOURCE_TABLE`
)
AS Source
ON Source.name = Target.name
WHEN MATCHED and (Target.key is null) THEN UPDATE SET
Target.key = Source.key,
Target.load_time = CURRENT_TIMESTAMP()
This query never updates any records -
MERGE
`MY_TAGET_TABLE` AS Target
USING
(
select
distinct name, key
from
`MY_SOURCE_TABLE`
)
AS Source
ON Source.name = Target.name
WHEN MATCHED and (Target.key <> Source.key) THEN UPDATE SET
Target.key = Source.key,
Target.load_time = CURRENT_TIMESTAMP()
My source has all non-null key values. The first query is just a POC to show that the rest of the query is working and for some reason the (Target.key <> Source.key) part of the matched query causes no records to get updated although there records with matching name and non-matching key values. If I instead use Target.key is null it will update the records, although this doesn't account for all of my use-cases (when the Target key is out of sync with the Source key)

Update table column with count from rows in two tables

I'm updating a count column in one table with the count of corresponding rows in another table like so, which seems to work fine.
UPDATE #Circuits
SET CMLTotal = (SELECT COUNT(#UTCMLs.Drawing)
FROM #UTCMLs
WHERE #UTCMLs.DRAWING = #Circuits.DRAWING)
Now I need to include the rows from another table, #XRAYCMLs to get the total count of corresponding rows from both tables. I'm thinking UNION, but do not know how to make that happen.
How can I update my existing statement to accomplish my goal?
You might just want + and two subqueries:
Update #Circuits
SET CMLTotal = (select COUNT(u.Drawing)
from #UTCMLs u
where u.DRAWING = #Circuits.DRAWING
) +
(select COUNT(#UTCMLs.Drawing)
from #XRAYCMLs x
where x.DRAWING = #Circuits.DRAWING
);

Insert first and then update the table using SQL statement

I am trying to use insert and update sql statements.
My table is as follows:
|c1|c2|c3|c4|c5
|1 2 a b c
|1 3 e f g
c3,c4,c5 can have different values. The row can be unique with the combination of C1 and C2 column. I need to be able to check if first row doesn't exists with values c1,c2 then insert the data. If c1,c2 already have the values for eg (1,2) and if the data comes back with the same values for c1,c2 then update c3,c4,c5 with the latest values.
I tried using the following query
INSERT INTO t1 (c1,c2,c3,c4,c5)
VALUES ('1','2','a','b','c')
ON DUPLICATE KEY
UPDATE c3='e',c4 = 'f',c5='g';
I am getting a ORA error as follows
SQL Command not ended properly (ORA-00933)
Update after response from sagi
MERGE INTO table1 t
USING(select '000004' as SENDER,'Receiver' as RECEIVER ,'1030' as IDENTIFIER,'2016' as CREATIONDATEANDTIME,'2' as ACKCODE,'Test' as ACKDESCRIPTION from table1 ) s
ON(t.SENDER = s.SENDER and t.IDENTIFIER = s.IDENTIFIER)
WHEN MATCHED THEN UPDATE SET t.CREATIONDATEANDTIME = '1213',t.RECEIVER = 'hello'
WHEN NOT MATCHED THEN INSERT (t.SENDER,t.RECEIVER,t.IDENTIFIER,t.CREATIONDATEANDTIME,t.ACKCODE,t.ACKDESCRIPTION)
VALUES (s.SENDER,s.RECEIVER,s.IDENTIFIER,s.CREATIONDATEANDTIME,s.ACKCODE,s.ACKDESCRIPTION)
Output of query:
scenario 1: When there is no data matching the condition(t.SENDER = s.SENDER and t.IDENTIFIER = s.IDENTIFIER), I get an error as follows
ORA-30926: Unable to get stable set of rows in the source tables. Cause: A stable set of rows could not be got because of large dml activity or a non-deterministic activity where clause.
Action: Remove any non-deterministic where clause and reissue dml
Scenario 2: When there is data matching the condition (t.SENDER = s.SENDER and t.IDENTIFIER = s.IDENTIFIER) then in the table, I can see 5 new entries.
Can you please help.
You can use MERGE STATEMENT like this:
MERGE INTO t1 t
USING(select '1' as c1,'2' c2 ,'a' as c3,'b' as c4,'c' as c5 from dual) s
ON(t.c1 = s.c1 and t.c2 = s.c2)
WHEN MATCHED THEN UPDATE SET t.c3 = '1213',t.c4 = 'test'
WHEN NOT MATCHED THEN INSERT (t.c1,t.c2,t.c3,t.c4,t.c5)
VALUES (S.c1,s.c2,s.c3,s.c4,s.c5)
This basically perform an UPSERT, update else insert. It checks if the values exist, if so - update/deletes them(adjust to code to do what you want) and if not, insert them.
MERGE INTO table1 t USING(select distinct '000004' as SENDER,'Receiver' as RECEIVER ,'1030' as IDENTIFIER,'2016' as CREATIONDATEANDTIME,'2' as ACKCODE,'Test' as ACKDESCRIPTION from table1 ) s ON(t.SENDER = s.SENDER and t.IDENTIFIER = s.IDENTIFIER) WHEN MATCHED THEN UPDATE SET t.CREATIONDATEANDTIME = '1213',t.RECEIVER = 'hello' WHEN NOT MATCHED THEN INSERT (t.SENDER,t.RECEIVER,t.IDENTIFIER,t.CREATIONDATEANDTIME,t.ACKCODE,t.ACKDESCRIPTION) VALUES (s.SENDER,s.RECEIVER,s.IDENTIFIER,s.CREATIONDATEANDTIME,s.ACKCODE,s.ACKDESCRIPTION)
Added distinct clause in my query and it is working fine. Thanks all for replying to my post and guiding me.

update multiple fields SQL

Hi my problem is I want to update a field in 1 table using another field from several tables dependant upon where the item originates my only problem is the table which im trying to update has several of the same values in so am getting 'single row sub-query returns more than 1 row'. I dont mind all of the updated fields with the same value being the same. Heres my SQL:
update URL_SET_TAB u
Set U.ITEM_NAME = (select a.PROGRAMME_NAME
from (SELECT (nvl(nvl(b.prog_name,c.movie_name), A.URL_1)) as programme_name, a.ID, a.URL_1
FROM URL_SET_TAB a, prog_name_lookup b, movie_name_lookup c
where a.url_1 = b.url_1(+) and a.url_1 = C.MOVIE_URL(+)
) a
where u.ID = a.ID and U.URL_1 = a.URL_1
)
You need to identify a key column which when matched for URL_SET_TAB and inline view a so that the subquery returns only a single record. This is a limitaion of an UPDATE clause.
Thanks,
Aditya