How to solve this SQL specific merge problem - sql

How to solve the error:
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
merge CARD_ALERTS as t
using #tblAlerts as s
on (t.Id = s.AlertId and t.CardId = s.CardId)
when not matched by target
then insert(Id, ExternalCodeHolder, CardId, IsCardOwner, IBAN, PAN, MinAmount, Currency, ByEmail, BySMS, IssueDate, IsActive)
values(s.AlertId, s.ExternalCodeHolder, s.CardId, s.IsCardOwner, s.IBAN, s.PAN, s.MinAmount, s.Currency, s.ByEmail, s.BySMS, getdate(), 1)
when matched
then update set t.ByEmail = s.ByEmail, t.BySMS = s.BySMS, IsActive = 1, t.MinAmount = s.MinAmount
when not matched by source and t.Id=#AlertId
then update set t.IsActive = 3

As explained in the error message and in the comments, multiple rows of the source table correspond to the same row in the target one.
This will show you which rows of the target table have multiple rows from the source table that try to update them:
select t.Id,t.CardId,count(*) as [count]
from CARD_ALERTS as t
inner join #tblAlerts as s
on (t.Id = s.AlertId and t.CardId = s.CardId)
group by t.Id,t.CardId
having count(*)>1
This will also show you the multiple rows of the source table:
select t.*,s.*
from CARD_ALERTS as t
inner join #tblAlerts as s on (t.Id = s.AlertId and t.CardId = s.CardId)
inner join
(
select ca.Id,ca.CardId
from CARD_ALERTS as ca
inner join #tblAlerts as s
on (ca.Id = s.AlertId and ca.CardId = s.CardId)
group by ca.Id,ca.CardId
having count(*)>1
)tkey on t.Id=tkey.Id and t.CardId=tkey.CardId

Related

BigQuery - How can i make Merge Statement Handle Duplicate rows?

I currently have a table which uses compound columns to uniquely identify each row of data.. So they are times we get duplicate rows. I have an Merge Statement that copies data from the source table to the destination. Anytime we have duplicate rows on the Source Table. The Merge statement copies the Duplicate along with the other rows of data to the Destination. The merge statement is
MERGE `project.dataset.destination` T
USING `project.dataset.source` S
ON (T.department = S.department OR T.department IS NULL and S.department IS NULL) AND
(T.category = S.category OR T.category IS NULL AND S.category IS NULL) AND
(T.subcategory = S.subcategory OR T.subcategory IS NULL AND S.subcategory IS NULL) AND
(T.subset = S.subset OR T.subset IS NULL AND S.subset IS NULL) AND
(T.country = S.country OR T.country IS NULL AND S.country IS NULL) AND
(T.state = S.state OR T.state IS NULL AND S.state IS NULL) AND
(T.county = S.county OR T.county IS NULL AND S.country IS NULL) AND
(T.date = S.date OR T.date IS NULL AND S.date IS NULL)
WHEN NOT MATCHED AND ((department = "SPORT" AND subcategory IN ("FOOTBALL", "PONG")) AND
(country IN("USA", "CANADA") )) THEN
INSERT ROW
WHEN NOT MATCHED BY SOURCE THEN
DELETE
My question is; is there any way i can handle the Inserting of Duplicates to the Destination Table?
Or if the Duplicates rows is Inserting into the Destination. when this Merge statement runs the next day; is there anyway i can modify this Merge Statement to Delete any duplicate found on the Destination table?
Thanks in Advance
General guideline for a good SQL question: provide sample source table and dest table, then the expected results.
That said, based on the comments, 'duplicate' here means:
having the same row of data appear more than one on the source table
Deduping the source table could happen in a subquery like
MERGE `project.dataset.destination` T
USING (
SELECT ...
FROM `project.dataset.source`
GROUP BY <your_unique_key>
) S
ON ...

If the record exists in target but doesn't exist in source, i need to delete it

So, i have two tables, the target table and the source one. I need to delete the rows that exists in the target table, but doesn't exists in the source table.
And the code:
MERGE INTO (SELECT id_car_bk, car_brand_bk, car_type_bk, new_car
FROM car_catalog_backup) CB
USING (SELECT id_car, car_brand, car_type FROM car_catalog) C
ON (CB.id_car_bk = b.id_car)
WHEN NOT MATCHED THEN
INSERT
(CB.id_car_bk, CB.car_brand_bk, CB.car_type_bk)
VALUES
(C.id_car, C.car_brand, C.car_type)
WHEN MATCHED THEN
UPDATE SET CB.car_brand_bk = C.car_brand;
You can use
DELETE car_catalog_backup b
WHERE not exists
( SELECT 0
FROM car_catalog c
WHERE b.id_car_bk = c.id_car );
or
DELETE car_catalog_backup b
WHERE b.id_car_bk not in
( SELECT c.id_car
FROM car_catalog c );
assuming car_catalog is the source, and car_catalog_backup is the target. The First one is preferable, since it's more performant.
If your aim is to find out with a MERGE statement similar to your case, then use the following
MERGE INTO car_catalog_backup a
USING (SELECT id_car, car_brand, car_type, car_brand_bk
FROM car_catalog
JOIN car_catalog_backup
ON id_car_bk = id_car
) b
ON (a.id_car_bk = b.id_car)
WHEN MATCHED THEN
UPDATE SET a.new_car = 1
DELETE
WHERE a.car_brand_bk != b.car_brand
WHEN NOT MATCHED THEN
INSERT
(id_car_bk, car_brand_bk, car_type_bk)
VALUES
(b.id_car, b.car_brand, b.car_type)
to delete the records matched for id columns ( a.id_car_bk = b.id_car ) but not matched for brand code columns ( a.car_brand_bk != car_brand ) as an example.
Demo
Delete from target
Where not exists
(
Select 1
From source
Where join of source and target
)
With a left join:
DELETE target
FROM target LEFT JOIN source
ON target.someid = source.otherid
WHERE source.otherid IS NULL;

PL/SQL Update Query

Attempting to update a table I have created with null values from another table I have created in PL/SQL:
I was able to utilize the below update in SQL Server but am running into issues within PL/SQL (dont ask why I am running it in both)
Overall_Inventory = Table created with some populated values and some null valuse; this is the table requiring updates to those null values
task_table = Table also created, but contains value needing to be updated into T1
update dbh.overall_inventory
set dbh.overall_inventory.case_due_date = tsk.TASK_ACTION_TIMESTAMP
from dbh.overall_inventory,
(SELECT tsk.INQ_KEY,
min(tsk.TASK_ACTION_TIMESTAMP) as TASK_ACTION_TIMESTAMP
FROM dbh.task_table tsk
inner join dbh.overall_inventory Inv
on tsk.INQ_KEY = inv.inq_key
where tsk.ACTION_CD = '324'
group by tsk.INQ_KEY
) tsk
where tsk.INQ_KEY = dbh.overall_inventory.inq_key`
Oracle doesn't support from clause in the update statement. In this situation merge statement can be used.
merge into overall_inventory oi
using (select tsk.inq_key,
min(tsk.task_action_timestamp) as task_action_timestamp
from task_table tsk
join overall_inventory Inv
on tsk.inq_key = inv.inq_key
where tsk.action_cd = '324'
group by tsk.inq_key) tsk
on (tsk.inq_key = oi.inq_key )
when matched then
update
set case_due_date = tsk.task_action_timestamp
where case_due_date is null -- as I understood only NULL values
-- need to be updated
Note: Not tested because no sample data and desired result were provided.
I think you're looking for update over a select:
UPDATE (
  SELECT product_id, category_id
  FROM product) st
SET st.category_id = 5
WHERE st.category_id = 4;

Ms Access query to SQL Server - DistinctRow

What would the syntax be to convert this MS Access query to run in SQL Server as it doesn't have a DistinctRow keyword
UPDATE DISTINCTROW [MyTable]
INNER JOIN [AnotherTable] ON ([MyTable].J5BINB = [AnotherTable].GKBINB)
AND ([MyTable].J5BHNB = [AnotherTable].GKBHNB)
AND ([MyTable].J5BDCD = [AnotherTable].GKBDCD)
SET [AnotherTable].TessereCorso = [MyTable].[J5F7NR];
DISTINCTROW [MyTable] removes duplicate MyTable entries from the results. Example:
select distinctrow items
items.item_number, items.name
from items
join orders on orders.item_id = items.id;
In spite of the join getting you the same item_number and name multiple times when there is more than one order for it, DISTINCTROW reduces this to one row per item. So the whole join is merely for assuring that you only select items for which exist at least one order. You don't find DISTINCTROW in any other DBMS as far as I know. Probably because it is not needed. When checking for existence, we use EXISTS of course (or IN for that matter).
You are joining MyTable and AnotherTable and expect for some reason to get the same MyTable record multifold for one AnotherTable record, so you use DISTINCTROW to only get it once. Your query would (hopefully) fail if you got two different MyTable records for one AnotherTable record.
What the update does is:
update anothertable
set tesserecorso = (select top 1 j5f7nr from mytable where mytable.j5binb = anothertable.gkbinb and ...)
where exists (select * from mytable where mytable.j5binb = anothertable.gkbinb and ...)
But this uses about the same subquery twice. So we'd want to update from a query instead.
The easiest way to get one result record per <some columns> in a standard SQL query is to aggregate data:
select *
from anothertable a
join
(
select j5binb, j5bhnb, j5bdcd, max(j5f7nr) as j5f7nr
from mytable
group by j5binb, j5bhnb, j5bdcd
) m on m.j5binb = a.gkbinb and m.j5bhnb = a.gkbhnb and m.j5bdcd = a.gkbdcd;
How to write an updateble query is different from one DBMS to another. Here is the final update statement for SQL-Server:
update a
set a.tesserecorso = m.j5f7nr
from anothertable a
join
(
select j5binb, j5bhnb, j5bdcd, max(j5f7nr) as j5f7nr
from mytable
group by j5binb, j5bhnb, j5bdcd
) m on m.j5binb = a.gkbinb and m.j5bhnb = a.gkbhnb and m.j5bdcd = a.gkbdcd;
The DISTINCTROW predicate in MS Access SQL removes duplicates across all fields of a table in join statements and not just the selected fields of query (which DISTINCT in practically all SQL dialects do). So consider selecting all fields in a derived table with DISTINCT predicate:
UPDATE [AnotherTable]
SET [AnotherTable].TessereCorso = main.[J5F7NR]
FROM
(SELECT DISTINCT m.* FROM [MyTable] m) As main
INNER JOIN [AnotherTable]
ON (main.J5BINB = [AnotherTable].GKBINB)
AND (main.J5BHNB = [AnotherTable].GKBHNB)
AND (main.J5BDCD = [AnotherTable].GKBDCD)
Another variant of the query.. (Too lazy to get the original tables).
But like the query above updates 35 rows =, so does this one
UPDATE [Albi-Anagrafe-Associati]
SET
[Albi-Anagrafe-Associati].CRegDitte = [055- Registri ditte].[CRegDitte],
[Albi-Anagrafe-Associati].NIscrTribunale = [055- Registri ditte].[NIscrTribunale],
[Albi-Anagrafe-Associati].NRegImprese = [055- Registri ditte].[NRegImprese]
FROM [055- Registri ditte]
WHERE EXISTS(
SELECT *
FROM [055- Registri ditte]-- [Albi-Anagrafe-Associati]
WHERE ([055- Registri ditte].GIBINB = [Albi-Anagrafe-Associati].GKBINB)
AND ([055- Registri ditte].GIBHNB = [Albi-Anagrafe-Associati].GKBHNB)
AND ([055- Registri ditte].GIBDCD = [Albi-Anagrafe-Associati].GKBDCD))
Update [AnotherTable]
Set [AnotherTable].TessereCorso = MyTable.[J5F7NR]
From [AnotherTable]
Inner Join
(
Select Distinct [J5BINB],[5BHNB],[J5BDCD]
,(Select Top 1 [J5F7NR] From MyTable) as [J5F7NR]
,[J5BHNB]
From MyTable
)as MyTable
On (MyTable.J5BINB = [AnotherTable].GKBINB)
AND (MyTable.J5BHNB = [AnotherTable].GKBHNB)
AND (MyTable.J5BDCD = [AnotherTable].GKBDCD)

Doing an Update Ignore in SQL Server 2005

I have a table where I wish to update some of the rows. All the fields are not null. I'm doing a sub-query, and I wish to update the table with the non-Null results.
See Below for my final answer:
In MySQL, I solve this problem by doing an UPDATE IGNORE. How do I make this work in SQL Server 2005? The sub-query uses a four-table Join to find the data to insert if it exists. The Update is being run against a table that could have 90,000+ records, so I need a solution that uses SQL, rather than having the Java program that's querying the database retrieve the results and then update those fields where we've got non-Null values.
Update: My query:
UPDATE #SearchResults SET geneSymbol = (
SELECT TOP 1 symbol.name FROM
GeneSymbol AS symbol JOIN GeneConnector AS geneJoin
ON symbol.id = geneJoin.geneSymbolID
JOIN Result AS sSeq ON geneJoin.sSeqID = sSeq.id
JOIN IndelConnector AS joiner ON joiner.sSeqID = sSeq.id
WHERE joiner.indelID = #SearchResults.id ORDER BY symbol.id ASC)
WHERE isSNV = 0
If I add "AND symbol.name IS NOT NULL" to either WHERE I get a SQL error. If I run it as is I get "adding null to a non-null column" errors. :-(
Thank you all, I ended up finding this:
UPDATE #SearchResults SET geneSymbol =
ISNULL ((SELECT TOP 1 symbol.name FROM
GeneSymbol AS symbol JOIN GeneConnector AS geneJoin
ON symbol.id = geneJoin.geneSymbolID
JOIN Result AS sSeq ON geneJoin.sSeqID = sSeq.id
JOIN IndelConnector AS joiner ON joiner.sSeqID = sSeq.id
WHERE joiner.indelID = #SearchResults.id ORDER BY symbol.id ASC), ' ')
WHERE isSNV = 0
While it would be better not to do anything in the null case (so I'm going to try to understand the other answers, and see if they're faster) setting the null cases to a blank answer also works, and that's what this does.
Note: Wrapping the ISNULL (...) with () leads to really obscure (and wrong) errors.
with UpdatedGenesDS (
select joiner.indelID, name, row_number() over (order by symbol.id asc) seq
from
GeneSymbol AS symbol JOIN GeneConnector AS geneJoin
ON symbol.id = geneJoin.geneSymbolID
JOIN Result AS sSeq ON geneJoin.sSeqID = sSeq.id
JOIN IndelConnector AS joiner ON joiner.sSeqID = sSeq.id
WHERE name is not null ORDER BY symbol.id ASC
)
update Genes
set geneSymbol = upd.name
from #SearchResults a
inner join UpdateGenesDs upd on a.id = b.intelID
where upd.seq =1 and isSNV = 0
this handles the null completely as all are filtered out by the where predicate (can also be filtered by join predicate if You wish. Is it what You are looking for?
Here's another option, where only those rows in #SearchResults that are succesfully joined will be udpated. If there are no null values in the underlying data, then the inner joins will pull in no null values, and you won't have to worry about filtering them out.
UPDATE #SearchResults
set geneSymbol = symbol.name
from #SearchResults sr
inner join IndelConnector AS joiner
on joiner.indelID = sr.id
inner join Result AS sSeq
on sSeq.id = joiner.sSeqID
inner join GeneConnector AS geneJoin
on geneJoin.sSeqID = sSeq.id
-- Get "lowest" (i.e. first if listed alphabetically) value of name for each id
inner join (select id, min(name) name
from GeneSymbol
group by id) symbol
on symbol.id = geneJoin.geneSymbolID
where isSNV = 0 -- Which table is this value from?
(There might be some syntax problems, without tables I can't debug it)