This is what I am trying to do:
Let's say I have two tables dbo.source & dbo.destination
I want to copy all records from source to destination IF a certain natural key (unique non clustered) does not exist in the destination. If the insert is successful, then output some values to a temporary buffer table.
Next I want to list all the records from the source which DID have a match in the destination, and copy these as well to the buffer table.
Is there anyway I can achieve this so that the buffer table does not hold redundant data ?
This is my current logic:
Step1: Get records from the source table where the natural key does not match the destination and insert into destination
Insert these into buffer table with flag
MERGE INTO dbo.Destination dest USING dbo.Source AS src
ON dest.Name = src.Name --Natural Key
WHEN NOT MATCHED THEN
INSERT (xxx) VALUES (xxx)
OUTPUT src.ID, Inserted.ID, 'flagA'
INTO dbo.Buffer;
Step2:
Get records from the source table where the natural key matched the destination
Insert these into buffer with a flag
Insert INTO dbo.Buffer
Select src.ID, src.Name, 'flagB'
FROM dbo.Source src
inner join dbo.Destination dest
on src.Name = dest.Name
With this logic, I am getting redundant rows into my buffer, which do not exactly track the inserts as intended. Can anyone critique my sql based on what I am trying to do.
You can try it some like it, the dislike of this technique is that you always update one field. Maybe, you need to adapt my example to your needs.
DECLARE #Source table (id int identity , myValue varchar(5))
DECLARE #Destination table (id int identity , myValue varchar(5))
DECLARE #Buffer table (sourceId int , InsertId varchar(5),flag varchar(5))
insert #Source (myValue) values ( 'a') ,( 'e'),( 'i'),( 'o'),( 'u')
insert #Destination (myValue) values ('a') ,('b'),('c')
;merge #Destination t
using #Source S
on
t.myValue = s.myValue
when not matched then insert (myValue) values (s.myValue)
when matched then update set myValue = t.myValue
output s.id, inserted.id, case $action when 'INSERT' then 'flagA' else 'flagB' end into #Buffer;
select * from #Destination
select * from #Buffer
Result
Destination table
id myValue
----------- -------
1 a
2 b
3 c
4 e
5 i
6 o
7 u
Buffer table
sourceId InsertId flag
----------- -------- -----
2 4 flagA
3 5 flagA
4 6 flagA
5 7 flagA
1 1 flagB
use this output
output s.id, case $action when 'INSERT' then Cast(inserted.id as varchar(5)) else inserted.myValue end , case $action when 'INSERT' then 'flagA' else 'flagB' end into #Buffer;
for
sourceId InsertId flag
----------- -------- -----
2 4 flagA
3 5 flagA
4 6 flagA
5 7 flagA
1 a flagB
When you run your second query, it also matches the just inserted rows. You should do something like this:
Insert INTO dbo.Buffer
Select src.ID, src.name, 'flagB'
FROM dbo.Source src
inner join dbo.Destination dest on src.Name = dest.Name
where not exists (
select * from dbo.Buffer b where b.xxx = 'flagA' and b.yyy = src.name
)
Or just use when matched by target, as LONG suggested.
Related
This question already has answers here:
Avoiding IF ELSE due to variable been NULL
(3 answers)
Issues with SQL comparison and null values
(8 answers)
Is there a way to simplify a NULL compare of 2 values
(2 answers)
Closed last year.
I need to update a table if an existing record has changes. If the record exists (APK is found), then update it if ID_NUMBER has changed. The problem is when the existing TARGET value is NOT NULL and the SOURCE value IS NULL. How can I detect that condition as unequal?
Using ISNULL() can work, but the second parameter must -not- ever occur in the data. That requires profiling all of the NUMERIC data. Can it bedone without that? In this case, zero (0) can work only if it never occurs in the data.
UPDATE T SET
T.f1 = S.f1
FROM TARGET_TABLE T
INNER JOIN SOURCE_TABLE S
ON T.APK = S.APK
WHERE
ISNULL(T.ID_NUMBER,0) <> ISNULL(S.ID_NUMBER,0)
;
Here is the possible combinations of ID_NUMBER values in TARGET and SOURCE tables.
Target Source
====== ======
NULL NULL
NULL 7 -- should identify as unequal
7 NULL -- should identify as unequal, BUT DOES NOT
7 7
The following script shows the results. Only the third statement, comparing an existing TARGET value of 7 with the incoming SOURCE value of NULL will fail. Why is that? What code will work?
SET NOCOUNT ON;
SELECT 3 WHERE ISNULL(NULL, NULL+1) <> ISNULL(NULL,NULL+1);
GO
SELECT 3 WHERE ISNULL(NULL, 7+1) <> ISNULL(7,7+1);
GO
SELECT 3 WHERE ISNULL(7, NULL+1) <> ISNULL(NULL,NULL+1); -- WHY DOES THIS NOT SEE THE INEQUALITY?
GO
SELECT 3 WHERE ISNULL(7, 7+1) <> ISNULL(7,7+1);
GO
Example execution:
1> SET NOCOUNT ON;
2> SELECT 3 WHERE ISNULL(NULL, NULL+1) <> ISNULL(NULL,NULL+1);
3> GO
-----------
1> SELECT 3 WHERE ISNULL(NULL, 7+1) <> ISNULL(7,7+1);
2> GO
-----------
3
1> SELECT 3 WHERE ISNULL(7, NULL+1) <> ISNULL(NULL,NULL+1); -- WHY DOES THIS NOT SEE THE INEQUALITY?
2> GO
-----------
1> SELECT 3 WHERE ISNULL(7, 7+1) <> ISNULL(7,7+1);
2> GO
-----------
1>
I highly recommend doing some reading on NULL because it represents an unknown value and as such cannot be compared or added to another value. Therefore you have to treat it as a separate case using traditional AND/OR logic.
DECLARE #Table1 TABLE (APK int, ID_NUMBER int);
DECLARE #Table2 TABLE (APK int, ID_NUMBER int);
INSERT INTO #Table1 (APK, ID_NUMBER)
VALUES (1, null), (1, null), (1, 7), (1, 7), (1, 5);
INSERT INTO #Table2 (APK, ID_NUMBER)
VALUES (1, null), (1, 7), (1, null), (1, 7), (1, 4);
SELECT T.APK, T.ID_NUMBER, S.ID_NUMBER
FROM #Table1 T
INNER JOIN #Table2 S ON T.APK = S.APK
WHERE T.ID_Number <> S.ID_NUMBER
OR (T.ID_Number IS NULL AND S.ID_NUMBER IS NOT NULL)
OR (T.ID_Number IS NOT NULL AND S.ID_NUMBER IS NULL);
Given I suspect you have simplified your actual use-case you might find that EXCEPT can be used in your situation as EXCEPT (and INTERSECT) perform a different type of compare when it comes to NULLs. See here for more.
Please try the following solution.
It is based on use of a checksum in CTEs via HASHBYTES() function.
This method is working with NULL values and multiple columns in the tables.
I added UpdatedOn column to show what column was updated.
SQL
-- DDL and sample data population, start
DECLARE #Table1 TABLE (APK INT IDENTITY PRIMARY KEY, ID_NUMBER INT, UpdatedOn DATETIMEOFFSET(3));
DECLARE #Table2 TABLE (APK INT IDENTITY PRIMARY KEY, ID_NUMBER INT, UpdatedOn DATETIMEOFFSET(3));
INSERT INTO #Table1 (ID_NUMBER)
VALUES (null), (null), (7), (7), (5);
INSERT INTO #Table2 (ID_NUMBER)
VALUES (null), (7), (null), (7), (4);
-- DDL and sample data population, end
WITH source AS
(
SELECT sp.*, HASHBYTES('sha2_256', xmlcol) as [Checksum]
FROM #Table1 sp
CROSS APPLY (SELECT sp.* FOR XML RAW) x(xmlcol)
), target AS
(
SELECT sp.*, HASHBYTES('sha2_256', xmlcol) as [Checksum]
FROM #Table2 sp
CROSS APPLY (SELECT sp.* FOR XML RAW) x(xmlcol)
)
UPDATE T
SET T.ID_NUMBER = S.ID_NUMBER
, T.UpdatedOn = SYSDATETIMEOFFSET()
FROM TARGET AS T
INNER JOIN SOURCE AS S
ON T.APK = S.APK
WHERE T.[Checksum] <> S.[Checksum];
-- test
SELECT * FROM #Table2;
Output
+-----+-----------+--------------------------------+
| APK | ID_NUMBER | UpdatedOn |
+-----+-----------+--------------------------------+
| 1 | NULL | NULL |
| 2 | NULL | 2022-02-09 18:58:10.336 -05:00 |
| 3 | 7 | 2022-02-09 18:58:10.336 -05:00 |
| 4 | 7 | NULL |
| 5 | 5 | 2022-02-09 18:58:10.336 -05:00 |
+-----+-----------+--------------------------------+
I am trying to prevent multiple inserts at the exact same time so I can prevent duplicate inserts. I have two tables:
Table B, this table has 4 columns id, timeToken, tokenOrder and taken.
Table A which I will be inserting into and that has id, createDate and timeToken.
What I am trying to do is prevent the timeToken in Table A not to have duplicate values in the case where multiple inserts are happening at the exact same time. I have the following code:
DECLARE #ReturnValue nvarchar
SELECT Top 1 #ReturnValue=timeToken FROM TableB WHERE taken = 0 Order By tokenOrder
Update TableB SET taken = 1 WHERE timeToken = #ReturnValue
INSERT INTO TableA Values(#ReturnValue, GETDATE())
Now that I think about it, is it possible to have my timeToken table in TableA auto increment with the timeToken from TableB?
Table B sample data:
id timeToken tokenOrder taken
1 1:00am 1 0
2 2:00am 2 0
3 3:00am 3 1
4 4:00am 4 0
5 5:00am 5 0
This is what I am expecting Table A to look like after 4 calls all at the exact same time that would cause duplicates (id starting at 5 - this could be because I have deleted old records).
Table A sample data:
id createDate timeToken
5 2014-11-22 12:45:34.243 1:00am
6 2014-11-22 12:45:34.243 2:00am
7 2014-11-22 12:45:34.243 4:00am
8 2014-11-22 12:45:34.243 5:00am
Try to rewrite like this, this should ensure that you do not get the row with taken=0 in TableB updated twice.
BEGIN TRANSACTION
DECLARE #taken table(
id int NOT NULL,
timeToken nvarchar(max));
Update TOP (1) TableB
SET taken = 1
OUTPUT UPDATED.id, UPDATED.timeToken
INTO #taken
WHERE timeToken =
(SELECT Top 1 timeToken FROM TableB WHERE taken = 0 Order By tokenOrder)
INSERT INTO TableA
SELECT id, GETDATE(), timeToken
FROM #taken
COMMIT TRANSACTION
See SQL Server isolation levels - read commited. READ COMMITTED is the default isolation level for the Microsoft SQL Server Database Engine.
In the example I copy id from TableB to TableA, but it is not probably required.
I think you can solve this problem in two steps:
Step1: Buffer all requests as soon as they arrive.
Step2: Periodically assign free tokens to the buffered requests.
Preparation
A sequence object will help resolve any order ambiguity:
CREATE SEQUENCE dbo.Taken_Seq
START WITH 1
INCREMENT BY 1 ;
GO
An auxiliary table will play the role of the buffer:
CREATE TABLE buffer (
requester uniqueidentifier, createdate datetime, seq_value bigint, id int);
I will also use a GUID to refer to the different processes asking for a token (requesters):
ALTER TABLE TableA add Requester uniqueidentifier;
Solution outline
As soon as a request comes (identified by a GUID) buffer it consuming the next sequence value, like this (here I use newid() to get a GUID, your application should have already assigned one to your request):
declare #seq bigint;
SELECT #seq = NEXT VALUE FOR dbo.Taken_Seq;
insert buffer values (newid(), getdate(), #seq, null);
Suppose now that three such requests arrive simultaneously, as in:
declare #seq bigint;
SELECT #seq = NEXT VALUE FOR dbo.Taken_Seq;
insert buffer values (newid(), getdate(), #seq, null);
SELECT #seq = NEXT VALUE FOR dbo.Taken_Seq;
insert buffer values (newid(), getdate(), #seq, null);
SELECT #seq = NEXT VALUE FOR dbo.Taken_Seq;
insert buffer values (newid(), getdate(), #seq, null);
The contents of the buffer table will then look like this:
requester createdate seq_value id
------------------------------------ ----------------------- -------------------- -----------
109B560C-155C-40BD-A13A-59D21EBEB1F8 2017-04-05 11:17:35.127 31 NULL
FAC00C2E-14AA-4502-AB5C-DDD756914653 2017-04-05 11:17:35.127 32 NULL
E95889C3-E291-4A1C-A7E8-0B8CC53D4D7B 2017-04-05 11:17:35.127 33 NULL
Next we can match each buffered request to a token. This will be done by assinging an id value to each request in our buffered table:
; with a as
(select rn =row_number() over (order by seq_value), *
from buffer
where id is null),
b as
(
select rn=row_number() over (order by tokenOrder), *
from TableB
where taken = 0
)
update buffer set buffer.id = b.id
from buffer
join a on buffer.requester = a.requester
join b on a.rn = b.rn
This is now how our buffer table looks like:
requester createdate seq_value id
------------------------------------ ----------------------- -------------------- -----------
109B560C-155C-40BD-A13A-59D21EBEB1F8 2017-04-05 11:17:35.127 31 1
FAC00C2E-14AA-4502-AB5C-DDD756914653 2017-04-05 11:17:35.127 32 2
E95889C3-E291-4A1C-A7E8-0B8CC53D4D7B 2017-04-05 11:17:35.127 33 3
Join the buffer table with TableB to find the tokens:
select buffer.requester, tableB.* from buffer join tableB on buffer.id= tableB.id
Mark the tokens as taken:
update TableB set taken = 1 from buffer where buffer.id = TableB.id
Finally, insert into TableA:
insert TableA (requester, createdate, timeToken)
select buffer.requester, buffer.createdate, TableB.timeToken
from buffer join TableB on buffer.id = TableB.id
Note:
Obviously some of these steps must be contained within a single transaction
I am trying to understand the merge search condition and have come across the following problem.
Table1
id groupid description
-------------------------
1 10 Good
2 20 Better
Table2
id groupid description
-------------------------
1 10 Very Good
1 20 Much Better
I intend to merge the source (table1) to target (table2) on the id present in both but only groupid = 20 present in target table.
Here is what I am writing
Merge table1 source
Using table2 target ON (target.id = source.id AND target.groupid = 20)
When Matched
Then update
set target.description = source.description
The output I am expecting is
Table2
id groupid description
-------------------------
1 10 Very Good
1 20 Good
But I am not 100% sure of the ON clause (merge search condition) with multiple conditions of checking target.id = source.id and target.groupid = 20. Is the result always predictable and matching the expectation above in these multiple conditions ? Or is predictability a question here and should I be adding target.groupId = 20 in the "when matched AND" condition ?
It looks like your join is wrong. You are either needing to join on the GROUPID or your data is incorrect.
JOINING ON GROUP
create table #table1 (id int, groupid int, description varchar(64))
create table #table2 (id int, groupid int, description varchar(64))
insert into #table1 values
(1,10,'Good'),
(2,20,'Better')
insert into #table2 values
(1,10,'Very Good'),
(1,20,'Much Better')
Merge #table2 t
Using #table1 s
ON (t.groupid = s.groupid AND t.groupid = 20)
When Matched
Then update
set t.description = s.description;
select * from #table2
drop table #table2
drop table #table1
Otherwise, there isn't any way to correlate "better" from ID = 2 to a row where ID = 1. This goes against your original join condition on the ID column.
BASED OFF EDITED EXPECTED OUTPUT
create table #table1 (id int, groupid int, description varchar(64))
create table #table2 (id int, groupid int, description varchar(64))
insert into #table1 values
(1,10,'Good'),
(2,20,'Better')
insert into #table2 values
(1,10,'Very Good'),
(1,20,'Much Better')
Merge #table2 t
Using #table1 s
ON (t.id = s.id) --you could also put the and t.groupid = 20 here...
When Matched and t.groupid = 20
Then update
set t.description = s.description;
select * from #table2
drop table #table2
drop table #table1
What's the CTE syntax to delete from a table, then insert to the same table and return the values of the insert?
Operating on 2 hours of sleep and something doesn't look right (besides the fact that this won't execute):
WITH delete_rows AS (
DELETE FROM <some_table> WHERE id = <id_value>
RETURNING *
)
SELECT * FROM delete_rows
UNION
(
INSERT INTO <some_table> ( id, text_field )
VALUES ( <id_value>, '<text_field_value>' )
RETURNING *
)
The expected behavior is to first clear all the records for an ID, then insert records for the same ID (intentionally not an upsert) and return those inserted records (not the deletions).
Your question update made clear that you cannot do this in a single statement.
Packed into CTEs of the same statement, both operations (INSERT and DELETE) would see the same snapshot of the table and execute virtually at the same time. I.e., the INSERT would still see all rows that you thought to be deleted already. The manual:
All the statements are executed with the same snapshot (see Chapter 13), so they cannot "see" one another's effects on the target tables.
You can wrap them as two independent statements into the same transaction - which doesn't seem strictly necessary either, but it would allow the whole operation to succeed / fail atomically:
BEGIN;
DELETE FROM <some_table> WHERE id = <id_value>;
INSERT INTO <some_table> (id, text_field)
VALUES ( <id_value>, '<text_field_value>')
RETURNING *;
COMMIT;
Now, the INSERT can see the results of the DELETE.
CREATE TABLE test_table (value TEXT UNIQUE);
INSERT INTO test_table SELECT 'value 1';
INSERT INTO test_table SELECT 'value 2';
WITH delete_row AS (DELETE FROM test_table WHERE value='value 2' RETURNING 0)
INSERT INTO test_table
SELECT DISTINCT 'value 2'
FROM (SELECT 'dummy') dummy
LEFT OUTER JOIN delete_row ON TRUE
RETURNING *;
The query above handles the situations when DELETE deletes 0/1/some rows.
Elaborating on skif1979's "DelSert" CTE method, the "Logged DelSert:"
-- setups
DROP TABLE IF EXISTS _zx_t1 ;
CREATE TEMP TABLE
IF NOT EXISTS
_zx_t1
( id bigint
, fld2 bigint
, UNIQUE (id)
);
-- unique records
INSERT INTO _zx_t1 SELECT 1, 99;
INSERT INTO _zx_t1 SELECT 2, 98;
WITH
_cte_del_row AS
( DELETE
FROM _zx_t1
WHERE id = 2
RETURNING id as _b4_id, fld2 as _b4_fld2 -- returns complete deleted row
)
, _cte_delsert AS
( INSERT
INTO _zx_t1
SELECT DISTINCT
_cte_del_row._b4_id
, _cte_del_row._b4_fld2 + 1
from (SELECT null::integer AS _zunk) _zunk -- skif1979's trick here
LEFT OUTER JOIN _cte_del_row -- clever LOJ magic
ON TRUE -- LOJ cartesian product
RETURNING id as _aft_id , fld2 as _aft_fld2 -- return newly "delserted" rows
)
SELECT * -- returns before & after snapshots from CTE's
FROM
_cte_del_row
, _cte_delsert ;
RESULT:
_b4_id | _b4_fld2 | _aft_id | _aft_fld2
--------+----------+---------+-----------
2 | 209 | 2 | 210
AFAICT these all occur linearly w/in a unit of work, akin to a journaled or logged update.
Workable for
Child records
OR Schema w/ no FK
OR FK w/ cascading deletes
Not workable for
Parent records w/ FK & no cascading deletes
A related (& IMO better) answer, akin to the "Logged DelSert" is this, a logged "SelUp" :
-- setups
DROP TABLE IF EXISTS _zx_t1 ;
CREATE TEMP TABLE
IF NOT EXISTS
_zx_t1
( id bigint
, fld2 bigint
, UNIQUE (id)
);
-- unique records
INSERT INTO _zx_t1 SELECT 1, 99;
INSERT INTO _zx_t1 SELECT 2, 98;
WITH
_cte_sel_row AS
( SELECT -- start unit of work with read
id as _b4_id -- fields need to be aliased
,fld2 as _b4_fld2 -- to prevent ambiguous column errors
FROM _zx_t1
WHERE id = 2
FOR UPDATE
)
, _cte_sel_up_ret AS -- we're in the same UOW
( UPDATE _zx_t1 -- actual table
SET fld2 = _b4_fld2 + 1 -- some actual work
FROM _cte_sel_row
WHERE id = _b4_id
AND fld2 < _b4_fld2 + 1 -- gratuitous but illustrates the point
RETURNING id as _aft_id, fld2 as _aft_fld2
)
SELECT
_cte_sel_row._b4_id
,_cte_sel_row._b4_fld2 -- before
,_cte_sel_up_ret._aft_id
,_cte_sel_up_ret._aft_fld2 -- after
FROM _cte_sel_up_ret
INNER JOIN _cte_sel_row
ON TRUE AND _cte_sel_row._b4_id = _cte_sel_up_ret._aft_id
;
RESULT:
_b4_id | _b4_fld2 | _aft_id | _aft_fld2
--------+----------+---------+-----------
2 | 209 | 2 | 210
See also:
https://rob.conery.io/2018/08/13/transactional-data-operations-in-postgresql-using-common-table-expressions/
How do I Delete duplicated rows in one Table and update References in another table to the remaining row? The duplication only occurs in the name. The Id Columns are Identity columns.
Example:
Assume we have two tables Doubles and Data.
Doubles table (
Id int,
Name varchar(50)
)
Data Table (
Id int,
DoublesId int
)
Now I Have Two entries in the Doubls table:
Id Name
1 Foo
2 Foo
And two entries in the Data Table:
ID DoublesId
1 1
2 2
At the end there should be only one entry in the Doubles Table:
Id Name
1 Foo
And two entries in the Data Table:
Id DoublesId
1 1
2 1
In the doubles Table there can be any number of duplicated rows per name (up to 30) and also regular 'single' rows.
I've not run this, but hopefully it should be correct, and close enough to the final soln to get you there. Let me know any mistakes if you like and I'll update the answer.
--updates the data table to the min ids for each name
update Data
set id = final_id
from
Data
join
Doubles
on Doubles.id = Data.id
join
(
select
name
min(id) as final_id
from Doubles
group by name
) min_ids
on min_ids.name = Doubles.name
--deletes redundant ids from the Doubles table
delete
from Doubles
where id not in
(
select
min(id) as final_id
from Doubles
group by name
)
Note: I have taken the liberty to rename your Id's to DoubleID and DataID respectively. I find that eassier to work with.
DECLARE #Doubles TABLE (DoubleID INT, Name VARCHAR(50))
DECLARE #Data TABLE (DataID INT, DoubleID INT)
INSERT INTO #Doubles VALUES (1, 'Foo')
INSERT INTO #Doubles VALUES (2, 'Foo')
INSERT INTO #Doubles VALUES (3, 'Bar')
INSERT INTO #Doubles VALUES (4, 'Bar')
INSERT INTO #Data VALUES (1, 1)
INSERT INTO #Data VALUES (1, 2)
INSERT INTO #Data VALUES (1, 3)
INSERT INTO #Data VALUES (1, 4)
SELECT * FROM #Doubles
SELECT * FROM #Data
UPDATE #Data
SET DoubleID = MinDoubleID
FROM #Data dt
INNER JOIN #Doubles db ON db.DoubleID = dt.DoubleID
INNER JOIN (
SELECT db.Name, MinDoubleID = MIN(db.DoubleID)
FROM #Doubles db
GROUP BY db.Name
) dbmin ON dbmin.Name = db.Name
/* Kudos to quassnoi */
;WITH q AS (
SELECT Name, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Name) AS rn
FROM #Doubles
)
DELETE
FROM q
WHERE rn > 1
SELECT * FROM #Doubles
SELECT * FROM #Data
Take a look at this one, i have tried this, working fine
--create table Doubles ( Id int, Name varchar(50))
--create table Data( Id int, DoublesId int)
--select * from doubles
--select * from data
Declare #NonDuplicateID int
Declare #NonDuplicateName varchar(max)
DECLARE #sqlQuery nvarchar(max)
DECLARE DeleteDuplicate CURSOR FOR
SELECT Max(id),name AS SingleID FROM Doubles
GROUP BY [NAME]
OPEN DeleteDuplicate
FETCH NEXT FROM DeleteDuplicate INTO #NonDuplicateID, #NonDuplicateName
--Fetch next record
WHILE ##FETCH_STATUS = 0
BEGIN
--select b.ID , b.DoublesID, a.[name],a.id asdasd
--from doubles a inner join data b
--on
--a.ID=b.DoublesID
--where b.DoublesID<>#NonDuplicateID
--and a.[name]=#NonDuplicateName
print '---------------------------------------------';
select
#sqlQuery =
'update b
set b.DoublesID=' + cast(#NonDuplicateID as varchar(50)) + '
from
doubles a
inner join
data b
on
a.ID=b.DoublesID
where b.DoublesID<>' + cast(#NonDuplicateID as varchar(50)) +
' and a.[name]=''' + cast(#NonDuplicateName as varchar(max)) +'''';
print #sqlQuery
exec sp_executeSQL #sqlQuery
print '---------------------------------------------';
-- now move the cursor
FETCH NEXT FROM DeleteDuplicate INTO #NonDuplicateID ,#NonDuplicateName
END
CLOSE DeleteDuplicate --Close cursor
DEALLOCATE DeleteDuplicate --Deallocate cursor
---- Delete duplicate rows from original table
DELETE
FROM doubles
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM doubles
GROUP BY [NAME]
)
Please try and let me know if this helped you
Thanks
~ Aamod
If you are using MYSQL following worked for me. I did it for 2 steps
Step 1 -> Update all Data rows to one Double table reference (with lowest id)
Step 2 -> Delete all duplicates with keeping lowest id
Step 1 ->
update Data
join
Doubles
on Data.DoublesId = Doubles.id
join
(
select name, min(id) as final_id
from Doubles
group by name
) min_ids
on min_ids.name = Doubles.name
set DoublesId = min_ids.final_id;
Step 2 ->
DELETE c1 FROM Doubles c1
INNER JOIN Doubles c2
WHERE
c1.id > c2.id AND
c1.name = c2.name;