Prevent the insertion of duplicate rows using SQL Server 2008 - sql

I am trying to insert some data from one table into another but I would like to prevent the insertion of duplicate rows. I have currently the following query:
INSERT INTO Table1
(
Table1Col1,
Table1Col2,
Table1Col3,
Table1Col4,
Table1Col5
)
SELECT
Table2Col1,
Table2Col2 = constant1,
Table2Col3 = constant2,
Table2Col4 = constant3,
Table2Col5 = constant4
FROM Table2
WHERE
Condition1 = constant5
AND
Condition2 = constant6
AND
Condition3 = constant7
AND
Condition4 LIKE '%constant8%'
What I do not know is that the row I am trying to insert from Table2 into Table1 might already exist and I would like to prevent this possible duplication from happening and skip the insertion and just move onto inserting the next unique row.
I have seen that I can use a WHERE NOT EXISTS clause and use of the INTERSECT keyword but I did not fully understand how to apply it to my particular query as I only want to use some of the selected data from Table2 and then some constant values to insert into Table1.
EDIT:
I should add that the columns TableCol2 through to TableCol5 don't actually exist in the result set and I am just populating these columns alongside Table2Col1 that is returned.

Since you are on SQL Server 2008, you can use a merge statement.
You can easily check if a row exists base on a key
something like this:
merge TableMain AS target
using TableA as source
ON <join tables here>
WHEN MATCHED THEN <update>
WHEN NOT MATCHED BY TARGET <Insert>
WHEN NOT MATCHED BY SOURCE <delete>

Intersect (minus in Sql Server's terms) is out of question because it compares whole row. Other two options are not in/not exists/left join and merge. Not In is for single-column prinary key only, so it is out of question in this instance. In/Exists/Left join should have the same performance in Sql Server, so I'll just use exists:
INSERT INTO Table1
(
Table1Col1,
Table1Col2,
Table1Col3,
Table1Col4,
Table1Col5
)
SELECT
Table2Col1,
Table2Col2 = constant1,
Table2Col3 = constant2,
Table2Col4 = constant3,
Table2Col5 = constant4
FROM Table2
WHERE
Condition1 = constant5
AND
Condition2 = constant6
AND
Condition3 = constant7
AND
Condition4 LIKE '%constant8%'
AND NOT EXISTS
(
SELECT *
FROM Table1 target
WHERE target.Table1Col1 = Table2.Table2Col1
AND target.Table1Col2 = Table2.Table2Col2
AND target.Table1Col3 = Table2.Table2Col3
)
Merge is used to sync two tables; it has ability to insert, update and delete records from target table.
merge into table1 as target
using table2 as source
on target.Table1Col1 = source.Table2Col1
AND target.Table1Col2 = source.Table2Col2
AND target.Table1Col3 = source.Table2Col3
when not matched by target then
insert (Table1Col1,
Table1Col2,
Table1Col3,
Table1Col4,
Table1Col5)
values (Table2Col1,
Table2Col2,
Table2Col3,
Table2Col4,
Table2Col5);
If columns from table2 are computed during transfer, in not exists() case you might use derived table in place of table2, and the same applies to merge example - just place your query in place of reference to table2.

we have check the whether the data is already exist or not in table. For this we have to use If condition to avoid the duplicate insertion

Related

MERGE statement to update or insert rows into a table

My task is to insert or update rows in a table2. Table1 contains id's of all employees. That id matches the ID in the table2. Some of the employees in table2 already have the rows I need but some don't. Table2 doesn't contain the ID's of the employees that don't have those rows.
My task is to update the rows for the existing ID's and insert for the ones that don't have those rows.
I have tried the following statement:
MERGE INTO dbo.table2 AS TGT
USING (SELECT table1ID FROM dbo.table1) AS SRC
ON SRC.table1ID = TGT.table2ID
WHEN MATCHED
AND table2Code = 'ValueToInsertOrUpdateCode'
THEN
UPDATE
SET table2Value= 'ValueToInsertOrUpdateValue'
WHEN NOT MATCHED BY TARGET
THEN
INSERT (table2Code, table2ID, table2Value)
VALUES ('ValueToInsertOrUpdateCode', src.table1ID, 'ValueToInsertOrUpdateValue');
This currently only updates the rows that exist, but doesn't insert the rows for ID's that don't have existing rows.
Based on your comments is sounds like you want this so that the WHEN NOT MATCHED BY TARGET is executed:
MERGE INTO dbo.table2 AS TGT
USING (SELECT table1ID FROM dbo.table1) AS SRC
ON (SRC.table1ID = TGT.table2ID AND table2Code = 'ValueToInsertOrUpdateCode') -- This is the difference
WHEN MATCHED
AND table2Code = 'ValueToInsertOrUpdateCode'
THEN
UPDATE
SET table2Value= 'ValueToInsertOrUpdateValue'
WHEN NOT MATCHED BY TARGET
THEN
INSERT (table2Code, table2ID, table2Value)
VALUES ('ValueToInsertOrUpdateCode', src.table1ID, 'ValueToInsertOrUpdateValue');
WHEN NOT MATCHED BY TARGET would not execute when SRC.table1ID = TGT.table2ID (i.e. they match).
Updating the ON clause to ON (SRC.table1ID = TGT.table2ID AND table2Code = 'ValueToInsertOrUpdateCode') will give you the inserts you are expecting.
However you should probably not do this:
ON <merge_search_condition> Caution
It's important to specify only the columns from the target table to use for matching purposes. That is, specify columns from the target table that are compared to the corresponding column of the source table. Don't attempt to improve query performance by filtering out rows in the target table in the ON clause; for example, such as specifying AND NOT target_table.column_x = value. Doing so may return unexpected and incorrect results.
For this reason and what others have suggested it would be safer to do separate update and insert statements.
I would, honestly, suggest avoiding the MERGE operator and doing an Upsert here instead. For your scenario, what you need is most likely the following:
SET XACT_ABORT ON;
BEGIN TRANSACTION;
UPDATE T2 WITH (UPDLOCK, SERIALIZABLE)
SET table2Value = 'ValueToInsertOrUpdateValue'
FROM dbo.Table2 T2
JOIN dbo.Table1 T1 ON T1.table1ID = T2.table2ID;
-- You could honestly use an EXISTS here, considering that you're updating the table
-- with a literal, rather than a value from the table Table1.
INSERT INTO dbo.Table2 (table2Code , table2ID, table2Value)
SELECT 'ValueToInsertOrUpdateCode',
T1.table1ID,
'ValueToInsertOrUpdateValue'
FROM dbo.Table1 T1
WHERE NOT EXISTS (SELECT 1
FROM dbo.Table2 T2
WHERE T2.table2ID = T1.table1ID);
COMMIT;
db<>fiddle

Improving insert query for SCD2 solution

I have two insert statements. The first query is to inserta new row if the id doesn't exist in the target table. The second query inserts to the target table only if the joined id hash value is different (indicates that the row has been updated in the source table) and the id in the source table is not null. These solutions are meant to be used for my SCD2 solution, which will be used for inserts of hundreds thousands of rows. I'm trying not to use the MERGE statement for practices.
The columns "Current" value 1 indicates that the row is new and 0 indicates that the row has expired. I use this information later to expire my rows in the target table with my update queries.
Besides indexing is there a more competent and effective way to improve my insert queries in a way that resembles the like of the SCD2 merge statement for inserting new/updated rows?
Query:
Query 1:
INSERT INTO TARGET
SELECT Name,Middlename,Age, 1 as current,Row_HashValue,id
from Source s
Where s.id not in (select id from TARGET) and s.id is not null
Query 2:
INSERT INTO TARGET
SELECT Name,Middlename,Age,1 as current ,Row_HashValue,id
FROM SOURCE s
LEFT JOIN TARGET t ON s.id = t.id
AND s.Row_HashValue = t.Row_HashValue
WHERE t.Row_HashValue IS NULL and s.ID IS NOT NULL
You can use WHERE NOT EXISTS, and have just one INSERT statement:
INSERT INTO TARGET
SELECT Name,Middlename,Age,1 as current ,Row_HashValue,id
FROM SOURCE s
WHERE NOT EXISTS (
SELECT 1
FROM TARGET t
WHERE s.id = t.id
AND s.Row_HashValue = t.Row_HashValue)
AND s.ID IS NOT NULL;

How to get data from joined tables when performing a merge?

I am trying to use a merge to a table.
What I am having trouble with is getting the matching name from the code that exists in the original table. I will put my code and explain further:
MERGE INTO ResultTable R
USING InitialTable IT
ON (false)
WHEN MATCHED THEN -- do some stuff
WHEN NOT MATCHED THEN
INSERT (PrimaryKey,..., ThingFromJoinedTable)
VALUES (Seq.NEXTVAL, ..., ??? );
So the Initial table has a foreign key and I want to get the matching value in the Joined table.
Anyone have any idea on how to do so, I have tried having a nested select with a join, but it gives me a single-row subquery returns more than one row error.
Something like this:
MERGE INTO ResultTable R
USING ( SELECT it.this, it.that, third.this, third.that
FROM InitialTable it
JOIN ThirdTable third ON <your join criteria> ) SRC
/* depending on which columns you want for the join */
ON (r.col1 = src.col1 and r.col2 = src.col2)
WHEN MATCHED THEN -- do some stuff
/* depending on which columns you need to merge */
UPDATE SET
r.col4 = src.col4,
r.col5 = src.col5,
etc.
WHEN NOT MATCHED THEN
INSERT (PrimaryKey,..., colThis, colThat, ....)
VALUES (Seq.NEXTVAL, ..., src.colThis, src.colThat );

SQL: Add multiple rows from one table to another when no data for that date in new table

I'm trying to update medical data from one table to another after switching from one system to another. We have two tables, for simplicity I'll make this a simple example. There are many columns in these tables in reality (not just 5).
Table1:
name, date, var1, var2, var3
Table2:
name, date, var1a, var2a, var3a
I want to transfer data from Table 1 to Table 2 for any rows where there isn't previous data for that date, where var1 = var1a, etc (same columns with different names).
I was trying to do something with a loop, but realized that may not be necessary.
I had gotten this far but keep wasn't sure if this was ok:
UPDATE Table2 VALUES (date, var1a, var2a, var3a)
SELECT date, var1, var2, var3 FROM Table1
Is that correct syntax so far? Or do I need to map the variables to translate var1 into var1a, etc?
How do I add a check to make sure I don't overwrite any data already in Table1? I don't want to add data if there is already data for that date/name combination.
Thanks!
You can INSERT into TABLE2 all values from TABLE1 that do not already exist in Table2:
INSERT INTO Table2 (date, var1a, var2a, var3a)
SELECT date, var1, var2, var3
FROM Table1 t1
WHERE NOT EXISTS (SELECT 1 FROM Table2 t2 WHERE t2.date = t1.date)
Already existing values are specified by comparing the date column. You can add any other predicates in the SELECT subquery of the NOT EXISTS expression to suit your needs.
You could use an update with a join. And you dont need to update the date column since that's what you are using to find the matches in the 2 tables.
Either you generate a dynamic query based on the empty/null valued columns, or you could do something like the below, which puts the same value in the column if it exists in table2 or else puts the corresponding value from table1.
The below approach requires less logic and easier to implement but will produce IO equivalent to updating the entire table.
update tbl2
set val1a=isnull(val1a,val1)
, val2a=isnull(val2a,val2)
, val3a=isnull(val3a.val3)
from table1 tbl1
inner join table2 tbl2
on tbl1.name=tbl2.name
and tbl1.date=tbl2.date
Considerations:
The approach requires less logic and easier to implement but will produce IOs equivalent to updating the entire table2. If you have a smallish table i would go with this approach.
If its a big table then you should look into building specific query sets to reduce IO
This code is tested in Access but something very similar should work in SQL Server 2012:
UPDATE Table2 RIGHT JOIN Table1 ON Table2.date = Table1.date
SET Table2.name = Table1.name, Table2.date = Table1.date, Table2.var1 = Table1.var1a, Table2.var2 = Table1.var2a, Table2.var3 = Table1.var3a
WHERE (Table2.date Is Null);
Explanation: this uses a right join so that within the query all the data from Table1 is present and where there is a matched date for Table2 that data is present too. We then ignore all cases where there is any data for Table2 and update the query in all other cases - the update in fact inserts new data into Table2.

How to copy column values from one database to empty column in other database?

I have two databases.
Alarm
TMP
I have a table in Alarm, where in a table there is one empty column with null values.
And I have a single column table in TMP.
I want to copy this single column values to my table in Alarm database.
What I tried so far is,
update [Alarm].[dbo].[AlarmDetails] set Alarm_Message = (select * from [TMP].[dbo].[AlarmDetails$])
where 1=1
The error is
Subquery returned more than 1 value.
Please note this,
NOTE: There is no id column in source table. Only one table & one column, Alarm Message.
I know the cause of error, but how should I modify my SQL.
Thank You.
Here's an example of copying a column:
update dst
set Alarm_Message = src.AlarmMessage
from Alarm.dbo.AlarmDetails dst
join TMP.dbo.AlarmDetails src
on dst.id = src.id
You did not specify how the tables are related, so I assumed they both have an id column.
You need something like this.
update t1
set
t1.<something1> = t2.<something2>
from
[Alarm].[dbo].[AlarmDetails] t1
join [TMP].[dbo].[AlarmDetails] t2 on (t1.<cols1> = t2.<cols2>)
UPDATE results SET results.platform_to_insert = (
SELECT correct_platform
FROM build
WHERE results.BuildID=build.BuildID LIMIT 1
);