I have a SQL Server 2008 many-to-many relationship table (Assets) with two columns:
AssetId (PK, FK, uniqueidentifier, not null)
AssetCategoryId (PK, FK, int, not null)
In my project, I need to take rows from this table, and insert them into a replicated database periodically. So, I have two databases that are exactly the same (constraints included).
In order to "copy" from one database to the other, I use a MERGE statement with a temp table. I insert up to 50 records into the temp table, then merge the temp table with the Assets table I am copying into as follows:
CREATE TABLE #Assets (AssetId UniqueIdentifier, AssetCategoryId Int);
INSERT INTO #Assets (AssetId, AssetCategoryId) VALUES ('ed05bac3-7a92-46aa-8822-2d882b137597', 44), ('dc5e3082-e2eb-4bdf-a640-94e0f59411ed', 22) ... ;
MERGE INTO Assets WITH (HOLDLOCK) AS Target
USING #Assets AS Source
ON Target.AssetId = Source.AssetId AND Target.AssetCategoryId = Source.AssetCategoryId
WHEN MATCHED THEN
UPDATE SET ...
WHEN NOT MATCHED BY Target THEN
INSERT (AssetId,AssetCategoryId) VALUES (Source.AssetId,Source.AssetCategoryId);
This works great, for the most part. However, once in a while, I get the error:
Violation of PRIMARY KEY constraint 'PK_Assets'. Cannot insert
duplicate key in object 'dbo.Assets'. The duplicate key value is
(dc5e3082-e2eb-4bdf-a640-94e0f59411ed, 22). The statement has been
terminated.
When I check in the Assets table, no such record exists... so I am confused how I would be inserting a duplicate key.
Any idea what is going on here?
UPDATE
When testing, it runs successfully 6 times, inserting 300 rows. On the 7th try, it always gives the same error shown above. Furthermore, when I INSERT (dc5e3082-e2eb-4bdf-a640-94e0f59411ed, 22) by itself, it works fine. My test is then able to continue and insert the remaining rows with no errors.
You need to add a HOLDLOCK on your MERGE statement. Try the following:
MERGE INTO Assets WITH (HOLDLOCK) AS Target
...
This avoids the race condition that you are running into. See more info here
EDIT
Based on your update, the only other thing I can think of is that your temp table might have a duplicate record in it. Can you double check?
Related
I have imported some data to a temp SQL table from an Excel file. Then I have tried to insert all rows to two related tables. Simply like this: There are Events and Actors tables with many to many relationship in my database. Actors are already added. I want to add all events to Events table and then add relation(ActorId) for each event to EventActors tables.
(dbo.TempTable has Title, ActorId columns)
insert into dbo.Event (Title)
Select Title
From dbo.TempTable
insert into dbo.EventActor (EventId, ActorId)
Select SCOPE_IDENTITY(), ActorId --SCOPE_IDENTITY() is for EventId
From dbo.TempTable
When this code ran, all events inserted into Events, but the relations didn't inserted into EventActors because of Foreign Key error.
I think there should be a loop. But I am confused. I don't want to write C# code for this. I know there would be a simple but advanced solution trick for this in SQL Server. Thanks for your help.
Use the output clause to capture the new IDs, with a merge statement to allow capture from both source and destination tables.
Having captured this information, join it back to the temp table for the second insert.
Note you need a unique id per row, and this assumes 1 row in the temp table creates 1 row in both the Event and the EventActor tables.
-- Ensure every row has a unique id - could be part of the table create
ALTER TABLE dbo.TempTable ADD id INT IDENTITY(1,1);
-- Create table variable for storing the new IDs in
DECLARE #NewId TABLE (INT id, INT EventId);
-- Use Merge to Insert with Output to allow us to access all tables involves
-- As Insert with Output only allows access to columns in the destination table
MERGE INTO dbo.[Event] AS Target
USING dbo.TempTable AS Source
ON 1 = 0 -- Force an insert regardless
WHEN NOT MATCHED THEN
INSERT (Title)
VALUES (Source.Title)
OUTPUT Source.id, Inserted.EventId
INTO #NewId (id, EventId);
-- Insert using new Ids just created
INSERT INTO dbo.EventActor (EventId, ActorId)
SELECT I.EventId, T.ActorId
FROM dbo.TempTable T
INNER JOIN #NewId I on T.id = T.id;
I have two identically-shaped tables: an "active" table, and an "archive" table. As part of a recurring process, I want to move rows from the active to the archive, and I'd like to do it using MERGE so that my process doesn't fail with a Violation of PRIMARY KEY constraint '...'. Cannot insert duplicate key....
I had thought to do it like so:
MERGE INTO Archive
USING (DELETE FROM Active OUTPUT DELETED.* WHERE Active.Id IN (...)) AS Active
ON (Active.Id = Archive.Id)
WHEN NOT MATCHED THEN
INSERT (Id, ...)
VALUES (Active.Id, ...);
but then learned that A nested INSERT, UPDATE, DELETE, or MERGE statement is not allowed in the USING clause of a MERGE statement.
The only other thing that occurs to me is to use a temp table:
SELECT TOP 0 * INTO #temp FROM Active
DELETE FROM Active
OUTPUT DELETED.* INTO #temp
WHERE Id IN (...)
MERGE INTO Archive
USING #temp AS Active
ON (Active.Id = Archive.Id)
WHEN NOT MATCHED THEN
INSERT (Id, ...)
VALUES (Active.Id, ...);
and it works, but it feels... unsatisfying.
Is there a more concise or direct way to achieve this "safe row movement"?
We're using Oracle 11g at the moment without Enterprise (not an option unfortunately).
Let's say I have a table with a constant(Let's say 2000) rows of data. Let's call it data_source.
I want to insert some columns of this table into another table, data_dest. I'm using all the records from the source table.
In other words, I would like to insert this set
select data_source.col1, data_source.col2, ... data_source.colN
from data_source
Which would be faster in this case:
insert into data_dest
select data_source.col1, data_source.col2, ... data_source.colN
from data_source
OR
merge into data_dest dd
using data_source ds
on (dd.col1 = ds.col1) --Let's assume that this is a matching column names
when not matched
insert (col1,col2...)
values(ds.col1,ds.col2...)
EDIT 1:
We can assume there are no primary keys violations from the insert.
In other words we can assume that insert will successfully insert all of the rows and so will merge.
The insert is very likely faster because it does not require a join on the two tables.
That said, the two queries are not equivalent. Assuming that col1 is defined as the primary key, the insert will throw an error if data_source contains a value in col1 that is already in data_dest. Because the merge is comparing the data in the two tables, then only inserting only the rows that don't already exist, it won't ever throw a primary key violation.
An insert that would be equivalent to the merge would be:
INSERT INTO data_dest
SELECT data_source.col1, data_source.col2, ... data_source.colN
FROM data_source
WHERE NOT EXISTS
(SELECT *
FROM data_dest
WHERE data_source.col1 = data_dest.col1)
It's likely that the plan for this insert will be very similar (if not identical) to the plan for the merge and the performance would be indistinguishable.
Trying insert table data into another table,
But I'm getting the following error:
Msg 2627, Level 14, State 1, Line 4
Violation of PRIMARY KEY constraint 'PK___4__10'. Cannot insert duplicate key in object 'dbo.tbl_Diagnosis_Table'.
Appears to be a duplicate primary key between both tables. Both tables have the same fields and data types, different data. What query can resolve this issue?
INSERT INTO tbl_Diagnosis_Table
SELECT *
FROM tbl_Holding_Diagnosis_Table
INSERT INTO tbl_Diagnosis_Table(Code, [Description], Comments, Discontinued)
(SELECT
Code, [Description], Comments, Discontinued
FROM
tbl_Holding_Diagnosis_Table);
Assuming Code is the primary key this should eliminate the duplicate rows from the insert:
INSERT INTO tbl_Diagnosis_Table (Code, [Description], Comments, Discontinued)
SELECT Code, [Description], Comments, Discontinued
FROM tbl_Holding_Diagnosis_Table
WHERE tbl_Holding_Diagnosis_Table.Code NOT IN
(SELECT Code FROM tbl_Diagnosis_Table)
If the primary key is some other column, or a composite key, you might need to use a join instead.
You might want to look at the MERGE statement if you want to update existing rows and only insert new.
You need a WHERE with an IN Clause To filter the records to insert, but first you need to know wich fields form the primary key.
If what you're saying is correct i.e. all the values are unique, it leaves only one option. Make sure that if there is an identity column in table tbl_diagnosis_table, you are setting the IDENTITY_INSERT to ON on this table and providing values manually in the select. It might have been possible that the seed and increment was reset in the past. In case you were wrong, you have to use a where clause as suggested by others.
I was going to suggest using a Merge query to do insert or update until I noticed the two inserts in the sample code both do the same insert. The error also says the error is on line 4 which is where the second insert occurs. If the two inserts aren't two examples of the problematic code, then the resolution may be as simple as removing one of the inserts.
Otherwise the other answers are correct, the duplicate rows need to be filtered and the IDENTITY_INSERT has to be turned on for the table.
SET IDENTITY_INSERT tbl_Diagnosis_Table ON -- if it is necessary to have the same primary key
MERGE tbl_Diagnosis_Table AS target
USING (SELECT Code, Description, Comments, Discontinued FROM tbl_Holding_Diagnosis_Table) AS source (Code, Description, Comments, Discontinued)
ON (target.Code = source.Code)
WHEN MATCHED THEN
UPDATE SET Description = source.Description,
Comments = source.Comments,
Discontinued = source.Discontinued
WHEN NOT MATCHED THEN
INSERT (Code, Description, Comments, Discontinued)
VALUES (source.Code, source.Description, source.Comments, source.Discontinued)
END; -- missing semicolons causes errors
SET IDENTITY_INSERT tbl_Diagnosis_Table OFF
Do your homework though. There are some very good reasons not to use Merge.
Use Caution with SQL Server's MERGE Statement
Indexed views and Merge
Optimizing Merge Statement Performance
I applied 12Lac Insert command in Single table ,
but after some time query terminated , How can I find Last
Inserted Record
a)Table don't have created Date column
b)Can not apply order by clause because primary key values are manually generated
c)Last() is not buit in fumction in mssql.
Or any way to find last executed query
There will be some way but not able to figure out
Table contain only primary key constrain no other constrain
As per comment request here a quick and dirty manual solution, assuming you've got the list of INSERT statements (or the according data) in the same sequence as the issued INSERTs. For this example I assume 1 million records.
INSERT ... VALUES (1, ...)
...
INSERT ... VALUES (250000, ...)
...
INSERT ... VALUES (500000, ...)
...
INSERT ... VALUES (750000, ...)
...
INSERT ... VALUES (1000000, ...)
You just have to find the last PK, that has been inserted. Luckily in this case there is one. So you start doing a manual binary search in the table issuing
SELECT pk FROM myTable WHERE pk = 500000
If you get a row back, you know it got so far. Continue checking with pk = 750000. Then again, if it is there with pk = 875000. If 750000 is not there, then the INSERTs must have stopped earlier. Then check for pk = 675000. This process stops in this case after 20 steps.
It's just plain manual divide and conquer.
There is a way.
Unfortunately you have to do this in advance so it helps you.
So if you have, by any chance the PRIMARY KEYS you inserted, still at hand go ahead and delete all rows that have those keys:
DELETE FROM tableName WHERE ID IN (id1, id2, ...., idn)
Then you enable Change Data Capture for your database (have the db already selected):
EXEC sys.sp_cdc_enable_db;
Now you also need to enable Change Data Capture for that table, in an example that I've tried I could just run:
EXEC sys.sp_cdc_enable_table #source_schema = N'dbo', #source_name = N'tableName', #role_name = null
Now you are almost setup! You need to look into your system services and verify that SQL Server Agent is running for your DBMS, if it does not capturing will not happen.
Now when you insert something into your table you can select data changes from a new table called [cdc].[dbo_tableName_CT]:
SELECT [__$start_lsn]
,[__$end_lsn]
,[__$seqval]
,[__$operation]
,[__$update_mask]
,[ID]
,[Value]
FROM [cdc].[dbo_tableName_CT]
GO
An example output of this looks like this:
you can order by __$seqval that should give you the order in which the rows were inserted.
NOTE: this feature seems not to be present in SQL Server Express