SQL Server overwrite unique constraint violation - sql

I have two files which I am importing via Node JS to SQL Server. The table has unique key for equity instrument identifier (ISIN)
data1.csv and data2.csv
I first import data1.csv each row is inserted to the database. After this I import data2.csv (the values are again inserted to database) which may contain the same ISIN, but it's related values are higher priority than the first file (there are not many of these ISINs 5 out of 1000 or so).
What can I do with SQL server to overwrite the values if the unique constraint is violated? I understand that there is an option to upload data2.csv first, however there are some external constrains that do not allow me to do that.
Please tell me if additional information is required

I would recommend staging process to do this:
1. create a staging table with similar schema as your target table
2. Before loading delete all rows from staging table (you can use truncate)
3. Upload the file to the staging table
4. Load your data into final table - here you can use some logic to only insert new rows and update existing rows. Merge command will be useful in scenario like this.
Repeat steps 2 to 4 for each source table.

Related

SQL Server constraint enforcement being violated for split seconds

I have a table on premise that is about 21 million rows with a primary key constraint and when I search that table, there are no duplicates. This table is in an OLTP application database that is constantly moving.
I have the exact same table in Azure which has the same primary key constraint. This table is not an application table, it's just a copy of the one that is on-premise (the goal is to use this one for ad hoc queries, as a source for other systems, etc.).
When I use Azure Data Factory to select all_columns from table on premise to the table in Azure, it returns a violation of the primary key constraint. No matter how many times I run this data factory pipeline, it comes back with a primary key violation for duplicate keys (the keys are always changing though).
So I dropped the primary key constraint in Azure and ran the pipeline again, and sure enough, duplication exists.
Upon investigation, it appears that the on-premise database is doing an insert new record then update the old record to inactivate it. So for a fraction of a second, there are two active rows that ADF is grabbing to then try to insert into the table in Azure which of course fails because of duplicate primary keys.
Now to the best of my knowledge, this shouldn't be possible. You can't insert a new row that violates the primary key constraint. But ADF seems to be grabbing all the data and some of those rows are mid-flight where the insert has happened and the update to inactivate the old row hasn't happened yet.
For those that are curious, the insert happens and the update of the old row happens within less than a second... it's typically 10-20 microseconds. I don't know how this is possible and I don't know how to fix it (because I can't modify the application code). The database for the on-premise database is a SQL Server 2000 database and Azure SQL is an Azure SQL database.
Try with readpast hint. It should not select any rows in locking state.
SELECT * FROM yourtable WITH (readpast)
Since you have create_date and updated_date column then you can select rows older than 5 seconds to avoid duplication.
select * from yourtable where created_date<=dateadd(second,-5,getdate()) and updated_date<=dateadd(second,-5,getdate());
Need to enable the Fault tolerance in a Pipeline Azure Data Factory
Copy data from a Source SQL to a Sink SQL database. A primary key is defined in the sink SQL database, but no such primary key is defined in the source SQL server. The duplicated rows that exist in the source cannot be copied to the sink. Copy activity copies only the first row of the source data into the sink. The subsequent source rows that contain the duplicated primary key value are detected as incompatible and are skipped.
To configure Json Definition skip the incompatible rows in copy activity "enableSkipIncompatibleRow": true
Please Refer: https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-fault-tolerance
If possible to modify your application, need to check the Primary key constraint before insert or update using EXISTS() function.
Example:
IF EXISTS(SELECT * FROM Table_Name WHERE primary key condition)
BEGIN
UPDATE Table_Name
SET Col_Name= value
WHERE condition
END
ELSE
BEGIN
INSERT INTO Table_Name ( col_Name1,col_Name2,,.. )
VALUES ( ‘’,’’,’’,….)
END

Data Agent - SELECT from one table and insert into another

Is there any type of product where I can write a SQL statement to select from one table and then insert into another database (The other database is out in the cloud). Also, it needs to be able to check to see if that record exists and then update the row if anything has changed. Then it will need to run every 10-30 minutes to check to see what has changed or if new records have been added.
The source database and the ending database have a different schema (if that matters?) I've been looking, but it seams that only products out there are ones that will just copy one table and insert into a table with the same schema.

SSIS package check if record exist then update else insert

I am creating this SSIS import package for about 10 tables , I am still new to this so I really appreciate any help I can get.
I need to compare my Excel source to this ~10 tables to see if any record exists ,if it exists then update it or else insert it. I am struggling on how to check on various tables where they all have auto-incremented PK. If one record doesn't exist how can I insert it and make sure the other tables have the foreign keys(auto-incremented primary key of tables) updated as well.Meaning the relationship of each record that have been divided into so many tables are tact.
My plan for the package:
Excel source
Look up transformer
Data conversion transformer
derived column transformer
multicast
OLE DB destination
Please advise on how I should go about, and the order I should follow for my transformers.
hmm ok firstly get the excel source into an sql staging table first (truncate before loading) then you could consider using the sql merge statement via an execute sql task to merge the data into the end tables. This will allow you to insert if the record dosent exist. You may need to lookup the foreign keys before running the merge. Are you able to post details of the 10 tables and import csv?

How to update a table copy in another database

I have two identical databases - one for development (DEV) and one for production (PROD) (both SQL Server 2008).
I have updated the contents of a certain table in DEV and now I want to sync the corresponding table in PROD.
I have not changed the table schema, only some the data inside the table (I have both changed existing rows and added some new rows).
How can I easily transfer the changes in DEV to the corresponding table in PROD?
Note, that the values in the automatic identity column mgiht not match exactly between the two tables. However, I know that I have only made changes to rows having the same value in another column.
Martin
If you don't want to use the replication, you can Create update, Insert and delete trigger in DEV database and update PROD by trigger.
or you can create view of DEV database table on the PROD database.

Import Export Wizard keep Identity, No Alter Table permissions

Basically, I am transferring data from one database table(A) to another database table(B).
Both databases have data except A gets updated daily and B needs to be updated with the current data in A. I would like to keep the identity column the same in both.
I try to run the wizard with delete previous data and the keep identity checked, but I get an error saying I don't have permission to alter table, so I am thinking that delete previous data truncates the table correct?
I then tried to use append table, but that complains about overwriting row with same identity value. Is there a way to ignore previous entries and only insert the new entries?
Sure, in your SELECT statement where you get data from table A, just select rows where A.ID is not equal to B.ID.
This functionality might work better with a trigger on table A or using transactional replication.