sql error - unique constraint - sql

I have one data migration script like this.
Data_migration.sql
It's contents are
insert into table1 select * from old_schema.table1;
commit;
insert into table2 select * from old_schema.table2;
commit;
And table1 has the pk_productname constraint when I execute the script
SQL> # "data_migration.sql"
I will get an unique constraint(pk_productname) violation. But when I execute the individual sql statements I won't get any error. Any reason behind this. And how to resolve this.

The failure of the unique constraint means you are attempting to insert one of more records whose primary key columns collide.
If it happens when you run a script but not when you run the individual statements then there must be a bug in your script. Without seeing the script it is impossible for us to be sure what that bug is, but the most likely thing is you are somehow running the same statement twice.
Another possible cause is that the constraint is deferred. This means it is not enforced until the end of the transaction. So the INSERT statement would appear to succeed if you run it without issuing the subsequent COMMIT.
It is common to run data migration without enabled constraints. Re-enable them afterwards using an EXCEPTIONS table. This makes it easier to investigate problems. Find out more.

Related

Get all constraint errors when inserting data from another table

I have a staging table without any constraints in my Azure SQL database (Azure SQL database 12.0.2000.8). I want to insert the data from the Staging table into the "real" table on which multiple constraints are set. When inserting the data, I use a statement of the kind
INSERT INTO <someTable> SELECT <columns> FROM StagingTable;
Now I only get the first error when violating some constraints. However, for my use case, it is important to get all violations, so they can be resolved altogether.
I have tried using TRY...CATCH mechanisms, however, this will throw an error on the first error and run the catch clause, but it will not continue with the other data. Note that the correct data that has no violations should not be inserted, so the whole insert statement can be rolled back on one error, however, I want to see all violations to be able to correct them all without having to run the insert statement multiple times to get all errors.
EDIT:
The types of constraints that need to be checked are foreign key constraints, NOT NULL constraints, duplicate keys. No casting is done, so no need to check for conversions.
There are couple of options:
If you want to catch row level information, you have to go for cursors or while loop and try to insert each row in TRY CATCH block and see if you are getting any error, and log the same.
Create another table similar to main table(say, MainCheckTable) with all constraints and disable all the constraints and load the data.
Now, you can leverage DBCC CHECKCONSTRAINTS to see all the constraint violations.Read more on this .
USE DBName;
DBCC CHECKCONSTRAINTS(MainCheckTable) WITH ALL_CONSTRAINTS;
First, don't look at your primary table(s). Look at the related tables e.g. lookups etc. Populate these first. Once you have populated the related tables (i.e.) satisfy all related constraints, then add the data.
You need to work backwards from the least constrained tables to the most constrained if that makes sense.
You should check that your related tables have the required reference values/fields that you intend to insert. This is easy to do, since you already have a staging table.

postgresql: \copy method enter valid entries and discard exceptions

When entering the following command:
\copy mmcompany from '<path>/mmcompany.txt' delimiter ',' csv;
I get the following error:
ERROR: duplicate key value violates unique constraint "mmcompany_phonenumber_key"
I understand why it's happening, but how do I execute the command in a way that valid entries will be inserted and ones that create an error will be discarded?
The reason PostgreSQL doesn't do this is related to how it implements constraints and validation. When a constraint fails it causes a transaction abort. The transaction is in an unclean state and cannot be resumed.
It is possible to create a new subtransaction for each row but this is very slow and defeats the purpose of using COPY in the first place, so it isn't supported by PostgreSQL in COPY at this time. You can do it yourself in PL/PgSQL with a BEGIN ... EXCEPTION block inside a LOOP over a select from the data copied into a temporary table. This works fairly well but can be slow.
It's better, if possible, to use SQL to check the constraints before doing any insert that violates them. That way you can just:
CREATE TEMPORARY TABLE stagingtable(...);
\copy stagingtable FROM 'somefile.csv'
INSERT INTO realtable
SELECT * FROM stagingtable
WHERE check_constraints_here;
Do keep concurrency issues in mind though. If you're trying to do a merge/upsert via COPY you must LOCK TABLE realtable; at the start of your transaction or you will still have the potential for errors. It looks like that's what you're trying to do - a copy if not exists. If so, skipping errors is absolutely the wrong approach. See:
How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?
Insert, on duplicate update in PostgreSQL?
Postgresql - Clean way to insert records if they don't exist, update if they do
Can COPY be used with a function?
Postgresql csv importation that skips rows
... this is a much-discussed issue.
One way to handle the constraint violations is to define triggers on the target table to handle the errors. This is not ideal as there can still be race conditions (if concurrently loading), and triggers have pretty high overhead.
Another method: COPY into a staging table and load the data into the target table using SQL with some handling to skip existing entries.
Additionally, another useful method is to use pgloader

sql server, composite keys - ignoring duplicate

Is there a way to prevent sql from throwing an error when I try to save a record that already exists. I've got a composite key table for a many-to-many relationship that has only the two values, when I update a model from my application, it tries to save all records, the records that already exist throw an error Cannot insert duplicate key is there a way of having the database ignore these, or do I have to handle it in the application?
you are calling an INSERT and trying to add duplicated keys. This error is by design, and essential. The DB is throwing an exception for an exceptional and erroneous condition.
If you are, instead, trying to perform an "upsert" you may need to use a stored procedure or use the MERGE syntax.
If, instead, you don't want to UPDATE but to just ignore rows already in the table, then you need to simply add an exception to your INSERT statement... such as
....
WHERE
table.Key <> interting.key
Try something like this with your insert statement.
insert into foo (x,y)
select #x,#y
except
select x,y from foo
This will add a record to foo, ONLY if it is not already in the table.
You could try creating your index with the IGNORE_DUP_KEY option so that you only get a warning when you have duplicate keys rather than a true error.
The other option and possibly the better one is to use the MERGE statement rather than insert. The MERGE statement let's you do Inserts, Updates and Deletes all in one statement and sounds like it should work out well for what you are trying to do.
Last but not least, as you said fix it in your app and only insert the rows that need to be added.

INSERT INTO .. SELECT .. unique constraint violation

I'm running a stored procedure that selects values my temp table and inserts them into the database like so:
INSERT INTO emails (EmailAddress) (
SELECT
DISTINCT eit.EmailAddress
FROM #EmailInfoTemp eit
LEFT JOIN emails ea
ON eit.EmailAddress = ea.EmailAddress
WHERE ea.EmailAddressID IS NULL )
On rare cases(~ once every couple of hours on a server that handles thousands of requests a minute), I then receive a unique constraint error "Violation of UNIQUE KEY constraint�".. on an index on the EmailAddress column.
I can confirm that I am not passing in duplicate values. Even if I was, it should be caught by the DISTINCT.
-SQL Server 2008
-Stored proc + not using transactions + JDBC callablestatement
Could it happen that between the SELECT and the ensuing INSERT, there was another call to the same/different stored proc that completed an INSERT with similiar data? If so, what would be the best way to prevent that?
Some ideas: We have many duplicate instances of "clients" who communicate with this one SQL Server at once in production, so my first reaction was a concurrency issue, but I can't seem to replicate it myself. That's the best guess I had, but it's gone nowhere so far. This does not happen on our staging environment where the load is insignificant compared to the production environment. That was the main reason I started looking into concurrency issues.
The error is probably caused by two sessions executing an insert at the same time.
You can make your SQL code safer by using MERGE. As Aaron Bertrand's comment says (thanks!), you have to include a with (holdlock) hint to make merge really safe.
; merge emails e with (holdlock)
using #EmailInfoTemp eit
on e.EmailAddress = eit.EmailAddress
when not matched then insert
(EmailAddress) values (eit.EmailAddress)
The merge statement will take appropriate locks to ensure that no other session can sneak in between it's "not matched" check and the "insert".
If you can't use merge, you could solve the problem client-side. Make sure that no two inserts are running at the same time. This is typically easy to do with a mutex or other synchronization construct.

What kind of errors exists in SQL querys for ROLLBACK?

For example:
insert into table( a, b ) values ('a','b') could generate the following error:
**a-b duplicate entry**
BUT here I can ignore this error selecting the ID of this values, then use this ID:
select ID from table where a = 'a' and b = 'b'
insert into brother( table ) values (ID)
Finally I could COMMIT the PROCEDURE. Look that this error isn't relevant for rollback if I need the ID.
The question is: what kind of errors will doing me to ROLLBACK the PROCEDURE???
I hope you understand.
I think you're asking, "What kind of errors can an INSERT statement cause that will make MySQL rollback a transaction?"
An INSERT that violates any constraint will cause a rollback. It could be foreign key constraint like you've outlined, but it could also be a UNIQUE constraint, or a CHECK constraint. (A CHECK constraint would probably be implemented as a trigger in MySQL.)
Trying to insert values that aren't valid (NULL in nonnullable columns, numbers that are out of range, invalid dates) might cause a rollback. But they might not, depending on the server configuration. (See link below.)
An INSERT can also fail due because it lacks permissions. That will also cause a rollback.
Some conditions that would cause a rollback on other platforms don't cause a rollback on MySQL.
The options MySQL has when an error
occurs are to stop the statement in
the middle or to recover as well as
possible from the problem and
continue. By default, the server
follows the latter course. This means,
for example, that the server may
coerce illegal values to the closest
legal values.
That quote is from How MySQL Deals with Constraints.
One of my favorite quotes from the MySQL documentation, 1.8.6.2. Constraints on Invalid Data.
MySQL enables you to store certain
incorrect date values into DATE and
DATETIME columns (such as '2000-02-31'
or '2000-02-00'). The idea is that it
is not the job of the SQL server to
validate dates.
Isn't that cute?