sql server, composite keys - ignoring duplicate - sql

Is there a way to prevent sql from throwing an error when I try to save a record that already exists. I've got a composite key table for a many-to-many relationship that has only the two values, when I update a model from my application, it tries to save all records, the records that already exist throw an error Cannot insert duplicate key is there a way of having the database ignore these, or do I have to handle it in the application?

you are calling an INSERT and trying to add duplicated keys. This error is by design, and essential. The DB is throwing an exception for an exceptional and erroneous condition.
If you are, instead, trying to perform an "upsert" you may need to use a stored procedure or use the MERGE syntax.
If, instead, you don't want to UPDATE but to just ignore rows already in the table, then you need to simply add an exception to your INSERT statement... such as
....
WHERE
table.Key <> interting.key

Try something like this with your insert statement.
insert into foo (x,y)
select #x,#y
except
select x,y from foo
This will add a record to foo, ONLY if it is not already in the table.

You could try creating your index with the IGNORE_DUP_KEY option so that you only get a warning when you have duplicate keys rather than a true error.
The other option and possibly the better one is to use the MERGE statement rather than insert. The MERGE statement let's you do Inserts, Updates and Deletes all in one statement and sounds like it should work out well for what you are trying to do.
Last but not least, as you said fix it in your app and only insert the rows that need to be added.

Related

Get all constraint errors when inserting data from another table

I have a staging table without any constraints in my Azure SQL database (Azure SQL database 12.0.2000.8). I want to insert the data from the Staging table into the "real" table on which multiple constraints are set. When inserting the data, I use a statement of the kind
INSERT INTO <someTable> SELECT <columns> FROM StagingTable;
Now I only get the first error when violating some constraints. However, for my use case, it is important to get all violations, so they can be resolved altogether.
I have tried using TRY...CATCH mechanisms, however, this will throw an error on the first error and run the catch clause, but it will not continue with the other data. Note that the correct data that has no violations should not be inserted, so the whole insert statement can be rolled back on one error, however, I want to see all violations to be able to correct them all without having to run the insert statement multiple times to get all errors.
EDIT:
The types of constraints that need to be checked are foreign key constraints, NOT NULL constraints, duplicate keys. No casting is done, so no need to check for conversions.
There are couple of options:
If you want to catch row level information, you have to go for cursors or while loop and try to insert each row in TRY CATCH block and see if you are getting any error, and log the same.
Create another table similar to main table(say, MainCheckTable) with all constraints and disable all the constraints and load the data.
Now, you can leverage DBCC CHECKCONSTRAINTS to see all the constraint violations.Read more on this .
USE DBName;
DBCC CHECKCONSTRAINTS(MainCheckTable) WITH ALL_CONSTRAINTS;
First, don't look at your primary table(s). Look at the related tables e.g. lookups etc. Populate these first. Once you have populated the related tables (i.e.) satisfy all related constraints, then add the data.
You need to work backwards from the least constrained tables to the most constrained if that makes sense.
You should check that your related tables have the required reference values/fields that you intend to insert. This is easy to do, since you already have a staging table.

postgresql: \copy method enter valid entries and discard exceptions

When entering the following command:
\copy mmcompany from '<path>/mmcompany.txt' delimiter ',' csv;
I get the following error:
ERROR: duplicate key value violates unique constraint "mmcompany_phonenumber_key"
I understand why it's happening, but how do I execute the command in a way that valid entries will be inserted and ones that create an error will be discarded?
The reason PostgreSQL doesn't do this is related to how it implements constraints and validation. When a constraint fails it causes a transaction abort. The transaction is in an unclean state and cannot be resumed.
It is possible to create a new subtransaction for each row but this is very slow and defeats the purpose of using COPY in the first place, so it isn't supported by PostgreSQL in COPY at this time. You can do it yourself in PL/PgSQL with a BEGIN ... EXCEPTION block inside a LOOP over a select from the data copied into a temporary table. This works fairly well but can be slow.
It's better, if possible, to use SQL to check the constraints before doing any insert that violates them. That way you can just:
CREATE TEMPORARY TABLE stagingtable(...);
\copy stagingtable FROM 'somefile.csv'
INSERT INTO realtable
SELECT * FROM stagingtable
WHERE check_constraints_here;
Do keep concurrency issues in mind though. If you're trying to do a merge/upsert via COPY you must LOCK TABLE realtable; at the start of your transaction or you will still have the potential for errors. It looks like that's what you're trying to do - a copy if not exists. If so, skipping errors is absolutely the wrong approach. See:
How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?
Insert, on duplicate update in PostgreSQL?
Postgresql - Clean way to insert records if they don't exist, update if they do
Can COPY be used with a function?
Postgresql csv importation that skips rows
... this is a much-discussed issue.
One way to handle the constraint violations is to define triggers on the target table to handle the errors. This is not ideal as there can still be race conditions (if concurrently loading), and triggers have pretty high overhead.
Another method: COPY into a staging table and load the data into the target table using SQL with some handling to skip existing entries.
Additionally, another useful method is to use pgloader

Cannot insert duplicate key row in object 'dbo.TitleClient' with unique index 'XAK1TitleClient'

Ever since I cleaned the data on the SQL Database I've been getting this issue, whereas on the unclean database the issue does not happen. When I run my stored procedure (huge procedure) it returns:
General SQL error. Cannot insert duplicate key row in object 'dbo.TitleClient' with unique index 'XAK1TitleClient'. Cannot insert the value NULL into column 'id_title', table 'Database.dbo.TitleCom'; column does not allow null, insert fails.
Is it possible that I deleted data from a table that causes this? Or is that impossible?
Does dbo.TitleClient have an identity column? You might need to run
DBCC CHECKIDENT('dbo.TitleClient')
I'm guessing that the first message
Cannot insert duplicate key row in
object 'dbo.TitleClient' with unique
index 'XAK1TitleClient'
is because the seed value is out of synch with the existing table values and the second error message
Cannot insert the value NULL into
column 'id_title', table
'Database.dbo.TitleCom' column does
not allow null, insert fails.
Comes from a failed attempt at inserting the result of scope_identity from the first statement.
How cleanly did you "clean" the data?
If some tables still have data, that might be causing a problem.
Especially if you have triggers resulting in further inserts.
For you to investigate further.
Take the body of your stored proc, and run it bit-by-bit.
Eventually, you'll get to the actual statement producing the error.
Of course if you aren't inserting into dbo.TitleClient at this point, then it's certainly a trigger causing problems.
Either way: Now you can easily check the data inserted earlier in your proc to figure out the root cause.

update table without using update statement

can anyone tell that how to update some records of a table without using update statement. it is possible using select statement.
I don't think you can update the table without update statement.
It is not possible with a select statement.
You can delete a row and insert the same row + your changes which is in many ways like an update, but will cause lots of trouble with foreign keys.
Oh, and your DBA might kill you.
You can use
REPLACE INTO tablename(primary key, ...{rest of the columns in the table})
VALUES(the same primary key, new values );
This will delete the previous row and insert a new row with the same primary key and updated column values. Not so much worthwhile, but maybe there is some other way.
It depends what tools you are using and what you actually want to achieve.
There are libraries which allow you to update data you got from a select statement. (eg. ORM's like NHibernate, I think ADO.NET also). These libraries are writing the update statements for you
You can use functions or triggers which change data when you just perform a select statement. In these functions or trigger, you still have an update statement.
For security reasons, you have to make sure that nobody injects an update statement into your select statement. So it is not just save to only perform a select statement.
how to update some records of a table
without using update statement.
Use a MERGE statement.
it is possible using select statement.
Logically, an update is a delete and an insert: INSERT INTO..SELECT to a staging table, modifying the data as appropriate, then DELETE then INSERT INTO..SELECT from staging table.
On the off chance you were asking how this happened when a module ran a select statement it created, then you need to read up on SQL injection. You cannot do an update without an update statment of some kind (includiing not only update but doing delete and then insert or useiing merge) and the user must have update permission on a table, but you can add an update to a select statement that is dymanically created if you haven't correctly parametized it to avoid SQL injection.

SQL Import skip duplicates

I am trying to do a bulk upload into a SQL server DB. The source file has duplicates which I want to remove, so I was hoping that the operation would automatically upload the first one, then discard the rest. (I've set a unique key constraint). Problem is, the moment a duplicate upload is attempted the whole thing fails and gets rolled back. Is there any way I can just tell SQL to keep going?
Try to bulk insert the data to the temporary table and then SELECT DISTINCT as #madcolor suggested or
INSERT INTO yourTable
SELECT * FROM #tempTable tt
WHERE NOT EXISTS (SELECT 1 FROM youTable yt WHERE yt.id = tt.id)
or other field in WHERE clause.
If you're doing this through some SQL tool like SQL Plus or DBVis or Toad, then I suspect not. If you're doing this programatically in a language, then you need to divide and conquer. Presumably executing an update line by line and catching each exception would be too lengthy a process, so instead you could do a batch operation first on the whole SQL block, and if it fails, do it on the first half, and if that fails, do it on the first half of the first half. Iterate this way until you have a block that succeeds. Discard the block and do the same procedure on the rest of the SQL. Anything that violates a constraint will eventually end up as a sole SQL statement which you know to log and discard. This should import with as much bulk processing as is possible while still throwing out the invalid lines.
Use SSIS for this. You can tell it to skip the duplicates. But first make sure they are true duplicates. What if the data in some of the columns is different, how do you know which is the better record to keep?