Automatically drop records on insert if they violate constraint

Automatically drop records on insert if they violate constraint - sql-server-2012

I want to set up a table with a constraint on it, but when I insert records, I don't want to get any constraint violation errors. I would like SQL to quietly drop any records that aren't unique, but carry on inserting those that can be inserted.
for example....
create table table1
(value1 int,
value2 int,
constraint uc_tab1 Unique (value1,value2)
)
create table table2
(value1 int,
value2 int
)
insert into table2 (value1,value2)
select 1,1
union all
select 2,1
union all
select 3,1
union all
select 1,1
insert into table1
select value1,value2 from table2
At the moment, this will fall over on a violation constraint. I want to suppress that error, so that table1 contains...
1,1
2,1
3,1
(in this example, I could just do a group by on table2, but in my actual application that isn't really viable)
I vaguely remember reading something about this years ago, but I might have imagined it. Is this possible?
Many thanks in advance

Please don't do this, you will lose data very easily
Instead try to change your application so it only inserts valid data isntead of dropping incorrect data

You can use the IGNORE_DUP_KEY index option, although personally I think it is better to find another way of solving your problem.
You can set it to ON to only generate warnings for inserted rows that violate the unique constraint instead of generating errors.

Look into the MERGE statement. It's complex, but can be made to do what you are describing.
(There is or was something that could cause an INSERT statement to continue to insert data even if some rows could not be inserted, but for the life of me I can't find it in BOL or recall what it was called. I'm pretty sure it raised errors anyway, and it always sounded like a horrible idea to me.)

Specifying Ignore_Dup_Key when I created my constraint did the trick. In the above example, I changed the table1 definition to....
create table table1
(value1 int,
value2 int,
constraint uc_tab1 Unique (value1,value2) WITH (IGNORE_DUP_KEY = ON)
)
And it worked perfectly

Related

Update/Create table from select with changes in the column values in Oracle 11g (speed up)

At the job we have an update script for some Oracle 11g database that takes around 20 hours, and some of the most demanding queries are updates where we change some values, something like:
UPDATE table1 SET
column1 = DECODE(table1.column1,null,null,'no info','no info','default value'),
column2 = DECODE(table1.column2,null,null,'no info','no info','another default value'),
column3 = 'default value';
And like this, we have many updates. The problem is that the tables have around 10 millions of rows. We also have some updates where some columns are going to have a default value but they are nullable (I know if they have the not null and a default constrains then the add of such columns is almost immediate because the values are in a catalog), and then the update or add of such columns is costing a lot of time.
My approach is to recreate the table (as TOM said in https://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:6407993912330 ). But I have no idea on how to retrive some columns from the original table, that are going to remain the same and also other that are going to change to a default value (and before the update such column had a sensible info), this because we need to keep some info private.
So, my approach is something like this:
CREATE TABLE table1_tmp PARALLEL NOLOGGING
AS (select col1,col2,col3,col4 from table1);
ALTER TABLE table1_tmp ADD ( col5 VARCHAR(10) default('some info') NOT NULL;
ALTER TABLE table1_tmp ADD ( col6 VARCHAR(10) default('some info') NOT NULL;
ALTER TABLE table1_tmp ADD ( col7 VARCHAR(10);
ALTER TABLE table1_tmp ADD ( col8 VARCHAR(10);
MERGE INTO table1_tmp tt
USING table1 t
ON (t.col1 = tt.col1)
WHEN MATCHED THEN
UPDATE SET
tt.col7 = 'some defaul value that may be null',
tt.col7 = 'some value that may be null';
I also have tried to create the nullable values as not null to do it fast, and worked, the problem is when I return the columns to null, then that operation takes too much time. The last code ended up consuming also a great amount of time (more tha one hour in the merge).
Hope have an idea on how to improve performance in stuff like this.
Thanks in advance!

Maybe you can try using NVL while joining in merge:
MERGE INTO table1_tmp tt
USING table1 t
ON (nlv(t.col1,'-3') = nvl(tt.col1,'-3'))
WHEN MATCHED THEN ....
If you don't want update null values you can also do like this:
MERGE INTO table1_tmp tt
USING table1 t
ON (nlv(t.col1,'-3') = nvl(tt.col1,'-2'))
WHEN MATCHED THEN .....

At the end, I finished creating a temp table with data from the original table, and while doing the create, inserting the default values and decodes and any other stuff, like if I wanted to set something to NULL, I did the cast. Something like:
CREATE TABLE table1_tmp AS (
column1 default "default message",
column2, --This column with no change at all
column3, --This will take the value from the decode below
) AS SELECT
"default message" column1,
column2 --This column with no change at all,
decode(column3, "Something", NULL, "A", "B") column3,
FROM table1;
That is how I solved the problem. The time for coping a 23 million row's table was about 3 to 5 minutes, while updating used to take hours. Now just need to set privileges, constraints, indexes, comment, and that's it, but that stuff just takes seconds.
Thanks for the answer #thehazal could not check your approach, but sounds interesting.

Updating SQL Server table with composite key

I have a SQL Server table with three columns, the first two columns are the primary key. I'm writing a stored procedure that will update the last two columns in mass and it works fine for that as long as there are are no primary key violations but when there is a primary key violation it throws an error and stops executing.
How can I make it to ignore the line and continue updating the record as long as there is no primary key violation?
Is there a better way to approach this problem? I'm only doing a simple update with where as column2= somevalue AND column 3 = some value.

In SQL Server you'd use MERGE to upsert (i.e. insert or update):
MERGE mytable
USING (SELECT 1 as key1, 2 as key2, 3 as col1, 4 as col2) AS src
ON (mytable.key1 = src.key1 AND mytable.key2 = src.key2)
WHEN MATCHED THEN
UPDATE SET col1 = src.col1, col2 = src.col2
WHEN NOT MATCHED THEN
INSERT (key1, key2, col1, col2) VALUES (src.key1, src.key2, src.col1, src.col2);

There is nothing inherently wrong with your question, despite the rather loud protestations. Your question is confusing, especially when you refer to columns by position. That is a big no-no. So, a script that demonstrates your problem is generally the best way to both demonstrate your problem and get useful suggestions.
The short answer to your question is - you can't. A statement either succeeds or fails as a whole. If you want to update each row individually and ignore certain errors, then you need to write your tsql to do that.
And despite the protests (again), there are situations where it is necessary to update columns that are part of the primary key. It is unusual - very unusual - but you should also be wary of any absolute statement about tsql. When you find yourself doing unusual things, you should review your schema (and your approach) because it is quite possible that there are better ways to accomplish your goal.
And in this case, I suggest that you SHOULD really think about what you are trying to accomplish. If you want to update a set of rows in a particular way and the statement fails - that means there is a flaw somewhere!. Typically, this error implies that your update logic is not correct. Perhaps you assume something about your data that is not accurate? It is impossible to know from a distance. The error message will tell you what set of values caused the conflict - so that should give you sufficient information to investigate. As another tool, write a select statement that demonstrates your proposed update and look for the values in the error message. E.g.
set nocount on;
create table #x (a smallint not null, b smallint not null, c varchar(10) not null, constraint xx primary key(a, b));
insert #x (a, b, c) values (1, 1, 'test'), (1, 2, 'zork');
select * from #x;
update #x set b = 2, c = 'dork';
select a, b, c, cast(2 as smallint) as new_b, 'dork' as new_c
from #x
order by a, new_b;
drop table #x;

How to temporary delete some rows in a SQL table?

In a SP I want to delete some rows from a Table and after some code insert the deleted rows in the same table.
How can I do it?
Thanks all.
Update:
I have a Table:
SampleTable(Col1, Col2, Col3, Col4)
I want to do that:
DELETE FROM SampleTable
WHERE Col1 = "foo"
-- SOME CODE...
INSERT INTO SampleTable
[DELETED VALUES...]
UPDATE:
Sorry but now I can't see the DB.
The problem is that in the SOME CODE... part, written by others, there is a delete that give me an error, but after the delete there is an insert with the SP input that replaced the deleted row with the same key.
I know that an UPDATE apparently solve my problem but there is a lot of logic and I don't want to change the SOME CODE... part, so I'm looking for a workaround, and so I want to temporary ignore foreign key

select * into #ttable FROM SampleTable
WHERE Col1 = "foo"
DELETE FROM SampleTable
WHERE Col1 = "foo"
-- SOME CODE...
INSERT INTO SampleTable
select * from #ttable

Deleting and re-inserting the rows can introduce all sorts of problems. For instance, identity() values will change (as well as automatically assigned creation times). In addition, you might have constraints. In theory, anything could happen to the database between the deletion and re-insertion, so constraints that once worked might fail.
How about creating a view?
create view v_SampleTable as
select *
from SampleTable
where col1 = 'foo' or col1 is null;
Then change the code to use v_SampleTable instead of SampleTable. This is an updatable view, so it will even permit modifications to the data inside the table.
You could go even one step further and rename the table first and then create a view with the same name.

How do I delete one record matching some criteria in SQL? (Netezza)

I've got some duplicate records in a table because as it turns out Netezza does not support constraint checks on primary keys. That being said, I have some records where the information is the exact same and I want to delete just ONE of them. I've tried doing
delete from table_name where test_id=2025 limit 1
and also
delete from table_name where test_id=2025 rowsetlimit 1
However neither option works. I get an error saying
found 'limit'. Expecting a keyword
Is there any way to limit the records deleted by this query? I know I could just delete the record and reinsert it but that is a little tedious since I will have to do this multiple times.
Please note that this is not SQL Server or MySQL.This is for Netezza

If it doesn't support either "DELETE TOP 1" or the "LIMIT" keyword, you may end up having to do one of the following:
1) add some sort of an auto-incrementing column (like IDs), making each row unique. I don't know if you can do that in Netezza after the table has been created, though.
2) Programmatically read the entire table with some programming language, eliminate duplicates programmatically, then deleting all the rows and inserting them again. This might not be possible if they are references by other tables, in which case, you might have to temporarily remove the constraint.
I hope that helps. Please let us know.
And for future reference; this is why I personally always create an auto-incrementing ID field, even if I don't think I'll ever use it. :)

The below query works for deleting duplicates from a table.
DELETE FROM YOURTABLE
WHERE COLNAME1='XYZ' AND
(
COLNAME1,
ROWID
)
NOT IN
(
SELECT COLNAME1,
MAX(ROWID)
FROM YOURTABLENAME
WHERE COLNAME = 'XYZ'
GROUP BY COLNAME1
)

If the records are identical then you could do something like
CREATE TABLE DUPES as
SELECT col11,col2,col3,col....... coln from source_table where test_id = 2025
group by
1,2,3..... n
DELETE FROM source_table where test_id = 2025
INSERT INTO Source_table select * from duoes
DROP TABLE DUPES
You could even create a sub-query to select all the test_ids HAVING COUNT(*) > 1 to automatically find the dupes in steps 1 and 3

-- remove duplicates from the <<TableName>> table
delete from <<TableName>>
where rowid not in
(
select min(rowid) from <<TableName>>
group by (col1,col2,col3)
);

The GROUP BY 1,2,3,....,n will eliminate the dupes on the insert to the temp table

Does the use rowid is allowed in Netezza...As far as my knowledge is concern i don't think this query will executed in Netezza...

SQL Constraint IGNORE_DUP_KEY on Update

I have a Constraint on a table with IGNORE_DUP_KEY. This allows bulk inserts to partially work where some records are dupes and some are not (only inserting the non-dupes). However, it does not allow updates to partially work, where I only want those records updated where dupes will not be created.
Does anyone know how I can support IGNORE_DUP_KEY when applying updates?
I am using MS SQL 2005

If I understand correctly, you want to do UPDATEs without specifying the necessary WHERE logic to avoid creating duplicates?
create table #t (col1 int not null, col2 int not null, primary key (col1, col2))
insert into #t
select 1, 1 union all
select 1, 2 union all
select 2, 3
-- you want to do just this...
update #t set col2 = 1
-- ... but you really need to do this
update #t set col2 = 1
where not exists (
select * from #t t2
where #t.col1 = t2.col1 and col2 = 1
)
The main options that come to mind are:
Use a complete UPDATE statement to avoid creating duplicates
Use an INSTEAD OF UPDATE trigger to 'intercept' the UPDATE and only do UPDATEs that won't create a duplicate
Use a row-by-row processing technique such as cursors and wrap each UPDATE in TRY...CATCH... or whatever the language's equivalent is
I don't think anyone can tell you which one is best, because it depends on what you're trying to do and what environment you're working in. But because row-by-row processing could potentially produce some false positives, I would try to stick with a set-based approach.

I'm not sure what is really going on, but if you are inserting duplicates and updating Primary Keys as part of a bulk load process, then a staging table might be the solution for you. You create a table that you make sure is empty prior to the bulk load, then load it with the 100% raw data from the file, then process that data into your real tables (set based is best). You can do things like this to insert all rows that don't already exist:
INSERT INTO RealTable
(pk, col1, col2, col3)
SELECT
pk, col1, col2, col3
FROM StageTable s
WHERE NOT EXISTS (SELECT
1
FROM RealTable r
WHERE s.pk=r.pk
)
Prevent the duplicates in the first place is best. You could also do UPDATEs on your real table by joining in the staging table, etc. This will avoid the need to "work around" the constraints. When you work around the constraints, you usually create difficult to find bugs.

I have the feeling you should use the MERGE statement and then in the update part you should really not update the key you want to have unique. That also means that you have to define in your table that a key is unique (Setting a unique index or define as primary key). Then any update or insert with a duplicate key will fail.
Edit: I think this link will help on that:
http://msdn.microsoft.com/en-us/library/bb522522.aspx

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas