When to use "ON UPDATE CASCADE" - sql

I use ON DELETE CASCADE regularly but I never use ON UPDATE CASCADE as I am not so sure in what situation it will be useful.
For the sake of discussion let see some code.
CREATE TABLE parent (
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY (id)
);
CREATE TABLE child (
id INT NOT NULL AUTO_INCREMENT, parent_id INT,
INDEX par_ind (parent_id),
FOREIGN KEY (parent_id)
REFERENCES parent(id)
ON DELETE CASCADE
);
For ON DELETE CASCADE, if a parent with an id is deleted, a record in child with parent_id = parent.id will be automatically deleted. This should be no problem.
This means that ON UPDATE CASCADE will do the same thing when id of the parent is updated?
If (1) is true, it means that there is no need to use ON UPDATE CASCADE if parent.id is not updatable (or will never be updated) like when it is AUTO_INCREMENT or always set to be TIMESTAMP. Is that right?
If (2) is not true, in what other kind of situation should we use ON UPDATE CASCADE?
What if I (for some reason) update the child.parent_id to be something not existing, will it then be automatically deleted?
Well, I know, some of the question above can be test programmatically to understand but I want also know if any of this is database vendor dependent or not.
Please shed some light.

It's true that if your primary key is just an identity value auto incremented, you would have no real use for ON UPDATE CASCADE.
However, let's say that your primary key is a 10 digit UPC bar code and because of expansion, you need to change it to a 13-digit UPC bar code. In that case, ON UPDATE CASCADE would allow you to change the primary key value and any tables that have foreign key references to the value will be changed accordingly.
In reference to #4, if you change the child ID to something that doesn't exist in the parent table (and you have referential integrity), you should get a foreign key error.

Yes, it means that for example if you do UPDATE parent SET id = 20 WHERE id = 10 all children parent_id's of 10 will also be updated to 20
If you don't update the field the foreign key refers to, this setting is not needed
Can't think of any other use.
You can't do that as the foreign key constraint would fail.

I think you've pretty much nailed the points!
If you follow database design best practices and your primary key is never updatable (which I think should always be the case anyway), then you never really need the ON UPDATE CASCADE clause.
Zed made a good point, that if you use a natural key (e.g. a regular field from your database table) as your primary key, then there might be certain situations where you need to update your primary keys. Another recent example would be the ISBN (International Standard Book Numbers) which changed from 10 to 13 digits+characters not too long ago.
This is not the case if you choose to use surrogate (e.g. artifically system-generated) keys as your primary key (which would be my preferred choice in all but the most rare occasions).
So in the end: if your primary key never changes, then you never need the ON UPDATE CASCADE clause.
Marc

A few days ago I've had an issue with triggers, and I've figured out that ON UPDATE CASCADE can be useful. Take a look at this example (PostgreSQL):
CREATE TABLE club
(
key SERIAL PRIMARY KEY,
name TEXT UNIQUE
);
CREATE TABLE band
(
key SERIAL PRIMARY KEY,
name TEXT UNIQUE
);
CREATE TABLE concert
(
key SERIAL PRIMARY KEY,
club_name TEXT REFERENCES club(name) ON UPDATE CASCADE,
band_name TEXT REFERENCES band(name) ON UPDATE CASCADE,
concert_date DATE
);
In my issue, I had to define some additional operations (trigger) for updating the concert's table. Those operations had to modify club_name and band_name. I was unable to do it, because of reference. I couldn't modify concert and then deal with club and band tables. I couldn't also do it the other way. ON UPDATE CASCADE was the key to solve the problem.

The ON UPDATE and ON DELETE specify which action will execute when a row in the parent table is updated and deleted. The following are permitted actions : NO ACTION, CASCADE, SET NULL, and SET DEFAULT.
Delete actions of rows in the parent table
If you delete one or more rows in the parent table, you can set one of the following actions:
ON DELETE NO ACTION: SQL Server raises an error and rolls back the delete action on the row in the parent table.
ON DELETE CASCADE: SQL Server deletes the rows in the child table that is corresponding to the row deleted from the parent table.
ON DELETE SET NULL: SQL Server sets the rows in the child table to NULL if the corresponding rows in the parent table are deleted. To execute this action, the foreign key columns must be nullable.
ON DELETE SET DEFAULT: SQL Server sets the rows in the child table to their default values if the corresponding rows in the parent table are deleted. To execute this action, the foreign key columns must have default definitions. Note that a nullable column has a default value of NULL if no default value specified.
By default, SQL Server appliesON DELETE NO ACTION if you don’t explicitly specify any action.
Update action of rows in the parent table
If you update one or more rows in the parent table, you can set one of the following actions:
ON UPDATE NO ACTION: SQL Server raises an error and rolls back the update action on the row in the parent table.
ON UPDATE CASCADE: SQL Server updates the corresponding rows in the child table when the rows in the parent table are updated.
ON UPDATE SET NULL: SQL Server sets the rows in the child table to NULL when the corresponding row in the parent table is updated. Note that the foreign key columns must be nullable for this action to execute.
ON UPDATE SET DEFAULT: SQL Server sets the default values for the rows in the child table that have the corresponding rows in the parent table updated.
FOREIGN KEY (foreign_key_columns)
REFERENCES parent_table(parent_key_columns)
ON UPDATE <action>
ON DELETE <action>;
See the reference tutorial.

It's an excellent question, I had the same question yesterday. I thought about this problem, specifically SEARCHED if existed something like "ON UPDATE CASCADE" and fortunately the designers of SQL had also thought about that. I agree with Ted.strauss, and I also commented Noran's case.
When did I use it? Like Ted pointed out, when you are treating several databases at one time, and the modification in one of them, in one table, has any kind of reproduction in what Ted calls "satellite database", can't be kept with the very original ID, and for any reason you have to create a new one, in case you can't update the data on the old one (for example due to permissions, or in case you are searching for fastness in a case that is so ephemeral that doesn't deserve the absolute and utter respect for the total rules of normalization, simply because will be a very short-lived utility)
So, I agree in two points:
(A.) Yes, in many times a better design can avoid it; BUT
(B.) In cases of migrations, replicating databases, or solving emergencies, it's a GREAT TOOL that fortunately was there when I went to search if it existed.

My comment is mainly in reference to point #3: under what circumstances is ON UPDATE CASCADE applicable if we're assuming that the parent key is not updateable? Here is one case.
I am dealing with a replication scenario in which multiple satellite databases need to be merged with a master. Each satellite is generating data on the same tables, so merging of the tables to the master leads to violations of the uniqueness constraint. I'm trying to use ON UPDATE CASCADE as part of a solution in which I re-increment the keys during each merge. ON UPDATE CASCADE should simplify this process by automating part of the process.

To add to other great answers here it is important to use ON UPDATE CASCADE (or on DELETE CASCADE...) cautiously. Operations on tables with this specification require exclusive lock on underlaying relations.
If you have multiple CASCADE definitions in one table (as in other answer), and especially multiple tables using same definitions, and multiple users updating, this can create a deadlock when one process acquires exclusive lock on first underlaying table, other exclusive lock on second, and they block out each other by none of them being able to get both (all) exclusive locks to perform operation.

Related

SQL Trigger: On update of primary key, how to determine which "deleted" record cooresponds to which "inserted" record?

Assume that I know that updating a primary key is bad.
There are other questions which imply that the inserted and updated table records match by position (the first of one matches the first of the other.) Is this a fact or coincidence?
Is there anything that could join the two tables together when the primary key changes on an update?
There is no match of inserted+deleted virtual table row positions.
And no, you can't match rows
Some options:
there is another unique unchanging (for that update) key to link rows
limit to single row actions.
use a stored procedure with the OUTPUT clause to capture before and after keys
INSTEAD OF trigger with OUTPUT clause (TBH not sure if you can do this)
disallow primary key updates (added after comment)
Each table is allowed to have one identity column. Identity columns are not updateable; they are assigned a value when the records are inserted (or when the column is added), and they can never change. If the primary key is updateable, it must not be an identity column. So, either the table has another column which is an identity column, or you can add one to it. There is no rule that says the identity column has to be the primary key. Then in the trigger, rows in inserted and updated that have the same identity value are the same row, and you can support updating the primary key on multiple rows at a time.
Yes -- create an "old_primary_key" field in the table you're updating, and populate it first.
Nothing you can do to match-up the inserted and deleted psuedo table record keys -- even if you store their data in a log table somewhere.
I guess alternatively, you could create a separate log table that tracked changes to primary keys (old and new). This might be more useful than adding a field to the table you're updating as I suggested right at first, as it would allow you to track more than one change for a given record. Just depends on your situation, I guess.
But that said -- before you do anything, please go find a chalk board and write this 100 times:
I know that updating a primary key is bad.
I know that updating a primary key is bad.
I know that updating a primary key is bad.
I know that updating a primary key is bad.
I know that updating a primary key is bad.
...
:-) (just kidding)

How to change value of primary key and update foreign key in the same time

I have a record in table with wrong primary key. I want change it to correct value, but this value is used in many other tables.
Is there any simple way to update primary key and foreign key at the same tim?
If the foreign keys are set to cascade changes then the value should change automatically.
Make sure that your foreign key relationships have ON UPDATE CASCADE specified, and the foreign key will automatically update to match the primary key.
From Books Online:
http://msdn.microsoft.com/en-us/library/ms174123%28v=SQL.90%29.aspx
ON UPDATE {CASCADE | NO ACTION | SET
DEFAULT | SET NULL}
Specifies what action happens to a row in the table that is created when
that row has a referential
relationship, and the referenced row
is updated in the parent table. The
default is NO ACTION. See the
"Remarks" section later in this topic
for more information.
Updating a primary key does not update related foreign keys, it only deletes the related records on other tables as Sql Server treats update as delete and insert. This is Sql Server 2000, not sure later versions. Using "on cascading update on cascading delete", the cascading effects of the "delete and insert:aka update", deletes the related records on other tables.

actual working of primary, foreign key and unique constraint , order/steps of their working

How do primary key , foreign key and unique constraints work? i mean in what sequence?
Like, when a child table has a FK, and a record is inserted into it , which doesn't exists in the parent table, then is this record first inserted into the child table & then the constraint checks in the Parent table if this record exists or not, and if it doesn't finds it then it rollbacks and removes the record from the Child table. is this the order of working?
or, does first SQL gets the record(on which the FK is made) from the insert query, & matches it with the parent table records, and ceases the insert when matching record is not found, while insertion itself and doesn't inserts the row in the child table?
Similarly, for the primary key, if a duplicate record is inserted in a table, then is it first inserted then checked or before insertion first it is matched with existing records, and if it is a duplicate one, then the query is ceased.
Logically speaking, all constraints are supposed to be checked simultaneously against the entire result of an UPDATE, INSERT or DELETE statement. The constraints are evaluated as if the modification to all rows had already happened and if any constraint would be violated then the modification is not permitted.
You need the basic of rdbms reference. Here is the free resource: http://msdn.microsoft.com/en-us/library/aa933098%28v=SQL.80%29.aspx
Consider the logical (conceptual) tables deleted and inserted that are accessible to a TRIGGER. Even these are only concepts. Who knows what's going on under the covers? ...well, someone is bound to know... but do you care what's going on under the covers? At the conceptual level, it either succeeds or fails or you can manipulate the outcome in a trigger. What more do you need to know? ;)

How do I rename primary key values in Oracle?

Our application uses an Oracle 10g database where several primary keys are exposed to the end user. Productcodes and such. Unfortunately it's to late to do anything with this, as there are tons of reports and custom scripts out there that we do not have control over. We can't redefine the primary keys or mess up the database structure.
Now some customer want to change some of the primary key values. What they initially wanted to call P23A1 should now be called CAT23MOD1 (not a real example, but you get my meaning.)
Is there an easy way to do this? I would prefer a script of some sort, that could be parametrized to fit other tables and keys, but external tools would be acceptable if no other way exists.
The problem is presumably with the foreign keys that reference the PK. You must define the foreign keys as "deferrable initially immediate", as described in this Tom Kyte article: http://www.oracle.com/technology/oramag/oracle/03-nov/o63asktom.html
That lets you ...
Defer the constraints
Modify the parent value
Modify the child values
Commit the change
Simple.
Oops. A little googling makes it appear that, inexplicably, Oracle does not implement ON UPDATE CASCADE, only ON DELETE CASCADE. To find workarounds google ORACLE ON UPDATE CASCADE. Here's a link on Creating A Cascade Update Set of Tables in Oracle.
Original answer:
If I understand correctly, you want to change the values of data in primary key columns, not the actual constraint names of the keys themselves.
If this is true it can most easily be accomplished redefining ALL the foreign keys that reference the affected primary key constraint as ON UPDATE CASCADE. This means that when you make a change to the primary key value, the engine will automatically update all related values in foreign key tables.
Be aware that if this results in a lot of changes it could be prohibitively expensive in a production system.
If you have to do this on a live system with no DDL changes to the tables involved, then I think your only option is to (for each value of the PK that needs to be changed):
Insert into the parent table a copy of the row with the PK value replaced
For each child table, update the FK value to the new PK value
Delete the parent table row with the old PK value
If you have a list of parent tables and the PK values to be renamed, it shouldn't be too hard to write a procedure that does this - the information in USER_CONSTRAINTS can be used to get the FK-related tables for a given parent table.

Constraint for one-to-many relationship

We have a two tables with a one-to-many relationship. We would like to enforce a constraint that at least one child record exist for a given parent record.
Is this possible?
If not, would you change the schema a bit more complex to support such a constraint? If so how would you do it?
Edit: I'm using SQL Server 2005
Such a constraint isn't possible from a schema perspective, because you run into a "chicken or the egg" type of scenario. Under this sort of scenario, when I insert into the parent table I have to have a row in the child table, but I can't have a row in the child table until there's a row in the parent table.
This is something better enforced client-side.
It's possible if your back-end supports deferrable constraints, as does PostgreSQL.
How about a simple non nullable column?
Create Table ParentTable
(
ParentID
ChildID not null,
Primary Key (ParentID),
Foreign Key (ChildID ) references Childtable (ChildID));
)
If your business logic allows and you have default values you can query from the database for each new parent record, you can then use a before insert trigger on the parent table to populate the non nullable child column.
CREATE or REPLACE TRIGGER trigger_name
BEFORE INSERT
ON ParentTable
FOR EACH ROW
BEGIN
-- ( insert new row into ChildTable )
-- update childID column in ParentTable
END;
This isn't really something 'better enforced on the client side' so much as it is something that is impractical to enforce within certain database implementations. Realistically the job DOES belong in the database and at least one of the workarounds below should work.
Ultimately what you want is to constrain the parent to a child. This guarantees that a child exists. Unfortunately this causes a chicken-egg problem because the children must point to the same parent causing a constraint conflict.
Getting around the problem without visible side-effects in the rest of your system requires one of two abilities - neither of which is found in SQL Server.
1) Deferred constraint validation - This causes constraints to be validated at the end the transaction. Normally they happen at the end of a statement. This is the root of the chicken-egg problem since it prevents you from inserting either the first child or the parent row for lack of the other and this resolves it.
2) You can use a CTE to insert the first child where the CTE hangs off of the statement that inserts the parent (or vise versa). This inserts both rows in the same statement causing an effect similar to deferred constraint validation.
3) Without either you have no choice but to allow nulls in one of the references so you can insert that row without the dependency check. Then you must go back and update the null with the reference to the second row. If you use this technique you need to be careful to make the rest of the system refer to the parent table thru a view that hides all rows with null in the child reference column.
In any case your deletes of children are just as complicated because you cannot delete the child that proves at least one exists unless you update the parent first to point to a child that won't be deleted.
When you are about to delete the last child either you must throw an error or delete the parent at the same time. The error will occur automatically if you don't set the parent pointer to null first (or defer validation). If you do defer (or set the child pointer to null) your delete of the child will be possible and the parent can then be deleted as well.
I literally researched this for years and I watch every version of SQL Server for relief from this problem since it's so common.
PLEASE As soon as anyone has a practical solution please post!
P.S. You need to either use a compound key when referring to your proof-of-child row from the parent or a trigger to insure that the child providing proof actually considers that row to be its parent.
P.P.S Although it's true that null should never be visible to the rest of your system if you do both inserts and the update in the same transaction this relies on behavior that could fail. The point of the constraint is to insure that a logic failure won't leave your database in an invalid state. By protecting the table with a view that hides nulls any illegal row will not be visible. Clearly your insert logic must account for the possibility that such a row can exist but it needs inside knowledge anyway and nothing else needs to know.
I am encountering this issue, and have a solution implemented in Oracle rel.11.2.4.
To ensure that every child has a parent, I applied a typical foreign-key constraint from the FK of the child to the PK of the parent. -- no wizardry here
To ensure that every parent has at least one child, I did as follows:
Create a function which accepts a parent PK, and returns a COUNT of children for that PK. -- I ensure that NO_DATA_FOUND exceptions return 0.
Create a virtual column CHILD_COUNT on the parent table and calculate it to the function result.
Create a deferrable CHECK constraint on the CHILD_COUNT virtual column with the criteria of CHILD_COUNT > 0
It works as follows:
If a parent row is inserted and no children exist yet. then if that row is committed, the CHILD_COUNT > 0 CHECK constraint fails and the transaction rolls back.
If a parent row is inserted and a corresponding child row is inserted in the same transaction; then all integrity constraints are satisfied when a COMMIT is issued.
If a child row is inserted corresponding to an existing parent row, then the CHILD_COUNT virtual column is recalculated on COMMIT and no integrity violation occurs.
Any deletes of parent rows must cascade to the children otherwise orphaned child rows will violate the FOREIGN KEY constraint when the delete transaction is committed. (as usual)
Any deletes of child rows must leave at least one child for each parent otherwise the CHILD_COUNT check constraint will violate when the transaction commits.
NOTE: that I would not need a virtual column if Oracle would allow user-function-based CHECK constraints at rel.11.2.4.