Should I let JPA or the database cascade deletions? - sql

Let's say we have two entities, A and B. B has a many-to-one relationship to A like follows:
#Entity
public class A {
#OneToMany(mappedBy="a_id")
private List<B> children;
}
#Entity
public class B {
private String data;
}
Now, I want to delete the A object and cascade the deletions to all its children B. There are two ways to do this:
Add cascade=CascadeType.ALL, orphanRemoval=true to the OneToMany annotation, letting JPA remove all children before removing the A-object from the database.
Leave the classes as they are and simply let the database cascade the deletion.
Is there any problem with using the later option? Will it cause the Entity Manager to keep references to already deleted objects? My reason for choosing option two over one is that option one generates n+1 SQL queries for a removal, which can take a prolonged time when object A contains a lot of children, while option two only generates a single SQL query and then moves on happily. Is there any "best practice" regarding this?

I'd prefer the database. Why?
The database is probably a lot faster doing this
The database should be the primary place to hold integrity and relationship information. JPA is just reflecting that information
If you're connecting with a different application / platform (i.e. without JPA), you can still cascadingly delete your records, which helps increase data integrity

In EclipseLink you can use both if you use the #CascadeOnDelete annotation. EclipseLink will also generate the cascade DDL for you.
See,
http://wiki.eclipse.org/EclipseLink/Examples/JPA/DeleteCascade
This optimizes the deletion by letting the database do it, but also maintains the cache and the persistence unit by removing the objects.
Note that orphanRemoval=true will also delete objects removed from the collection, which the database cascade constraint will not do for you, so having the rules in JPA is still necessary. There are also some relationships that the database cannot handle deletion for, as the database can only cascade in the inverse direction of the constraint, a OneToOne with a foreign key, or a OneToMany with a join table cannot be cascaded on the database.

This answer raises some really strong arguments about why it should be JPA that handles the cascade, not the database.
Here's the relevant quote:
...if you would make cascades on database, and not declare them in
Hibernate (for performance issues) you could in some circumstances get
errors. This is because Hibernate stores entities in its session
cache, so it would not know about database deleting something in
cascade.
When you use second-level cache, your situation is even worse, because
this cache lives longer than session and such changes on db-side will
be invisible to other sessions as long old values are stored in this
cache.

Related

SQLAlchemy: how to disable ORM-level foreign key handling on delete?

I've learned that SQLAlchemy implements some foreign key handling such as setting them null when the parent is deleted, separately from the database, meaning they can be set to different behaviors, and different ways of doing the same thing could get either one. Example:
I have Comment and Subscription, where Subscription has a foreign key relationship to Comment. I started by setting backref = backref('subs', cascade = 'delete, delete-orphan') on the Subscription's relationship, and the result was that session.delete(comment) would delete the subscriptions on the comment properly, but session.query(Comment).filter_by(id = id).delete() would fail with a foreign key violation, because the cascade was set at the ORM level instead of the Postgres level.
Needless to say I find this very confusing, and I want to disable it so that all foreign key handling on deletes is done by Postgres. I'm not finding an obvious way to do it looking at the docs. I read about passive deletes, which sounds like it does what I want except that it doesn't apply to objects already loaded into the session.
Is there a way to disable all ORM-level foreign key handling on deletes? And is there a good reason not to?
Huh, I tried out passive deletes and it seems like it does what I need after all. The words "The cascade="all, delete-orphan" will take effect for instances of MyOtherClass which are currently present in the session" made me think that it would still manually issue deletes for dependent objects if they were added to the session, but if I fetch the depended object and then the dependent one and then modify the dependent, verifying that it's in session.dirty, and then delete the depended and commit, SQLAlchemy doesn't manually issue a delete for the dependent object.

SQL practices to persist the reference and remove referenced object

I designing a SQL db system(with Postgre) and I have a question about what is the common practice to create a relationship / reference that can persist even when referenced objects are deleted.
For example, there is a UserORM, and ActivityORM, and UserActivityRelation. ActivityORM holds user.id as foreign key to tell who created the activity, and relation table is about which users should know about the activity.
Now, if I want to remove the actor from db, I still want ActivityORM and the relation table to persist so that other users can still know about the activities. I want to know what is the most common / best practice to design such system. Simple answers might be not assigning them as foreign keys, or create an inactive state, but I wonder if there is any better ways. Thank you.
As you mentioned, removing the problematical foreign key constraints would solve the immediate problem of deleting either users or activities on either side of the bridge table. But then this would allow broken relationships to end up in your database.
I propose using soft deletion here. Add an active bit column to both the UserORM and ActivityORM tables. Then, when you need to delete a user or activity, just mark that record as inactive. This would guarantee that the key relationships do not get broken during the deletion. And this approach would also let you view relationships which used to exist, prior to any deletions.

Unable to cascade delete a #OneToOne member

#Entity public class Organization {
#OneToOne(fetch = FetchType.EAGER)
#OnDelete(action = OnDeleteAction.CASCADE)
#Cascade(value = DELETE_ORPHAN)
private Days days;
}
I have the following entity definition and it generates a SQL to do a cascade delete on the #OneToOne entry when the parent object gets deleted. But it's not removing the "days" entry while deleting an Organization.
This happens with h2, mysql databases, what could be the problem here.
My query looks like this "delete from Organization where some_key_id = ?" (am not deleting this based on primary key id)
A bulk delete (you should mention that in your question) doesn't cascade to anything. Quoting the JPA 1.0 specification:
4.10 Bulk Update and Delete Operations
...
A delete operation only applies to
entities of the specified class and
its subclasses. It does not cascade to
related entities.
This is a very annoying limitation and there are many RFEs to improve things (HHH-695, HHH-1917, HHH-3337, HHH-5529, etc).
For now, possible solutions include:
clean up the child table yourself
use cascading foreign keys in the schema.
Now the weird part... My understanding of #OnDelete(action = OnDeleteAction.CASCADE) is that this annotation is supposed to be used to ensure that the foreign key is created with the appropriate ON DELETE CASCADE clause (solution #2). In other words, I'd expect things to work.
Did Hibernate generate the Organization table? Can you check the DDL? Do you see the expected ON DELETE CASCADE? If not, add it.
Well, I guess you should add a
#Cascade(value = {DELETE, DELETE_ORPHAN})
Note that in JPA 2.0 you don't have to use the hibernate-sepcific #Cascade annotation - #*ToMany has an option to delete orphans.
Update: when using queries cascades are not handled. You have to handle them manually. This is expected and documented behaviour.

Why doesn't NHibernate delete orphans first?

I'm trying to figure out why NHibernate handles one-to-many cascading (using cascade=all-delete-orphan) the way it does. I ran into the same issue as this guy:
Forcing NHibernate to cascade delete before inserts
As far as I can tell NHibernate always performs inserts first, then updates, then deletes. There may be a very good reason for this, but I can't for the life of me figure out what that reason is. I'm hoping that a better understanding of this will help me come up with a solution that I don't hate :)
Are there any good theories on this behavior? In what scenario would deleting orphans first not work? Do all ORMs work this way?
EDIT: After saying there is no reason, here is a reason.
Lets say you have the following scenario:
public class Dog {
public DogLeg StrongestLeg {get;set;}
public IList<DogLeg> Legs {get;set;
}
If you were to delete first, and lets say you delete all of Dog.Legs, then you may delete the StrongestLeg which would cause a reference violation. Hence you cannot DELETE before you UPDATE.
Lets say you add a new leg, and that new leg is also the StrongestLeg. Then you must INSERT before you UPDATE so that the Leg has an Id that can be inserted into Dog.StrongestLegId.
So you must INSERT, UPDATE, then DELETE.
Also as nHibernate is based on Hibernate, I had a look into Hibernate and found several people talking about the same issue.
Support one-to-many list associations with constraints on both (owner_id, position) and (child_id)
Non lazy loaded List updates done in wrong order, cause exception
wrong insert/delete order when updating record-set
Why does Hibernate perform Inserts before Deletes?
Unidirection OneToMany causes duplicate key entry violation when removing from list
And here is the best answer from them:
Gail Badner added a comment - 21/Feb/08 2:30 PM: The problem arises when a new
association entity with a generated ID
is added to the collection. The first
step, when merging an entity
containing this collection, is to
cascade save the new association
entity. The cascade must occur before
other changes to the collection.
Because the unique key for this new
association entity is the same as an
entity that is already persisted, a
ConstraintViolationException is
thrown. This is expected behavior.

Restricting deletion with NHibernate

I'm using NHibernate (fluent) to access an old third-party database with a bunch of tables, that are not related in any explicit way. That is a child tables does have parentID columns which contains the primary key of the parent table, but there are no foreign key relations ensuring these relations. Ideally I would like to add some foreign keys, but cannot touch the database schema.
My application works fine, but I would really like impose a referential integrity rule that would prohibit deletion of parent objects if they have children, e.i. something similar 'ON DELETE RESTRICT' but maintained by NHibernate.
Any ideas on how to approach this would be appreciated. Should I look into the OnDelete() method on the IInterceptor interface, or are there other ways to solve this?
Of course any solution will come with a performance penalty, but I can live with that.
I can't think of a way to do this in NHibernate because it would require that NHibernate have some knowledge of the relationships. I would handle this in code using the sepecification pattern. For example (using a Company object with links to Employee objects):
public class CanDeleteCompanySpecification
{
bool IsSatisfiedBy(Company candidate)
{
// Check for related Employee records by loading collection
// or using COUNT(*).
// Return true if there are no related records and the Company can be deleted.
// Hope that no linked Employee records are created before the delete commits.
}
}