Delete big entity efficiently with NHibernate 3 - nhibernate

I have an entity which contains several sets/bags.
is there any recommendation from NHibernate on how to delete it?
of course I can do foreach on each list and delete each child, but that will create many delete statements. is it better to create HQL for each child table, or maybe other approach?
I've also seen on another thread to use IStatelessSession. is it wise here?

Personally I think HQL works well in this instance.
Or if your stomach can take it use cascading deletion at the database level then delete the parent and the children precede automatically.

Related

How do you map a HasOne relationship with nHibernate Fluent mapping and avoid N+1?

I have 2 tables ATable and AATable where both have a shared Primary Key - ATable.aKey and AATable.aKey to represent a one-to-one relationship. For my Fluent mapping I have a HasOne Relationship defined within my Fluent ATableMapping, all of which works fine. However I have noticed that querying for ATable generates a 2nd query (N+1) for the child Table AATable. My understanding is that Hasone eager loads by default, and I had assumed this would be part of the query for ATable, but I may well have this wrong?
I have researched various solutions including using .Not.LazyLoad().Fetch.Join(), PropertyRef, ForeignKey but I cannot seem to resolve the n+1 so that either it is Eager loaded with 1 query, or Lazy loaded and I can fetch the child with my queries.
Has anyone had any issues with this or have an example they know to work with no n+1? Grateful for any advice.
You have two options:
Not.LazyLoad() which disables possibility to provide lazy loaded related entity and it would enforce NHB to provide corresponding subselect within original query
Use component mapping so both entities point to the same table. This is better approach as once you decided to fetch both entities together, generated queries hit only one table - not two like within first option. This is definitely better for performance.

How to effectively refresh many to many relationship

Lets say I have entity A, which have many to many relationship with another entities of type A. So on entity A, I have collection of A. And lets say I have to "update" this relationships according to some external service - from time to time I receive notification that relations for certain entity has changed, and array of IDs of current related entities - some relations can be new, some existing, some of existing no longer there... How can I effectively update my database with EF ?
Some ideas:
eager load entity with its related entities, do foreach on collection of IDs from external service, and remove/add as needed. But this is not very effective - need to load possibly hundreds of related entities
clear current relations and insert new. But how ? Maybe perform delete by stored procedure, and then insert by "fake" objects
a.Related.Add(new A { Id = idFromArray })
but can this be done in transaction ? (call to stored procedure and then inserts done by SaveChanges)
or is there any 3rd way ?
Thanx.
Well, "from time to time" does not sound like a situation to think much about performance improvement (unless you mean "from millisecond to millisecond") :)
Anyway, the first approach is the correct idea to do this update without a stored procedure. And yes, you must load all old related entities because updating a many-to-many relationship goes only though EFs change detection. There is no exposed foreign key you could leverage to update the relations without having loaded the navigation properties.
An example how this might look in detail is here (fresh question from yesterday):
Selecting & Updating Many-To-Many in Entity Framework 4
(Only the last code snippet before the "Edit" section is relevant to your question and the Edit section itself.)
For your second solution you can wrap the whole operation into a manually created transaction:
using (var scope = new TransactionScope())
{
using (var context = new MyContext())
{
// ... Call Stored Procedure to delete relationships in link table
// ... Insert fake objects for new relationships
context.SaveChanges();
}
scope.Complete();
}
Ok, solution found. Of course, pure EF solution is the first one proposed in original question.
But, if performance matters, there IS a third way, the best one, although it is SQL server specific (afaik) - one procedure with table-valued parameter. All new related IDs goes in, and the stored procedure performs delete and inserts in transaction.
Look for the examples and performance comparison here (great article, i based my solution on it):
http://www.sommarskog.se/arrays-in-sql-2008.html

Should I let JPA or the database cascade deletions?

Let's say we have two entities, A and B. B has a many-to-one relationship to A like follows:
#Entity
public class A {
#OneToMany(mappedBy="a_id")
private List<B> children;
}
#Entity
public class B {
private String data;
}
Now, I want to delete the A object and cascade the deletions to all its children B. There are two ways to do this:
Add cascade=CascadeType.ALL, orphanRemoval=true to the OneToMany annotation, letting JPA remove all children before removing the A-object from the database.
Leave the classes as they are and simply let the database cascade the deletion.
Is there any problem with using the later option? Will it cause the Entity Manager to keep references to already deleted objects? My reason for choosing option two over one is that option one generates n+1 SQL queries for a removal, which can take a prolonged time when object A contains a lot of children, while option two only generates a single SQL query and then moves on happily. Is there any "best practice" regarding this?
I'd prefer the database. Why?
The database is probably a lot faster doing this
The database should be the primary place to hold integrity and relationship information. JPA is just reflecting that information
If you're connecting with a different application / platform (i.e. without JPA), you can still cascadingly delete your records, which helps increase data integrity
In EclipseLink you can use both if you use the #CascadeOnDelete annotation. EclipseLink will also generate the cascade DDL for you.
See,
http://wiki.eclipse.org/EclipseLink/Examples/JPA/DeleteCascade
This optimizes the deletion by letting the database do it, but also maintains the cache and the persistence unit by removing the objects.
Note that orphanRemoval=true will also delete objects removed from the collection, which the database cascade constraint will not do for you, so having the rules in JPA is still necessary. There are also some relationships that the database cannot handle deletion for, as the database can only cascade in the inverse direction of the constraint, a OneToOne with a foreign key, or a OneToMany with a join table cannot be cascaded on the database.
This answer raises some really strong arguments about why it should be JPA that handles the cascade, not the database.
Here's the relevant quote:
...if you would make cascades on database, and not declare them in
Hibernate (for performance issues) you could in some circumstances get
errors. This is because Hibernate stores entities in its session
cache, so it would not know about database deleting something in
cascade.
When you use second-level cache, your situation is even worse, because
this cache lives longer than session and such changes on db-side will
be invisible to other sessions as long old values are stored in this
cache.

Why doesn't NHibernate delete orphans first?

I'm trying to figure out why NHibernate handles one-to-many cascading (using cascade=all-delete-orphan) the way it does. I ran into the same issue as this guy:
Forcing NHibernate to cascade delete before inserts
As far as I can tell NHibernate always performs inserts first, then updates, then deletes. There may be a very good reason for this, but I can't for the life of me figure out what that reason is. I'm hoping that a better understanding of this will help me come up with a solution that I don't hate :)
Are there any good theories on this behavior? In what scenario would deleting orphans first not work? Do all ORMs work this way?
EDIT: After saying there is no reason, here is a reason.
Lets say you have the following scenario:
public class Dog {
public DogLeg StrongestLeg {get;set;}
public IList<DogLeg> Legs {get;set;
}
If you were to delete first, and lets say you delete all of Dog.Legs, then you may delete the StrongestLeg which would cause a reference violation. Hence you cannot DELETE before you UPDATE.
Lets say you add a new leg, and that new leg is also the StrongestLeg. Then you must INSERT before you UPDATE so that the Leg has an Id that can be inserted into Dog.StrongestLegId.
So you must INSERT, UPDATE, then DELETE.
Also as nHibernate is based on Hibernate, I had a look into Hibernate and found several people talking about the same issue.
Support one-to-many list associations with constraints on both (owner_id, position) and (child_id)
Non lazy loaded List updates done in wrong order, cause exception
wrong insert/delete order when updating record-set
Why does Hibernate perform Inserts before Deletes?
Unidirection OneToMany causes duplicate key entry violation when removing from list
And here is the best answer from them:
Gail Badner added a comment - 21/Feb/08 2:30 PM: The problem arises when a new
association entity with a generated ID
is added to the collection. The first
step, when merging an entity
containing this collection, is to
cascade save the new association
entity. The cascade must occur before
other changes to the collection.
Because the unique key for this new
association entity is the same as an
entity that is already persisted, a
ConstraintViolationException is
thrown. This is expected behavior.

Override delete behaviour in NHibernate

In my application users cannot truly delete records. Rather, the record's Deleted field gets set to 1, which hides it from selects.
I need to maintain this behaviour and I'm looking into whether NHibernate is appropriate for my app. Can I override NHibnernate's delete behaviour so that instead of issuing DELETE statements, it issues UPDATES, as described above?
I would obviously also need to override its SELECT behaviour to include the 'AND Deleted = 0' clause. Or read from a view instead. I'm not sure.
To implement a soft delete just bypass the Hibernate delete mechanism. Instead, map your table's Deleted field to a .Net boolean property by the same name. To delete an item, set item.Deleted = true. Then add a where attribute to your class mapping to filter out the deleted items. If you like, create another mapping for deleted items. Otherwise they will become invisible to your application, but maybe that's what you want.
Edit: Here is perhaps a better approach: use the <sql-delete> tag to write a custom delete operation for your mapping. See http://docs.jboss.org/hibernate/core/3.3/reference/en/html/querysql.html#querysql-cud. I think this in combo with the where attribute would be just the ticket. For example:
<class name="MyClass" table="my_table" where="deleted=0">
...
<sql-delete>UPDATE my_table SET deleted=1 WHERE id=?</sql-delete>
</class>
I implemented this using INSTEAD OF DELETE triggers on SQL Server to set a delete flag bit when a SQL DELETE is issued. This has two benefits: 1) I can just issue deletes from my app. without worry and 2) It enforces soft deletes for all database access (i.e. the trigger must be temporarily disabled to hard delete). I then set up views that select the active records only and mapped NHibernate to those. This solution has worked very well for me.
I think the best way to get such behaviour would be by implementing the IInterceptor interface which would allow you to perform your own code as shown within the NHibernate Documentation.
Otherwise, you could simply create a trigger on delete that would perform an update. This solution is simpler, but is this suitable for your needs?
As for the SELECT, you only need to write method that will use Criterion with a Where clause to specify the Deleted=0 thing.