We use NHibernate as our ORM on the server side.
Sometimes, there is a need to delete an object from the database, given only the type and ID of that object. In addition, the deleted object was not fetched before (so it is not in the session cache or whatever).
Anyway, I am using the ISession.Delete(query) overload, where the query is as trivial as from Dummy where Id=5.
My question is why does NHibernate fetch the object before deleting it? As far as I can see I am paying for the double round-trip to the server for an operation, which intuitively should take just one round-trip.
Is there a way to delete an object from the database by its type and ID with NHibernate, so that it takes just one round-trip?
You can also use HQL to delete it yourself.
session.CreateQuery("delete from Table where id = :id")
.SetParameter("id", id)
.ExecuteUpdate();
But this will not cascade deletes and I assume that is one reason it needs to load first.
Do it really hit the db if you call ISession.Delete(Session.Load(type, id)); since Load() should return a proxy and not hit the db?
Related
Lets say I have entity A, which have many to many relationship with another entities of type A. So on entity A, I have collection of A. And lets say I have to "update" this relationships according to some external service - from time to time I receive notification that relations for certain entity has changed, and array of IDs of current related entities - some relations can be new, some existing, some of existing no longer there... How can I effectively update my database with EF ?
Some ideas:
eager load entity with its related entities, do foreach on collection of IDs from external service, and remove/add as needed. But this is not very effective - need to load possibly hundreds of related entities
clear current relations and insert new. But how ? Maybe perform delete by stored procedure, and then insert by "fake" objects
a.Related.Add(new A { Id = idFromArray })
but can this be done in transaction ? (call to stored procedure and then inserts done by SaveChanges)
or is there any 3rd way ?
Thanx.
Well, "from time to time" does not sound like a situation to think much about performance improvement (unless you mean "from millisecond to millisecond") :)
Anyway, the first approach is the correct idea to do this update without a stored procedure. And yes, you must load all old related entities because updating a many-to-many relationship goes only though EFs change detection. There is no exposed foreign key you could leverage to update the relations without having loaded the navigation properties.
An example how this might look in detail is here (fresh question from yesterday):
Selecting & Updating Many-To-Many in Entity Framework 4
(Only the last code snippet before the "Edit" section is relevant to your question and the Edit section itself.)
For your second solution you can wrap the whole operation into a manually created transaction:
using (var scope = new TransactionScope())
{
using (var context = new MyContext())
{
// ... Call Stored Procedure to delete relationships in link table
// ... Insert fake objects for new relationships
context.SaveChanges();
}
scope.Complete();
}
Ok, solution found. Of course, pure EF solution is the first one proposed in original question.
But, if performance matters, there IS a third way, the best one, although it is SQL server specific (afaik) - one procedure with table-valued parameter. All new related IDs goes in, and the stored procedure performs delete and inserts in transaction.
Look for the examples and performance comparison here (great article, i based my solution on it):
http://www.sommarskog.se/arrays-in-sql-2008.html
Let's say we have two entities, A and B. B has a many-to-one relationship to A like follows:
#Entity
public class A {
#OneToMany(mappedBy="a_id")
private List<B> children;
}
#Entity
public class B {
private String data;
}
Now, I want to delete the A object and cascade the deletions to all its children B. There are two ways to do this:
Add cascade=CascadeType.ALL, orphanRemoval=true to the OneToMany annotation, letting JPA remove all children before removing the A-object from the database.
Leave the classes as they are and simply let the database cascade the deletion.
Is there any problem with using the later option? Will it cause the Entity Manager to keep references to already deleted objects? My reason for choosing option two over one is that option one generates n+1 SQL queries for a removal, which can take a prolonged time when object A contains a lot of children, while option two only generates a single SQL query and then moves on happily. Is there any "best practice" regarding this?
I'd prefer the database. Why?
The database is probably a lot faster doing this
The database should be the primary place to hold integrity and relationship information. JPA is just reflecting that information
If you're connecting with a different application / platform (i.e. without JPA), you can still cascadingly delete your records, which helps increase data integrity
In EclipseLink you can use both if you use the #CascadeOnDelete annotation. EclipseLink will also generate the cascade DDL for you.
See,
http://wiki.eclipse.org/EclipseLink/Examples/JPA/DeleteCascade
This optimizes the deletion by letting the database do it, but also maintains the cache and the persistence unit by removing the objects.
Note that orphanRemoval=true will also delete objects removed from the collection, which the database cascade constraint will not do for you, so having the rules in JPA is still necessary. There are also some relationships that the database cannot handle deletion for, as the database can only cascade in the inverse direction of the constraint, a OneToOne with a foreign key, or a OneToMany with a join table cannot be cascaded on the database.
This answer raises some really strong arguments about why it should be JPA that handles the cascade, not the database.
Here's the relevant quote:
...if you would make cascades on database, and not declare them in
Hibernate (for performance issues) you could in some circumstances get
errors. This is because Hibernate stores entities in its session
cache, so it would not know about database deleting something in
cascade.
When you use second-level cache, your situation is even worse, because
this cache lives longer than session and such changes on db-side will
be invisible to other sessions as long old values are stored in this
cache.
I'm trying to use 'adonet.batch_size' property in NHibernate. Now, I'm creating entities across multiple sessions at a large rate (hence batch inserting). So what I'm doing is creating a buffer where I keep these entities and them flush them out all at once periodically.
However I need the ID's as soon as I create the entities. So I want to create an entity (in any session) and then have its ID generated (I'm using HiLo generator). And then at a later time (and other session) I want to flush that buffer and ensure that those IDs do not change.
Is there anyway to do this?
Thanks
Guido
I find it odd that you need many sessions to do a single job. Normally a single session is enough to do all work.
That said, the Hilo generator sets the id property on the entity when calling nhSession.Save(object) without necessarily requiring a round-trip to the database and a
nhSession.Flush() will flush the inserts to the database
UPDATE ===========================================================================
This is a method i used on a specific case that made pure-sql inserts while maintaining NHibernate compatibility.
//this will get the value and update the hi-lo value repository in the datastore
public static void GenerateIdentifier(object target)
{
var targetType = target.GetType();
var classMapping = NHibernateSessionManager.Instance.Configuration.GetClassMapping(targetType);
var impl = NHibernateSessionManager.Instance.GetSession().GetSessionImplementation();
var newId = classMapping.Identifier.CreateIdentifierGenerator(impl.Factory.Dialect, classMapping.Table.Catalog, classMapping.Table.Schema,
classMapping.RootClazz).Generate(impl, target);
classMapping.IdentifierProperty.GetSetter(targetType).Set(target, newId);
}
So, this method takes your newly constructed entity like
var myEnt = new MyEnt(); //has default identifier
GenerateIdentifier(myEnt); //now has identifier injected based on nhibernate's mapping
note that this call does not place the entity in any kind of nhibernate managed space. So you still have to make a place to place your objects and make the save on each one. Also note that i used this one with pure sql inserts and unless you specify generator="assigned" (which will then require some custom hi-lo generator) in your entity mapping nhibernate may require a different mechanism to persist it.
All in all, what you want is to generate an Id for an object that will be persisted at some time in the future. This brings up some problems such as handling non-existent entries due to rollbacks and failed commits. Additionally imo nhibernate is not the tool for this particular job, you don't need nhibernate to do your bulk insert unless there is some complex entity logic that is too costly (in dev time) to implement on your own.
Also note that you are implying that you need transient detached entities which however cannot be used unless you call .nhSes.Save(obj) on the first session and flush its contents so the 2nd session when it calls Load on the transient object there will be an existing row in the database which contradicts what you want to achieve.
Imo don't be afraid of storming the database, just optimise the procedure top-to-bottom to be able to handle the volume. Using nhibernate just to do an insert seems counter-productive when you can achieve the same result with 4 times the performance using ado.net or even an isqlquery wrapped-query (and use the method i provided above)
I am wondering how can one delete an entity having just its ID and type (as in mapping) using NHibernate 2.1?
If you are using lazy loading, Load only creates a proxy.
session.Delete(session.Load(type, id));
With NH 2.1 you can use HQL. Not sure how it actually looks like, but something like this: note that this is subject to SQL injection - if possible use parametrized queries instead with SetParameter()
session.Delete(string.Format("from {0} where id = {1}", type, id));
Edit:
For Load, you don't need to know the name of the Id column.
If you need to know it, you can get it by the NH metadata:
sessionFactory.GetClassMetadata(type).IdentifierPropertyName
Another edit.
session.Delete() is instantiating the entity
When using session.Delete(), NH loads the entity anyway. At the beginning I didn't like it. Then I realized the advantages. If the entity is part of a complex structure using inheritance, collections or "any"-references, it is actually more efficient.
For instance, if class A and B both inherit from Base, it doesn't try to delete data in table B when the actual entity is of type A. This wouldn't be possible without loading the actual object. This is particularly important when there are many inherited types which also consist of many additional tables each.
The same situation is given when you have a collection of Bases, which happen to be all instances of A. When loading the collection in memory, NH knows that it doesn't need to remove any B-stuff.
If the entity A has a collection of Bs, which contains Cs (and so on), it doesn't try to delete any Cs when the collection of Bs is empty. This is only possible when reading the collection. This is particularly important when C is complex of its own, aggregating even more tables and so on.
The more complex and dynamic the structure is, the more efficient is it to load actual data instead of "blindly" deleting it.
HQL Deletes have pitfalls
HQL deletes to not load data to memory. But HQL-deletes aren't that smart. They basically translate the entity name to the corresponding table name and remove that from the database. Additionally, it deletes some aggregated collection data.
In simple structures, this may work well and efficient. In complex structures, not everything is deleted, leading to constraint violations or "database memory leaks".
Conclusion
I also tried to optimize deletion with NH. I gave up in most of the cases, because NH is still smarter, it "just works" and is usually fast enough. One of the most complex deletion algorithms I wrote is analyzing NH mapping definitions and building delete statements from that. And - no surprise - it is not possible without reading data from the database before deleting. (I just reduced it to only load primary keys.)
When I am adding a new row in the database, I noticed the Add method returns void.
Is it possible to get the newly created row's ID returned in the add operation?
When you call session.Save() on your entity, your entity should have the Id modified.
You probably want to check your nhibernate configuration if your save is occurring immediately or if your saving is batching. If it's batching, then you'll need to commit (or flush, someone will correct this... hopefully) your session.