EF: How to do effective lazy-loading (not 1+N selects)? - orm

Starting with a List of entities and needing all dependent entities through an association, is there a way to use the corresponding navigation-propertiy to load all child-entities with one db-round-trip? Ie. generate a single WHERE fkId IN (...) statement via navigation property?
More details
I've found these ways to load the children:
Keep the set of parent-entities as IQueriable<T>
Not good since the db will have to find the main set every time and join to get the requested data.
Put the parent-objects into an array or list, then get related data through navigation properties.
var children = parentArray.Select(p => p.Children).Distinct()
This is slow since it will generate a select for every main-entity.
Creates duplicate objects since each set of children is created independetly.
Put the foreign keys from the main entities into an array then filter the entire dependent-ObjectSet
var foreignKeyIds = parentArray.Select(p => p.Id).ToArray();
var children = Children.Where(d => foreignKeyIds.Contains(d.Id))
Linq then generates the desired "WHERE foreignKeyId IN (...)"-clause.
This is fast but only possible for 1:*-relations since linking-tables are mapped away.
Removes the readablity advantage of EF by using Ids after all
The navigation-properties of type EntityCollection<T> are not populated
Eager loading though the .Include()-methods, included for completeness (asking for lazy-loading)
Alledgedly joins everything included together and returns one giant flat result.
Have to decide up front which data to use
It there some way to get the simplicity of 2 with the performance of 3?

You could attach the parent object to your context and get the children when needed.
foreach (T parent in parents) {
_context.Attach(parent);
}
var children = parents.Select(p => p.Children);
Edit: for attaching multiple, just iterate.

I think finding a good answer is not possible or at least not worth the trouble. Instead a micro ORM like Dapper give the big benefit of removing the need to map between sql-columns and object-properties and does it without the need to create a model first. Also one simply writes the desired sql instead of understanding what linq to write to have it generated. IQueryable<T> will be missed though.

Related

Nhibernate QueryOver don't get latest database changes

I am trying get a record updated from database with QueryOver.
My code initially creates an entity and saves in database, then the same record is updated on database externally( from other program, manually or the same program running in other machine), and when I call queryOver filtering by the field changed, the query gets the record but without latest changes.
This is my code:
//create the entity and save in database
MyEntity myEntity = CreateDummyEntity();
myEntity.Name = "new_name";
MyService.SaveEntity(myEntity);
// now the entity is updated externally changing the name property with the
// "modified_name" value (for example manually in TOAD, SQL Server,etc..)
//get the entity with QueryOver
var result = NhibernateHelper.Session
.QueryOver<MyEntity>()
.Where(param => param.Name == "modified_name")
.List<T>();
The previous statement gets a collection with only one record(good), BUT with the name property established with the old value instead of "modified_name".
How I can fix this behaviour? First Level cache is disturbing me? The same problem occurs with
CreateCriteria<T>();
The session in my NhibernateHelper is not being closed in any moment due application framework requirements, only are created transactions for each commit associated to a session.Save().
If I open a new session to execute the query evidently I get the latest changes from database, but this approach is not allowed by design requirement.
Also I have checked in the NHibernate SQL output that a select with a WHERE clause is being executed (therefore Nhibernate hits the database) but don´t updates the returned object!!!!
UPDATE
Here's the code in SaveEntity after to call session.Save: A call to Commit method is done
public virtual void Commit()
{
try
{
this.session.Flush();
this.transaction.Commit();
}
catch
{
this.transaction.Rollback();
throw;
}
finally
{
this.transaction = this.session.BeginTransaction();
}
}
The SQL generated by NHibernate for SaveEntity:
NHibernate: INSERT INTO MYCOMPANY.MYENTITY (NAME) VALUES (:p0);:p0 = 'new_name'.
The SQL generated by NHibernate for QueryOver:
NHibernate: SELECT this_.NAME as NAME26_0_
FROM MYCOMPANY.MYENTITY this_
WHERE this_.NAME = :p0;:p0 = 'modified_name' [Type: String (0)].
Queries has been modified due to company confidential policies.
Help very appreciated.
As far as I know, you have several options :
have your Session as a IStatelessSession, by calling sessionFactory.OpenStatelesSession() instead of sessionFactory.OpenSession()
perform Session.Evict(myEntity) after persisting an entity in DB
perform Session.Clear() before your QueryOver
set the CacheMode of your Session to Ignore, Put or Refresh before your QueryOver (never tested that)
I guess the choice will depend on the usage you have of your long running sessions ( which, IMHO, seem to bring more problems than solutions )
Calling session.Save(myEntity) does not cause the changes to be persisted to the DB immediately*. These changes are persisted when session.Flush() is called either by the framework itself or by yourself. More information about flushing and when it is invoked can be found on this question and the nhibernate documentation about flushing.
Also performing a query will not cause the first level cache to be hit. This is because the first level cache only works with Get and Load, i.e. session.Get<MyEntity>(1) would hit the first level cache if MyEntity with an id of 1 had already been previously loaded, whereas session.QueryOver<MyEntity>().Where(x => x.id == 1) would not.
Further information about NHibernate's caching functionality can be found in this post by Ayende Rahien.
In summary you have two options:
Use a transaction within the SaveEntity method, i.e.
using (var transaction = Helper.Session.BeginTransaction())
{
Helper.Session.Save(myEntity);
transaction.Commit();
}
Call session.Flush() within the SaveEntity method, i.e.
Helper.Session.Save(myEntity);
Helper.Session.Flush();
The first option is the best in pretty much all scenarios.
*The only exception I know to this rule is when using Identity as the id generator type.
try changing your last query to:
var result = NhibernateHelper.Session
.QueryOver<MyEntity>()
.CacheMode(CacheMode.Refresh)
.Where(param => param.Name == "modified_name")
if that still doesn't work, try add this after the query:
NhibernateHelper.Session.Refresh(result);
After search and search and think and think.... I´ve found the solution.
The fix: It consist in open a new session, call QueryOver<T>() in this session and the data is succesfully refreshed. If you get child collections not initialized you can call HibernateUtil.Initialize(entity) or sets lazy="false" in your mappings. Take special care about lazy="false" in large collections, because you can get a poor performance. To fix this problem(performance problem loading large collections), set lazy="true" in your collection mappings and call the mentioned method HibernateUtil.Initialize(entity) of the affected collection to get child records from database; for example, you can get all records from a table, and if you need access to all child records of a specific entity, call HibernateUtil.Initialize(collection) only for the interested objects.
Note: as #martin ernst says, the update problem can be a bug in hibernate and my solution is only a temporal fix, and must be solved in hibernate.
People here do not want to call Session.Clear() since it is too strong.
On the other hand, Session.Evict() may seem un-applicable when the objects are not known beforehand.
Actually it is still usable.
You need to first retrieve the cached objects using the query, then call Evict() on them. And then again retrieve fresh objects calling the same query again.
This approach is slightly inefficient in case the object was not cached to begin with - since then there would be actually two "fresh" queries - but there seems to be not much to do about that shortcoming...
By the way, Evict() accepts null argument too without exceptions - this is useful in case the queried object is actually not present in the DB.
var cachedObjects = NhibernateHelper.Session
.QueryOver<MyEntity>()
.Where(param => param.Name == "modified_name")
.List<T>();
foreach (var obj in cachedObjects)
NhibernateHelper.Session.Evict(obj);
var freshObjects = NhibernateHelper.Session
.QueryOver<MyEntity>()
.Where(param => param.Name == "modified_name")
.List<T>()
I'm getting something very similar, and have tried debugging NHibernate.
In my scenario, the session creates an object with a couple children in a related collection (cascade:all), and then calls ISession.Flush().
The records are written into the DB, and the session needs to continue without closing. Meanwhile, another two child records are written into the DB and committed.
Once the original session then attempts to re-load the graph using QueryOver with JoinAlias, the SQL statement generated looks perfectly fine, and the rows are being returned correctly, however the collection that should receive these new children is found to have already been initialized within the session (as it should be), and based on that NH decides for some reason to completely ignore the respective rows.
I think NH makes an invalid assumption here that if the collection is already marked "Initialized" it does not need to be re-loaded from the query.
It would be great if someone more familiar with NHibernate internals could chime in on this.

How can I speed my Entity Framework code?

My SQL and Entity Framework knowledge is a somewhat limited. In one Entity Framework (4) application, I notice it takes forever (about 2 minutes) to complete one of my method calls. The first queries do not take much time, but when I loop through the Entity Framework objects returned by the queries, even though I am only reading (not modifying) the data I supposedly got, it takes forever to complete the nested loops, even though there are only dozens of entries in each list and a few levels of looping.
I expect the example below could be re-written with a fancier query that could probably include all of the filtering I am doing in my loops with some SQL words I don't really know how to use, so if someone could show me what the equivalent SQL expression would be, that would be extremely educational to me and probably solve my current performance problem.
Moreover, since other parts of this and other applications I develop often want to do more complex computations on SQL data, I would also like to know a good way to retrieve data from Entity Framework to local memory objects that do not have huge delays in reading them. In my LINQ-to-SQL project there was a similar performance problem, and I solved it by refactoring the whole application to load all SQL data into parallel objects in RAM, which I had to write myself, and I wonder if there isn't a better way to either tell Entity Framework to not keep doing whatever high-latency communication it is doing, or to load into local RAM objects.
In the example below, the code gets a list of food menu items for a member (i.e. a person) on a certain date via a SQL query, and then I use other queries and loops to filter out the menu items on two criteria: 1) If the member has a rating of zero for any group id which the recipe is a member of (a many-to-many relationship) and 2) If the member has a rating of zero for the recipe itself.
Example:
List<PFW_Member_MenuItem> MemberMenuForCookDate =
(from item in _myPfwEntities.PFW_Member_MenuItem
where item.MemberID == forMemberId
where item.CookDate == onCookDate
select item).ToList();
// Now filter out recipes in recipe groups rated zero by the member:
List<PFW_Member_Rating_RecipeGroup> ExcludedGroups =
(from grpRating in _myPfwEntities.PFW_Member_Rating_RecipeGroup
where grpRating.MemberID == forMemberId
where grpRating.Rating == 0
select grpRating).ToList();
foreach (PFW_Member_Rating_RecipeGroup grpToExclude in ExcludedGroups)
{
List<PFW_Member_MenuItem> rcpsToRemove = new List<PFW_Member_MenuItem>();
foreach (PFW_Member_MenuItem rcpOnMenu in MemberMenuForCookDate)
{
PFW_Recipe rcp = GetRecipeById(rcpOnMenu.RecipeID);
foreach (PFW_RecipeGroup group in rcp.PFW_RecipeGroup)
{
if (group.RecipeGroupID == grpToExclude.RecipeGroupID)
{
rcpsToRemove.Add(rcpOnMenu);
break;
}
}
}
foreach (PFW_Member_MenuItem rcpToRemove in rcpsToRemove)
MemberMenuForCookDate.Remove(rcpToRemove);
}
// Now filter out recipes rated zero by the member:
List<PFW_Member_Rating_Recipe> ExcludedRecipes =
(from rcpRating in _myPfwEntities.PFW_Member_Rating_Recipe
where rcpRating.MemberID == forMemberId
where rcpRating.Rating == 0
select rcpRating).ToList();
foreach (PFW_Member_Rating_Recipe rcpToExclude in ExcludedRecipes)
{
List<PFW_Member_MenuItem> rcpsToRemove = new List<PFW_Member_MenuItem>();
foreach (PFW_Member_MenuItem rcpOnMenu in MemberMenuForCookDate)
{
if (rcpOnMenu.RecipeID == rcpToExclude.RecipeID)
rcpsToRemove.Add(rcpOnMenu);
}
foreach (PFW_Member_MenuItem rcpToRemove in rcpsToRemove)
MemberMenuForCookDate.Remove(rcpToRemove);
}
You can use EFProf http://www.hibernatingrhinos.com/products/EFProf to track see exactly what EF is sending to SQL. It can also show you how many queries you are sending and how many unique queries. It also provides you some analysis of each query (e.g. is it unbound etc). Entity Framework with its navigation properties, it is quite easy to not realize you are making a db request. When you are in a loop, and have a navigation property, you get in to the N + 1 problem.
You could use the Keyword Virtual on your List parts of your model if you are using code first to enable proxying, that way you will not have to get all the data back at once, only as you need it.
Also consider NoTracking for read only data
context.bigTable.MergeOption = MergeOption.NoTracking;

Native Query Mapping on the fly openJPA

I am wondering if it is possible to map a named native query on the fly instead of getting back a list of Object[] and then looping through and setting up the object that way. I have a call which I know ill return a massive data set and I want to be able to map it right to my entity. Can I do that or will I have to continue looping through the result set.
Here is what I am doing now...
List<Provider> ObjList = (List<Provider>) emf.createNativeQuery(assembleQuery(organizationIDs, 5)).getResultList();
That is my entity, the List (my entity is the provider). Normally I would just return a List<Object[]>
and then I would loop through that to get back all the objects and set them up as new providers and add them to a list....
//List<Provider> provList = new ArrayList<Provider>();
/*for(Object[] obj: ObjList)
{
provList.add(this.GetProviderFromObj(obj));
}*/
As you can see I commented that section of the code out to try this out. I know you can map named native queries if you put your native query in the entity itself and then call it via createNamedQuery. I would do it that way, but I need to use the IN oracle keyword because I have a list of ID's that I want to check against. It is not just one that is needed. And as we all know, native queruies don't handle the in keyword to well. Any advice?
Sigh, If only the IN keyword was supported well for NamedNativeQueries.
Assuming that Provider is configured as a JPA entity, you should be able to specify the class as the second parameter to your createNativeQuery call. For example:
List<Provider> ObjList = (List<Provider>) emf.createNativeQuery(assembleQuery(organizationIDs, 5), Provider.class).getResultList();
According to the documentation, "At a minimum, your SQL must select the class' primary key columns, discriminator column (if mapped), and version column (also if mapped)."
See the OpenJPA documentation for more details.

Fluent nHibernate Selective loading for collections

I was just wondering whether when loading an entity which contains a collection e.g. a Post which may contain 0 -> n Comments if you can define how many comments to return.
At the moment I have this:
public IList<Post> GetNPostsWithNCommentsAndCreator(int numOfPosts, int numOfComments)
{
var posts = Session.Query<Post>().OrderByDescending(x => x.CreationDateTime)
.Take(numOfPosts)
.Fetch(z => z.Comments)
.Fetch(z => z.Creator).ToList();
ReleaseCurrentSession();
return posts;
}
Is there a way of adding a Skip and Take to Comments to allow a kind of paging functionality on the collection so you don't end up loading lots of things you don't need.
I'm aware of lazy loading but I don't really want to use it, I'm using the MVC pattern and want my object to return from the repositories loaded so I can then cache them. I don't really want my views causing select statements.
Is the only real way around this is to not perform a fetch on comments but to perform a separate Select on Comments to Order By Created Date Time and then Select the top 5 for example and then place the returned result into the Post object?
Any thoughts / links on this would be appreciated.
Thanks,
Jon
A fetch simple does a left-outer join on the associated table so that it can hydrate the collection entities with data. What you are looking to do will require a separate query on the specific entities. From there you can use any number of constructs to limit your result set (skip/take, setmaxresults, etc)

NHibernate CreateCriteria and CreateQuery generates different sql?

I'm new to NHibernate and can't figure out why these two statements generates different sql.
the first one only get the ClientInformation (with Information and Client being Proxies) which is what i want.
return repository
.CreateQuery("from ClientInformation ci where ci.Information.IsMandatory = true and ci.Client.Id = :clientId")
.SetParameter("clientId", clientId)
.List<ClientInformation>();
The second one generates everything. All data is returned for the 3 entities, which is not what i want
return repository.CreateCriteria()
.CreateAlias("Information", "inf")
.CreateAlias("Client", "cli")
.Add(Expression.Eq("cli.Id", clientId))
.Add(Expression.Eq("inf.IsMandatory", true))
.List<ClientInformation>();
What i'm i doing wrong ?
thanks
Actually it all boils down to what you want to do.
First of all the Criteria queries honor the mapping definitions (lazy/eager joins etc) where in constrast HQL queries unless defined otherwise everything is lazy (excluding value properties of course)
Secondly the CreateAlias method defines which entities to join and default behaviour is to also select them.
Note that you are calling
repository.CreateCriteria()
and if that wraps directly to nhSession.CreateCriteria() then you haven't defined exactly what you want to select.
So, try to make this
nhSession.CreateCriteria(typeof(ClientInformation));
which will be translated as 'select only ClientInformation'...