NHibernate Fetch/FetchMany duplication in resultset, how to fix with ToFuture() - nhibernate

I'm relatively new to using NHibernate and I'm running into a shortcoming I can't seem to work myself around. I have an object tree that I wish to retrieve from the database in a single roundtrip but end up with a cartesian product.
The objects I'm trying to retrieve are called 'AccountGroup', 'Concern', 'Advertiser' and 'Product' and I only wish to get those objects where the active user has permissions for.
My initial query looked like this:
using (var session = OpenSession())
{
return session.Query<AccountGroupEntity>()
.FetchMany(a => a.Planners)
.Where(a => a.Planners.Any(p => p.Id == userId))
.FetchMany(a => a.Concerns)
.ThenFetchMany(c => c.Advertisers)
.ThenFetch(a => a.Products)
.ToList();
}
This won't work as it will return a cartesian product and the resulting entities will contain many duplicates.
However, I have NO idea how to fix this. I've seen the ToFuture() method that will allow me to execute more than one query in the same roundtrip, but I have no clue how to configure my ToFuture() query in such a way that it populates all the child collections properly.
Could anyone shine some light on how I can use ToFuture to fetch the entire tree in a single query without duplicates?

I do have an answer to this topic, solution which I do use. But it at the end means "do not use Fetch" - do it differently. So, please, take it at least as a suggestion.
Check this Q & A:
How to Eager Load Associations without duplication in NHibernate?
Small cite:
Fetching Collections is a difficult operation. It has many side effects (as you realized, when there are fetched more collections). But even with fetching one collection, we are loading many duplicated rows.
Other words, Fetching is a fragil feature, and should be used wisely in very few scenarios, I'd say. So what to use? How to solve that?
Profit from a built in NHibernate feature:
19.1.5. Using batch fetching
NHibernate can make efficient use of batch fetching, that is, NHibernate can load several uninitialized proxies if one proxy is accessed (or collections. Batch fetching is an optimization of the lazy select fetching strategy. There are two ways you can tune batch fetching: on the class and the collection level.
Batch fetching for classes/entities is easier to understand. Imagine you have the following situation at runtime: You have 25 Cat instances loaded in an ISession, each Cat has a reference to its Owner, a Person. The Person class is mapped with a proxy, lazy="true". If you now iterate through all cats and call cat.Owner on each, NHibernate will by default execute 25 SELECT statements, to retrieve the proxied owners. You can tune this behavior by specifying a batch-size in the mapping of Person:
<class name="Person" batch-size="10">...</class>
NHibernate will now execute only three queries, the pattern is 10, 10, 5.
You may also enable batch fetching of collections. For example, if each Person has a lazy collection of Cats, and 10 persons are currently loaded in the ISesssion, iterating through all persons will generate 10 SELECTs, one for every call to person.Cats. If you enable batch fetching for the Cats collection in the mapping of Person, NHibernate can pre-fetch collections:
<class name="Person">
<set name="Cats" batch-size="3">
...
</set>
My experience, this approach is pricless. The setting working for us is batch-size="25".
If you ask for any kind of Entity (via session.Get() or .QueryOver()...) - until session is open, the first time we touch related reference or collection - it is loaded in few batches... No 1 + N SELECT Issue...
Summary: Mark all your classes, and all collection with batch-size="x" (x could be 25). That will support clean queries over root Entities - until session is open, all related stuff is loaded in few SELECTS. The x could be adjusted, for some could be much more higher...

Related

NHibernate iStatelessSession returns duplicate parent instances on eager fetch

I'm trying to get a root entity and eager fetch it's child entities. But because I'm using the IStatelessSession of NHibernate, it returns duplicates of the root entity for each child. Using an ISession, it would be solved with
.TransformUsing(new DistinctRootEntityResultTransformer())
But for an IStatelessSession it's not.
Basically it's about the code below, where there's just one instance of Parent, holding 3 Childs.
var result = session.QueryOver<Parent>()
.Fetch(i => i.Childs).Eager();
This will return 3 duplicate instances of Parent, instead of just one. Does anyone have a solution for this?
I would say: Do not use StatelessSession. It does not suite to this use case.
13.2. The StatelessSession interface
Alternatively, NHibernate provides a command-oriented API that may be used for streaming data to and from the database in the form of detached objects. A IStatelessSession has no persistence context associated with it and does not provide many of the higher-level life cycle semantics. In particular, a stateless session does not implement a first-level cache nor interact with any second-level or query cache. It does not implement transactional write-behind or automatic dirty checking. Operations performed using a stateless session do not ever cascade to associated instances. Collections are ignored by a stateless session. Operations performed via a stateless session bypass NHibernate's event model and interceptors...
I just tried explain that here: NHibernate: Select one to Many Left Join - Take X latest from Parent, The problem here is, that your JOIN is resulting in this SQL result, which is not suitable for paging (which you will need sooner or later)
PARENT1 CHILD1
PARENT1 CHILD2
PARENT1 CHILD3
PARENT2 CHILD4
PARENT2 CHILD5 // if we would take 5 records, the parent2 won't get child6
PARENT2 CHILD6
So this resultset is not the way to go. I would strongly suggest: use
standard session, isnide of using (to let it dispose immediately)
load the list of root entity (Parent) and
let NHibernate load their children lazily - in separated SQL query.
The query could/should be like this:
ISessionFactory factory = ...;
using (var session = factory.OpenSession())
{
var list = session.QueryOver<Parent>()
.Skip(100)
.Take(25)
.List<Parent>();
list.Last() // this will load all occupations at once
.Childs // if batch-size is higher than page size
.Any(); // otherwise touch more items
} // session is closed and disposed
As the above code snippet shows, to avoid 1 + N issue, we have to use one of the smart mapping features:
19.1.5. Using batch fetching
NHibernate can make efficient use of batch fetching, that is, NHibernate can load several uninitialized proxies if one proxy is accessed (or collections. Batch fetching is an optimization of the lazy select fetching strategy. There are two ways you can tune batch fetching: on the class and the collection level.
Batch fetching for classes/entities is easier to understand. Imagine you have the following situation at runtime: You have 25 Cat instances loaded in an ISession, each Cat has a reference to its Owner, a Person. The Person class is mapped with a proxy, lazy="true". If you now iterate through all cats and call cat.Owner on each, NHibernate will by default execute 25 SELECT statements, to retrieve the proxied owners...
And the parent mapping should be like:
HasMany(x => x.Childs)
...
.BatchSize(100) // should be at least 25
Please, check also these:
How to Eager Load Associations without duplication in NHibernate?
NHibernate QueryOver with Fetch resulting multiple sql queries and db hits
Is this the right way to eager load child collections in NHibernate
NOTE: Some people could suggest to you to use Result Transforemer, as you've tried. This solution could work, but is done in C#, in memory, so all data are loaded (multi lines) and then narrowed. I would never use that. Check: Criteria.DISTINCT_ROOT_ENTITY vs Projections.distinct

What does Include() do in LINQ?

I tried to do a lot of research but I'm more of a db guy - so even the explanation in the MSDN doesn't make any sense to me. Can anyone please explain, and provide some examples on what Include() statement does in the term of SQL query?
Let's say for instance you want to get a list of all your customers:
var customers = context.Customers.ToList();
And let's assume that each Customer object has a reference to its set of Orders, and that each Order has references to LineItems which may also reference a Product.
As you can see, selecting a top-level object with many related entities could result in a query that needs to pull in data from many sources. As a performance measure, Include() allows you to indicate which related entities should be read from the database as part of the same query.
Using the same example, this might bring in all of the related order headers, but none of the other records:
var customersWithOrderDetail = context.Customers.Include("Orders").ToList();
As a final point since you asked for SQL, the first statement without Include() could generate a simple statement:
SELECT * FROM Customers;
The final statement which calls Include("Orders") may look like this:
SELECT *
FROM Customers JOIN Orders ON Customers.Id = Orders.CustomerId;
I just wanted to add that "Include" is part of eager loading. It is described in Entity Framework 6 tutorial by Microsoft. Here is the link:
https://learn.microsoft.com/en-us/aspnet/mvc/overview/getting-started/getting-started-with-ef-using-mvc/reading-related-data-with-the-entity-framework-in-an-asp-net-mvc-application
Excerpt from the linked page:
Here are several ways that the Entity Framework can load related data into the navigation properties of an entity:
Lazy loading. When the entity is first read, related data isn't retrieved. However, the first time you attempt to access a navigation property, the data required for that navigation property is automatically retrieved. This results in multiple queries sent to the database — one for the entity itself and one each time that related data for the entity must be retrieved. The DbContext class enables lazy loading by default.
Eager loading. When the entity is read, related data is retrieved along with it. This typically results in a single join query that retrieves all of the data that's needed. You specify eager loading by using the Include method.
Explicit loading. This is similar to lazy loading, except that you explicitly retrieve the related data in code; it doesn't happen automatically when you access a navigation property. You load related data manually by getting the object state manager entry for an entity and calling the Collection.Load method for collections or the Reference.Load method for properties that hold a single entity. (In the following example, if you wanted to load the Administrator navigation property, you'd replace Collection(x => x.Courses) with Reference(x => x.Administrator).) Typically you'd use explicit loading only when you've turned lazy loading off.
Because they don't immediately retrieve the property values, lazy loading and explicit loading are also both known as deferred loading.
Think of it as enforcing Eager-Loading in a scenario where your sub-items would otherwise be lazy-loading.
The Query EF is sending to the database will yield a larger result at first, but on access no follow-up queries will be made when accessing the included items.
On the other hand, without it, EF would execute separte queries later, when you first access the sub-items.
include() method just to include the related entities.
but what happened on sql is based on the relationship between those entities which you are going to include what the data you going to fetch.
your LINQ query decides what type of joins have to use, there could be left outer joins there could be inner join there could be right joins etc...
#Corey Adler
Remember that you should use .Include() and .ThenInclude() only when returning the object (NOT THE QUERYABLE) with the "other table property".
As a result, it should only be used when returning APIs' objects, not in your intra-application.

Fluent Mapping Many-Many with sorting

I'm trying to get a many to many relationship to work using Fluent Nhibernate.
I have a Product and a RelatedProduct Table
Product {Id, Name...}
ProductRelated{Id, ProductId, RelatedProductId, relOrder}
and a Product class
The mapping looks like
HasManyToMany(x => x.RelatedProducts)
.Table("ProductRelated")
.ReadOnly()
.ChildOrderBy("relOrder asc")
.ParentKeyColumn("ProductId")
.ChildKeyColumn("RelatedProductId");
When a query is done for Product and the RelatedProducts are lazy loaded I can see that the sorting is applied correctly using the relOrder on the join table.
Session.Query<Product>()
.FetchMany(p => p.Categories)
.FetchMany(p => p.Departments)
Once I add in eager loading of the related products NHibernate tries to sort by a relOrder column on the product itself instead of on the join table.
Session.Query<Product>()
.FetchMany(p => p.Categories)
.FetchMany(p => p.Departments)
.FetchMany(p => p.RelatedProducts)
Any ideas of whats going on here?
Well to answer your question what's going on here?, I would say, you are using: "not-together fitting features" of NHibernate.
A snippet from documentation 6.6. Sorted Collections:
Setting the order-by attribute tells NHibernate ...
Note: that lookup operations on these collections are very slow if they contain more than a few elements.
Note: that the value of the order-by attribute is an SQL ordering, not a HQL ordering!
So, this could be applied only for "standard" lazy loading, becuase this kind of a feature is applied only on a DB side. It is not managing order in the memory.
And the eager fetching, as the counter-part, is a different way how to generate and issue the SQL Statement to DB.
So, eager and order-by will never work together.
*
My NOTE: I simply have to append this. I can't help myself
I.
The Eager fetching is the feature which should be avoided (I never use it, but it's me). There is a better solution and it is setting the BatchSize(), which will reduce the 1+N into 1+(a few) and will keep all the (lazy) featrues, including order-by. Check these if interested:
NHibernate QueryOver with Fetch resulting multiple sql queries and db hits
Is this the right way to eager load child collections in NHibernate
https://stackoverflow.com/questions/18419988/
BatchSize() is supported for HasManyToMany as well: ToManyBase:
/// <summary>Specify the select batch size </summary>
/// <param name="size">Batch size</param>
public T BatchSize(int size) { ...
II.
The many-to-many mapping, while fancy at first look, is not the way I'd suggest. Try to rethink your model and introduce the first-level-citizen: PairingEntity - for the pairing object. It will then use many-to-one and one-to-many mapping which could give us more... e.g. improved querying like Subqueries... try to check these:
How to create NHibernate HasManyToMany relation
many-to-many with extra columns nhibernate
Nhibernate: How to represent Many-To-Many relationships with One-to-Many relationships?

Nhibernate Query with multiple one to many mappings

I'm a beginner in NHibernate. I have to write a complex query on say an "Employee" to populate all the associations for Employee based on the where clause. What I'm looking for is similar to this - when you do a Employee.FindById(10) should fill up OwnedDepartment, SubscribedGroups etc.
The Employee model I need to populate is really heavy (many associations with other objects)
but I need to populate only few associations. How do I achieve it using a query over? or any other approaches?
Updated
I was reading about eager loading just now, has it something to do with the loading ? In my map I have not mentioned any loading techniques, so by default all of my employee's child element are getting loaded already. There is a bunch of queries getting triggered underneath.
All the associations are lazy loaded by default. That means that the load is triggered when you access it - that's why so many queries are issued. If you want to eagerly load the data (which means either joining the tables or - rarely - doing additional select queries at once), you have to specify it in your mapping or query, depending how you fetch your data. The concept is generally called "eager fetching".
If you want to get a single Employee by ID, the standard way to do it is using session.Get<Employee>(10) - but that approach means that eager loads need to be specified in the mapping. For mapping by code it will be c.Lazy(CollectionLazy.NoLazy); for collections or c.Lazy(LazyRelation.NoProxy) for many-to-one - see here or here for details.
I prefer specifying that kind of things in the query - just where it is used, not globally for the whole entity, regardless who is fetching and what for. In LINQ provider you have FetchMany(x => x.SubscribedGroups) for collections and Fetch(x => x.OwnedDepartment) for many-to-one relations. You can find similiar options in QueryOver, if that's your choice.

NHibernate Future Object Graph Many Queries

Given a multi level object graph being called using Future as:
var Dads = db.Session.Query<Parent>().Where(P => P.EntityKey == Id)
.ToFuture<Parent>();
var Kids = db.Session.Query<Kid>().Where(K => K.Parent.EntityKey == Id)
.ToFuture<Kid>();
when I call var Dad = dads.ToList() I see the batch go across the wire and show in profiler.
Problem is when enumerating the collection it is still sending one off queries to the db
Eg.
for each (Kid kid in Dad.Kids) // This seems to hit the database
{
Teach(kid);
}
Sends a SQL query and hits the database to get each kid. Why is the object graph not populated? or is this expected behavior?
That behaviour is to be expected. You are simply telling NHibernate to get two collections from the database in a batch, which it is doing as told. However, you are not telling it that they are related. NH Queries with Futures do not put entities together after executing them unless they are told to do so with a join.
If you executed the separate queries without Futures you would not expect the Parent entity to suddenly have the child collection filled. Basically, Futures allow you to run things in one roundtrip. If the queries happen to have a common root with several child collections (e.g. to avoid a cartesian product), then NH is able to "combine" several collections into one entity.
Unfortunately joins with the NH LINQ Api and the ToFuture() method seem to pose a problem in the current (NH 3.0 or 3.1) implementation. You may need to use the QueryOver Api in that case.
On a side note, I think the method name is not appropriate.
Edit: After Edit of the question the method name is now ok.