Prevent Fluent NHibernate select n+1

Prevent Fluent NHibernate select n+1 - nhibernate

I have a fairly deep object graph (5-6 nodes), and as I traverse portions of it NHProf is telling me I've got a "Select N+1" problem (which I do).
The two solutions I'm aware of are
Eager load children
Break apart my object graph (and eager load)
I don't really want to do either of these (although I may break the graph apart later as I forsee it growing)
For now....
Is it possible to tell NHibernate (with FluentNHibernate) that whenever I try to access children, to load them all in one go, instead of select-n+1-ing as I iterate over them?
I'm also getting "unbounded results set"s, which is presumably the same problem (or rather, will be solved by the above solution if possible).
Each child collection (throughout the graph) will only ever have about 20 members, but 20^5 is a lot, so I don't want to eager load everything when I get the root, but simply get all of a child collection whenever I go near it.
Edit: an afterthought.... what if I want to introduce paging when I want to render children? Do I HAVE to break my object graph here, or is there some sneakiness I can employ to solve all these issues?

It sounds to me that you want to pursue the approach of using your domain model rather than creating a specific nhibernate query to handle this scenario. Given this, I would suggest you take a look at the batch-size attribute which you can apply to your collections. The Fluent NHibernate fluent interface does not yet support this attribute, but as a work around you can use:
HasMany(x => x.Children).AsSet().SetAttribute("batch-size", "20")
Given the general lack of information about your exact scenario, I cannot say for sure whether batch-size is the ideal solution, but I certainly recommend you give it a go. If you haven't already, I suggest you read these:
http://www.nhforge.org/wikis/howtonh/lazy-loading-eager-loading.aspx
http://nhibernate.info/doc/nhibernate-reference/performance.html
The NHibernate performance documentation will explain how batch-size works.
Edit: I am not aware of any way to page from your domain model. I recommend you write NH queries for scenarios where paging is required.

Edit: an afterthought.... what if i
want to introduce paging when i want
to render children? do i HAVE to break
my object graph here, or is there some
sneakyness i can employ to solve all
these issues?
Well, if you only load the children then you can page them :). But if you want something like : LoadParent AND PageChildren , then I don't think you can do that.

Related

use_reflection_optimizer with nHibernate

I'm dealing with nHibernate performance problem during hydration of a collection. The nHibernate-generated query finishes in 1-3 seconds, but the hydration takes another 7-9 seconds for just 50-90 objects. There are multiple layers of joined objects being hydrated from one resultset and data doesn't overlap much to benefit from caching. Taking out nHibernate is out of the question and I'm simply figuring out how to make hydration faster.
One idea that I looking into is to split nHibernate code to use named queries and to control how objects in different layers are hydrated. I also found the use_reflection_optimizer parameter for nHibernate. While it seems great, there is barely any info on usage.
1) Is reflection optimizer enabled by default? I'm using .NET 4.0, so it seems that it should be turned on by default, but I can't find a clear answer to this.
2) If use_reflection_optimizer is not "true" by default, then how do I enable it? It doesn't work for me through web.config and I'm reading that it should be done through code or through section in config. Could someone provide an example?
3) Are there any other suggestions for speeding up hydration of my objects?

use_reflection_optimizer is enabled by default, and what it does is it generates a lot of runtime-needed reflection functionality at startup, instead when coming in contact with some objects for the first time. It makes startup longer, but it is better in the long run.
I think NHibernate translates named HQL queries to SQL even without reflection-optimizer.
Now, about the performance of that query, could you share some code? Could eager fetch / join help ?

Hibernate Search, Entities, and SQL VIEWs

I have a table that maintains rows of products that are for sale (tbl_products) using PostgreSQL 9.1. There are also several other tables that maintain ratings on the items, comments, etc. We're using JPA/Hibernate for ORM in a Seam application, and have the appropriate entities wired up properly. In an effort to provide better listings of these items, I've created a SQL VIEW (v_product_summary) that aggregates some of the basic product data (name, description, price, etc.) with data from the other tables (number of comments, average rating, etc.). This provides a nice concise view of the data, and I've created a corresponding JPA entity object that provides read-only access to the view data.
Everything is working fine with respect to running JPQL queries on either the Product object (tbl_products) or the ProductSummary (v_product_summary) objects. However, we'd like to provide a richer search experience using Hibernate Search and Lucene. The issue we're running into, though, is how do we query the ProductSummary objects using Hibernate Search? They're not indexed upon creation, because they're never really "created". They're obtained as read-only objects from the v_product_summary VIEW. An index entry is only created on Product when it's persisted to the database, and not for ProductSummary since it's never persisted.
Our thought is that we should be able to:
Persist our Product object to the database
Immediately query the corresponding ProductSummary object using the product's ID
Manually update the Hibernate Search index for the ProductSummary object
Is this possible? Is this even a good idea? I can see there will be a performance impact since we're executing a query for the ProductSummary object every time a new Product is persisted. However, products are not added to the database at a high volume, so I don't think this will be a huge issue.
We'd really like to find a better, more efficient way of accomplishing this. Can anyone provide any tips or recommendations? If we do go the route of updating the search index manually, is that even doable? Can anyone provide a resource explaining how we can add a single ProductSummary to the index?
Any help you can provide is GREATLY appreciated.

If I understand the question correctly, you're trying to do the normal thing of persisting an object and indexing it at that point, but you're dealing with 2 separate objects.
I find myself doing kludgey things in Hibernate all the time, it feels like it almost demands it of you. Yes, there'd be a performance impact, and as you say, it is probably not a big deal, so it might be worth profiling.
A part of me remembers there's a way you can refresh the object upon write, and wonders if there's a way you can wrap the Product and the ProductSummary and tweak the mapping so that you read part and write part of it (waves hands on syntax and mapping). Or create a Hibernate-facing object with readonly fields that can be split and merged into your two objects. I don't know if your design allows Hibernate-only objects, it's a common idiom in my system.
Either way could be useful if you had a lot of objects in this situation, if this is the only object you're searching in this way, your 3 steps look much clearer.
As for the syntax for adding an object manually, I think you're looking for something like this, after your fetch:
FullTextSession textSession = Search.getFullTextSession(session);
textSession.index(myProductSummary);
Was that all you wanted?

Since you are using postgresql, you could insert to the view and use a rule to redirect the insert to the appropriate table.
A postgresql rule is a way to change the query just before it gets executed. I used it in an application which needed a change in schema but required the old queries to still work for a little while.
You can check out the documentation about rules on insert queries on the postgresql site
Since you'll be inserting and updating to the view, hibernate search will work as usual.
EDIT
An easier strategy. You could insert and update ProductSummary when doing so on Product and tell PostgreSQL to ignore the inserts, updates and deletes on the view.
On the database side"
create RULE dontinsert AS ON insert to v_product_summary do instead nothing
create RULE dontupdate AS ON update to v_product_summary do instead nothing
create RULE dontdelete AS ON delete to v_product_summary do instead nothing
But I guess you will need to hack a little, since the jdbc call executeUpdate will return 0, and hibernate will probably freak.

Technically I think this would be possible, but I think your entire efficiency dilemma might be better solved using something like memcached, therefore making performance less of an issue, and perhaps increasing code maintainability depending on how you currently have it implemented at statement level. By updating the search index manually, do you mean the database index? That is not recommended, and I'm not sure if it's even doable. Why not index them on creation?

What is the .Fetch.Select() in Fluent nHibernate?

While developing with Fluent nHibernate, I notice that on relationships I can specify a Fetch property, with possible options of Select(), Join(), and Subselect().
I did some searches for these and yielded very little information. I did find them in the nHibernate documentation and the fluent nHibernate documentation, but it does little other than give their signatures, which doesn't help me too much.
I was wondering if there is any real explanation for what these are, and what they really do. I've been rather perplexed myself. From my own evaluation they seem to change the way that referenced entities are pulled into the object graph, but I've yet to entirely discern how they change it, and which one is optimal for what situation...
I did find this blog post (http://www.mkyong.com/hibernate/hibernate-fetching-strategies-examples/) that has a little bit of detail but I'm still pretty perplexed about the entire situation. I've also seen other examples that state using Select() is more optimal, but the reasoning behind it. Additionally I found a post at (http://community.jboss.org/wiki/AShortPrimerOnFetchingStrategies) that is geared towards the original Java Hibernate platform, but I presume the concept is the same. In this one, my theory seems to be blown a bit as it focuses more on the lazy loading aspect of what they do, but I've still not seen any really flat examples.

Join fetching - NHibernate retrieves the associated instance or collection in the same SELECT, using an OUTER JOIN.
Select fetching - a second SELECT is used to retrieve the associated entity or collection. Unless you explicitly disable lazy fetching by specifying lazy="false", this second select will only be executed when you actually access the association.
Subselect fetching - a second SELECT is used to retrieve the associated collections for all entities retrieved in a previous query or fetch. Unless you explicitly disable lazy fetching by specifying lazy="false", this second select will only be executed when you actually access the association.
Check out the fetching strategy document # The Nhibernate Documentation

I'm not really familiar with nHibernate (I work with Hibernate and Java), but based on analogy, this enables you to specify association/collection property which you want to load eagerly, with the given entity. This is useful when you don't have full control over (n)Hibernate sessions (i.e if some other framework like Spring in Java is taking care of sessions/transactions).
So your assumption is basically correct.
Select, Join, and Subselect are the ways to obtain the related property, and determine what kind of query will be performed in database. Which one is optimal, really depends on the situation you have.
Hope this helps a little,
Cheers.

NHibernate: How to perform eager subselect fetching of many children & grandchildren (object graph) in a single round-trip to the database?

First, please don't try to argue me out of doing the eager load - traversing the object graph and causing (by lazy loading) even more than ONE round-trip to the database is just not an option.
I have a big object graph. I want to fetch the root object, plus a subset of its children, grandchildren, great-grandchildren, etc. Currently I do this by creating multiple Future objects (with Criteria) and in each one, I do SetFetchMode("...", FetchMode.Eager) - see Ayende's post and Sam's 3rd comment here.
There are two problems:
NHibernate performs multiple select queries in the same round-trip - one for each path from root to a leaf (A.B.C.D), which is great, but uses join rather than subselect which is what I really want it to do. Using join means a ton of data needs to be sent from the database, needs to be parsed, and nhibernate needs to do a lot more work than necessary.
As a result of problem 1 - duplication of objects nested more than one level deep in some cases.
The second problem I "solved" by setting my collections to be Set, but then I lose the ordering ability - since I must specify ISet as the interface, there's no way for my code to know if the set is really an OrderedSet.
Does anyone know how to perform, in a single round-trip, eager loading of an object plus several deeply nested collections, but not using join?
I'd be extremely grateful! I've scoured the web for answers, apparently I'm not the first to hit this wall.

You can create a separate queries with only 1 call to SetFetchMode and run them in one go using MultiCriteria (or Futures or whatever you want to use). After that, only the result from the first query is relevant to you.
This will give you a single result in an single round-trip.

When should one avoid using NHibernate's lazy-loading feature?

Most of what I hear about NHibernate's lazy-loading, is that it's better to use it, than not to use it. It seems like it just makes sense to minimize database access, in an effort to reduce bottlenecks. But few things come without trade-offs, certainly it slightly limits design by forcing you to have virtual properties. But I've also noticed that some developers turn lazy-loading off on certain often-used objects.
This makes me wonder if there are some definite situations where data-access performance is hurt by using lazy-loading.
So I wonder, when and in what situations should I avoid lazy-loading one of my NHibernate-persisted objects?
Is the downside to lazy-loading merely in additional processing time, or can nhibernate lazy-loading also increase the data-access time (for instance, by making additional round-trips to the database)?
Thanks!

There are clear performance tradeoffs between eager and lazy loading objects from a database.
If you use eager loading, you suck a ton of data in a single query, which you can then cache. This is most common on application startup. You are trading memory consumption for database round trips.
If you use lazy loading, you suck a minimal amount of data in a single query, but any time you need more information related to that initial data it requires more queries to the database and database performance hits are very often the major performance bottleneck in most applications.
So, in general, you always want to retrieve exactly the data you will need for the entire "unit of work", no more, no less. In some cases, you may not know exactly what you need (because the user is working through a wizard or something similar) and in that case it probably makes sense to lazy load as you go.
If you are using an ORM and focused on adding features quickly and will come back and optimize performance later (which is extremely common and a good way to do things), having lazy loading being the default is the correct way to go. If you later find (through performance profiling/analysis) that you have one query to get an object and then N queries to get the N objects related to that original object, you can change that piece of code to use eager loading to only hit the database once instead of N+1 times (the N+1 problem is a well known downside of using lazy loading).

The usual tradeoff for lazy loading is that you make a smaller hit on the database up front, but you end up making more hits on it long-term.
Without lazy loading, you'll grab an entire object graph up front, sucking down a large chunk of data at once. This could, potentially, cause lag in your UI, and so it is often discouraged. However, if you have a common object graph (not just single object - otherwise it wouldn't matter!) that you know will be accessed frequently, and top to bottom, then it makes sense to pull it down at once.
As an example, if you're doing an order management system, you probably won't pull down all the lines of every order, or all the customer information, on a summary screen. Lazy loading prevents this from happening.
I can't think of a good example for not using it offhand, but I'm sure there are cases where you'd want to do a big load of an object graph, say, on application initialization, in order to avoid lags in processing further down the line.

The short version is this:
Development is simpler if you use lazy loading. You just traverse object relationships in a natural OO way, and you get what you need when you ask for it.
Performance is generally better if you figure out what you need before you ask for it, and ask for it in one trip to the database.
For the past few years we've been focusing on quick development times. Now that we have a solid app and userbase, we're optimizing our data access.

If you are using a webservice between the client and server handling the database access using nhibernate it might be problematic using lazy loading since the object will be serialized and sent over the webservice and subsequent usage of "objects" further down in the object relationship needs a new trip to the database server using additional webservices. In such an instance it might not be too good using lazy loading. A word of caution, be careful in what you fetch if you turn lazy loading of, its way to easy to not think this through and through and end up fetching almost the whole database...

I have seen many performance problems aring from wrong loading behaviour configuration in Hibernate. The situation is quite the same with NHibernate I think. My recommendation is to always use lazy relations and then use eager fetching statemetns in your query - like fetch joins - . This ensures you are not loading to much data and you can avoid to many SQL queries.
It is easy to make a lazy releation eager by a query. It is nearly impossible the other way round.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas