What is the .Fetch.Select() in Fluent nHibernate? - nhibernate

While developing with Fluent nHibernate, I notice that on relationships I can specify a Fetch property, with possible options of Select(), Join(), and Subselect().
I did some searches for these and yielded very little information. I did find them in the nHibernate documentation and the fluent nHibernate documentation, but it does little other than give their signatures, which doesn't help me too much.
I was wondering if there is any real explanation for what these are, and what they really do. I've been rather perplexed myself. From my own evaluation they seem to change the way that referenced entities are pulled into the object graph, but I've yet to entirely discern how they change it, and which one is optimal for what situation...
I did find this blog post (http://www.mkyong.com/hibernate/hibernate-fetching-strategies-examples/) that has a little bit of detail but I'm still pretty perplexed about the entire situation. I've also seen other examples that state using Select() is more optimal, but the reasoning behind it. Additionally I found a post at (http://community.jboss.org/wiki/AShortPrimerOnFetchingStrategies) that is geared towards the original Java Hibernate platform, but I presume the concept is the same. In this one, my theory seems to be blown a bit as it focuses more on the lazy loading aspect of what they do, but I've still not seen any really flat examples.

Join fetching - NHibernate retrieves the associated instance or collection in the same SELECT, using an OUTER JOIN.
Select fetching - a second SELECT is used to retrieve the associated entity or collection. Unless you explicitly disable lazy fetching by specifying lazy="false", this second select will only be executed when you actually access the association.
Subselect fetching - a second SELECT is used to retrieve the associated collections for all entities retrieved in a previous query or fetch. Unless you explicitly disable lazy fetching by specifying lazy="false", this second select will only be executed when you actually access the association.
Check out the fetching strategy document # The Nhibernate Documentation

I'm not really familiar with nHibernate (I work with Hibernate and Java), but based on analogy, this enables you to specify association/collection property which you want to load eagerly, with the given entity. This is useful when you don't have full control over (n)Hibernate sessions (i.e if some other framework like Spring in Java is taking care of sessions/transactions).
So your assumption is basically correct.
Select, Join, and Subselect are the ways to obtain the related property, and determine what kind of query will be performed in database. Which one is optimal, really depends on the situation you have.
Hope this helps a little,
Cheers.

Related

Hibernate findById or sql query

I have many time scenarios that I only want to access only one/two/ or some no of columns and we are using hibernate so I want to know which is better for performance either
1) by fetching findById method of hibernate, which is very convenient for me because I have to just call it, but I think it will be not good in performance because it will fetch all column rather I require only some.
2) Or by creating my query each time that is tedious but it will be better in performance
So I want suggestion regarding what should I use?
To answer more specifically, it would be helpful if you included a code snippet. In general though, findById is a convenience method that will result in a query very similar to what you would write yourself. So writing the query yourself and returning only the columns you need (constructor expressions are useful) would be better in terms of performance. The question I would ask is, is that improved performance worth the more complicated code? You can always optimize your queries later.
It entirely depends on the entity which is being loaded. If the entity is one, with lots of relationships, and all you need is a couple of fields in the root entity, it is definitely worth writing your own query as Hibernate generates queries with JOINs to load the entity which can be very expensive. The other thing to consider here is that, you can always handle the fields that are being loaded using LAZY or EAGER loading but these settings are static and will be applied permanently to your entity.
On the other hand if the entity doesn't have many relationships, I believe the most expensive part is the conversation time between DB and your application, thus loading a number of extra fields can be ignored.

Getting/building the SQL (with parameters) from NHibernate 3.2

I am working on a project that requires the following:
Extract the full SQL query from a specific NHibernate 3.2 session.
Perform specific actions on the query (i.e not necesserily log it)
Do not affect NHibernate in the whole system to avoid introducing performance issues
I checked several approaches, many of them already appearing on StackOverflow. Here are my options as I see now:
Manual
In the most naive and annoying solution, I can just follow the business logic and build the query myself - E.g, if the BL builds a criteria that does a restrictions on ID=5, I'd build a query with SELECT ... WHERE ID = 5. Since we have quite a complex BL, I'd really like to avoid that.
NHibernate interception
Originally, using the OnPrepareStatement seemed like the best bet. However, I soon discovered that the parameters of the query aren't logged which renders it quite useless.
Introspecting NHibernarnate’s ICriteria
When performing a query with NHibernate, we do it with an ICreteria object that contains the restrictions, the sorting and the aggregations definitions. It seems I could interspect it when a CriteriaWalker that is described here. However, it seems that it gets confused on complex queries. Also, in some cases we use NHibernate 3 new "QueryOver" syntax for which this solution doesn't help me.
Using ILoggerFactory
Since NHibernate 3, you can write custom log factories (sample). This gets the full SQL, however, it also affects the whole NHibernate system and it seems it is impossible to have a factory to apply for a specific ISession, or even ISessionFactory.
Custom NHibernate driver
I've considered writing a proxy NHibernate driver and assigning it to a specific SessionFactory (as described here). However, a friendly comment warns that it longer works in Nhibernate 3.2.
Using dynamic proxies
This code uses Castle's Dynamic proxies to inject itself inside ISession. I haven't tried running it yet with my server, but I am a bit wary of using such drastic measures. If nothing else works, however, I guess it is something to consider.
Suggestions?
Right now I am a bit stuck on choosing the best way to go with since nothing seems to be doing its job, quite right. If there are other suggestions, I'd love to hear them.
I'd use a standard or custom logging framework, and apply a custom filter to retrieve a flag from thread data (for example) in order to determine whether the session should be logged.
This way, you don't mess with NH internals at all, and as long as you don't set the flag, nothing gets logged.

Where is the api reference for nhibernate?

I may be going mental, but I can not find any api reference material for nhibernate. I've found plenty of manuals, tutorials, ebooks etc but no api reference. I saw the chm file on the nhibernate sourceforge page, but it doesn't seem to work on any of my PCs (different OSes)
Can someone please point me in the right direction?
I just found this one:
http://web.archive.org/web/20141001063046/http://elliottjorgensen.com/nhibernate-api-ref/index.html
It doesn't seem to be official, but at least it looks like an API reference... unlike the official reference, which mostly describes concepts and mappings without any information about classes and members.
If you're on Windows, get ILSpy and point it at NHibernate.dll. It's not quite the same as real API documentation, but it's not half bad.
There is no class references publicly available on Internet as far as I know. You may build it from the source. Clone them, build the NHibernate.sln solution, then go into doc folder, ensure you have prerequisites indicated in reference\readme.txt file, and run nant doc. This will generate the class reference in the build folder.
Otherwise the most commonly used API are not wide, and most of them are xml documented with intellisens working in Visual Studio. The reference documentation has the advantage of giving more context, probably helping avoiding pitfalls like believing ISession.Update is to be used for updating entities (this is wrong, you do not need it unless you use detached entities, or entities coming from another session).
Official documentation reference is on https://nhibernate.info.
Sub-links:
Global documentation list
Reference (What I mostly use, especially following sub parts.)
Configuration
Mapping - basic / entities. (Add mapping xsd definition file in any or your solution folders for letting VS know it and give you intellisens in your hbm mappings.)
Mapping - collections
Querying - general. Do not miss the named queries feature in The IQuery interface.
Querying APIs:
HQL. I mostly use HQL with named queries, in mappings, for queries not dynamically built. They get parsed and validated when building session factory, which normally occurs at application startup, so it is almost as good as compile time validation. Checks log4net logs to get detailed reasons of named query parsing failures.
Criteria API. I view it as the historical way of dynamically building queries in code, to be preferred over constructing HQL strings.
QueryOver API. Based on Criteia API, with lambda expression support for having compile time validation of queried entities namings. Should be preferred over Criteria API in my opinion.
Linq API. Great for dynamically built queries. Bear in mind that its implementation translates your queries to HQL. With complex queries, it may generate unsupported HQL constructs. Having knowledge of HQL capabilities allows a better understanding of how to write a supported Linq query for complex cases. (By example, for a complex order by, better use an explicit linq sub-query in the OrderBy rather than using a collection mapped on your queried entity.)
Native SQL. Well, quite self-explanatory. To be used by example when you need some SQL special feature not available through other querying APIs (SQL server full-text, select for xml, ...), and that you do not wish to extend those other APIs. You may also call stored procedures. When using native SQL, I favor SQL named queries.
Modifying data, from Updating objects to Flush, and Exception handling.
Performances.
Batch fetching. About this, you may read my post here for a detailed explanation of why lazy loading can be very efficient with NHibernate, thanks to batch fetching. This single feature will always cause me to prefer NHibernate over Entity Framework, till it ceases being lacking in EF.
Second level cache. Another great NHibernate feature, lacking native support in EF. Beware, you must use transactions for leveraging this. It allows NHibernate to automatically evict cached entries for you as you change data through your application process. Without transactions, NHibernate will disable the second level cache as soon as you start changing data, for avoiding letting the cache yield you stale data.
Interceptors. This is one way among many allowing to customize NHibernate inner working. NHibernate is very strong at allowing you to extend it. You may also add your own HQL extensions as here, your own linq2NH extension as here (all are answers from me). And there are other ways, see this list for linq2NH extensibility solutions.
Moreover, a class reference will very likely be near the Hibernate one. There is so many internals APIs supporting its implementation that is not much usable.
Why are such API not hidden (internal, private, ...)? Not hiding them is required for allowing the great extensibility capabilities of NHibernate. Those capabilities are a must have in my opinion. In contrast, it is so hard to fix some other .Net project shortcomings, due to lacks of extensibility they suffer. (MVC FileResult and the TweakDispositionAsInline I had to use instead of just being able of overriding some method, or try extend linq-to-entities, see this.)
there is a good book that covers a lot, and there is the html documentation on the site (which also comes as a book)
(the book would be manning - nHibernate in Action - a little outdated, but a good start)
Here is the link to the online reference

Coldfusion ORM Large Tables

Say if I have a large dataset, the table has well over a million records and the database is normalized so theres foreign keys and stuff. Ive set up the relations properly and i get a list of the first object applications = EntityLoad("entityName") but because of the relations and stuff the page takes like 24 seconds to load, even when i limit the number of records to show to like 5 it takes an awful long time to load.
My solution to this was create another object that just gets the list, and then when the user wants to , use the object with all the relations and show it to the user. Is this the right way to approach it, or am i missing a big ORM concept?
Are you counting just the time to get the data, or are you perhaps doing a CFDUMP on it or something else visually that could be slow. In other words, have you wrapped the EntityLoad by itself in a cftimer tag to be sure that it is the culprit?
The first thing I would do is enable SQL logging in your Application.cfc. Add logSQL=true to This.ormSettings.
That should allow you to grab the SQL that ORM generates. Run it in an analyzer. See if the ORM SQL is doing somethign crazy. See if it is an index that you missed or something.
Also are you doing paging as Ray talks about here: http://www.coldfusionjedi.com/index.cfm/2009/8/14/Simple-ColdFusion-9-ORM-Paging-Demo?
If not have you tried using ORMExecuteQuery and HQL to enable paging.
Those are my thoughts.
When defining complex domain models with Hibernate - you will sometimes need to tweak the mapping to improve performance. This is especially true if you are dealing with inheritance (not sure how much inheritance is in your model). The ultimate goal is to have your query pulling from as few tables as possible while still preserving your domain model. This might require using the advanced inheritance mappings (more on that in a sec).
LOGGING SQL
As Terry mentioned, you will want to be sure you can log the actual SQL that is being passed to your database (yeah, you don't totally get away from SQL with ORM). Here is a great article on setting up logging for Hibernate in CF9 from Rupesh:
http://www.rupeshk.org/blog/index.php/2009/07/coldfusion-orm-how-to-log-sql/
HIBERNATE MAPPING FILES
Anytime you want to do something beyond the basic, you want to be sure that you are looking at the actual Hibernate mapping files that are generated for your CFC's. Be sure to set the following with all of your hibernate options in Application.cfc:
savemapping = true
While the cfproperty properties allow you to define many aspects of the mapping, there are actually some things that can only be done in the Hibernate mapping files (and there are tons of community resources on this.
INHERITANCE MAPPING
As I mentioned earlier, Hibernate provides different inheritance strategies for mapping. They are Table per Hierarchy, Table per subclass, Table per concrete class, and implicit polymorphism. You can read more about these types in the CF9 docs under Advanced Mapping > Inheritance Mapping or in the Hibernate documentation (as it would take forever to explain each of these).
Knowing how your tables are mapped is very important with inheritance (and it is also where Hibernate can generate some HUGE queries if you don't tweak your setup).
Those are the things I can think of - if you can give some additional information about your domain model - we can look to see what other things might be done to tweak it.
There is a good chance Hibernate is doing it's caching thing. A fair comparison in my mind (everyone please feel free to add) is doing an:
EntityLoad("entity_name") is the same as doing a select * from TABLE
So, in this case, what Hibernate might be doing in instantiating the memory, and caching it a certain way, your database server might do this similarly when you sent such a broad SQL instruction.
I have been extremely interested in ORM the past few weeks and it looks to be a very rewarding undertaking.
For this reason, is there a tiem you would ever load all 500,000 records as a result? I assume not.
I have one large logging table that I will be attacking, I am finding that the SQL good stuff must be there. For example, mark the fields that are indexes as such, this will speed it up incredibly when searching. I am sure the ORM can handle this.
Beyond this:
Find some excellent Hibernate forums, resources, and tutorials so you can learn Hibernate. This isn't really as much a Coldfusion --> ORM issue as what Hibernate might do on it's own. I have ordered a few Hibernate books that I'm waiting on to see how they are.
Likewise there seems to be an incredible amount of Hibernate resources out there where you can bring the Performance enhancement solutions of Hibernate into the Coldfusion sphere. I might be making it too simple, but I see the CF-ORM implementation as a wrapper with some code generation to save us time.
Take a look at implementing filters to cut down your data in the EntityLoad() call.
As recommended in other threads, turn on sql logging and see what sql is being generated. Chances are it might not be what you need. Check out HQL to see if you can form a better statement.
Most importantly, share what you find. I'll volunteer to do the same on this as you've tempted me to go try this out in my spare time a bit sooner than planned.
Faisal, we ran into this with Linq (c# orm).
Our solution was to create simple objects not holding the relational data. For instance, along with Users we had a SimpleUsers object which held little or no relation to any other object and had a limited set of columns.
There could be other ways of handling this but this approach helped tremendously with the query speed.

Prevent Fluent NHibernate select n+1

I have a fairly deep object graph (5-6 nodes), and as I traverse portions of it NHProf is telling me I've got a "Select N+1" problem (which I do).
The two solutions I'm aware of are
Eager load children
Break apart my object graph (and eager load)
I don't really want to do either of these (although I may break the graph apart later as I forsee it growing)
For now....
Is it possible to tell NHibernate (with FluentNHibernate) that whenever I try to access children, to load them all in one go, instead of select-n+1-ing as I iterate over them?
I'm also getting "unbounded results set"s, which is presumably the same problem (or rather, will be solved by the above solution if possible).
Each child collection (throughout the graph) will only ever have about 20 members, but 20^5 is a lot, so I don't want to eager load everything when I get the root, but simply get all of a child collection whenever I go near it.
Edit: an afterthought.... what if I want to introduce paging when I want to render children? Do I HAVE to break my object graph here, or is there some sneakiness I can employ to solve all these issues?
It sounds to me that you want to pursue the approach of using your domain model rather than creating a specific nhibernate query to handle this scenario. Given this, I would suggest you take a look at the batch-size attribute which you can apply to your collections. The Fluent NHibernate fluent interface does not yet support this attribute, but as a work around you can use:
HasMany(x => x.Children).AsSet().SetAttribute("batch-size", "20")
Given the general lack of information about your exact scenario, I cannot say for sure whether batch-size is the ideal solution, but I certainly recommend you give it a go. If you haven't already, I suggest you read these:
http://www.nhforge.org/wikis/howtonh/lazy-loading-eager-loading.aspx
http://nhibernate.info/doc/nhibernate-reference/performance.html
The NHibernate performance documentation will explain how batch-size works.
Edit: I am not aware of any way to page from your domain model. I recommend you write NH queries for scenarios where paging is required.
Edit: an afterthought.... what if i
want to introduce paging when i want
to render children? do i HAVE to break
my object graph here, or is there some
sneakyness i can employ to solve all
these issues?
Well, if you only load the children then you can page them :). But if you want something like : LoadParent AND PageChildren , then I don't think you can do that.