nhibernate cross table query optimization - nhibernate

I have a query I've written with NHibernate's Criteria functionality and I want to optimize it. The query joins 4 tables. The query works, but the generated SQL is returning all the columns for the 4 tables as opposed to just the information I want to return. I'm using SetResultTransformer on the query which shapes the returned data to an Individual, but not until after the larger sql is returned from the server.
Here's the NHibernate Criteria
return session.CreateCriteria(typeof(Individual))
.CreateAlias("ExternalIdentifiers", "ExternalIdentifier")
.CreateAlias("ExternalIdentifier.ExternalIdentifierType", "ExternalIdentifierType")
.CreateAlias("ExternalIdentifierType.DataSource", "Datasource")
.Add(Restrictions.Eq("ExternalIdentifier.Text1", ExternalId))
.Add(Restrictions.Eq("ExternalIdentifierType.Code", ExternalIdType))
.Add(Restrictions.Eq("Datasource.Code", DataSourceCode))
.SetResultTransformer(new NHibernate.Transform.RootEntityResultTransformer());
And the generated sql (from NHProfiler) is
SELECT (all columns from all joined tables)
FROM INDIVIDUAL this_
inner join EXTERNAL_ID externalid1_
on this_.INDIVIDUAL_GUID = externalid1_.GENERIC_GUID
inner join EXTERNAL_ID_TYPE externalid2_
on externalid1_.EXTERNAL_ID_TYPE_GUID = externalid2_.EXTERNAL_ID_TYPE_GUID
inner join SYSTEM_SRC datasource3_
on externalid2_.SYSTEM_SRC_GUID = datasource3_.SYSTEM_SRC_GUID
WHERE externalid1_.EXTERNAL_ID_TEXT_1 = 96800 /* #p0 */
and externalid2_.EXTERNAL_ID_TYPE_CODE = 'PATIENT' /* #p1 */
and datasource3_.SYSTEM_SRC_CODE = 'TOUCHPOINT' /* #p2 */
I only want the columns back from the Individual table. I could set a projection, but then I lose the Individual type.
I could also rewrite this with DetachedCriteria.
Are these my only options?

I had exactly the same question and asked one of NHibernate development team members directly, here is the answer I got:
with the Criteria API, the only thing you could do is to use a projection list and include all of the properties of the root entity... then you would need to use the AliasToBeanResultTransformer. Far from an optimal solution obviously
if you don't mind rewriting the query with hql, then you can do it very easily. an hql query that says "from MyEntity e join e.Association" will select both the entity columns as well as the association's columns (just like the problem you're having with criteria). But an hql query that says "select e from MyEntity e join e.Association" will only select the columns of e.

Related

How to model and query objects in relational databases?

I have a complex database scheme for a dictionary. Each object (essentially a translation) is similar to this:
Entry {
keyword;
examples;
tags;
Translations;
}
with
Translation {
text;
tags;
examples;
}
and
Example {
text;
translation;
phonetic_script;
}
i.e. tags (i.e. grammar) can belong to either the keyword itself, or the translation (grammar of the foreign language), and similar examples can belong either to the translation itself (i.e. explaining the foreign word) or the the text in the entry. I ended up with such kind of relational design:
entries(id,keyword,)
tags(tag)
examples(id,text,...)
entrytags(entry_id,tag)
entryexamples(entry_id,example_id)
translations(id,belongs_to_entry,...)
translationtags(transl_id, tag)
translationexamples(transl_id,example_id)
My main task is querying this database. Say I search for "foo", my current way of handling is:
query all entries with foo, get ids A
foreach id in A
query all examples belonging to id
query all tags belonging to id
query all translations belonging to A, store their ids in B
foreach tr_id in B
query all tags belonging to tr_id
query all examples belonging to tr_id
to rebuild my objects. This looks cumbersome to me, and is slow. I do not see how i could significantly improve this by using joins, or otherwise. I have a hard time modeling these objects to relations in the database. Is this a proper design?
How can I make this more efficient to improve query time?
Each query being called in the loop takes at a minimum a certain base duration of time to execute, even for trivial queries. Many environment factors contribute to what this duration is but for now let's assume it's 10 milliseconds. If the first query matches 100 entries then there are at a minimum 301 total queries being called, each taking 10 ms, for a total of 3 seconds. The number of loop iterations varies which can contribute to a substantial variation in the performance.
Restructuring the queries with joins will create more complex queries but the total number of queries being called can be reduced down to a fixed number, 4 in the queries below. Suppose now that each query takes 50 ms to execute now that it is more complex and the total duration becomes 200 ms, a substantial decrease from 3000 ms.
The 4 queries show below should come close to achieving the desired result. There are other ways to write the queries such as using subquery or including the tables in the FROM clause but these show how to do it with JOINs. The condition entries.keyword = 'foo' is used to represent the condition in the original query to select the entries.
It is worth noting that if the foo condition on entries is very expensive to compute then other optimizations may be needed to further improve performance. In these examples the condition is a simple comparison which is quick to lookup in an index but using LIKE which may require a full table scan may not work well with these queries.
The following query selects all examples matching the original query. The condition from the original query is expressed as a WHERE clause on the entries.keyword column.
SELECT entries.id, examples.text
FROM entries
INNER JOIN entryexamples
ON (entries.id = entryexamples.entry_id)
INNER JOIN examples
ON (entryexamples.example_id = examples.id)
WHERE entries.keyword = 'foo';
This query selects tags matching the original query. Only two joins are used in this case because the entrytags.tag column is what is needed and joining with tags would only provide the same value.
SELECT entries.id, entrytags.tag
FROM entries
INNER JOIN entrytags
ON (entries.id = entrytags.entry_id)
WHERE entries.keyword = 'foo'';
This query selects the translation tags for the original query. This is similar to the previous query to select the entrytags but another layer of joins is used here for the translations.
SELECT entries.id, translationtags.tag
FROM entries
INNER JOIN translations
ON (entries.id = translations.belongs_to_entry)
INNER JOIN translationtags
ON (translations.id = translationtags.transl_id)
WHERE entries.keyword = 'foo';
The final query does the same as the first query for the examples but also includes the additional joins. It's getting to be a lot of joins but in general should perform significantly better than looping through and executing individual queries.
SELECT entries.id, examples.text
FROM entries
INNER JOIN translations
ON (entries.id = translations.belongs_to_entry)
INNER JOIN translationexamples
ON (translations.id = translationexamples.transl_id)
INNER JOIN examples
ON (translationexamples.example_id = examples.id)
WHERE entries.keyword = 'foo';

How to select one table using NHibernate CreateCriteria

How can I create the following SQL statement in Nhibernate using CreateCriteria:
SELECT distinct top 20 a.* from ActivityLog a
left join WallPost w on a.ActivityLogId = w.ActivityLogId left join ItemStatus i on i.StatusId = w.ItemStatus
I always tend to get all columns from all tables returned in the sql statement producing duplicates even though I map it to the ActivityLog table. I am also doing paging as the code below shows:
ICriteria crit = nhelper.NHibernateSession.CreateCriteria(typeof(Model.ActivityLog), "a").CreateAlias("a.WallPosts", "w",CriteriaSpecification.LeftJoin)
.CreateAlias("w.ItemStatus", "i", CriteriaSpecification.LeftJoin)
.SetMaxResults(pageSize).SetFirstResult(startRow).AddOrder(Order.Desc("a.Date"));
Thanks
H
Sounds like you have set lazy loading to false in your mapping files meaning that all the associations and child collections are being loaded up too. Can you verify that?
You ask "How to select one table using NHibernate CreateQuery" (HQL). In that case you can choose what to fetch using select.
In your text you're using criteria. AFAIK, you cannot directly control what columns you want to read. However, if you create a DetachedCriteria for the join, this won't be fetched.

Enormous SQL generated for Linq-to-Sql Grouping

I have one Linq to SQL query that simply joins three tables and then performs grouping. Here is the query:
(from s in mktActualSales
join p in sysPeriods on s.PeriodID equals p.PeriodID
join d in setupDesignations on s.PositionID equals d.DesignationID
group new { s, p } by new{d.Title,d.DesignationID} into temping
select
new
{
SPOPosition = temping.Key.Title,
SalesPeriods = temping.Select(x=>new {PeriodID = x.p.PeriodID, StartDate = x.p.StartDate, EndDate = x.p.EndDate, SaleTargetID = x.s.ActualSaleID, IsApproved = x.s.IsApproved}),
PositionID = temping.Key.DesignationID
}).Take(5)
When I inspect SQL generated (executed) by this query in LinqPad, there are 6 SQL statements; first one performs join and grouping and rest of queries are the same just calling same query over and over again with different parameters. Obviously parameters are the values included in Key of the group.
It makes me believe that for a record with 130 groups Linq will hit the database 131 times. How can I save so much hits to the database? Should I perform grouping after loading data into memory i.e after calling ToList on joined query?
(I am assuming mktActualSales is your table and you missed putting the Datacontext in the question.)
If it is your table you should have a look at LoadOptions on your DataContext. With this you can eager load some of the related data and this might prevent the repetitive queries.
http://msdn.microsoft.com/en-us/library/system.data.linq.dataloadoptions.aspx
If it is not a table, have a look on how you populate mktActualSales but judging from your question I assume it is a table.

NHibernate HQL subselect left join?

I have the following issue and I would really appreciate your help:
I have all my filtering done through the HQL and one of the filters is a left outer join which must be a sub-select. So my thinking here is that I can create a dummy business object which I would like to populate (with an SP if possible) with data and use it in my left join.
Now how can I do that and still have all of the logic in 1 HQL query?
I guess the biggest issue for me is understanding how can I have this BO (not actually mapped to a table etc.) be used in a query where all the other BOs are mapped to tables etc.
I am trying to avoid doing any filtering in the actual C# code.
Thanks!
Example:
Entity A - mapped to table A
Entity B - mapped to table B
Entity C - mapped to table C
Entity D - not mapped to table - coming from SQL query or SP
HQL:
from A pbo
inner join B.EntityType etype
inner join C.EntityAddressList eadr
left join D.Level lvl
You cannot join to arbitrary SQL in HQL. So I don't think what you're trying to do is possible. You'll likely either have to map whatever D is, or write your query in SQL using Session.CreateSQLQuery(). You can still specify any entities that come back. See this article for reference. http://www.nhforge.org/doc/nh/en/index.html#d0e10274

How to implement paging in NHibernate with a left join query

I have an NHibernate query that looks like this:
var query = Session.CreateQuery(#"
select o
from Order o
left join o.Products p
where
(o.CompanyId = :companyId) AND
(p.Status = :processing)
order by o.UpdatedOn desc")
.SetParameter("companyId", companyId)
.SetParameter("processing", Status.Processing)
.SetResultTransformer(Transformers.DistinctRootEntity);
var data = query.List<Order>();
I want to implement paging for this query, so I only return x rows instead of the entire result set.
I know about SetMaxResults() and SetFirstResult(), but because of the left join and DistinctRootEntity, that could return less than x Orders.
I tried "select distinct o" as well, but the sql that is generated for that (using the sqlserver 2008 dialect) seems to ignore the distinct for pages after the first one (I think this is the problem).
What is the best way to accomplish this?
In these cases, it's best to do it in two queries instead of one:
Load a page of orders, without joins
Load those orders with their products, using the in operator
There's a slightly more complex example at http://ayende.com/Blog/archive/2010/01/16/eagerly-loading-entity-associations-efficiently-with-nhibernate.aspx
Use SetResultTransformer(Transformers.AliasToBean()) and get the data that is not the entity.
The other solution is that you change the query.
As I see you're returning Orders that have products which are processing.
So you could use exists statement. Check nhibernate manual at 13.11. Subqueries.