NHibernate - Linq query using COUNT(DISTINCT) - nhibernate

I'm trying to get a paged query to work properly using LINQ and NHibernate. On a single table, this works perfect, but when more than one table is joined, it's causing me havoc. Here is what I have so far.
public virtual PagedList<Provider> GetPagedProviders(int startIndex, int count, System.Linq.Expressions.Expression<Func<Record, bool>> predicate) {
var firstResult = startIndex == 1 ? 0 : (startIndex - 1) * count;
var query = (from p in session.Query<Record>()
.Where(predicate)
select p.Provider);
var rowCount = query.Select(x => x.Id).Distinct().Count();
var pageOfItems = query.Distinct().Skip(firstResult).Take(count).ToList<Provider>();
return new PagedList<Provider>(pageOfItems, startIndex, count, rowCount);
}
There are a couple issues I'm having with this. First off, the rowCount variable is pulling back a COUNT(*) on a join that has one to many. So my count is way off of what is should be, I'm expecting 129, but getting back over 1K.
The second issue I'm happening is with the results. If I'm on the first page (firstResult = 0), then I'm getting correct results. However, if firstResult is something other than 0, it produces completely different SQL. Below is the SQL generated by both scenarios, I've trimmed out some of the fat to make it a little more readable.
--firstResult = 0
select distinct TOP (30) provider1_.Id as Id12_, provider1_.TaxID as TaxID12_,
provider1_.Facility_Name as Facility5_12_, provider1_.Last_name as Last6_12_,
provider1_.First_name as First7_12_, provider1_.MI as MI12_
from PPORecords record0_ left outer join PPOProviders provider1_ on record0_.Provider_id=provider1_.Id,
PPOProviders provider2_ where record0_.Provider_id=provider2_.Id
and provider2_.TaxID='000000000'
--firstResult = 30
SELECT TOP (30) Id12_, TaxID12_, Facility5_12_, Last6_12_, First7_12_, MI12_,
FROM (select distinct provider1_.Id as Id12_, provider1_.TaxID as TaxID12_,
provider1_.Facility_Name as Facility5_12_,
provider1_.Last_name as Last6_12_,
provider1_.First_name as First7_12_,
provider1_.MI as MI12_,
ROW_NUMBER() OVER(ORDER BY CURRENT_TIMESTAMP) as __hibernate_sort_row
from PPORecords record0_ left outer join PPOProviders provider1_ on record0_.Provider_id=provider1_.Id,
PPOProviders provider2_
where record0_.Provider_id=provider2_.Id and provider2_.TaxID='000000000') as query
WHERE query.__hibernate_sort_row > 30
ORDER BY query.__hibernate_sort_row
The problem with the second query is the "distinct" keyword is not present on the outer query, only the inner query. Any ideas how to correct this?
Thanks for any suggestions!
[UPDATE]
This is the code that is building the the Linq query predicate.
private Expression<Func<Record, bool>> ParseQueryExpression(string Query) {
Expression<Func<Record, bool>> mExpression = x => true;
string[] splitQuery = Query.Split('|');
foreach (string query in splitQuery) {
if (string.IsNullOrEmpty(query))
continue;
int valStartIndex = query.IndexOf('(');
string variable = query.Substring(0, valStartIndex);
string value = query.Substring(valStartIndex + 1, query.IndexOf(')') - valStartIndex - 1);
switch (variable) {
case "tax":
mExpression = x => x.Provider.TaxID == value;
break;
case "net":
mExpression = Combine<Record>(mExpression, x => x.Network.Id == int.Parse(value));
break;
case "con":
mExpression = Combine<Record>(mExpression, x => x.Contract.Id == int.Parse(value));
break;
case "eff":
mExpression = Combine<Record>(mExpression, x => x.Effective_Date >= DateTime.Parse(value));
break;
case "trm":
mExpression = Combine<Record>(mExpression, x => x.Term_Date <= DateTime.Parse(value));
break;
case "pid":
mExpression = Combine<Record>(mExpression, x => x.Provider.Id == long.Parse(value));
break;
case "rid":
mExpression = Combine<Record>(mExpression, x => x.Rate.Id == int.Parse(value));
break;
}
}
return mExpression;
}

It seems as if the Linq provider has some "unexpected" behaviour there. I could reproduce the issue with the rowCount that you are describing and also encountered some issues when trying to fix it with a subquery.
If I understand your intentions correctly you basically want to have a paged list of Providers whose Records match certain criteria.
In that case I would suggest using a subquery. However, I tried implementing the subquery with the Query() method but it did not work. So I tried the QueryOver() method which worked flawlessly. In your case the desired queries would look like this:
Provider pAlias = null;
var query = session.QueryOver<Provider>(() => pAlias).WithSubquery
.WhereExists(QueryOver.Of<Record>()
.Where(predicate)
.And(r => r.Provider.Id == pAlias.Id)
.Select(r => r.Provider));
var rowCount = query.RowCount();
var pageOfItems = query.Skip(firstResult).Take(count).List<Provider>();
That way you do not have to struggle with the Distinct(). And as much as I personally like and use NHibernate.Linq, I suppose, if it does not work in a given situation, one should use something else that works.
Edit: Splitting the query into smaller units.
// the method ParseQueryExpression() does not need to be modified, the Expression should work like that
System.Linq.Expressions.Expression<Func<Record, bool>> predicate = x => x.Provider.TaxID == "000000000";
// Query to evaluate predicate
IQueryable<Record> queryId = session.Query<Record>().Where(predicate).Select(r => r.Provider);
// extracting the IDs to use in a subquery
List<int> idList = queryId.Select(p => p.Id).Distinct().ToList();
// total count of distinct Providers
int rowCount = idList.Count;
// new Query on Provider, using the distinct Ids
var query = session.Query<Provider>().Where(p => idList.Contains(p.Id));
// the List<Provider> to display on the page
var pageOfItems = query.Skip(firstResult).Take(count).ToList();
The problem here is that you have multiple queries and therefore multiple roundtrips to the db. Another problem apperas when you examine the generated sql. The subquery idList.Contains(p.Id) will generate an in-clause with a lot of parameters.
As you can see, these are NHibernate.Linq queries. That is because the new QueryOver feature is not a true Linq provider and does not work well with dynamic Linq expressions.
You could get around that limitation by using Detached QueryOvers. The drawback here is that you would have to modify your ParseQueryExpression() somehow, e.g. like that:
private QueryOver<Record> ParseQueryExpression(string Query)
{
Record rAlias = null;
var detachedQueryOver = QueryOver.Of<Record>(() => rAlias)
.JoinQueryOver(r => r.Provider)
.Where(() => rAlias.Provider.TaxID == "000000000")
.Select(x => x.Id);
// modify your method to match the return value
return detachedQueryOver;
}
// in your main method
var detachedQueryOver = QueryOver.Of<Record>()
.WithSubquery
.WhereProperty(r => r.Id
.In(this.ParseQueryExpression(...));
var queryOverList = session.QueryOver<Provider>()
.WithSubquery
.WhereProperty(x => x.Id)
.In(detachedQueryOver.Select(r => r.Provider.Id));
int rowCount = queryOverList.RowCount();
var pageOfItems = queryOverList.Skip(firstResult).Take(count).List();
Well, I hope you are not too confused by that.

Related

NHibernate Linq Query with Projection and Count error

I have the following query:
var list = repositoy.Query<MyClass>.Select(domain => new MyDto()
{
Id = domain.Id,
StringComma = string.Join(",", domain.MyList.Select(y => y.Name))
});
That works great:
list.ToList();
But if I try to get the Count I got an exception:
list.Count();
Exception
NHibernate.Hql.Ast.ANTLR.QuerySyntaxException
A recognition error occurred. [.Count[MyDto](.Select[MyClass,MyDto](NHibernate.Linq.NhQueryable`1[MyClass], Quote((domain, ) => (new MyDto()domain.Iddomain.Name.Join(p1, .Select[MyListClass,System.String](domain.MyList, (y, ) => (y.Name), ), ))), ), )]
Any idea how to fix that without using ToList ?
The point is, that we should NOT call Count() over projection. So this will work
var query = repositoy.Query<MyClass>;
var list = query.Select(domain => new MyDto()
{
Id = domain.Id,
StringComma = string.Join(",", domain.MyList.Select(y => y.Name))
});
var count = query.Count();
When we use ICriteria query, the proper syntax would be
var criteria = ... // criteria, with WHERE, SELECT, ORDER BY...
// HERE cleaned up, just to contain WHERE clause
var totalCountCriteria = CriteriaTransformer.TransformToRowCount(criteria);
So, for Count - use the most simple query, i.e. containing the same JOINs and WHERE part
If you really don't need the results, but only the count, then you shouldn't even bother writing the .Select() clause. Radim's answer as posted is a good way to both get the results and the count, but if your database supports it, use future queries to execute both in the same roundtrip to the database:
var query = repository.Query<MyClass>;
var list = query.Select(domain => new MyDto()
{
Id = domain.Id,
StringComma = string.Join(",", domain.MyList.Select(y => y.Name))
}).ToFuture();
var countFuture = query.Count().ToFutureValue();
int actualCount = countFuture.Value; //queries are actually executed here
Note that there in NH prior to 3.3.3, this would still execute two round-trips (see https://nhibernate.jira.com/browse/NH-3184), but it would work, and if you ever upgrade NH, you get a (minor) performance boost.

NHibernate check for same join in QueryOver

Using QueryOver I am creating a query like this
BulkActionItem bulkActionItemAlias1 = null;
BulkActionItem bulkActionItemAlias2 = null;
var query = GetSession().QueryOver<Student>(() => studentAlias)
.JoinAlias(() => studentAlias.BulkNotifications, () => bulkActionItemAlias1, NHibernate.SqlCommand.JoinType.LeftOuterJoin);
if (query.UnderlyingCriteria.GetCriteriaByAlias("bulkActionItemAlias2") == null
query = query.JoinAlias(() => studentAlias.BulkNotifications, () => bulkActionItemAlias2, NHibernate.SqlCommand.JoinType.LeftOuterJoin);
This will crash because I have the same join twice with different aliases. Is it possible to check if a join already exists on a query, even with a different alias?
I haven't found a built-in way to accomplish this. Typically I use an out parameter with extension methods to keep track of what tables are part of the query. For example:
bool joinedOnBulkNotifications;
BulkNotification notificationAlias = null;
Expression<Func<object>> aliasExpr = () => notificationAlias;
var query = GetSession().QueryOver<Student>(() => studentAlias)
.FilterByBulkNotificationStatus(
someCondition, aliasExpr, out joinedOnBulkNotifications);
public static class QueryExtensions
{
public static IQueryOver<Student, Student> FilterByBulkNotificationStatus(
this IQueryOver<Student, Student> query,
bool someCondition,
Expression<Func<object>> aliasExpr,
out bool joinedOnBulkNotifications)
{
joinedOnBulkNotifications = false;
if (someCondition)
{
joinedOnBulkNotifications = true;
query.JoinAlias(s => s.BulkNotifications, aliasExpr);
}
return query;
}
}
The issue is that you might need to reuse the alias you created later. You might be tempted to pass in a BulkNotification and use that, but this only works if the parameter name matches the name of the variable you pass to the extension method. NHibernate uses the name of the variable to create an alias name, so if these two do not match, you'll get an error. Because of this, you need to wrap the alias in an Expression and use that instead.
This isn't a very clean option, so I hope someone has a better solution.

Querying Raven with Where() only filters against the first 128 documents?

We're using Raven to validate logins so people can get into our site.
What we've found is that if you do this:
// Context is an IDocumentSession
Context.Query<UserModels>()
.SingleOrDefault(u => u.Email.ToLower() == email.ToLower());
The query only filters on the first 128 docs of the documents in
Raven. There are several thousand in our database, so unless your
email happens to be in that first 128 returned, you're out of luck.
None of the Raven samples code or any
sample code I've come across on the net performs any looping using
Skip() and Take() to iterate through the set.
Is this the desired behavior of Raven?
Is it the same behavior even if you use an advanced Lucene Query? ie; Do advanced queries behave any differently?
Is the solution below appropriate? Looks a little ugly. :P
My solution is to loop through the set of all documents until I
encounter a non null result, then I break and return .
public T SingleWithIndex(string indexName, Func<T, bool> where)
{
var pageIndex = 1;
const int pageSize = 1024;
RavenQueryStatistics stats;
var queryResults = Context.Query<T>(indexName)
.Statistics(out stats)
.Customize(x => x.WaitForNonStaleResults())
.Take(pageSize)
.Where(where).SingleOrDefault();
if (queryResults == null && stats.TotalResults > pageSize)
{
for (var i = 0; i < (stats.TotalResults / (pageIndex * pageSize)); i++)
{
queryResults = Context.Query<T>(indexName)
.Statistics(out stats)
.Customize(x => x.WaitForNonStaleResults())
.Skip(pageIndex * pageSize)
.Take(pageSize)
.Where(where).SingleOrDefault();
if (queryResults != null) break;
pageIndex++;
}
}
return queryResults;
}
EDIT:
Using the fix below is not passing query params to my RavenDB instance. Not sure why yet.
Context.Query<UserModels>()
.Where(u => u.Email == email)
.SingleOrDefault();
In the end I am using Advanced Lucene Syntax instead of linq queries and things are working as expected.
RavenDB does not understand SingleOrDefault, so it performs a query without the filter. Your condition is then executed on the result set, but per default Raven only returns the first 128 documents.
Instead, you have to call
Context.Query<UserModels>()
.Where(u => u.Email == email)
.SingleOrDefault();
so the filtering is done by RavenDB/Lucene.

LINQ & Lambda Expressions equivalent of SQL In

Is there a lambda equivalent of IN? I will like to select all the funds with ids either 4, 5 or 6. One way of writing it is:
List fundHistoricalPrices = lionContext.FundHistoricalPrices.Where(fhp => fhp.Fund.FundId == 5 || fhp.Fund.FundId == 6 || fhp.Fund.FundId == 7).ToList();
However, that quickly becomes unmanageable if I need it to match say 100 different fundIds. Can I do something like:
List
fundHistoricalPrices =
lionContext.FundHistoricalPrices.Where(fhp
=> fhp.Fund.FundId in(5,6,7)).ToList();
It's somewhere along these lines, but I can't quite agree with the approach you have taken. But this will do if you really want to do this:
.Where(fhp => new List<int>{5,6,7}.Contains( fhp.Fund.FundId )).ToList();
You may want to construct the List of ids before your LINQ query...
You can use the Contains() method on a collection to get the equivalent to in.
var fundIds = new [] { 5, 6, 7 };
var fundHistoricalPrices = lionContext.FundHistoricalPrices.Where(fhp => fundIds.Contains(fhp.Fund.FundId)).ToList();
You could write an extension method like this :
public static bool In<T>(this T source, params T[] list)
{
if(null==source) throw new ArgumentNullException("source");
return list.Contains(source);
}
Then :
List fundHistoricalPrices = lionContext.FundHistoricalPrices.Where(fhp => fhp.Fund.FundId.In(5,6,7)).ToList();
No, the only similar operator i'm aware of is the Contains() function.
ANother was is to construct your query dynamically by using the predicate builder out of the LINQkit: http://www.albahari.com/nutshell/predicatebuilder.aspx
Example
int[] fundIds = new int[] { 5,6,7};
var predicate = PredicateBuilder.False<FundHistoricalPrice>();
foreach (int id in fundIds)
{
int tmp = id;
predicate = predicate.Or (fhp => fhp.Fund.FundId == tmp);
}
var query = lionContext.FundHistoricalPrices.Where (predicate);

LINQ: Count number of true booleans in multiple columns

I'm using LINQ to SQL to speed up delivery of a project, which it's really helping with. However I'm struggling with a few things I'm used to doing with manual SQL.
I have a LINQ collection containing three columns, each containing a boolean value representing whether an e-mail, mobile or address is availble.
I want to write a LINQ query to give me an count of trues for each column, so how many rows in the e-mail column are set to true (and the same for the other two columns)
If you need a single object containing the results:
var result = new {
HasEmailCount = list.Count(x => x.HasEmail),
HasMobileCount = list.Count(x => x.HasMobile),
HasAddressCount = list.Count(x => x.HasAddress)
};
Or using the aggregate function:
class Result
{
public int HasEmail;
public int HasAddress;
public int HasMobile;
}
var x = data.Aggregate(
new Result(),
(res, next) => {
res.HasEmail += (next.HasEmail ? 0 : 1);
res.HasAddress += (next.HasAddress ? 0 : 1);
res.HasMobile += (next.HasMobile ? 0 : 1);
return res;
}
);
x is of Type Result and contains the aggregated information. This can also be used for more compelx aggregations.
var mobileCount = myTable.Count(user => user.MobileAvailable);
And so on for the other counts.
You can do it like so:
var emailCount = yourDataContext.YourTable.Count(r => r.HasEmail);
etc.