How to simplify this LINQ to Entities Query to make a less horrible SQL statement from it? (contains Distinct,GroupBy and Count) - sql

I have this SQL expression:
SELECT Musclegroups.Name, COUNT(DISTINCT Workouts.WorkoutID) AS Expr1
FROM Workouts INNER JOIN
Series ON Workouts.WorkoutID = Series.WorkoutID INNER JOIN
Exercises ON Series.ExerciseID = Exercises.ExerciseID INNER JOIN
Musclegroups ON Musclegroups.MusclegroupID = Exercises.MusclegroupID
GROUP BY Musclegroups.Name
Since Im working on a project which uses EF in a WCF Ria LinqToEntitiesDomainService, I have to query this with LINQ (If this isn't a must then pls inform me).
I made this expression:
var WorkoutCountPerMusclegroup = (from s in ObjectContext.Series1
join w in ObjectContext.Workouts on s.WorkoutID equals w.WorkoutID
where w.UserID.Equals(userid) && w.Type.Equals("WeightLifting")
group s by s.Exercise.Musclegroup into g
select new StringKeyIntValuePair
{
TestID = g.Select(n => n.Exercise.MusclegroupID).FirstOrDefault(),
Key = g.Select(n => n.Exercise.Musclegroup.Name).FirstOrDefault(),
Value = g.Select(n => n.WorkoutID).Distinct().Count()
});
The StringKeyIntValuePair is just a custom Entity type I made so I can send down the info to the Silverlight client. Also this is why I need to set an "TestID" for it, as it is an entity and it needs one.
And the problem is, that this linq query produces this horrible SQL statement:
http://pastebay.com/144532
I suppose there is a better way to query this information, a better linq expression maybe. Or is it possible to just query with plain SQL somehow?
EDIT:
I realized that the TestID is unnecessary because the other property named "Key" (the one on which Im grouping) becomes the key of the group, so it will be a key also. And after this, my query looks like this:
var WorkoutCountPerMusclegroup = (from s in ObjectContext.Series1
join w in ObjectContext.Workouts on s.WorkoutID equals w.WorkoutID
where w.UserID.Equals(userid) && w.Type.Equals("WeightLifting")
group w.WorkoutID by s.Exercise.Musclegroup.Name into g
select new StringKeyIntValuePair
{
Key = g.Key,
Value = g.Select(n => n).Distinct().Count()
});
This produces the following SQL: http://pastebay.com/144545
This seems far better then the previous sql statement of the half-baked linq query.
But is this good enough? Or this is the boundary of LinqToEntities capabilities, and if I want even more clear sql, I should make another DomainService which operates with LinqToSQL or something else?
Or the best way would be using a stored procedure, that returns Rowsets? If so, is there a best practice to do this asynchronously, like a simple WCF Ria DomainService query?

I would like to know best practices as well.
Compiling of lambda expression linq can take a lot of time (3–30s), especially using group by and then FirstOrDefault (for left inner joins meaning only taking values from the first row in the group).
The generated sql excecution might not be that bad but the compilation using DbContext which cannot be precompiled with .NET 4.0.
As an example 1 something like:
var q = from a in DbContext.A
join b ... into bb from b in bb.DefaultIfEmtpy()
group new { a, b } by new { ... } into g
select new
{
g.Key.Name1
g.Sum(p => p.b.Salary)
g.FirstOrDefault().b.SomeDate
};
Each FirstOrDefault we added in one case caused +2s compile time which added up 3 times = 6s only to compile not load data (which takes less than 500ms). This basically destroys your application's usability. The user will be waiting many times for no reason.
The only way we found so far to speed up the compilation is to mix lambda expression with object expression (might not be the correct notation).
Example 2: refactoring of previous example 1.
var q = (from a in DbContext.A
join b ... into bb from b in bb.DefaultIfEmtpy()
select new { a, b })
.GroupBy(p => new { ... })
.Select(g => new
{
g.Key.Name1
g.Sum(p => p.b.Salary)
g.FirstOrDefault().b.SomeDate
});
The above example did compile a lot faster than example 1 in our case but still not fast enough so the only solution for us in response-critical areas is to revert to native SQL (to Entities) or using views or stored procedures (in our case Oracle PL/SQL).
Once we have time we are going to test if precompilation works in .NET 4.5 and/or .NET 5.0 for DbContext.
Hope this helps and we can get other solutions.

Related

Many to many query joins in aqueduct

I have A -> AB <- B many to many relationship between 2 ManagedObjects (A and B), where AB is the junction table.
When querying A from db, how do i join B values to AB joint objects?
Query<A> query = await Query<A>(context)
..join(set: (a) => a.ab);
It gives me a list of A objects which contains AB joint objects, but AB objects doesn't include full B objects, but only b.id (not other fields from class B).
Cheers
When you call join, a new Query<T> is created and returned from that method, where T is the joined type. So if a.ab is of type AB, Query<A>.join returns a Query<AB> (it is linked to the original query internally).
Since you have a new Query<AB>, you can configure it like any other query, including initiating another join, adding sorting descriptors and where clauses.
There are some stylistic syntax choices to be made. You can condense this query into a one-liner:
final query = Query<A>(context)
..join(set: (a) => a.ab).join(object: (ab) => ab.b);
final results = await query.fetch();
This is OK if the query remains as-is, but as you add more criteria to a query, the difference between the dot operator and the cascade operator becomes harder to track. I often pull the join query into its own variable. (Note that you don't call any execution methods on the join query):
final query = Query<A>(context);
final join = query.join(set: (a) => a.ab)
..join(object: (ab) => ab.b);
final results = await query.fetch();

Has anyone noticed that EF Core 1.0 2015.1 has made the queries very inefficient

After upgrading to Asp.Net Core 2015.1 I have noticed that a lot of EF queries have become a lot slower to run.
I have done some investigation and found that a lot of the queries with where filters now get evaluated in Code, rather than passing the filters to SQL as part of a where clause to run with the query.
We ended up having to re-write a number of our queries as stored procedures to get back performance. Note these used to be efficient prior to the 2015.1 release. Something obviously was changed, and a lot of queries are doing select all queries on a table and then filtering the data in code. This approach is terrible for performance, e.g. reading a table with lots of rows, to filter everything but maybe 2 rows.
I have to ask what changed, and whether anyone else is seeing the same thing?
For example: I have a ForeignExchange table along with a ForeignExchangeRate table which are linked via ForeignExchangeid = ForeignExchangeRate.ForeignExchangeId
await _context.ForeignExchanges
.Include(x => x.ForeignExchangeRates)
.Select(x => new ForeignExchangeViewModel
{
Id = x.Id,
Code = x.Code,
Name = x.Name,
Symbol = x.Symbol,
CountryId = x.CountryId,
CurrentExchangeRate = x.ForeignExchangeRates
.FirstOrDefault(y => (DateTime.Today >= y.ValidFrom)
&& (y.ValidTo == null || y.ValidTo >= DateTime.Today)).ExchangeRate.ToFxRate(),
HistoricalExchangeRates = x.ForeignExchangeRates
.OrderByDescending(y => y.ValidFrom)
.Select(y => new FxRate
{
ValidFrom = y.ValidFrom,
ValidTo = y.ValidTo,
ExchangeRate = y.ExchangeRate.ToFxRate(),
}).ToList()
})
.FirstOrDefaultAsync(x => x.Id == id);
And I use this to get the data for editing a foreign exchange rate
So the SQL generated is not as expected. It generates the following 2 SQL statements to get the data
SELECT TOP(1) [x].[ForeignExchangeId], [x].[ForeignCurrencyCode], [x].[CurrencyName], [x].[CurrencySymbol], [x].[CountryId], (
SELECT TOP(1) [y].[ExchangeRate]
FROM [ForeignExchangeRate] AS [y]
WHERE ((#__Today_0 >= [y].[ValidFrom]) AND ([y].[ValidTo] IS NULL OR ([y]. [ValidTo] >= #__Today_1))) AND ([x].[ForeignExchangeId] = [y].[ForeignExchangeId])
)FROM [ForeignExchange] AS [x]
WHERE [x].[ForeignExchangeId] = #__id_2
and
SELECT [y0].[ForeignExchangeId], [y0].[ValidFrom], [y0].[ValidTo], [y0].[ExchangeRate]
FROM [ForeignExchangeRate] AS [y0]
ORDER BY [y0].[ValidFrom] DESC
The second query is the one that causes the slowness. If the table has many rows, then it essentially gets the whole table and filters the data in code
This has changed in the latest release as this used to work in the RC versions of EF
One other query I used to have was the following
return await _context.CatchPlans
.Where(x => x.FishReceiverId == fishReceiverId
&& x.FisherId == fisherId
&& x.StockId == stockId
&& x.SeasonStartDate == seasonStartDate
&& x.EffectiveDate >= asAtDate
&& x.BudgetType < BudgetType.NonQuotaed)
.OrderBy(x => x.Priority)
.ThenBy(x => x.BudgetType)
.ToListAsync();
and this query ended up doing a Table read (the entire table which was in the tens of thousands of rows) to get a filter subset of between 2 and 10 records. Very inefficient. This was one query I had to replace with a stored procedure. Reduced from approx 1.5-3.0 seconds down to milliseconds. And note this used to run efficiently before the upgrade
This is a known issue on EF core 1.0.The solution right now is to convert all your critical queries to sync one.The problem is on Async queries right now.They'll sort out this issue on EF core 1.1.0 version.But it has not been released yet.
Here is the Test done by the EF core dev team member :
You can find out more info here :EF Core 1.0 RC2: async queries so much slower than sync
Another suggestion I would like to do.That is try your queries with .AsNoTracking().That too will improve the query performance.
.AsNoTracking()
Sometimes you may want to get entities back from a query but not have
those entities be tracked by the context. This may result in better
performance when querying for large numbers of entities in read-only
scenarios.

How can I convert the following SQL query to run in entity framework?

I am new to entity framework and learning making queries. Can anyone please help me how can I convert the following SQL query to run in entity framework?
select max(isnull(TInvoice.InvoiceNr, 0)) + 1
from TInvoice inner join TOrders
on TInvoice.OrderId = TOrders.OrderId
where TOrders.ClientFirmId = 1
As comments have said, without the data model it is hard to be exact.
Would really need to see how you have defined your relations in your data model.
I guess from first read my first impression is something along the lines of:
int max = context.TInvoice.Where(x => x.TOrders.ClientFirmId == 1).Max(x => x.InvoiceNr);

Creating a Rails 3 scope that joins to a subquery

First off, I'm a Ruby/Rails newbie, so I apologize if this question is basic.
I've got a DB that (among other things) looks like this:
organizations { id, name, current_survey_id }
surveys { id, organization_id }
responses { id, survey_id, question_response_integer }
I'm trying to create a scope method that adds the average of the current survey answers to a passed-in Organization relation. In other words, the scope that's getting passed into the method would generate SQL that looks like more-or-less like this:
select * from organizations
And I'd like the scope, after it gets processed by my lambda, to generate SQL that looks like this:
select o.id, o.name, cs.average_responses
from organizations o join
(select r.id, avg(r.question_response_integer) as average_responses
from responses r
group by r.id) cs on cs.id = o.current_survey_id
The best I've got is something like this:
current_survey_average: lambda do |scope, sort_direction|
average_answers = Responses.
select("survey_id, avg(question_response_integer) as average_responses").
group("survey_id")
scope.joins(average_answers).order("average_responses #{sort_direction}")
end
That's mostly just a stab in the dark - among other things, it doesn't specify how the scope could be expected to join to average_answers - but I haven't been able to find any documentation about how to do that sort of join, and I'm running out of things to try.
Any suggestions?
EDIT: Thanks to Sean Hill for the answer. Just to have it on record, here's the code I ended up going with:
current_survey_average: lambda do |scope, sort_direction|
scope_table = scope.arel.froms.first.name
query = <<-QUERY
inner join (
select r.survey_id, avg(r.question_response_integer) as average_responses
from responses r
group by r.survey_id
) cs
on cs.survey_id = #{scope_table}.current_survey_id
QUERY
scope.
joins(query).
order("cs.average_responses #{sort_direction}")
end
That said, I can see the benefit of putting the averaged_answers scope directly onto the Responses class - so I may end up doing that.
I have not been able to test this, but I think the following would work, either as-is or with a little tweaking.
class Response < ActiveRecord::Base
scope :averaged, -> { select('r.id, avg(r.question_response_integer) as average_responses').group('r.id') }
scope :current_survey_average, ->(incoming_scope, sort_direction) do
scope_table = incoming_scope.arel.froms.first.name
query = <<-QUERY
INNER JOIN ( #{Arel.sql(averaged.to_sql)} ) cs
ON cs.id = #{scope_table}.current_survey_id
QUERY
incoming_scope.joins(query).order("average_responses #{sort_direction}")
end
end
So what I've done here is that I have split out the inner query into another scope called averaged. Since you do not know which table the incoming scope in current_survey_average is coming from, I got the scope table name via scope.arel.froms.first.name. Then I created a query string that uses the averaged scope and joined it using the scope_table variable. The rest is pretty self-explanatory.
If you do know that the incoming scope will always be from the organizations table, then you don't need the extra scope_table variable. You can just hardcode it into the join query string.
I would make one suggestion. If you do not have control over sort_direction, then I would not directly input that into the order string.

How can I recreate this complex SQL Query using NHibernate QueryOver?

Imagine the following (simplified) database layout:
We have many "holiday" records that relate to going to a particular Accommodation on a certain date etc.
I would like to pull from the database the "best" holiday going to each accommodation (i.e. lowest price), given a set of search criteria (e.g. duration, departure airport etc).
There will be multiple records with the same price, so then we need to choose by offer saving (descending), then by departure date ascending.
I can write SQL to do this that looks like this (I'm not saying this is necessarily the most optimal way):
SELECT *
FROM Holiday h1 INNER JOIN (
SELECT h2.HolidayID,
h2.AccommodationID,
ROW_NUMBER() OVER (
PARTITION BY h2.AccommodationID
ORDER BY OfferSaving DESC
) AS RowNum
FROM Holiday h2 INNER JOIN (
SELECT AccommodationID,
MIN(price) as MinPrice
FROM Holiday
WHERE TradeNameID = 58001
/*** Other Criteria Here ***/
GROUP BY AccommodationID
) mp
ON mp.AccommodationID = h2.AccommodationID
AND mp.MinPrice = h2.price
WHERE TradeNameID = 58001
/*** Other Criteria Here ***/
) x on h1.HolidayID = x.HolidayID and x.RowNum = 1
As you can see, this uses a subquery within another subquery.
However, for several reasons my preference would be to achieve this same result in NHibernate.
Ideally, this would be done with QueryOver - the reason being that I build up the search criteria dynamically and this is much easier with QueryOver's fluent interface. (I had started out hoping to use NHibernate Linq, but unfortunately it's not mature enough).
After a lot of effort (being a relative newbie to NHibernate) I was able to re-create the very inner query that fetches all accommodations and their min price.
public IEnumerable<HolidaySearchDataDto> CriteriaFindAccommodationFromPricesForOffers(IEnumerable<IHolidayFilter<PackageHoliday>> filters, int skip, int take, out bool hasMore)
{
IQueryOver<PackageHoliday, PackageHoliday> queryable = NHibernateSession.CurrentFor(NHibernateSession.DefaultFactoryKey).QueryOver<PackageHoliday>();
queryable = queryable.Where(h => h.TradeNameId == website.TradeNameID);
var accommodation = Null<Accommodation>();
var accommodationUnit = Null<AccommodationUnit>();
var dto = Null<HolidaySearchDataDto>();
// Apply search criteria
foreach (var filter in filters)
queryable = filter.ApplyFilter(queryable, accommodationUnit, accommodation);
var query1 = queryable
.JoinQueryOver(h => h.AccommodationUnit, () => accommodationUnit)
.JoinQueryOver(h => h.Accommodation, () => accommodation)
.SelectList(hols => hols
.SelectGroup(() => accommodation.Id).WithAlias(() => dto.AccommodationId)
.SelectMin(h => h.Price).WithAlias(() => dto.Price)
);
var list = query1.OrderByAlias(() => dto.Price).Asc
.Skip(skip).Take(take+1)
.Cacheable().CacheMode(CacheMode.Normal).List<object[]>();
// Cacheing doesn't work this way...
/*.TransformUsing(Transformers.AliasToBean<HolidaySearchDataDto>())
.Cacheable().CacheMode(CacheMode.Normal).List<HolidaySearchDataDto>();*/
hasMore = list.Count() == take;
var dtos = list.Take(take).Select(h => new HolidaySearchDataDto
{
AccommodationId = (string)h[0],
Price = (decimal)h[1],
});
return dtos;
}
So my question is...
Any ideas on how to achieve what I want using QueryOver, or if necessary Criteria API?
I'd prefer not to use HQL but if it is necessary than I'm willing to see how it can be done with that too (it makes it harder (or more messy) to build up the search criteria though).
If this just isn't doable using NHibernate, then I could use a SQL query. In which case, my question is can the SQL be improved/optimised?
I have manage to achieve such dynamic search criterion by using Criteria API's. Problem I ran into was duplicates with inner and outer joins and especially related to sorting and pagination, and I had to resort to using 2 queries, 1st query for restriction and using the result of 1st query as 'in' clause in 2nd creteria.