How do I flatten a hierarchy in LINQ to Entities? - sql

I have a class Org, which has ParentId (which points to a Consumer) and Orgs properties, to enable a hierarchy of Org instances. I also have a class Customer, which has a OrgId property. Given any Org instance, named Owner, how can I retrieve all Customer instances for that org? That is, before LINQ I would do a 'manual' traversal of the Org tree with Owner as its root. I'm sure something simpler exists though.
Example: If I have a root level Org called 'Film', with Id '1', and sub-Org called 'Horror' with ParentId of '1', and Id of 23, I want to query for all Customers under Film, so I must get all customers with OrgId's of both 1 and 23.

Linq won't help you with this but SQL Server will.
Create a CTE to generate a flattened list of Org Ids, something like:
CREATE PROCEDURE [dbo].[OrganizationIds]
#rootId int
AS
WITH OrgCte AS
(
SELECT OrganizationId FROM Organizations where OrganizationId = #rootId
UNION ALL
SELECT parent.OrganizationId FROM Organizations parent
INNER JOIN OrgCte child ON parent.Parent_OrganizationId = Child.OrganizationId
)
SELECT * FROM OrgCte
RETURN 0
Now add a function import to your context mapped to this stored procedure. This results in a method on your context (the returned values are nullable int since the original Parent_OrganizationId is declared as INT NULL):
public partial class TestEntities : ObjectContext
{
public ObjectResult<int?> OrganizationIds(int? rootId)
{
...
Now you can use a query like this:
// get all org ids for specific root. This needs to be a separate
// query or LtoE throws an exception regarding nullable int.
var ids = OrganizationIds(2);
// now find all customers
Customers.Where (c => ids.Contains(c.Organization.OrganizationId)).Dump();

Unfortunately, not natively in Entity Framework. You need to build your own solution. Probably you need to iterate up to the root. You can optimize this algorithm by asking EF to get a certain number of parents in one go like this:
...
select new { x.Customer, x.Parent.Customer, x.Parent.Parent.Customer }
You are limited to a statically fixed number of parent with this approach (here: 3), but it will save you 2/3 of the database roundtrips.
Edit: I think I did not get your data model right but I hope the idea is clear.
Edit 2: In response to your comment and edit I have adapted the approach like this:
var rootOrg = ...;
var orgLevels = new [] {
select o from db.Orgs where o == rootOrg, //level 0
select o from db.Orgs where o.ParentOrg == rootOrg, //level 1
select o from db.Orgs where o.ParentOrg.ParentOrg == rootOrg, //level 2
select o from db.Orgs where o.ParentOrg.ParentOrg.ParentOrg == rootOrg, //level 3
};
var setOfAllOrgsInSubtree = orgLevels.Aggregate((a, b) => a.Union(b)); //query for all org levels
var customers = from c in db.Customers where setOfAllOrgsInSubtree.Contains(c.Org) select c;
Notice that this only works for a bounded maximum tree depth. In practice, this is usually the case (like 10 or 20).
Performance will not be great but it is a LINQ-to-Entities-only solution.

Related

Union between optional and non-optional tables

I have two queries that select records where a union needs to be taken, one of which is a left join and one of which is a regular (i.e. inner) join.
Here's the left join case:
def regularAccountRecords = for {
(customer, account) <- customers joinLeft accounts on (_.accountId === _.accountId) // + some other special conditions
} yield (customer, account)
Here's the regular join case:
def specialAccountRecords = for {
(customer, account) <- customers join accounts on (_.accountId === _.accountId) // + some other special conditions
} yield (customer, account)
Now I want to take a union of the two record sets:
regularAccountRecords ++ specialAccountRecords
Obviously this doesn't work because in the regular join case it returns Query[(Customer, Account),...] and in the left join case it returns Query[(Customer, Rep[Option[Account]]),...] and this results in a Type Mismatch error.
Now, If this were a regular column type (e.g. Rep[String]) I could convert it to an optional via the ? operator (i.e. record.?) and get Rep[Option[String]] but using it on a table (i.e. the accounts table) causes:
Error:(62, 85) value ? is not a member of com.test.Account
How do I work around this issue and do the union properly?
Okay, looks like this is what the '?' projection is for but I didn't realize it because I disabled the optionEnabled option in the Codegen. Here's what your codegen extension is supposed to look like:
class MyCodegen extends SourceCodeGenerator(inputModel) {
override def TableClass = new TableClassDef {
override def optionEnabled = true
}
}
Alternatively, you can use implicit classes to tack this thing onto the generated TableClass yourself. Here is how that would look:
implicit class AccountExtensions(account:Account) {
def ? = (Rep.Some(account.id), account.name).shaped.<>({r=>r._1.map(_=> Account.tupled((r._2, r._1.get)))}, (_:Any) => throw new Exception("Inserting into ? projection not supported."))
}
NOTE: be sure to check the field ordering, depending on how this
projection is done, the union query might put the ID field in the wrong
place in the output, use
println(query.result.statements.headOption) to debug the output
SQL to be sure.
Once you do that, you will be able to use account.? in the yield statement:
def specialAccountRecords = for {
(customer, account) <- customers join accounts on (_.accountId === _.accountId)
} yield (customer, account.?)
...and then you will be able to unionize the tables correctly
regularAccountRecords ++ specialAccountRecords
I really wish the Slick people would put a note on how the '?' projection is useful in the documentation beyond the vague statement 'useful for outer joins'.

NHibernate: Why does Linq First() force only one item in all child and grandchild collections with FetchMany()

Domain Model
I've got a canonical Domain of a Customer with many Orders, with each Order having many OrderItems:
Customer
public class Customer
{
public Customer()
{
Orders = new HashSet<Order>();
}
public virtual int Id {get;set;}
public virtual ICollection<Order> Orders {get;set;}
}
Order
public class Order
{
public Order()
{
Items = new HashSet<OrderItem>();
}
public virtual int Id {get;set;}
public virtual Customer Customer {get;set;}
}
OrderItems
public class OrderItem
{
public virtual int Id {get;set;}
public virtual Order Order {get;set;}
}
Problem
Whether mapped with FluentNHibernate or hbm files, I run two separate queries, that are identical in their Fetch() syntax, with the exception of one including the .First() extension method.
Returns expected results:
var customer = this.generator.Session.Query<Customer>()
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items).ToList()[0];
Returns only a single item in each collection:
var customer = this.generator.Session.Query<Customer>()
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items).First();
I think I understand what's going on here, which is that the .First() method is being applied to each of the preceding statements, rather than just to the initial .Where() clause. This seems incorrect behavior to me, given the fact that First() is returning a Customer.
Edit 2011-06-17
After further research and thinking, I believe that depending on my mapping, there are two outcomes to this Method Chain:
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items);
NOTE: I don't think I can get subselect behavior, since I'm not using HQL.
When the mapping is fetch="join" I should get a cartesian product between the Customer, Order and OrderItem tables.
When the mapping is fetch="select" I should get a query for Customer, and then multiple queries each for Orders and OrderItems.
How this plays out with adding the First() method to the chain is where I lose track of what should be happening.
The SQL Query that get's issued is the traditional left-outer-join query, with select top (#p0) in front.
The First() method is translated into SQL (T-SQL at least) as SELECT TOP 1 .... Combined with your join fetching, this will return a single row, containing one customer, one order for that customer and one item for the order. You might consider this a bug in Linq2NHibernate, but as join fetching is rare (and I think you're actually hurting your performance pulling the same Customer and Order field values across the network as part of the row for each Item) I doubt the team will fix it.
What you want is a single Customer, then all Orders for that customer and all Items for all those Orders. That happens by letting NHibernate run SQL that will pull one full Customer record (which will be a row for each Order Line) and construct the Customer object graph. Turning the Enumerable into a List and then getting the first element works, but the following will be slightly faster:
var customer = this.generator.Session.Query<Customer>()
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items)
.AsEnumerable().First();
the AsEnumerable() function forces evaluation of the IQueryable created by Query and modified with the other methods, spitting out an in-memory Enumerable, without slurping it into a concrete List (NHibernate can, if it wishes, simply pull enough info out of the DataReader to create one full top-level instance). Now, the First() method is no longer applied to the IQueryable to be translated to SQL, but it is instead applied to an in-memory Enumerable of the object graphs, which after NHibernate has done its thing, and given your Where clause, should be zero or one Customer record with a hydrated Orders collection.
Like I said, I think you're hurting yourself using join fetching. Each row contains the data for the Customer and the data for the Order, joined to each distinct Line. That is a LOT of redundant data, which I think will cost you more than even an N+1 query strategy.
The best way I can think of to handle this is one query per object to retrieve that object's children. It would look like this:
var session = this.generator.Session;
var customer = session.Query<Customer>()
.Where(c => c.CustomerID == id).First();
customer.Orders = session.Query<Order>().Where(o=>o.CustomerID = id).ToList();
foreach(var order in customer.Orders)
order.Items = session.Query<Item>().Where(i=>i.OrderID = order.OrderID).ToList();
This requires a query for each Order, plus two at the Customer level, and will return no duplicate data. This will perform far better than a single query returning a row containing every field of the Customer and Order along with each Item, and also better than sending a query per Item plus a query per Order plus a query for the Customer.
I'd like to update the answer with my found so that could help anybody else with the same problem.
Since you are querying the entity base on their ID, you can use .Single instead of .First or .AsEnumerable().First():
var customer = this.generator.Session.Query<Customer>()
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items).Single();
This will generate a normal SQL query with where clause and without the TOP 1.
In other situation, if the result has more than one Customer, exception will be thrown so it won't help if you really need the first item of a series based on condition. You have to use 2 queries, one for the first Customer and let the lazy load do the second one.

NHibernate Nested query in FROM clause

I’m having a problem with translating T-SQL query into Nhibernate query – or writing query that will return same results but will be properly written in NH query language (HQL, Criteria, QueryOver, LINQ – I don’t truly care).
I would like to execute similar query from NHibernate:
SELECT
lic.RegNo,
lic.ReviewStatus
FROM
Licence lic
WHERE
lic.RegNo IN
(
SELECT
grouped.RegNo
FROM
(
SELECT
g.[Type],
g.Number,
MAX(g.Iteration) AS [Iteration],
MAX(g.RegNo) AS [RegNo]
FROM
Licence g
GROUP BY
g.[Type],
g.Number
) as grouped
)
ORDER BY
lic.RegNo desc
It returns the top most licenses and fetch their review status if exists. RegNo is created from Type, Number and Iteration (pattern: {0}{1:0000}-{2:00}). Each license can have multiply iterations and some of them can contains ReviewStatus, for instance:
W0004-01 NULL
W0001-03 1
P0004-02 3
P0001-02 4
If iteration part is greater than 1 it means that there are multiply iterations (n) for specific licence.
I’ve manage to create NH query by going twice to database:
LicenceInfoViewModel c = null;
var grouped = session.QueryOver<Licence>()
.SelectList(l => l
.SelectGroup(x => x.Type)
.SelectGroup(x => x.Number)
.SelectMax(x => x.Iteration)
.SelectMax(x => x.RegNo).WithAlias(() => c.RegNo)
).TransformUsing(Transformers.AliasToBean<LicenceInfoViewModel>())
.Future<LicenceInfoViewModel>();
var proper = session.QueryOver<Licence>()
.Select(x => x.RegNo, x => x.ReviewStatus)
.WhereRestrictionOn(x => x.RegNo)
.IsIn(grouped.Select(x => x.RegNo).ToArray())
.TransformUsing(Transformers.AliasToBean<LicenceInfoViewModel>())
.List<LicenceInfoViewModel>();
// ...
public class LicenceInfoViewModel
{
public string RegNo { get; set; }
public LicReviewStatus? ReviewStatus { get; set; }
}
public enum LicReviewStatus
{
InProgress,
Submitted,
Validated,
RequestForInformation,
DecissionIssued
}
However this solution is not good as it require to download all grouped licences from database, and there could be a thousands of them.
Is there a better way to write this query or is there a way to translate provided above T-SQL query into NHibernate?
adding nhibernate and hibernate tags as IMO if this can be done in hibernate it should be easily translated into nh
I don't think that SQL does what you think it does. Iteration is not used for anything.
In any case, it seems unnecessary. You can change the WHERE to the following, and you'll have both valid SQL and HQL:
lic.RegNo IN
(
SELECT
MAX(g.RegNo)
FROM
Licence g
GROUP BY
g.Type,
g.Number
)

How can I recreate this complex SQL Query using NHibernate QueryOver?

Imagine the following (simplified) database layout:
We have many "holiday" records that relate to going to a particular Accommodation on a certain date etc.
I would like to pull from the database the "best" holiday going to each accommodation (i.e. lowest price), given a set of search criteria (e.g. duration, departure airport etc).
There will be multiple records with the same price, so then we need to choose by offer saving (descending), then by departure date ascending.
I can write SQL to do this that looks like this (I'm not saying this is necessarily the most optimal way):
SELECT *
FROM Holiday h1 INNER JOIN (
SELECT h2.HolidayID,
h2.AccommodationID,
ROW_NUMBER() OVER (
PARTITION BY h2.AccommodationID
ORDER BY OfferSaving DESC
) AS RowNum
FROM Holiday h2 INNER JOIN (
SELECT AccommodationID,
MIN(price) as MinPrice
FROM Holiday
WHERE TradeNameID = 58001
/*** Other Criteria Here ***/
GROUP BY AccommodationID
) mp
ON mp.AccommodationID = h2.AccommodationID
AND mp.MinPrice = h2.price
WHERE TradeNameID = 58001
/*** Other Criteria Here ***/
) x on h1.HolidayID = x.HolidayID and x.RowNum = 1
As you can see, this uses a subquery within another subquery.
However, for several reasons my preference would be to achieve this same result in NHibernate.
Ideally, this would be done with QueryOver - the reason being that I build up the search criteria dynamically and this is much easier with QueryOver's fluent interface. (I had started out hoping to use NHibernate Linq, but unfortunately it's not mature enough).
After a lot of effort (being a relative newbie to NHibernate) I was able to re-create the very inner query that fetches all accommodations and their min price.
public IEnumerable<HolidaySearchDataDto> CriteriaFindAccommodationFromPricesForOffers(IEnumerable<IHolidayFilter<PackageHoliday>> filters, int skip, int take, out bool hasMore)
{
IQueryOver<PackageHoliday, PackageHoliday> queryable = NHibernateSession.CurrentFor(NHibernateSession.DefaultFactoryKey).QueryOver<PackageHoliday>();
queryable = queryable.Where(h => h.TradeNameId == website.TradeNameID);
var accommodation = Null<Accommodation>();
var accommodationUnit = Null<AccommodationUnit>();
var dto = Null<HolidaySearchDataDto>();
// Apply search criteria
foreach (var filter in filters)
queryable = filter.ApplyFilter(queryable, accommodationUnit, accommodation);
var query1 = queryable
.JoinQueryOver(h => h.AccommodationUnit, () => accommodationUnit)
.JoinQueryOver(h => h.Accommodation, () => accommodation)
.SelectList(hols => hols
.SelectGroup(() => accommodation.Id).WithAlias(() => dto.AccommodationId)
.SelectMin(h => h.Price).WithAlias(() => dto.Price)
);
var list = query1.OrderByAlias(() => dto.Price).Asc
.Skip(skip).Take(take+1)
.Cacheable().CacheMode(CacheMode.Normal).List<object[]>();
// Cacheing doesn't work this way...
/*.TransformUsing(Transformers.AliasToBean<HolidaySearchDataDto>())
.Cacheable().CacheMode(CacheMode.Normal).List<HolidaySearchDataDto>();*/
hasMore = list.Count() == take;
var dtos = list.Take(take).Select(h => new HolidaySearchDataDto
{
AccommodationId = (string)h[0],
Price = (decimal)h[1],
});
return dtos;
}
So my question is...
Any ideas on how to achieve what I want using QueryOver, or if necessary Criteria API?
I'd prefer not to use HQL but if it is necessary than I'm willing to see how it can be done with that too (it makes it harder (or more messy) to build up the search criteria though).
If this just isn't doable using NHibernate, then I could use a SQL query. In which case, my question is can the SQL be improved/optimised?
I have manage to achieve such dynamic search criterion by using Criteria API's. Problem I ran into was duplicates with inner and outer joins and especially related to sorting and pagination, and I had to resort to using 2 queries, 1st query for restriction and using the result of 1st query as 'in' clause in 2nd creteria.

NHibernate: how to retrieve an entity that "has" all entities with a certain predicate in Criteria

I have an Article with a Set of Category.
How can I query, using the criteria interface, for all Articles that contain all Categories with a certain Id?
This is not an "in", I need exclusively those who have all necessary categories - and others. Partial matches should not come in there.
Currently my code is failing with this desperate attempt:
var c = session.CreateCriteria<Article>("a");
if (categoryKeys.HasItems())
{
c.CreateAlias("a.Categories", "c");
foreach (var key in categoryKeys)
c.Add(Restrictions.Eq("c", key)); //bogus, I know!
}
Use the "IN" restriction, but supplement to ensure that the number of category matches is equal to the count of all the categories you're looking for to make sure that all the categories are matched and not just a subset.
For an example of what I mean, you might want to take a look at this page, especially the "Intersection" query under the "Toxi solution" heading. Replace "bookmarks" with "articles" and "tags" with "categories" to map that back to your specific problem. Here's the SQL that they show there:
SELECT b.*
FROM tagmap bt, bookmark b, tag t
WHERE bt.tag_id = t.tag_id
AND (t.name IN ('bookmark', 'webservice', 'semweb'))
AND b.id = bt.bookmark_id
GROUP BY b.id
HAVING COUNT( b.id )=3
I believe you can also represent this using a subquery that may be easier to represent with the Criteria API
SELECT Article.Id
FROM Article
INNER JOIN (
SELECT ArticleId, count(*) AS MatchingCategories
FROM ArticleCategoryMap
WHERE CategoryId IN (<list of category ids>)
GROUP BY ArticleId
) subquery ON subquery.ArticleId = EntityTable.Id
WHERE subquery.MatchingCategories = <number of category ids in list>
I'm not 100% sure, but I think query by example may be what you want.
Assuming that Article to Category is a one-to-many relationship and that the Category has a many-to-one property called Article here is a VERY dirty way of doing this (I am really not proud of this but it works)
List<long> catkeys = new List<long>() { 4, 5, 6, 7 };
if (catkeys.Count == 0)
return;
var cr = Session.CreateCriteria<Article>("article")
.CreateCriteria("Categories", "cat0")
.Add(Restrictions.Eq("cat0.Id", catkeys[0]));
if (catkeys.Count > 1)
{
for (int i = 1; i < catkeys.Count; i++)
{
cr = cr.CreateCriteria("Article", "a" + i)
.CreateCriteria("Categories", "cat" + i)
.Add(Restrictions.Eq("cat" + i + ".Id", catkeys[i]));
}
}
var results = cr.List<Article>();
What it does is to re-join the relationship over and over again guaranteeing you the AND between category Ids. It should be very slow query especially if the list of Ids gets big.
I am offering this solution as NOT a recommended way but at least you can have something working while looking for a proper one.