Got some strange behaviour when using Take() with join. Suppose the following example:
Comment comment = null;
var persons = _repository
.QueryOver()
.Left.JoinAlias(x => x.Comments, () => comment)
.Where(x => x.Age > 20)
.Take(5)
.Future()
.ToList();
Well I'd expect that 5 persons are present in the array, and each of them has a list of N comments.
But, the result givest 5 persons, with a maximum of 5 comments.
Why is .Take(5) also restricting the number of comments?
How to achieve the desired result?
The point here is a difference in our need of paging and its implementation.
while we would expect 5 root elements to be returned,
the result is converted into 5 rows, 5 selected rows :
Some clue could be found in this Q & A: NHibernate QueryOver with Fetch resulting multiple sql queries and db hits
In case of paging with SQL Server 2012 dialect, we would see SQL like this:
SELECT ...
FROM root
JOIN collection
WHERE....
OFFSET 0 ROWS -- skip 0 or 10 or 20 rows
FETCH NEXT 5 ROWS ONLY; -- take 5 rows
So it could at the end result in returning just ONE root entity, if amount of chidren (comments) is 5+
Solution? I would suggest to do it by:
select just root entity
use batch fetching to load children
Selecting just root, could eve mean select also any many-to-one/Reference. This will end up in a star schema structure, which with left joins will again correctly page over the root entity.
Batch fetching is described here in doc:
19.1.5. Using batch fetching
Some similar issues:
NHibernate: Select one to Many Left Join - Take X latest from Parent
How to Eager Load Associations without duplication in NHibernate?
Related
I have the following cypher queries and their execution plans respectively,
Before optimization,
match (o:Order {statusId:74}) <- [:HAS_ORDERS] - (m:Member)
with m,o
match (m:Member) - [:HAS_WALLET] -> (w:Wallet) where w.currentBalance < 250
return m as Members,collect(o) as Orders,w as Wallets order by m.createdAt desc limit 10
After optimization (db hits reduced by 40-50%),
match (m:Member) - [:HAS_ORDERS]->(o:Order {statusId:74})
with m, collect(o) as Orders
match (m) - [:HAS_WALLET] - (w:Wallet) where w.currentBalance < 250
return m as Members, Orders, w as Wallets
order by m.createdAt desc limit 10
There are 3 types of nodes, Member, Order and Wallet. And the relation between them goes like this,
Member - [:HAS_ORDERS] -> Order,
Member - [:HAS_WALLET] -> Wallet
I have around 100k Member nodes (100k wallet) and almost 570k orders for those members.
I want to fetch all the members who have order status 74 and wallet balance less than 250, and the above query gives the desired result but it takes an average 1.5 sec to respond.
I suspect there is a still scope of optimization here but I'm not be able to figure out. I've added indexing on fields upon which I'm filtering the data.
I've just started exploring neo4j and not sure how can I optimize this.
We can leverage index-backed ordering to try a different approach here. By providing a type hint (something to indicate the property value is a string) along with the ordering by the indexed property, we can have the planner use the index to check :Member nodes in the order you want (by m.createdAt DESC) for free (meaning we don't need to check every :Member node and order them), and check each of those in the given order to find the ones that meet the desired criteria until we get the 10 you need.
From some back-and-forth on the Neo4j users slack, you mentioned that of your 100k :Member nodes, about 52k of them fit the criteria you're looking for, so this is a good indicator that we may not have to look very far down the ordered :Member nodes before finding the 10 that meet the criteria.
Here's the query:
MATCH (m:Member)
WHERE m.createdAt > '' // type hint
WITH m
ORDER BY m.createdAt DESC
MATCH (m)-[:HAS_WALLET]->(w)
WHERE w.currentBalance < 250 AND EXISTS {
MATCH (m)-[:HAS_ORDERS]->(:Order {statusId:74})
}
WITH m, w
LIMIT 10
RETURN m as member, w as wallet, [(m)-[:HAS_ORDERS]->(o:Order {statusId:74}) | o] as orders
Note that by using an existential subquery, we just have to find one order that satisfies the condition. We wait until after the limit of 10 members is reached before using a pattern comprehension to grab all the orders for the 10 members.
Have you tried subqueries? If you can use a subquery to shrink down the number of nodes before passing it along to subsequent queries. (It would seem that an omniscient Query Planner could do this, but Cypher isn't there yet.). You may have to experiment with which subquery would filter out the most Nodes.
An example of using a subquery is here:
https://community.neo4j.com/t/slow-query-with-very-limited-data-and-boolean-false/31555
Another one is here:
https://community.neo4j.com/t/why-is-this-geospatial-search-so-slow/31952/24
(Of course, I assume you already have the appropriate properties indexed.)
I have two queries, first query return top 10/20 recoords, second query return total record count from first query. Both queries need to use same filter condition.
How can I write filter condition and parameter used in filter condition in one place and use in both the queries.
Condition I can store in string variable and use in both the queries but how to share parameters?
I am using HQL
Check this similar Q & A: Nhibernate migrate ICriteria to QueryOver
There is a native support in NHiberante for row count. Let's have some query
// the QueryOver
var query = session.QueryOver<MyEntity>();
It could have any amount of where parts, projections... Now we just take its underlying criteria and use a transformer to create brand new criteria - out-of-box ready to get total rowcount
// GET A ROW COUNT query (ICriteria)
var rowCount = CriteriaTransformer.TransformToRowCount(query.UnderlyingCriteria);
Next step is to use FUTURE to get both queries in one round trip to DB
// ask for a list, but with a Future, to combine both in one SQL statement
var list = query
.Future<MyEntity>()
.ToList();
// execute the main and count query at once
var count = rowCount
.FutureValue<int>()
.Value;
// list is now in memory, ready to be used
var list = futureList
.ToList();
Summary:
I have a list of work items that I am attempting to assign to a list of workers. Each working is allowed to only have a max of 100 work items assigned to them. Each work item specifies the user that should work it (associated as an owner).
For example:
Jim works a total of 5 accounts each with multiple work items. In total jim has 50 items to work already assigned to him. I am allowed to assign only 50 more.
My plight/goal:
I am using a temp table and a select statement to get the # of items each owner has currently assigned to them and I calculate the available slots for new items and store the values in new column. I need to be able to select from the items table where the owner matches my list of owners and their available items(in the temp table), only retrieving the number of rows for each user equal to the number of available slots per user - query would return only 50 rows for jim even though there may be 200 matching the criteria while sam may get 0 rows because he has no available slots while there are 30 items for him to work in the items table.
I realize I may be approaching this problem wrong. I want to avoid using a cursor.
Edit: Adding some example code
SELECT
nUserID_Owner
, CASE
WHEN COUNT(c.nWorkID) >= 100 THEN 0
ELSE 100 - COUNT(c.nWorkID)
END
,COUNT(c.nWorkID)
FROM tblAccounts cic
LEFT JOIN tblWorkItems c
ON c.sAccountNumber = cic.sAccountNumber
AND c.nUserID_WorkAssignedTo = cic.nUserID_Owner
AND c.nTeamID_WorkAssignedTo = cic.nTeamID_Owner
WHERE cic.nUserID_Collector IS NOT NULL
AND nUserID_CurrentOwner = 5288
AND c.bCompleted = 0
GROUP BY nUserID_Owner
This provides output vaulues of 5288, 50, 50 (in Jim's scenario)
It took longer than I wanted it to but I found a solution.
I did use a sub-query as suggested above to produce the work items with a unique row count by user.
I used PARTITION BY to produce a unique row count for each worker and included in my HAVING clause that the row number must be < the count of available slots. I'd post the code but it's beyond the char limit and I'd also have a lot of things to change to anon the system properly.
Originally I was approaching the problem incorrectly focusing on limiting the results rather than thinking about creating the necessary data to relate the result sets.
I have a featured section in my website that contains featured posts of three types: normal, big and small. Currently I am fetching the three types in three separate queries like so:
#featured_big_first = Post.visible.where(pinged: 1).where('overlay_type =?', :big).limit(1)
#featured_big_first = Post.visible.where(pinged: 1).where('overlay_type =?', :small).limit(1)
#featured_big_first = Post.visible.where(pinged: 1).where('overlay_type =?', :normal).limit(5)
Basically I am looking for a query that will combine those three in to one and fetch 1 big, 1 small, 5 normal posts.
I'd be surprised if you don't want an order. As you have it, it is supposed to find a random small, random large, and 5 random normal.
Yes, you can use a UNION. However, you will have to do an execute SQL. Look at the log for the SQL for each of your three queries, and do an execute SQL of a string which is each of the three queries with UNION in between. It might work, or it might have problems with the limit.
It is possible in SQL by joining the table to itself, doing a group by on one of the aliases for the table, a where when the other aliased table is <= the group by table, and adding a having clause where count of the <= table is under the limit.
So, if you had a simple query of the posts table (without the visible and pinged conditions) and wanted the records with the latest created_at date, then the normal query would be:
SELECT posts1.*
FROM posts posts1, posts posts2
WHERE posts2.created_at >= posts1.create_at
AND posts1.overlay_type = 'normal'
AND posts2.overlay_type = 'normal'
GROUP BY posts1.id
HAVING count(posts2.id) <= 5
Take this SQL, and add your conditions for visible and pinged, remembering to use the condition for both posts1 and posts2.
Then write the big and small versions and UNION it all together.
I'd stick with the three database calls.
I don't think this is possible but you can use scope which is more rails way to write a code
Also it may just typo but you are reassigning the #featured_big_first so it will contain the data of the last query only
in post.rb
scope :overlay_type_record lamda{|type| joins(:visible).where(["visible.pinged=1 AND visible.overlay_type =", type])}
and in controller
#featured_big_first = Post.overlay_type_record(:big).limit(1)
#featured_small_first = Post.overlay_type_record(:small).limit(1)
#featured_normal_first = Post.overlay_type_record(:normal).limit(5)
How can HQL be used to select specific objects that meet a certain criteria?
We've tried the following to generate a list of top ten subscribed RSS feeds (where SubscriptionCount is a derived property):
var topTen = UoW.Session.CreateQuery( #"SELECT distinct rss
FROM RssFeedSubscription rss
group by rss.FeedUrl
order by rss.SubscriptionCount DESC
")
.SetMaxResults(10)
.List<RssFeedSubscription>();
Where the intention is only to select the two unique feed URLs in the database, rather than the ten rows int the database instantiated as objects. The result of the above is:
Column 'RssSubscriptions.Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.
It's possible just to thin out the results so that we take out the two unique feed URLs after we get the data back from the database, but there must be a way to do this at the DB level using HQL?
EDIT: We realise it's possible to do a Scalar query and then manually pull out values, but is there not a way of simply specifying a match criteria for objects pulled back?
If you change your HQL a bit to look like that:
var topTen = UoW.Session.CreateQuery( #"SELECT distinct rss.FeedUrl
FROM RssFeedSubscription rss
group by rss.FeedUrl
order by rss.SubscriptionCount DESC
")
.SetMaxResults(10)
.List();
the topTen variable will be an object[] with 2 elements in there being the 2 feed URLs.
You can have this returned as strongly typed collection if you use the SetResultTransformer() method of the IQuery interfase.
You need to perform a scalar query. Here is an example from the NHibernate docs:
IEnumerable results = sess.Enumerable(
"select cat.Color, min(cat.Birthdate), count(cat) from Cat cat " +
"group by cat.Color"
);
foreach ( object[] row in results )
{
Color type = (Color) row[0];
DateTime oldest = (DateTime) row[1];
int count = (int) row[2];
.....
}
It's the group by rss.FeedUrl that's causing you the problem. It doesn't look like you need it since you're selecting the entities themselves. Remove that and I think you'll be good.
EDIT - My apologies I didn't notice the part about the "derived property". By that I assume you mean it's not a Hibernate-mapped property and, thus doesn't actually have a column in the table? That would explain the second error message you received in your query. You may need to remove the "order by" clause as well and do your sorting in Java if that's the case.