I'm sorting a large database by the quantity of outgoing relationships. I have a working Cypher query as follows:
optional match (n)-[r]->(m)
return n, Count(r) as c
order by c DESC limit 1;
The Cypher query is working as anticipated. However, as I am debugging and stepping through the Cypher -> Neo4jClient conversion, I cannot seem to find the root of the issue.
public ReturnPayload[] getByConnections()
{
var query = client.Cypher
.OptionalMatch("(p)-[r]->(m)")
.Return((p, r, m, c) => new
{
p = p.As<Person>(),
pid = (int)p.Id(),
e = r.As<RelationshipInstance<Object>>(),
m = m.As<Metadata>(),
c = r.Count()
}).OrderByDescending("c").Limit(1);
var res = query.Results;
var payload = new List<ReturnPayload>();
foreach (var el in res)
{
var t = new ReturnPayload();
t.e = el.e;
t.m = el.m;
t.pid = el.pid;
t.p = el.p;
payload.Add(t);
}
return payload.ToArray<ReturnPayload>();
}
I suspect that a part of the problem may be that I am not utilizing CollectAs<T>() and thus it is referring a count of '1' per each person. Unfortunately, I have attempted using CollectAs<T>() and CollectAsDisctinct<T>() and my resultant JSON architecture is only wrapping each individual element in an array, as opposed to aggregating identical elements into an array proper.
The FOREACH loop is there to assist in converting from the anonymous type into my relatively standard <ReturnPayload> which does not utilize a c object within its parameters.
Your time is appreciated, thank you.
Debugged Query Test:
OPTIONAL MATCH (p)-[r]->(m)
RETURN p AS p, id(p) AS pid, r AS e, m AS m, count(r) AS c
ORDER BY c DESC
LIMIT {p0}
And my 'functional' Cypher query:
optional match (n)-[r]->(m)
return n, Count(r) as c
order by c DESC limit 1;
So, you've identified yourself that you're running two different queries. This is the issue: your C# does not match the Cypher you're expecting.
To make the C# match the working Cypher, change it to this:
var query = client.Cypher
.OptionalMatch("(n)-[r]->(m)")
.Return((n, r) => new
{
n = n.As<Person>(),
c = r.Count()
})
.OrderByDescending("c")
.Limit(1);
That should now produce the same query, and thus the same output as you're expecting from the working Cypher.
On a purely stylistic note, you can also simplify your foreach query to a more functional equivalent:
var payload = query
.Results
.Select(r => new ReturnPayload {
n = r.n,
c = r.c
})
.ToArray();
return payload;
However, as I look closer as your query, it looks like you just want the count for the sake of getting the top 1, then you're throwing it away.
Consider using the WITH clause:
optional match (n)-[r]->(m)
with n as n, count(r) as c
order by c desc
limit 1
return n
You can map that back to C# using similar syntax:
var query = client.Cypher
.OptionalMatch("(n)-[r]->(m)")
.With((n, r) => new
{
n = n.As<Person>(),
c = r.Count()
})
.OrderByDescending("c")
.Limit(1)
.Return(n => new ReturnPayload // <-- Introduce the type here too
{
n = n.As<Person>()
});
Then, you don't need to query for data and just throw it away with another foreach loop. (You'll notice I introduce the DTO type in the Return call as well, so you don't have to translate it out of the anonymous type either.)
(Disclaimer: I'm just typing all of this C# straight into the answer; I haven't double checked the compilation, so my signatures might be slightly off.)
Hope that helps!
Related
I am trying to convert Anthyme Caillard's INotifyDataErrorInfo implementation into VB.NET. All goes well until I reach the Validate method, which has the following LINQ query, in which q is of type IEnumerable<IGrouping<string, ValidationResult>>:
var q = from r in validationResults
from m in r.MemberNames
group r by m into g
select g;
I have tried the following translation:
Dim q = From r In valResults
From m In r.MemberNames
Group r By r.MemberNames Into Group
Select Group
Or even the lambda expression version, (assuming I'm using John Skeet's syntax correctly):
Dim q = valResults.GroupBy(Function(r) r, Function(r) r.MemberNames)
but in both of these, q is of type IEnumerable(Of IEnumerable(Of ValidationResult)).
I have looked at VB and IGrouping for LINQ Query and VB.Net - Linq Group By returns IEnumerable(Of Anonymous Type), but I don't think they apply since I will be working with the group directly, and not a specialized class.
Why aren't these implementations returning the same thing, and what can I do to make q the required IEnumerable(Of IGrouping(Of String, ValidationResult))?
It seems that the VB compiler treats LINQ query syntax different than the C# compiler.
In C# the expression
from element in list
group element by element.Value
is identical to
list.GroupBy(element => element.Value)
In VB however
From element In list
Group element By element.Value Into g = Group
is translated into
list.GroupBy(Function(element) element.Value,
Function(key, element) New With { Key .Value = key, Key .g = element })
I can't see any good reason why the VB compiler introduces an anonymous type here - but that's what it does.
I recommend you always translate C# LINQ query syntax (from a in list select a.Value) into a method chain (list.Select(a => a.Value)). That way the VB compiler can't interfere. It is forced to use the exact chain of methods.
However your translation of the original query isn't correct.
var q = from r in validationResults
from m in r.MemberNames
group r by m into g
select g;
is actually translated into
var q = validationResults.SelectMany(r => r.MemberNames, (r, m) => new { r, m })
.GroupBy(t => t.m, t => t.r);
which in VB becomes:
Dim q = validationResults.SelectMany(Function(r) r.MemberNames,
Function(r, m) New With { Key .r = r, Key .m = m }) _
.GroupBy(Function(t) t.m, Function(t) t.r)
what is wrong with this linq query that show me the error of Does not contain a definition for 'union'
(from rev in db.vM29s
where Years.Contains(rev.FinancialYear) && rev.Exclude=="No"
group rev by new { rev.RevenueCode, rev.FinancialYear } into g
select new
{
Revenuecode = g.Key.RevenueCode,
totalRevenue = g.Sum(x => x.RevenueAmount),
RevenueEnglish = (from a in db.RevenueCodes where a._RevenueCode == g.Key.RevenueCode select a.RevenueEng).FirstOrDefault(),
// MinorCode = (from a in db.MinorCodes where a._MinorCode == g.Key.MinorCode select a.MinorEng),
RevenueDari = (from a in db.RevenueCodes where a._RevenueCode == g.Key.RevenueCode select a.RevenueDari).FirstOrDefault(),
Yearss = g.Key.FinancialYear
}
)
.Union
(from u in db.rtastable
where Years.Contains(u.Year)
group u by new { u.objectcode, u.Year } into g
select new
{
Revenuecode = g.Key.objectcode,
totalRevenue = g.Sum(x => x.amount),
RevenueEnglish = (from a in db.RevenueCodes where a._RevenueCode == g.Key.objectcode select a.RevenueEng).FirstOrDefault(),
// MinorCode = (from a in db.MinorCodes where a._MinorCode == g.Key.MinorCode select a.MinorEng),
RevenueDari = (from a in db.RevenueCodes where a._RevenueCode == g.Key.objectcode select a.RevenueDari).FirstOrDefault(),
Yearss = g.Key.Year
}
)
.ToList();
If you included using System.Linq; and both Anonymous Types have exactly the same property names + property types, then what you did should work.
Yet it does not work. The solution is to check your Anonymous Types for subtle property name differences and/or subtle property type differences.
E.g. even an int vs a smallint or double or decimal will cause this build error:
'System.Collections.Generic.IEnumerable<AnonymousType#1>' does not contain a definition for 'Union' and the best extension method overload 'System.Linq.Queryable.Union(System.Linq.IQueryable, System.Collections.Generic.IEnumerable)' has some invalid arguments
Switching to .Concat() will not fix this: it has the same (obvious) restriction that the types on both sides must be compatible.
After you fix the naming or typing problem, I would recommend that you consider switching to .Concat(). The reason: .Union() will call .Equals() on all objects to eliminate duplicates, but that is pointless because no two Anonymous Objects that were created independently will ever be the same object (even if their contents would be the same).
If it was your specific intention to eliminate duplicates, then you need to create a class that holds your data and that implements .Equals() in a way that makes sense.
You should use Concat or using addRange if the data is allready in memory.
I have the following query:
var list = repositoy.Query<MyClass>.Select(domain => new MyDto()
{
Id = domain.Id,
StringComma = string.Join(",", domain.MyList.Select(y => y.Name))
});
That works great:
list.ToList();
But if I try to get the Count I got an exception:
list.Count();
Exception
NHibernate.Hql.Ast.ANTLR.QuerySyntaxException
A recognition error occurred. [.Count[MyDto](.Select[MyClass,MyDto](NHibernate.Linq.NhQueryable`1[MyClass], Quote((domain, ) => (new MyDto()domain.Iddomain.Name.Join(p1, .Select[MyListClass,System.String](domain.MyList, (y, ) => (y.Name), ), ))), ), )]
Any idea how to fix that without using ToList ?
The point is, that we should NOT call Count() over projection. So this will work
var query = repositoy.Query<MyClass>;
var list = query.Select(domain => new MyDto()
{
Id = domain.Id,
StringComma = string.Join(",", domain.MyList.Select(y => y.Name))
});
var count = query.Count();
When we use ICriteria query, the proper syntax would be
var criteria = ... // criteria, with WHERE, SELECT, ORDER BY...
// HERE cleaned up, just to contain WHERE clause
var totalCountCriteria = CriteriaTransformer.TransformToRowCount(criteria);
So, for Count - use the most simple query, i.e. containing the same JOINs and WHERE part
If you really don't need the results, but only the count, then you shouldn't even bother writing the .Select() clause. Radim's answer as posted is a good way to both get the results and the count, but if your database supports it, use future queries to execute both in the same roundtrip to the database:
var query = repository.Query<MyClass>;
var list = query.Select(domain => new MyDto()
{
Id = domain.Id,
StringComma = string.Join(",", domain.MyList.Select(y => y.Name))
}).ToFuture();
var countFuture = query.Count().ToFutureValue();
int actualCount = countFuture.Value; //queries are actually executed here
Note that there in NH prior to 3.3.3, this would still execute two round-trips (see https://nhibernate.jira.com/browse/NH-3184), but it would work, and if you ever upgrade NH, you get a (minor) performance boost.
I'm currently using Entity Framework for my db access but want to have a look at Dapper. I have classes like this:
public class Course{
public string Title{get;set;}
public IList<Location> Locations {get;set;}
...
}
public class Location{
public string Name {get;set;}
...
}
So one course can be taught at several locations. Entity Framework does the mapping for me so my Course object is populated with a list of locations. How would I go about this with Dapper, is it even possible or do I have to do it in several query steps?
Alternatively, you can use one query with a lookup:
var lookup = new Dictionary<int, Course>();
conn.Query<Course, Location, Course>(#"
SELECT c.*, l.*
FROM Course c
INNER JOIN Location l ON c.LocationId = l.Id
", (c, l) => {
Course course;
if (!lookup.TryGetValue(c.Id, out course))
lookup.Add(c.Id, course = c);
if (course.Locations == null)
course.Locations = new List<Location>();
course.Locations.Add(l); /* Add locations to course */
return course;
}).AsQueryable();
var resultList = lookup.Values;
See here https://www.tritac.com/blog/dappernet-by-example/
Dapper is not a full blown ORM it does not handle magic generation of queries and such.
For your particular example the following would probably work:
Grab the courses:
var courses = cnn.Query<Course>("select * from Courses where Category = 1 Order by CreationDate");
Grab the relevant mapping:
var mappings = cnn.Query<CourseLocation>(
"select * from CourseLocations where CourseId in #Ids",
new {Ids = courses.Select(c => c.Id).Distinct()});
Grab the relevant locations
var locations = cnn.Query<Location>(
"select * from Locations where Id in #Ids",
new {Ids = mappings.Select(m => m.LocationId).Distinct()}
);
Map it all up
Leaving this to the reader, you create a few maps and iterate through your courses populating with the locations.
Caveat the in trick will work if you have less than 2100 lookups (Sql Server), if you have more you probably want to amend the query to select * from CourseLocations where CourseId in (select Id from Courses ... ) if that is the case you may as well yank all the results in one go using QueryMultiple
No need for lookup Dictionary
var coursesWithLocations =
conn.Query<Course, Location, Course>(#"
SELECT c.*, l.*
FROM Course c
INNER JOIN Location l ON c.LocationId = l.Id
", (course, location) => {
course.Locations = course.Locations ?? new List<Location>();
course.Locations.Add(location);
return course;
}).AsQueryable();
I know I'm really late to this, but there is another option. You can use QueryMultiple here. Something like this:
var results = cnn.QueryMultiple(#"
SELECT *
FROM Courses
WHERE Category = 1
ORDER BY CreationDate
;
SELECT A.*
,B.CourseId
FROM Locations A
INNER JOIN CourseLocations B
ON A.LocationId = B.LocationId
INNER JOIN Course C
ON B.CourseId = B.CourseId
AND C.Category = 1
");
var courses = results.Read<Course>();
var locations = results.Read<Location>(); //(Location will have that extra CourseId on it for the next part)
foreach (var course in courses) {
course.Locations = locations.Where(a => a.CourseId == course.CourseId).ToList();
}
Sorry to be late to the party (like always). For me, it's easier to use a Dictionary, like Jeroen K did, in terms of performance and readability. Also, to avoid header multiplication across locations, I use Distinct() to remove potential dups:
string query = #"SELECT c.*, l.*
FROM Course c
INNER JOIN Location l ON c.LocationId = l.Id";
using (SqlConnection conn = DB.getConnection())
{
conn.Open();
var courseDictionary = new Dictionary<Guid, Course>();
var list = conn.Query<Course, Location, Course>(
query,
(course, location) =>
{
if (!courseDictionary.TryGetValue(course.Id, out Course courseEntry))
{
courseEntry = course;
courseEntry.Locations = courseEntry.Locations ?? new List<Location>();
courseDictionary.Add(courseEntry.Id, courseEntry);
}
courseEntry.Locations.Add(location);
return courseEntry;
},
splitOn: "Id")
.Distinct()
.ToList();
return list;
}
Something is missing. If you do not specify each field from Locations in the SQL query, the object Location cannot be filled. Take a look:
var lookup = new Dictionary<int, Course>()
conn.Query<Course, Location, Course>(#"
SELECT c.*, l.Name, l.otherField, l.secondField
FROM Course c
INNER JOIN Location l ON c.LocationId = l.Id
", (c, l) => {
Course course;
if (!lookup.TryGetValue(c.Id, out course)) {
lookup.Add(c.Id, course = c);
}
if (course.Locations == null)
course.Locations = new List<Location>();
course.Locations.Add(a);
return course;
},
).AsQueryable();
var resultList = lookup.Values;
Using l.* in the query, I had the list of locations but without data.
Not sure if anybody needs it, but I have dynamic version of it without Model for quick & flexible coding.
var lookup = new Dictionary<int, dynamic>();
conn.Query<dynamic, dynamic, dynamic>(#"
SELECT A.*, B.*
FROM Client A
INNER JOIN Instance B ON A.ClientID = B.ClientID
", (A, B) => {
// If dict has no key, allocate new obj
// with another level of array
if (!lookup.ContainsKey(A.ClientID)) {
lookup[A.ClientID] = new {
ClientID = A.ClientID,
ClientName = A.Name,
Instances = new List<dynamic>()
};
}
// Add each instance
lookup[A.ClientID].Instances.Add(new {
InstanceName = B.Name,
BaseURL = B.BaseURL,
WebAppPath = B.WebAppPath
});
return lookup[A.ClientID];
}, splitOn: "ClientID,InstanceID").AsQueryable();
var resultList = lookup.Values;
return resultList;
There is another approach using the JSON result. Even though the accepted answer and others are well explained, I just thought about an another approach to get the result.
Create a stored procedure or a select qry to return the result in json format. then Deserialize the the result object to required class format. please go through the sample code.
using (var db = connection.OpenConnection())
{
var results = await db.QueryAsync("your_sp_name",..);
var result = results.FirstOrDefault();
string Json = result?.your_result_json_row;
if (!string.IsNullOrEmpty(Json))
{
List<Course> Courses= JsonConvert.DeserializeObject<List<Course>>(Json);
}
//map to your custom class and dto then return the result
}
This is an another thought process. Please review the same.
I'm trying to get a paged query to work properly using LINQ and NHibernate. On a single table, this works perfect, but when more than one table is joined, it's causing me havoc. Here is what I have so far.
public virtual PagedList<Provider> GetPagedProviders(int startIndex, int count, System.Linq.Expressions.Expression<Func<Record, bool>> predicate) {
var firstResult = startIndex == 1 ? 0 : (startIndex - 1) * count;
var query = (from p in session.Query<Record>()
.Where(predicate)
select p.Provider);
var rowCount = query.Select(x => x.Id).Distinct().Count();
var pageOfItems = query.Distinct().Skip(firstResult).Take(count).ToList<Provider>();
return new PagedList<Provider>(pageOfItems, startIndex, count, rowCount);
}
There are a couple issues I'm having with this. First off, the rowCount variable is pulling back a COUNT(*) on a join that has one to many. So my count is way off of what is should be, I'm expecting 129, but getting back over 1K.
The second issue I'm happening is with the results. If I'm on the first page (firstResult = 0), then I'm getting correct results. However, if firstResult is something other than 0, it produces completely different SQL. Below is the SQL generated by both scenarios, I've trimmed out some of the fat to make it a little more readable.
--firstResult = 0
select distinct TOP (30) provider1_.Id as Id12_, provider1_.TaxID as TaxID12_,
provider1_.Facility_Name as Facility5_12_, provider1_.Last_name as Last6_12_,
provider1_.First_name as First7_12_, provider1_.MI as MI12_
from PPORecords record0_ left outer join PPOProviders provider1_ on record0_.Provider_id=provider1_.Id,
PPOProviders provider2_ where record0_.Provider_id=provider2_.Id
and provider2_.TaxID='000000000'
--firstResult = 30
SELECT TOP (30) Id12_, TaxID12_, Facility5_12_, Last6_12_, First7_12_, MI12_,
FROM (select distinct provider1_.Id as Id12_, provider1_.TaxID as TaxID12_,
provider1_.Facility_Name as Facility5_12_,
provider1_.Last_name as Last6_12_,
provider1_.First_name as First7_12_,
provider1_.MI as MI12_,
ROW_NUMBER() OVER(ORDER BY CURRENT_TIMESTAMP) as __hibernate_sort_row
from PPORecords record0_ left outer join PPOProviders provider1_ on record0_.Provider_id=provider1_.Id,
PPOProviders provider2_
where record0_.Provider_id=provider2_.Id and provider2_.TaxID='000000000') as query
WHERE query.__hibernate_sort_row > 30
ORDER BY query.__hibernate_sort_row
The problem with the second query is the "distinct" keyword is not present on the outer query, only the inner query. Any ideas how to correct this?
Thanks for any suggestions!
[UPDATE]
This is the code that is building the the Linq query predicate.
private Expression<Func<Record, bool>> ParseQueryExpression(string Query) {
Expression<Func<Record, bool>> mExpression = x => true;
string[] splitQuery = Query.Split('|');
foreach (string query in splitQuery) {
if (string.IsNullOrEmpty(query))
continue;
int valStartIndex = query.IndexOf('(');
string variable = query.Substring(0, valStartIndex);
string value = query.Substring(valStartIndex + 1, query.IndexOf(')') - valStartIndex - 1);
switch (variable) {
case "tax":
mExpression = x => x.Provider.TaxID == value;
break;
case "net":
mExpression = Combine<Record>(mExpression, x => x.Network.Id == int.Parse(value));
break;
case "con":
mExpression = Combine<Record>(mExpression, x => x.Contract.Id == int.Parse(value));
break;
case "eff":
mExpression = Combine<Record>(mExpression, x => x.Effective_Date >= DateTime.Parse(value));
break;
case "trm":
mExpression = Combine<Record>(mExpression, x => x.Term_Date <= DateTime.Parse(value));
break;
case "pid":
mExpression = Combine<Record>(mExpression, x => x.Provider.Id == long.Parse(value));
break;
case "rid":
mExpression = Combine<Record>(mExpression, x => x.Rate.Id == int.Parse(value));
break;
}
}
return mExpression;
}
It seems as if the Linq provider has some "unexpected" behaviour there. I could reproduce the issue with the rowCount that you are describing and also encountered some issues when trying to fix it with a subquery.
If I understand your intentions correctly you basically want to have a paged list of Providers whose Records match certain criteria.
In that case I would suggest using a subquery. However, I tried implementing the subquery with the Query() method but it did not work. So I tried the QueryOver() method which worked flawlessly. In your case the desired queries would look like this:
Provider pAlias = null;
var query = session.QueryOver<Provider>(() => pAlias).WithSubquery
.WhereExists(QueryOver.Of<Record>()
.Where(predicate)
.And(r => r.Provider.Id == pAlias.Id)
.Select(r => r.Provider));
var rowCount = query.RowCount();
var pageOfItems = query.Skip(firstResult).Take(count).List<Provider>();
That way you do not have to struggle with the Distinct(). And as much as I personally like and use NHibernate.Linq, I suppose, if it does not work in a given situation, one should use something else that works.
Edit: Splitting the query into smaller units.
// the method ParseQueryExpression() does not need to be modified, the Expression should work like that
System.Linq.Expressions.Expression<Func<Record, bool>> predicate = x => x.Provider.TaxID == "000000000";
// Query to evaluate predicate
IQueryable<Record> queryId = session.Query<Record>().Where(predicate).Select(r => r.Provider);
// extracting the IDs to use in a subquery
List<int> idList = queryId.Select(p => p.Id).Distinct().ToList();
// total count of distinct Providers
int rowCount = idList.Count;
// new Query on Provider, using the distinct Ids
var query = session.Query<Provider>().Where(p => idList.Contains(p.Id));
// the List<Provider> to display on the page
var pageOfItems = query.Skip(firstResult).Take(count).ToList();
The problem here is that you have multiple queries and therefore multiple roundtrips to the db. Another problem apperas when you examine the generated sql. The subquery idList.Contains(p.Id) will generate an in-clause with a lot of parameters.
As you can see, these are NHibernate.Linq queries. That is because the new QueryOver feature is not a true Linq provider and does not work well with dynamic Linq expressions.
You could get around that limitation by using Detached QueryOvers. The drawback here is that you would have to modify your ParseQueryExpression() somehow, e.g. like that:
private QueryOver<Record> ParseQueryExpression(string Query)
{
Record rAlias = null;
var detachedQueryOver = QueryOver.Of<Record>(() => rAlias)
.JoinQueryOver(r => r.Provider)
.Where(() => rAlias.Provider.TaxID == "000000000")
.Select(x => x.Id);
// modify your method to match the return value
return detachedQueryOver;
}
// in your main method
var detachedQueryOver = QueryOver.Of<Record>()
.WithSubquery
.WhereProperty(r => r.Id
.In(this.ParseQueryExpression(...));
var queryOverList = session.QueryOver<Provider>()
.WithSubquery
.WhereProperty(x => x.Id)
.In(detachedQueryOver.Select(r => r.Provider.Id));
int rowCount = queryOverList.RowCount();
var pageOfItems = queryOverList.Skip(firstResult).Take(count).List();
Well, I hope you are not too confused by that.