linq group by multipe aggregates - sql

Im new to Linq and Im sure that I have gone about this in a convoluted manner. Im trying to do something like this SQL in Linq:
SELECT DISTINCT
count(vendor) as vendorCount,
reqDate,
status,
openDate,
item,
poDate,
count(responseDate) as responseCount
FROM
myTable
GROUP BY
reqDate, status, openDate, item, poDate
HAVING
reqDate > openDate
Here is what I have so far.
var groupQuery = (from table in dt.AsEnumerable()
group table by new
{
vendor = table["vendor"], reqdate = table.Field<DateTime>("ReqDate"), status = table["status"],
open = table["openDate"],
item = table["item"),
podate = table.Field<DateTime>("PODate"), responsedate = table.Field<DateTime>("responseDate"),
}
into groupedTable
where Having(groupedTable.Key.reqdate, groupedTable.Key.openDate) == 1
select new
{
x = groupedTable.Key,
y = groupedTable.Count()
}).Distinct();
foreach (var req in groupQuery)
{
Console.WriteLine("cols: {0} count: {1} ",
req.x, req.y);
}
The Having() is a function that takes two datetime parameters and returns a 1 if the reqDate is greater than the openDate. It compiles and runs, but it obviously does not give me the results I want. Is this possible using Linq? I want to push this data to an excel spreadsheet so Im hoping to create a datatable from this linq query. Would I be better off just creating a dataview from my datatable and not mess with Linq?

The SQL code is grouping by only some of the fields, while your LINQ statement is grouping by all of the fields, so the only items that would get grouped would be duplicates. If you group by only the fields that the SQL query groups by, you should get the correct answer. Your Having() method words fine, but is not necessary and is less readable.
var groupQuery = (from table in dt.AsEnumerable()
group table by new
{
reqdate = table.Field<DateTime>("ReqDate"),
status = table["status"],
open = table["openDate"],
item = table["item"),
podate = table.Field<DateTime>("PODate")
}
into groupedTable
where groupedTable.Key.reqdate > groupedTable.Key.openDate
select new
{
x = groupedTable.Key,
VenderCount = groupedTable.Select(t => t["vendor"])
.Distinct()
.Count(),
ResponseCount = groupedTable.Select(t => t.Field<DateTime>("responseDate"))
.Distinct()
.Count()
}).Distinct();

Related

How to write join query with multiple column - LINQ

I have a situation where two tables should be joined with multiple columns with or condition. Here, I have a sample of sql query but i was not able to convert it into linq query.
select cm.* from Customer cm
inner join #temp tmp
on cm.CustomerCode = tmp.NewNLKNo or cm.OldAcNo = tmp.OldNLKNo
This is how i have write linq query
await (from cm in Context.CustomerMaster
join li in list.PortalCustomerDetailViewModel
on new { OldNLKNo = cm.OldAcNo, NewNLKNo = cm.CustomerCode } equals new { OldNLKNo = li.OldNLKNo, NewNLKNo = li.NewNLKNo }
select new CustomerInfoViewModel
{
CustomerId = cm.Id,
CustomerCode = cm.CustomerCode,
CustomerFullName = cm.CustomerFullName,
OldCustomerCode = cm.OldCustomerCode,
IsCorporateCustomer = cm.IsCorporateCustomer
}).ToListAsync();
But this query doesn't returns as expected. How do I convert this sql query into linq.
Thank you
You didn't tell if list.PortalCustomerDetailViewModel is some information in the database, or in your local process. It seems that this is in your local process, your query will have to transfer it to the database (maybe that is why it is Tmp in your SQL?)
Requirement: give me all properties of a CustomerMaster for all CustomerMasters where exists at least one PortalCustomerDetailViewModel where
customerMaster.CustomerCode == portalCustomerDetailViewModel.NewNLKNo
|| customerMaster.OldAcNo == portalCustomerDetailViewModel.OldNLKNo
You can't use a normal Join, because a Join works with an AND, you want to work with OR
What you could do, is Select all CustomerMasters where there is any PortalCustomerDetailViewModel that fulfills the provided OR:
I only transfer those properties of list.PortalCustomerDetailViewModel to the database that I need to use in the OR expression:
var checkProperties = list.PortalCustomerDetailViewModel
.Select(portalCustomerDetail => new
{
NewNlkNo = portalCustomerDetail.NewNlkNo,
OldNLKNo = portalCustomerDetail.OldNLKNo,
});
var result = dbContext.CustomerMasters.Where(customerMaster =>
checkProperties.Where(checkProperty =>
customerMaster.CustomerCode == checkProperty.NewNLKNo
|| customerMaster.OldAcNo == checkProperty.OldNLKNo)).Any()))
.Select(customerMaster => new CustomerInfoViewModel
{
Id = customerMaster.Id,
Name = customerMaster.Name,
...
});
In words: from each portalCustomerDetail in list.PortalCustomerDetailViewModel, extract the properties NewNKLNo and OldNLKNo.
Then from the table of CustomerMasters, keep only those customerMasters that have at least one portalCustomerDetail with the properties as described in the OR statement.
From every remaining CustomerMasters, create one new CustomerInfoViewModel containing properties ...
select cm.* from Customer cm
inner join #temp tmp
on cm.CustomerCode = tmp.NewNLKNo or cm.OldAcNo = tmp.OldNLKNo
You don't have to use the join syntax. Adding the predicates in a where clause could get the same result. Try to use the following code:
await (from cm in Context.CustomerMaster
from li in list.PortalCustomerDetailViewModel
where cm.CustomerCode == li.NewNLKNo || cm.OldAcNo = li.OldNLKNo
select new CustomerInfoViewModel
{
CustomerId = cm.Id,
CustomerCode = cm.CustomerCode,
CustomerFullName = cm.CustomerFullName,
OldCustomerCode = cm.OldCustomerCode,
IsCorporateCustomer = cm.IsCorporateCustomer
}).ToListAsync();
var result=_db.Customer
.groupjoin(_db.#temp ,jc=>jc.CustomerCode,c=> c.NewNLKNo,(jc,c)=>{jc,c=c.firstordefault()})
.groupjoin(_db.#temp ,jc2=>jc2.OldAcNo,c2=> c2.OldNLKNo,(jc2,c2)=>{jc2,c2=c2.firstordefault()})
.select(x=> new{
//as you want
}).distinct().tolist();

Need help in converting SQL query to LINQ

I am new to the world of LINQ and hence I am stuck at converting one sql query to LINQ.
My SQL query is:
select COUNT(DISTINCT PAYER) as count,
PPD_COL FROM BL_REV
where BL_NO_UID = 1084
GROUP BY PPD_COL
The desired output is:
Count PPD_COL
12 P
20 C
I have written something like below in LINQ:
var PayerCount = from a in LstBlRev where a.DelFlg == "N"
group a by new { a.PpdCol} into grouping
select new
{
Count = grouping.First().PayerCustCode.Distinct().Count(),
PPdCol = (grouping.Key.PpdCol == "P") ? "Prepaid" : "Collect"
};
But it is not giving me the desired output. The count is returned same for PPD_COL value P & C. What am I missing here?
Change the groupby as following. in the group group only the property you need and then in thr by no need to create an anonymous object - just the one property you are grouping by.
var PayerCount = from a in LstBlRev
where a.DelFlg == "N"
group a.PayerCustCode by a.PpdCol into grouping
select new
{
Count = grouping.Distinct().Count(),
PPdCol = grouping.Key == "P" ? "Prepaid" : "Collect"
};

Convert SQL to LINQ with group by

I'm stumped trying to convert the following sql to linq:
SELECT t.* FROM(SELECT mwfieldid,MAX([TimeStamp]) AS MaxValue, BatchDocumentID
FROM mw_BatchField
GROUP BY mwfieldid,BatchDocumentID) x
JOIN mw_BatchField t ON x.mwfieldid = t.mwfieldid
AND x.MaxValue = t.TimeStamp
and x.BatchDocumentID = t.BatchDocumentID
So far I had to convert it to a stored proc to get it to work. I'd rather know how to write this correctly in linq. I tried using a sql to linq converter (http://www.sqltolinq.com/) which produced this code that had errors in it: (Are these converters any good? It didn't seem to produce anything useful with a few tries.)
From x In (
(From mw_BatchFields In db.mw_BatchFields
Group mw_BatchFields By
mw_BatchFields.MWFieldID,
mw_BatchFields.BatchDocumentID
Into g = Group
Select
MWFieldID,
MaxValue = CType(g.Max(Function(p) p.TimeStamp),DateTime?),
BatchDocumentID)
)
Join t In db.mw_BatchFields
On New With { .MWFieldID = CInt(x.MWFieldID), .MaxValue = CDate(x.MaxValue), .BatchDocumentID = CInt(x.BatchDocumentID) }
Equals New With { .MWFieldID = t.MWFieldID, .MaxValue = t.TimeStamp, .BatchDocumentID = t.BatchDocumentID }
Select
BatchFieldID = t.BatchFieldID,
BatchDocumentID = t.BatchDocumentID,
MWFieldID = t.MWFieldID,
TimeStamp = t.TimeStamp,
value = t.value,
DictionaryValue = t.DictionaryValue,
AutoFilled = t.AutoFilled,
employeeID = t.employeeID
Seems like a lot of code for such a simple query, and it doesn't compile.
So for every combination of mwfieldid and BatchDocumentID you want all columns of the row with the highest TimeStamp? This is something which is much easier to express in LINQ than SQL so I'm not surprised that an automated converter is making a meal of it.
You should be able to do:
Mw_BatchFields.GroupBy(x => new { x.Mwfieldid, x.BatchDocumentId })
.SelectMany(x => x.Where(y => y.TimeStamp == x.Max(z => z.TimeStamp)))
This (like your SQL) will return multiple rows per grouping key if there is more than one row in the group that shares the same maximum TimeStamp. If you only want row per key, you could use:
Mw_BatchFields.GroupBy(x => new { x.Mwfieldid, x.BatchDocumentId })
.Select(x => x.OrderByDescending(y => y.TimeStamp).First())
Edit:
Sorry, just twigged that you're working in VB, not C#, so not quite what you were looking for, but if you can live with the lambda syntax style, I think the above can be translated as:
Mw_BatchFields.GroupBy(Function(x) New With {x.Mwfieldid, x.BatchDocumentId}).Select(Function(x) x.OrderByDescending(Function(y) y.TimeStamp).First())
and:
Mw_BatchFields.GroupBy(Function(x) New With {x.Mwfieldid, x.BatchDocumentId}).SelectMany(Function(x) x.Where(Function(y) y.TimeStamp = x.Max(Function(z) z.TimeStamp)))

NHibernate 3.0 - Only one expression can be specified in the select list when the subquery is not introduced with EXISTS."

We are trying to upgrade to NHibernate 3.0 and now i am having problem with the following Linq query. It returns "Only one expression can be specified in the select list when the subquery is not introduced with EXISTS." error.
This is the linq query in the controller.
var list = (from item in ItemTasks.FindTabbedOrDefault(tab)
select new ItemSummary
{
Id = item.Id,
LastModifyDate = item.LastModifyDate,
Tags = (from tag in item.Tags
select new TagSummary
{
ItemsCount = tag.Items.Count,
Name = tag.Name
}).ToList(),
Title = item.Title
});
and the following is the sql generated for this query
select TOP ( 1 /* #p0 */ ) item0_.Id as col_0_0_,
item0_.LastModifyDate as col_1_0_,
(select (select cast(count(* ) as INT)
from dbo.ItemsToTags items3_,
dbo.Item item4_
where tag2_.Id = items3_.Tag_id
and items3_.Item_id = item4_.Id),
tag2_.Name
from dbo.ItemsToTags tags1_,
dbo.Tag tag2_
where item0_.Id = tags1_.Item_id
and tags1_.Tag_id = tag2_.Id) as col_2_0_,
item0_.Title as col_3_0_ from dbo.Item item0_ order by item0_.ItemPostDate desc
ps:If i remove the Tags property in the linq query, it works fine.
Where is the problem in the query?
Thanks in advance.
I've got the same Generic ADO Exception error, I think it's actually the limitation of SQL server;
Is it possible somehow load object graph with projections in collections?
If I try this one:
var cats = q.Select(t => new cat()
{
NickName = t.NickName,
Legs = t.Legs.Select(l => new Leg()
{
Color = l.Color,
Size = l.Size
}).ToList()
}).ToList();
That does the same error..

Optimize Linq to SQL query, Group By multiple fields

My LINQ query contains the following Group By statement:
Group p By Key = New With { _
.Latitude = p.Address.GeoLocations.FirstOrDefault(Function(g) New String() {"ADDRESS", "POINT"}.Contains(g.Granularity)).Latitude, _
.Longitude = p.Address.GeoLocations.FirstOrDefault(Function(g) New String() {"ADDRESS", "POINT"}.Contains(g.Granularity)).Longitude}
The query works, but here is the SQL that the clause above produces
SELECT [t6].[Latitude]
FROM (
SELECT TOP (1) [t5].[Latitude]
FROM [dbo].[GeoLocations] AS [t5]
WHERE ([t5].[Granularity] IN (#p0, #p1)) AND ([t5].[AddressId] = [t2].[Addr_AddressId])
) AS [t6]
) AS [value], (
SELECT [t8].[Longitude]
FROM (
SELECT TOP (1) [t7].[Longitude]
FROM [dbo].[GeoLocations] AS [t7]
WHERE ([t7].[Granularity] IN (#p2, #p3)) AND ([t7].[AddressId] = [t2].[Addr_AddressId])
) AS [t8]
) AS [value2]
I am not a SQL expert, but it looks to me that this is rather suboptimal translation. This should really be one query that selects Latitide and Longitude from the first record. Perhaps SQL Server Optimizer will take care of this. But is there a way to nudge Linq to generate a leaner SQL statement to begin with?
I tried the following, too..
Group p By Key = p.Address.GeoLocations.Where(Function(g) New String() {"ADDRESS", "POINT"}.Contains(g.Granularity)). _
Select(Function(g) New With {.Latitude = g.Latitude, .Longitude = g.Longitude}).FirstOrDefault
but this produced an error: "A group by expression can only contain non-constant scalars that are comparable by the server."
Sorry to reply in c#...
Here's what you have, translated to c#:
List<string> params = new List<string>()
{ "Address", "Point" };
from p in people
group p by new {
Latitude = p.Address.GeoLocations
.FirstOrDefault(g => params.Contains(g.Granularity)).Latitude,
Longitude = p.Address.GeoLocations
.FirstOrDefault(g => params.Contains(g.Granularity)).Longitude
};
Here's a rewrite, using the let keyword.
from p in people
let loc = p.Address.GeoLocations
.FirstOrDefault(g => params.Contains(g.Granularity))
group p by new
{
Latitude = loc.Latitude,
Longitude = loc.Longitude
};