Using Linq to select item with maximum value in a group - vb.net

I have a table like that :
dt = New DataTable()
dt.Columns.AddRange(New DataColumn() {
New DataColumn("time"),
New DataColumn("fname", Type.GetType("System.String")),
New DataColumn("note", Type.GetType("System.String")),
New DataColumn("du", Type.GetType("System.Int32")),
New DataColumn("site", Type.GetType("System.String"))})
dt.Rows.Add(New Object() {"2023-01-28 02:01", "aaa1", "xxx11xxxxxxx", 100, "a"})
dt.Rows.Add(New Object() {"2023-01-28 03:01", "bbb1", "xxxx22xxxxxx", 2, "b"})
dt.Rows.Add(New Object() {"2023-01-28 09:01", "ccc", "xxxx33xxxxxx", 3, "c"})
dt.Rows.Add(New Object() {"2023-01-28 02:01", "aaa2", "xxx44xxxxxxx", 3, "a"})
dt.Rows.Add(New Object() {"2023-01-28 03:01", "bbb2", "xxx55xxxxxxx", 53, "b"})
dt.Rows.Add(New Object() {"2023-01-28 03:01", "bbb3", "xxx66xxxxxxx", 89, "b"})
dt.Rows.Add(New Object() {"2023-01-28 01:01", "xxx", "xxx77xxxxxxx", 5, "x"})
I want to use linq to query the above table,Group by two columns time and site from a datatable, then get the fname that have maximum du I use the following code:
Dim MYquery = (From p In dt.Select()
Group p By ID = New With _
{Key .time = p("time").ToString.Trim, _
Key .site = p("site")} _
Into Group Select Group(0)).ToArray.CopyToDataTable
The result was as shown in the photo .
the desired table is :
time
fname
note
du
site
2023-01-28 02:01
aaa1
xxx11xxxxxxx
100
a
2023-01-28 03:01
bbb3
xxx55xxxxxxx
53
b
2023-01-28 09:01
ccc
xxxx33xxxxxx
3
c
2023-01-28 01:01
xxx
xxx77xxxxxxx
5
x
What should I do?

(I am a C# guy, I hope converted from C# correctly):
BTW Your sample desired output do not match what you described. It should be the row with 89 for site "b", no?
Dim myQuery = (From r In dt.Select()
Group r By ID = New With {
Key .Time = r.Field(Of String)("Time"),
Key .Site = r.Field(Of String)("Site")
} Into Group
Let mx = Group.Max(Function(gg) gg.Field(Of Integer)("Du"))
From row In Group
Where row.Field(Of Integer)("Du") = mx
Select row).CopyToDataTable()
time
fname
note
du
site
2023-01-28 02:01
aaa1
xxx11xxxxxxx
100
a
2023-01-28 03:01
bbb3
xxx66xxxxxxx
89
b
2023-01-28 09:01
ccc
xxxx33xxxxxx
3
c
2023-01-28 01:01
xxx
xxx77xxxxxxx
5
x

Related

Split a collection into n parts with Lambda (or LINQ), in VB.Net

Example of code presented below is working fine, it's doing what is supposed to do, but I'm not satisfied. Looking for much smarter solution in VB.NET. The presentation of results (I mean counts for each subgroup) is quite awkward. The content of data and list of records etc. are not important. Also counts should be sorted in order Less{0}, From{1}To{2}, MoreThan{3}...Thanks in advance.
Dim Age1 As Integer = 5
Dim Age2 As Integer = 9
Dim myList As New List(Of Integer) = {1, 3, 5, 7, 9, 2, 4, 6, 8, 10, 2, 4, 5, 7, 8, 9, 6, 7, 9, 11}
Dim Lambda = myList.GroupBy(Function(x) New With {Key .Age1 = (x) < Age1,Key .Age2 = (x) > Age1 - 1 And (x) <= Age2,Key .Age3 = (x) > Age2}).ToList()
Dim group1, group2, group3 As Integer
myList = myList.OrderBy(Function(x) x).ToList()
Console.WriteLine(String.Join(",", myList.Select(Function(s) s.ToString).ToArray))
For Each group In Lambda
If group.Key.Age1 Then group1 = group.Count()
If group.Key.Age2 Then group2 = group.Count()
If group.Key.Age3 Then group3 = group.Count()
Next
' Obviously If Stop Then Error condition
If group1 + group2 + group3 <> myList.Count Then Stop
Console.WriteLine(String.Format("Groups: Less{0},From{1}To{2},MoreThan{3}", Age1, Age1, Age2 - 1, Age2))
Console.WriteLine(String.Format(" Age: {0,4},{1,8},{2,8}", group1, group2, group3))
'1,2,2,3,4,4,5,5,6,6,7,7,7,8,8,9,9,9,10,11
'Groups: Less5,From5To8,MoreThan9
'Age: 6, 12, 2
Here's how I would try to improve it:
Dim Age1 As Integer = 5
Dim Age2 As Integer = 9
Dim myList As New List(Of Integer) From {1, 3, 5, 7, 9, 2, 4, 6, 8, 10, 2, 4, 5, 7, 8, 9, 6, 7, 9, 11}
Console.WriteLine(String.Join(", ", myList.OrderBy(Function(x) x)))
console.WriteLine
Dim ageBins = myList.GroupBy(Function(age) If(age < Age1, 1, If(age >= Age1 And age <= Age2, 2, 3))) _
.Select(Function(agebin) New With { agebin.Key, .AgeCount = agebin.Count() }) _
.OrderBy(Function(agebin) agebin.Key)
For Each bin In ageBins
Dim msg As String
Select bin.Key
Case 1
msg = $"Less {Age1}"
Case 2
msg = $"From {Age1} To {Age2}"
Case Else
msg = $"MoreThan {Age2}"
End Select
Console.WriteLine($"{msg,12}: {bin.AgeCount}")
Next
You could also change the code to handle any number of bins:
Dim agesForBins = {5, 10}
Dim ageBins = myList.GroupBy(Function(age) Enumerable.Range(0, agesForBins.Length).Where(Function(n) age < agesForBins(n)).DefaultIfEmpty(agesForBins.Length).First) _
.Select(Function(agebin) New With { agebin.Key, .AgeCount = agebin.Count() }) _
.OrderBy(Function(agebin) agebin.Key)
For Each bin In ageBins
Dim msg As String
If bin.Key = 0 Then
msg = $"Less {agesForBins(0)}"
ElseIf bin.Key = agesForBins.Length Then
msg = $"MoreThan {agesForBins(bin.Key-1)-1}"
Else
msg = $"From {agesForBins(bin.Key-1)} To {agesForBins(bin.Key)-1}"
End If
Console.WriteLine($"{msg,12}: {bin.AgeCount}")
Next
It seems to me that this is the simplest way to do it:
Dim Age1 As Integer = 5
Dim Age2 As Integer = 9
Dim myList As New List(Of Integer) From {1, 3, 5, 7, 9, 2, 4, 6, 8, 10, 2, 4, 5, 7, 8, 9, 6, 7, 9, 11}
Dim group1 As Integer = myList.Where(Function (x) x < Age1).Count()
Dim group2 As Integer = myList.Where(Function (x) x > Age1 - 1 And x <= Age2).Count()
Dim group3 As Integer = myList.Where(Function (x) x > Age2).Count()
If group1 + group2 + group3 <> myList.Count Then Stop
Console.WriteLine(String.Format("Groups: Less{0},From{1}To{2},MoreThan{3}", Age1, Age1, Age2 - 1, Age2))
Console.WriteLine(String.Format(" Age: {0,4},{1,8},{2,8}", group1, group2, group3))
If you want a funky LINQ-based method then try this:
Dim bands() As Func(Of Integer, Boolean) = _
{ _
Function (x) x < Age1, _
Function (x) x <= Age2, _
Function (x) True _
}
Dim counts = _
myList _
.GroupBy(Function (x) Enumerable.Range(0, bands.Count).Where(Function (n) bands(n)(x)).First()) _
.Select(Function (x) x.Count()) _
.ToArray()
Dim group1 As Integer = counts(0)
Dim group2 As Integer = counts(1)
Dim group3 As Integer = counts(2)
Here is 2'nd fastest and I think quite 'clean' solution based on #Enigmativity concept posted a couple of hours ago...Takes a care of his n-bands approach
Function simpleCSVsplit(ageForBins() As Integer, myList As List(Of Integer)) As List(Of Integer)
Dim Bands As New List(Of Integer)
For indx As Integer = 0 To ageForBins.Count - 1
Bands.Add(myList.Where(Function(x) x < ageForBins(indx)).Count())
myList = myList.Skip(Bands(indx)).ToList()
Next
Bands.Add(myList.Count)
Return Bands
End Function

datatable sum column and concatenate rows using LINQ and group by on multiple columns

I Have a datatable with following records
ID NAME VALUE CONTENT
1 AAA 10 SYS, LKE
2 BBB 20 NOM
1 AAA 15 BST
3 CCC 30 DSR
2 BBB 05 EFG
I want to write a VB.NET/LINQ query to have a output like below table: -
ID NAME SUM CONTENT (as CSV)
1 AAA 25 SYS, LKE, BST
2 BBB 25 NOM, EFG
3 CCC 30 DSR
Please provide me LINQ query to get the desired result. Thanks.
I have tried concatenation using below query
Dim grouped = From row In dtTgt.AsEnumerable() _
Group row By New With {row.Field(Of Int16)("ID"), row.Field(Of String)("Name")} _
Into grp() _
Select ID, Name, CONTENT= String.Join(",", From i In grp Select i.Field(Of String)("CONTENT"))
This query will give you the expected output:-
Dim result = From row In dt.AsEnumerable()
Group row By _group = New With {Key .Id = row.Field(Of Integer)("Id"),
Key .Name = row.Field(Of String)("Name")} Into g = Group
Select New With {Key .Id = _group.Id, Key .Name = _group.Name,
Key .Sum = g.Sum(Function(x) x.Field(Of Integer)("Value")),
Key .Content = String.Join(",", g.Select(Function(x) x.Field(Of String)("Content")))}
Thanks for your answers.
However, I have managed to get the desired result using simple code (Without LINQ): -
Dim dt2 As New DataTable
dt2 = dt.Clone()
For Each dRow As DataRow In dt.Rows
Dim iID As Integer = dRow("ID")
Dim sName As String = dRow("Name")
Dim sContt As String = dRow("Content")
Dim iValue As Integer = dRow("Value")
Dim rwTgt() As DataRow = dt2.Select("ID=" & iID)
If rwTgt.Length > 0 Then
rwTgt(0)("Value") += iValue
rwTgt(0)("Content") += ", " & sContt
Else
rw = dt2.NewRow()
rw("ID") = iID
rw("Name") = sName
rw("Value") = iValue
rw("Content") = sContt
dt2.Rows.Add(rw)
End If
Next

Multiple Column Group count linq dataset

I'm not a regular user of LINQ so this question might be as dumb as possible for an expert :)
I have a table in a dataset as below
Column1 OffLoc
01/02/2016 Johannesburg
01/02/2016 Johannesburg
02/02/2016 Moscow
02/02/2016 Johannesburg
02/02/2016 Johannesburg
03/02/2016 Johannesburg
03/02/2016 Moscow
03/02/2016 Moscow
03/02/2016 Bogota
04/02/2016 Barcelona
04/02/2016 Johannesburg
04/02/2016 Singapore
04/02/2016 Singapore
04/02/2016 Singapore
05/02/2016 Singapore
05/02/2016 Singapore
05/02/2016 Singapore
10/02/2016 Singapore
10/02/2016 Singapore
10/02/2016 Singapore
10/02/2016 Singapore
10/02/2016 Singapore
I would like to have like the below
Column1 Offloc Count
01/02/2016 - Johannesburg - 2
02/02/2016 - Moscow - 1
02/02/2016 - Johannesburg - 2
03/02/2016 - Johannesburg - 1
03/02/2016 - Moscow - 2
03/02/2016 - Bogota - 1
...
I tried to use this example Linq1, but im using dataset and refrain from copying the data to a list and then performing the Tasks. I also tried following Code:
Dim oQuery = _
From oRow In ds.Table("AuditLog").AsEnumerable() _
Group By _
Column1DT = oRow("Column1"), _
OffLocDT = oRow("OffLoc") _
Into Total
But I dont know how to proceed from here. Please help me. Im using vb.net, because the online converters do not convert very well.
Thanks
You already done the grouping correctly. Next step, you can project the query into an anonymous type containing all the needed information, for example :
Dim oQuery = _
From oRow In ds.Table("AuditLog").AsEnumerable() _
Group By _
Column1DT = oRow("Column1"), _
OffLocDT = oRow("OffLoc") _
Into Total = Group
Select New With
{
Key .Column1 = Column1DT,
Key .OffLoc = OffLocDT,
.Count = G.Count()
}
Full working demo example :
Dim dt As New DataTable()
dt.Columns.Add("Column1", GetType(DateTime))
dt.Columns.Add("OffLoc", GetType(String))
dt.Rows.Add(New DateTime(2016, 1, 2), "Johannesburg")
dt.Rows.Add(New DateTime(2016, 1, 2), "Johannesburg")
dt.Rows.Add(New DateTime(2016, 1, 3), "Jakarta")
dt.Rows.Add(New DateTime(2016, 1, 3), "Jakarta")
dt.Rows.Add(New DateTime(2016, 1, 3), "Jakarta")
Dim result = From oRow In dt.AsEnumerable()
Group By
Column1DT = oRow("Column1"),
OffLocDT = oRow("OffLoc")
Into G = Group
Select New With
{
Key .Column1 = Column1DT,
Key .OffLoc = OffLocDT,
.Count = G.Count()
}
For Each item As Object In result
Console.WriteLine(item)
Next
output :
{ Column1 = 1/2/2016 12:00:00 AM, OffLoc = Johannesburg, Count = 2 }
{ Column1 = 1/3/2016 12:00:00 AM, OffLoc = Jakarta, Count = 3 }
Here You Go.
Dim list = (From g In ds.Tables("AuditLog").AsEnumerable()
Group By
col1 = g("Column1"),
col2 = g("OffLoc")
Into X = Group
Select New With
{
Key .Column1 = col1,
Key .OffLoc = col2,
.Count = X.Count()
}).ToList()

grouping in LINQ query

Given a table of thousands of rows of data as shown in the sample below
Id Date SymbolId NumOccs HighProjection LowProjection ProjectionTypeId
1 2014-04-09 28 45 1.0765 1.0519 1
2 2014-04-10 5 44 60.23 58.03 1
3 2014-04-11 29 77 1.026 1.0153 1
and a Class defined as
Public Class ProjectionPerformance
Public symbolId As Integer
Public Name as String
Public Date as String
Public ActualRange as Decimal
End Class
I am trying to return the following for each symbolId;
The symbolId (from this table)
The symbol Name (from the symbols table)
The Actual Range (High Projection - Low Projection)
Can this be done in one query since i am essentially in need of a Dictionary(Of Integer, List(Of ProjectionPerformance)) where the integer is the symbolId and the List is generated from the query?
Updated:
So as to be a little clearer, Here is what I'm doing so far but contains two LINQ iterations
Public Shared Function GetRangeProjectionPerformance(Optional daysToRetrieve As Integer = 100) As Dictionary(Of Integer, List(Of ProjectionPerformance))
Dim todaysDate As Date = DateTime.Now.Date
Dim lookbackDate As Date = todaysDate.AddDays(daysToRetrieve * -1)
Dim temp As New Dictionary(Of Integer, List(Of ProjectionPerformance))
Using ctx As New ProjectionsEntities()
Dim query = (From d In ctx.projections
Where d.SymbolId <= 42 AndAlso d.Date >= lookbackDate
Join t In ctx.symbols On d.SymbolId Equals t.Id
Let actualRange = d.HighProjection - d.LowProjection
Select New With {
d.Date,
d.SymbolId,
t.Name,
actualRange}).GroupBy(Function(o) o.SymbolId).ToDictionary(Function(p) p.Key)
For Each itm In query
Dim rpp As New ProjectionPerformance
Dim rppList As New List(Of ProjectionPerformance)
If itm.Value.Count > 0 Then
For x As Integer = 0 To itm.Value.Count - 1
Dim bb As Integer = Convert.ToInt32(itm.Value(x).SymbolId)
With rpp
.SymbolId = bb
.ProjectionDate = itm.Value(x).Date.ToString()
.Name = itm.Value(x).Name
.ProjectedRange = itm.Value(x).actualRange
End With
rppList.Add(rpp)
Next
End If
temp.Add(itm.Key, rppList)
Next
End Using
Return temp
End Function
I'm going to answer in C#, but I think you'll get the gist of it anyway. Basically, you can group by SymbolId, build your object graph and then use ToDictionary using the Key to create dictionary.
var result = (From d In _projectionEntities.projections
Where d.SymbolId <= 42
group d by d.SymbolId into g
select new {
SymbolId = g.Key,
ProjectionPerformances =
g.Select(gg=>new ProjectionPerformance{
SymbolId = gg.SymbolId,
Name = gg.Symbol.Name,
rpDate = gg.Date.ToString(),
ActualRange = gg.HighProjection - gg.LowProjection
})
.ToDictionary(g=>g.SymbolId);
Try
Dim Result = (From d In _ProjectionEntities.projections
Join t In _projectionEntities.symbols On d.SymbolId Equals t.Id
Where d.SymbolId <= 42
Select New With {.SymbolID = d.SymbolID
.Date = d.Date
.Name = t.Name
.ActualRange = d.HighProjection - d.LowProjection})

How to remove all duplicates in a data table in vb.net?

Consider my data table
ID Name
1 AAA
2 BBB
3 CCC
1 AAA
4 DDD
Final Output is
2 BBB
3 CCC
4 DDD
How can i remove the rows in the data table using Vb.Net
Any help is appreciated.
Following works if you only want the distinct rows(skip those with same ID and Name):
Dim distinctRows = From r In tbl
Group By Distinct = New With {Key .ID = CInt(r("ID")), Key .Name = CStr(r("Name"))} Into Group
Where Group.Count = 1
Select Distinct
' Create a new DataTable containing only the unique rows '
Dim tblDistinct = (From r In tbl
Join distinctRow In tblDistinct
On distinctRow.ID Equals CInt(r("ID")) _
And distinctRow.Name Equals CStr(r("Name"))
Select r).CopyToDataTable
If you want to remove the dups from the original table:
Dim tblDups = From r In tbl
Group By Dups = New With {Key .ID = CInt(r("ID")), Key .Name = CStr(r("Name"))} Into Group
Where Group.Count > 1
Select Dups
Dim dupRowList = (From r In tbl
Join dupRow In tblDups
On dupRow.ID Equals CInt(r("ID")) _
And dupRow.Name Equals CStr(r("Name"))
Select r).ToList()
For Each dup In dupRowList
tbl.Rows.Remove(dup)
Next
Here is your sample-data:
Dim tbl As New DataTable
tbl.Columns.Add(New DataColumn("ID", GetType(Int32)))
tbl.Columns.Add(New DataColumn("Name", GetType(String)))
Dim row = tbl.NewRow
row("ID") = 1
row("Name") = "AAA"
tbl.Rows.Add(row)
row = tbl.NewRow
row("ID") = 2
row("Name") = "BBB"
tbl.Rows.Add(row)
row = tbl.NewRow
row("ID") = 3
row("Name") = "CCC"
tbl.Rows.Add(row)
row = tbl.NewRow
row("ID") = 1
row("Name") = "AAA"
tbl.Rows.Add(row)
row = tbl.NewRow
row("ID") = 4
row("Name") = "DDD"
tbl.Rows.Add(row)
You can use the DefaultView.ToTable method of a DataTable to do the filtering like this:
Public Sub RemoveDuplicateRows(ByRef rDataTable As DataTable)
Dim pNewDataTable As DataTable
Dim pCurrentRowCopy As DataRow
Dim pColumnList As New List(Of String)
Dim pColumn As DataColumn
'Build column list
For Each pColumn In rDataTable.Columns
pColumnList.Add(pColumn.ColumnName)
Next
'Filter by all columns
pNewDataTable = rDataTable.DefaultView.ToTable(True, pColumnList.ToArray)
rDataTable = rDataTable.Clone
'Import rows into original table structure
For Each pCurrentRowCopy In pNewDataTable.Rows
rDataTable.ImportRow(pCurrentRowCopy)
Next
End Sub
Assuming you want to check all the columns, this should remove the duplicates from the DataTable (DT):
DT = DT.DefaultView.ToTable(True, Array.ConvertAll((From v In DT.Columns Select v.ColumnName).ToArray(), Function(x) x.ToString()))
Unless I overlooked it, this doesn't seem to be in the documentation (DataView.ToTable Method), but this also appears to do the same thing:
DT = DT.DefaultView.ToTable(True)