VB.NET: sorting list by another list - vb.net

How to sort somelist As List(of T) by the order set in another list sortorder As List(of Integer)? Both somelist and sortorder are of the same size and are indexed from 0 to n. Integers in the sortorder list determine the sort order: new index of item X in somelist = value of item X in sortorder.
Like this:
somelist = (itemA, itemB, itemC)
sortorder = (3, 1, 2)
somelist.sort()
somelist = (itemB, itemC, itemA)
I am trying to sort several equally sized lists using the predefined sort order.

You could use LINQ, although i hate the ugly method syntax in VB.NET:
somelist = somelist.
Select(Function(t, index) New With {.Obj = t, .Index = index}).
OrderBy(Function(x) sortorder(x.Index)).
Select(Function(x) x.Obj).
ToList()
This uses the overload of Enumerable.Select that projects the index of the item. The object and the index are stored in an anonymous type which is used for the ordering, finally i'm selecting the object and use ToList to build the ordered list.
Another way is to use Enumerable.Zip to merge both into an anonymous type:
Dim q = From x In somelist.Zip(sortorder, Function(t, sort) New With {.Obj = t, .Sort = sort})
Order By x.Sort Ascending
Select x.Obj
somelist = q.ToList()
If you want to order it descending, so the highest values first, use OrderByDescending in the method syntax and Order By x.Sort Descending in the query.
But why do you store such related informations in two different collections at all?

Related

Get source rows of LINQ query result?

In the following LINQ example, how can you get a list of index numbers from the original rows that the group is made up of? I would like to show the user where the data comes from.
Dim inputDt As New DataTable
inputDt.Columns.Add("Contractor")
inputDt.Columns.Add("Job_Type")
inputDt.Columns.Add("Cost")
inputDt.Rows.Add({"John Smith", "Roofing", "2408.68"})
inputDt.Rows.Add({"John Smith", "Electrical", "1123.08"})
inputDt.Rows.Add({"John Smith", "Framing", "900.99"})
inputDt.Rows.Add({"John Smith", "Electrical", "892.00"})
Dim results = From rows In inputDt Where rows!Contractor <> ""
Group rows By rows!Job_Type
Into cost_total = Sum(CDec(rows!Cost))
For Each r In results
' Show results.
'r.Job_Type
'r.cost_total
' Show line numbers of original rows... ?
Next
For the result (Job_Type="Electrical", cost_total=2015.08), the original index numbers are 1 and 3.
Thanks
First and perhaps foremost, set Option Strict On. This will not allow the old VB6 style rows!Cost type notation. But this is for the better because that way always returns Object and the data rarely is. This is no loss at all as NET has better ways to type and convert variables.
Second, and somewhat related, is that all your DataTable columns are text even though one is clearly decimal. Next, your query relates to working with the data in the table but you want to also include a DataRow property which is a bit odd. Better would be to add an Id or Number to the data to act as the identifier. This will also help the results make sense if the View (order of rows) changes.
You did not clarify whether you wanted a CSV of indices (now IDs) or a collection of them. A CSV of them seems simpler, so thats what this does.
The code also uses more idiomatic names and demonstrates casting data to the needed type using other extension methods. It also uses the extension method approach. First the DataTable with non-string Data Types specified:
Dim inputDt As New DataTable
inputDt.Columns.Add("ID", GetType(Int32))
inputDt.Columns.Add("Contractor")
inputDt.Columns.Add("JobType")
inputDt.Columns.Add("Cost", GetType(Decimal))
inputDt.Rows.Add({1, "John Smith", "Roofing", "2408.68"})
inputDt.Rows.Add({5, "John Smith", "Electrical", "1123.08"})
inputDt.Rows.Add({9, "John Smith", "Framing", "900.99"})
inputDt.Rows.Add({17, "John Smith", "Electrical", "892.00"})
then the query:
Dim summary = inputDt.AsEnumerable().GroupBy(Function(g) g.Field(Of String)("JobType"),
Function(k, v) New With {.Job = k,
.Cost = v.Sum(Function(q) q.Field(Of Decimal)("Cost")),
.Indices = String.Join(",", inputDt.AsEnumerable().
Where(Function(q) q.Field(Of String)("JobType") = k).
Select(Function(j) j.Field(Of Int32)("Id")))
}).
OrderBy(Function(j) j.Cost).
ToArray()
' Debug, test:
For Each item In summary
Console.WriteLine("Job: {0}, Cost: {1}, Ids: {2}", item.Job, item.Cost, item.Indices)
Next
The excessive scroll is unfortunate but I left it to allow the clauses to align with what "level" they are acting at. As you can see, a separate query is run on the DataTable to get the matching Indicies.
It is a little more typical to write such a thing as
Dim foo = Something.GroupBy(...).Select(...)
But you can skip the SELECT by using this overload of GroupBy as the above does:
Dim foo = Something.GroupBy(Function (g) ..., Function (k, v) ... )
Results:
Job: Framing, Cost: 900.99, Ids: 9
Job: Electrical, Cost: 2015.08, Ids: 5,17
Job: Roofing, Cost: 2408.68, Ids: 1

convert Int64Index to Int

I'm iterating through a dataframe (called hdf) and applying changes on a row by row basis. hdf is sorted by group_id and assigned a 1 through n rank on some criteria.
# Groupby function creates subset dataframes (a dataframe per distinct group_id).
grouped = hdf.groupby('group_id')
# Iterate through each subdataframe.
for name, group in grouped:
# This grabs the top index for each subdataframe
index1 = group[group['group_rank']==1].index
# If criteria1 == 0, flag all rows for removal
if(max(group['criteria1']) == 0):
for x in range(rank1, rank1 + max(group['group_rank'])):
hdf.loc[x,'remove_row'] = 1
I'm getting the following error:
TypeError: int() argument must be a string or a number, not 'Int64Index'
I get the same error when I try to cast rank1 explicitly I get the same error:
rank1 = int(group[group['auction_rank']==1].index)
Can someone explain what is happening and provide an alternative?
The answer to your specific question is that index1 is an Int64Index (basically a list), even if it has one element. To get that one element, you can use index1[0].
But there are better ways of accomplishing your goal. If you want to remove all of the rows in the "bad" groups, you can use filter:
hdf = hdf.groupby('group_id').filter(lambda group: group['criteria1'].max() != 0)
If you only want to remove certain rows within matching groups, you can write a function and then use apply:
def filter_group(group):
if group['criteria1'].max() != 0:
return group
else:
return group.loc[other criteria here]
hdf = hdf.groupby('group_id').apply(filter_group)
(If you really like your current way of doing things, you should know that loc will accept an index, not just an integer, so you could also do hdf.loc[group.index, 'remove_row'] = 1).
call tolist() on Int64Index object. Then the list can be iterated as int values.
simply add [0] to insure the getting the first value from the index
rank1 = int(group[group['auction_rank']==1].index[0])

Using method FindAll in two generic lists

I've got two generic lists in a vb.net program. I'd like to loop List_A and search List_A.ID in List_B, elements in common should be stored in a third list (LIST).
For Each n As BE_Busq In List_A
LIST = List_B.FindAll(Function(x As BE_Busq) x.ID = n.ID)
'' for each step, LIST should be incremented, not be replaced
Next
Method FindAll will return a generic list. How to increment LIST and not replaced it for each step in loop?
Try this:
LIST.addrange(List_B.FindAll(Function(x As BE_Busq) x.ID = n.ID))
You can use the AddRange method to add multiple items to a List.
For Each n As BE_Busq In List_A
LIST.AddRange(List_B.FindAll(Function(x As BE_Busq) x.ID = n.ID))
Next

LINQ (aggregate) query to add multiple arrays together based on criteria

I have a data table with time-series data. How do I write a query to add daily data together for selected series?
My table looks like this...
Day,Y,Series
1,1,A
1,2,A
1,3,A
2,2,A
2,3,B
2,5,C
3,4,A
3,1,B
3,4,C
etc.
I want to return an array (dY) based on a list e.g. {"A","C"}. e.g. giving the Y value (for A+C) for each day...
dY = {4,7,8}
I have managed to write the query in SQL
SELECT Sum(myTable.Y) AS [Total Of Y]
FROM AAAA
WHERE (((myTable.Series) In (1,3)))
GROUP BY myTable.X;
and I think it should be something like this in LINQ (VB.NET)
Dim mySeries = {1, 3}
Dim Ys = (From myrows In oSubData Where mySeries.Contains(myrows("Series")) Select mycol = Sum(Val(myrows("Y"))))

VB.net sort and retain keys

I have a situation where I need to sort arrays and preserve the current key - value pairs.
For example, this array:
(0) = 4
(1) = 3
(2) = 1
(3) = 2
Needs to sort like this
(2) = 1
(3) = 2
(1) = 3
(0) = 4
Retaining the original keys. Array.Sort(myArray) sorts into the right sequence but doesn't keep the indexes. I need a variant that does.
edit
Using the links, this seems close to what I want. Do I just need to remove the extra brackets to convert this to vb.net?
myList.Sort((firstPair,nextPair) =>
{
return firstPair.Value.CompareTo(nextPair.Value);
}
);
(also would I intergrate this as a function or something else?)
In an array, the order is determined by the indexes (what you call "keys"). Thus, there cannot be an array like this:
(2) = 1
(3) = 2
(1) = 3
(0) = 4
What you need is a data structure that has keys, values and an order (which is independent from the keys). You can use a List(Of KeyValuePair) or (if you use .net 4) List(Of Tuple(Of Integer, Integer)) for this; a few examples are shown in the link provided by Ken in the comment (which I will repeat here for convenience):
How do you sort a C# dictionary by value?
EDIT: Another option would be to use LINQ to automatically create a sorted IEnumerable(Of Tuple(Of Integer, Integer)):
Dim a() As Integer = {4, 3, 1, 2} ' This is your array
Dim tuples = a.Select(Function(value, key) New Tuple(Of Integer, Integer)(key, value))
Dim sorted = tuples.OrderBy(Function(t) t.Item2)
(untested, don't have Visual Studio available right now)
Since you are using .net 2.0 (since you said that you are using Visual Studio 2005 in one of the comments), using an OrderedDictionary might be an option for you, if every array value appears only once. Since OrderedDictionaries are ordered by the key, you could add your array entries to such a dictionary, using
the array index as the dictionary value and
the array value as the dictionary key (which will be used to order the dictionary).
What you are looking for is storing it as a Dictionary < int,int > and sort by the dictionary by Value.
I think the VB .Net synatx for dictionary is Dictionary( of Int, Int)