Are there better ways to split arrays into smaller arrays? - vb.net

I have a program that will create orders for a bunch of orders. However API has limitation that if I wanna do that I got to do it 10 at a time
If orderList.Count > 10 Then
Dim FirstTwenty = From n In orderList Take (10)
Dim theRest = From n In orderList Skip (10)
Dim result1 = Await internalActualcreateNormalLimitOrderMultiple(FirstTwenty.ToArray)
Dim result2 = Await internalActualcreateNormalLimitOrderMultiple(theRest.ToArray)
Return result1 + result2 'no longer json but we don't really use the resulting json unless for debugging
End If
Basically I want to split {1,2,3,4,5,6,7,8,9,10.11.12,...} into {1,2,3}{4,5,6},{7,8,9},...
And I wonder if I can use linq instead of for each
So I use this recursive function. Get first 10 or twenty and then recursively call the function and so on.
And I look at it and while it's simple, it doesn't seem right. Obviously number of orders won't be big. At most 15. But what about if on day I have 100? I can get like stackoverflow for recursive things.
If only there is a function that can split arrays into array using linq, where, take, and skip that'll be great.
Of course I can do for each but perhaps there is a more elegant way?
Then I wrote another code
Public Shared Function splitArrayIntoSmallerArrays(Of someObject)(arrayOfSomeObject As someObject(), chunkSize As Integer) As List(Of someObject())
Dim output = New List(Of someObject())
Dim newestArray = New List(Of someObject)
For i = 0 To arrayOfSomeObject.Count - 1
newestArray.Add(arrayOfSomeObject(i))
If newestArray.Count = chunkSize Then
output.Add(newestArray.ToArray)
newestArray = New List(Of someObject)
End If
Next
output.Add(newestArray.ToArray)
Return output
End Function
That'll do it in O(n)
But I think it can be done more simply by using linq, seek, and take but I don't know how. Or may be group by.
Any idea?

Your question was not very clear to me, but I believe you have an array with several "objects" inside it, correct? And before that you want to divide this matrix into smaller matrices, correct?
So how about passing this larger matrix to a JSON, creating an object that will be filled in by the smaller values ​​and then transforming this JSON again into objects of the type above, then passing these objects to the smaller array, do you understand?

If you have access to Net 7.0, perhaps you're looking for Enumerable.Chunk?
Splits the elements of a sequence into chunks of size at most size.

Related

Import txt file data and store it in Multidimensioanl array in Vb.net

Sorry if the problem is so basic, I'm a bit used to python not VB.net
I'm trying to read text file data (numbers) and store it in array/list
# Sample of text
1.30e+03,1.30e+03,1.30e+03
5.4600e+02,2.7700e+02,2.8000e+02
# PS: I can control the output of the numbers to have delimiter = ',' or space between numbers, whatever is easier to import
I wrote the following code to read string data and store it. yet, I don't know how to have a multidimensional array (2D or 3D) instead of 1D string (e.g. for the text above, it would be 2x3 array)
' Import Data
Comp_path = FinalPath & "Components_colors.txt"
reader = New StreamReader(Comp_path)
Dim W As String = ""
Dim wArray(10) As String
Dim i As Integer = 0
Do Until reader.Peek = -1
W = reader.ReadLine()
wArray(i) = W
i += 1
Loop
Moreover, I don't know the length of the text file, so I can't determine the length of the array like I did in the code above for the string wArray
For a file like this, you should turn to NuGet for a dedicated CSV parser. There are parsers built into .NET you could also use, but pulling one off of NuGet will also let you parse the values directly into something other than a string.
But if you really don't want to do that you can start with this (assuming Option Infer):
Public Function ImportData(filePath As String) As IEnumerable(Of Double())
Dim lines = File.ReadLines(filePath)
Return lines.Select(Function(line) line.Split(",").Select(AddressOf Double.Parse).ToArray())
End Function
And use it like this:
Comp_path = FinalPath & "Components_colors.txt"
Dim result = ImportData(Comp_path)
Note this code doesn't actually do any meaningful work yet. It doesn't even read the file. What it does is give you an object (result) that you can use with a For Each loop or linq operations. It will read the file in a just-in-time way, parsing out the data for each line as it goes. If you want an array (or List, which you should use in .Net more often), you can append a ToList() call to the end:
Comp_path = FinalPath & "Components_colors.txt"
Dim result = ImportData(Comp_path).ToList()
But you should try to avoid doing that it. It's much less efficient in terms of memory use. The first sample will only ever need to keep one line of the file in memory at a time. Adding ToArray() or ToList() needs to load the entire file.
Some more notes:
Many newer dynamic platforms like Python don't actually use real arrays in the formal computer science sense (fixed block of contiguous memory). Rather, they use collections, and just call them arrays. .Net has collections, too, but when you declare an array, you get an array. This has nice benefits for performance, but if you don't know you want that or how to take advantage of it you're probably better off asking for a generic List most of the time instead.
Thanks to cultural/internationalization issues, parsing numeric (or date) values to string and back again is much slower and more error-prone than you've believed in the past, especially coming from a dynamic platform. It is slow on these other platforms, too, but they want you to pretend it isn't. The first introduction to a strongly-typed platform like .Net can feel stifling in this area, but once you understand the performance and and reliability benefits, you won't want to go back.
In strongly-typed platforms it is very important to understand the data types you are working with at every level of an expression. Otherwise, building and reading statements like the Return line in my answer will be way more difficult and frustrating than it needs to be.

Create a List of elements from a DataTable LINQ Column

I would like to know how I can convert elements of a column of a DataTable to a list of type string, grouping the elements to avoid repetition.
For example my DataTable would look like this
DataTable
and I want to make a list containing the elements of only "User" without repeating itself using LINQ.
The code I was trying to use is
InvoiceList = InvoiceDT.AsEnumerable().GroupBy(Function(r) r("User").ToString).ToList(Function(g) g.ToList())
But it doesn't work for me since I am new to LINQ and still have problems forming the structures.
I'd use this:
InvoiceList = InvoiceDT.AsEnumerable().Select(Function(r) r("User").ToString()).Distinct().ToList()
If you wanted a GroupBy solution it's
InvoiceList = InvoiceDT.AsEnumerable().GroupBy(Function(r) r("User").ToString()).Select(Function(g) g.Key).ToList()
Where your code went wrong was in trying to pass a delegate to ToList; it doesn't take one (and you wouldn't ToList the g either, as it's a list of data rows with all varying properties).
To reshape our IGrouping (something like a list of objects that all share the same Key, which is a property of the list that the IGrouping represents) produced by the groupby into a sequence of string Keys we Select the Key, and then ToList that
There is a lot of back and forthing between developers over things like ToList vs ToArray - some people universally use ToList because, for collections of an unknown number of elements, both list and array will grow and resize repeatedly in the same way but using ToArray requires one additional resizing step at the end to trim off any unused slots. Mostly that's trivial in terms of an overall performance consideration and should be weighed against the benefit of releasing the memory with the trim. Getting into finer details is way beyond the scope of this answer but you can read some huge blog posts about it.
I personally think it's more important to generate sensible code by calling the method that results in the relevant type depending on what you plan to do with it; I ToList if I need List functionality (add/insert/remove).. I prefer ToArray if an array suits the follow-on purposes (read/write/random access, no insert or delete), and if I'll only ever enumerate it I don't To... anything at all - I just ForEach the result of the query, which can give a bigger performance boost than anything else because it means I may not have to enumerate the entire set (if I stop early) or allocate memory all at once for doing so (if I'm writing to a socket or file)
On the use of ToString; it's worth avoiding if you think you'll fall into a pattern where you do it on every column just to get a string. If the column is already a string it's an acceptable way to get the object that DataRow.Item gives you, into a string. If the column is another type it's better to cast it:
DirectCast(r("Age"), Integer)
r.Field(Of Integer)("Age")
Thing is, it's verbose, and ugly, and intellisense doesn't help you out with writing Age or knowing it's an Int. LINQ in VB is bad enough for verbosity without pouring gas on that fire. If you're working with datatables of a known structure, it's a lot nicer if you make strongly typed ones:
Add a new file of type DataSet to your project
Open it so the design surface appears. In the properties grid call it something reasonable, such as AccountsDataSet
Right click, Add Table, call it Invoices
Right click the emppty table, Add Column, call it User
Then use it like:
Dim dt as new AccountsDataSet.InvoicesDataTable
Populate it like:
dt.AddInvoicesRow("John Smith", ... other properties here)
Query it like:
dt.Select(Function(r) r.User).Distinct()
Much nicer than accessing column names by string, and having them be objects that need casting..
Consider the dataset generator as a way to quickly, visually, create poco classes with named, typed properties
Try this
dim list as List(of string) = InvoiceDT.Rows.
Cast(of DataRow)().
Select(Function(r) r("User").ToString()).
Distinct().
ToList()
Here you cast Row collection as IEnumerable(of DataRow), rest is trivial

Using Orderby on BatchedJoinBlock(Of T1, T2) - Dataflow (Task Parallel Library)

I'm just looking to be able to sort the results of a BatchedJoinBlock (http://msdn.microsoft.com/en-us/library/hh194683.aspx) so that the different results of the different targets stay together. I will explain! Example in some pseudo-code:
Dim batchedJoin = New BatchedJoinBlock(Of String, object)(4)
batchedJoin.Target1.Post("String1Target1")
batchedJoin.Target2.Post(CType(BuildIt, StringBuilder1))
batchedJoin.Target1.Post("String1Target2")
batchedJoin.Target2.Post(CType(BuildIt, StringBuilder2))
Dim results = batchedJoin.Receive()
'This sorts one result...
Dim SortByResult = results.Item1.OrderBy(Function(item) item.ToString, New NaturalStringComparer)
Basically I've got a string and an object, the SortByResult variable above sorts the strings exactly as I'd like them to sort. I'm looking for a way to get the objects that used to be at the same index number in target2 into the same order. e.g. if "String1Target1" changes order I'd like to somehow reliably refer to/pair it together with "StringBuilder1". The actual end result just needs to be that the objects (target2) are sorted in the order that is dictated by the strings being sorted (target1). Something like:
Dim EndResult = results.Item2.OrderBy(strings in target1)
but I'll gladly take an intermediate solution! I've also tried using a dictionary (results.Item2.ToDictionary) with the string as a key (which would also be a fine solution) but it's a bit beyond my ken using lamba expressions in the proper context. I can realistically do this in several steps with a list or something, but I'm trying to get something more efficient/learn something, and it seems like there's a lot of default options with the results of the jointblock that I'm just not experienced enough to use. Thanks in advance for any help you can provide!
To me, it looks like you don't actually want BatchedJoinBlock, because the two pieces of data always come together. A better option for that would be a BatchBlock of Tuple<string, object>. When you have that, you can then use LINQ directly to sort each batch:
results.OrderBy(Function(tuple) tuple.Item1)

How to get minimum value from an of type list generated from Linq to SQL

I'm returning a list of database records;
Dim rsPublicChilds As System.Data.Linq.ISingleResult(Of spGetPublicObjectsResult) = Nothing
rsPublicChilds = dc.spGetPublicObject(slintLoginID, lintLanguageID, lintObjectID, lintObjectTypeID, lstrSEOURL, lstrValid)
I get an enumerable list of rsPublicChildObjects that I then convert to an array;
Dim larr_PublicChild As IEnumerable(Of spGetPublicObjectsResult) = rsPublicChilds.toArray()
That then gives me easy access to an array of the objects, so I can then do;
larr_publicchild(0).colMyValue
etc.etc
I'd like to get the minimum value of colMyValue (or any other property of the object that's been created for me) but I can't quite see how to get there.
thanks
Andy
You can write
someCollection.Min(Function(x) x.SomeProperty)

Do you choose Linq over Forloops?

Given a datatable containing two columns like this:
Private Function CreateDataTable() As DataTable
Dim customerTable As New DataTable("Customers")
customerTable.Columns.Add(New DataColumn("Id", GetType(System.Int32)))
customerTable.Columns.Add(New DataColumn("Name", GetType(System.String)))
Dim row1 = customerTable.NewRow()
row1.Item("Id") = 1
row1.Item("Name") = "Customer 1"
customerTable.Rows.Add(row1)
Dim row2 = customerTable.NewRow()
row2.Item("Id") = 2
row2.Item("Name") = "Customer 2"
customerTable.Rows.Add(row2)
Dim row3 = customerTable.NewRow()
row3.Item("Id") = 3
row3.Item("Name") = "Customer 3"
customerTable.Rows.Add(row3)
Return customerTable
End Function
Would you use this snippet to retrieve a List(Of Integer) containing all Id's:
Dim table = CreateDataTable()
Dim list1 As New List(Of Integer)
For i As Integer = 0 To table.Rows.Count - 1
list1.Add(CType(table.Rows(i)("Id"), Integer))
Next
Or rather this one:
Dim list2 = (From r In table.AsEnumerable _
Select r.Field(Of Integer)("Id")).ToList()
This is not a question about whether to type cast the Id column to Integer by using .Field(Of Integer), CType, CInt, DirectCast or whatever but generally about whether or not you choose Linq over forloops as the subject implies.
For those who are interested: I ran some iterations with both versions which resulted in the following performance graph:
graph http://dnlmpq.blu.livefilestore.com/y1pOeqhqQ5neNRMs8YpLRlb_l8IS_sQYswJkg17q8i1K3SjTjgsE4O97Re_idshf2BxhpGdgHTD2aWNKjyVKWrQmB0J1FffQoWh/analysis.png?psid=1
The vertical axis shows the milliseconds it took the code to convert the rows' ids into a generic list with the number of rows shown on the horizontal axis. The blue line resulted from the imperative approach (forloop), the red line from the declarative code (linq).
Whatever way you generally choose: Why do you go that way and not the other?
Whenever possible I favor the declarative way of programming instead of imperative. When you use a declarative approach the CLR can optimize the code based on the characteristics of the machine. For example if it has multiple cores it could parallelize the execution while if you use an imperative for loop you are basically locking this possibility. Today maybe there's no big difference but I think that in the future more and more extensions like PLINQ will appear allowing better optimization.
I avoid linq unless it helps readability a lot, because it completely destroys edit-and-continue.
When they fix that, I will probably start using it more, because I do like the syntax a lot for some things.
For almost everything I've done I've come to the conclusion that LINQ is optimized enough. If I handcrafted a for loop it would have better performance, but in the grand scheme of things we are usually talking milliseconds. Since I rarely have a situation where those milliseconds will make any kind of impact, I find it's much more important to have readable code with clear intentions. I would much rather have a call that is 50ms slower than have someone come along and break it altogether!
Resharper has a cool feature that will flag and convert loops into Linq expressions. I will flip it to the Linq version and see if that hurts or helps readability. If the Linq expression more clearly communicates the intent of the code, I will go with that. If the Linq expression is unreadable, I will flip back to the foreach version.
Most of the performance issues don't really compare with readability for me.
Clarity trumps cleverness.
In the above example, I would go with the the Linq version since it clearly explains the intent and also locks out people accidently adding side effects in the loop.
I recently found myself wondering whether I've been totally spoiled by LINQ. Yes, I now use it all the time to pick all sort of things out from all sort of collections.
I started to, but found out in some cases, I saved time by using this approach:
for (var i = 0, len = list.Count; i < len; i++) { .. }
Not necessarily in all cases, but some. Most extension methods use the foreach approach of querying.
I try to follow these rules:
Whenever I'm just querying (filtering, projecting, ...) collections, use LINQ.
As soon as I'm actually 'doing' something with the result (i.e, introduce side effects), I'll use a for loop.
So in this example, I'll use LINQ.
Also, I always try to split up the 'query definition' from the 'query evaluation':
Dim query = From r In table.AsEnumerable()
Select r.Field(Of Integer)("Id")
Dim result = query.ToList()
This makes it clear when that (in this case in-memory) query will be evaluated.