The fastest way to query a vb array for datetimes within two boundaries - vb.net

I am writing a function to increase the time scale of raw calculation data with a time density of about two minutes to five minutes(and other larger scales after). There are over 100k data points held in an array that isn't in chronological order. I am looking for the fastest way to query the array and to find data within two datetimes. As the code runs every data point will need to be used only once, but will have to be read several times as the data is not in order. I have several ideas of how to do this:
Just look at all of the time values in the array to check whether they are within the two datetimes given. This will force the code to run through the entire array for each new time point ~50k times.
Create a boolean in the array with my timedata that will become true if the value has been used. This will use a boolean check of the point has been used before the datetime comparison which should be faster.
Reorganize the array into order, I am not sure how long this would take based on datetimes. It would greatly increase the time required to import data in the first place, however it could make the scaling query much faster. Any idea on vaguely the ratio of time it would take to reorder the array compared to running it out of order?
Any other suggestions are welcome.
I will add some code if people feel it is necessary. Thanks in advance.
EDIT: A few examples as requested.
Here are the definitions of the arrays.:
Dim ScaleDate(0) As Date
Dim ScaleData(0) As Double
I use redim preserve as the data is added to them with an SQL.
Here is an example of a datetime point copied from the array.
(0) = #2/12/2012 12:01:36 AM#

First, as Tim Schmelter recommended, I would use a List(Of T) instead of an array. It will likely be more efficient and will definitely be easier to work with. Second, I would recommend defining your own type which stores all the data for a single item rather than storing each property for the item in a separate list. Doing so will make it easier to modify in the future, but it will also be more efficient because you'll only need to resize one list rather than two:
Public Class MyItem
Public Property ScaleDate() As Date
Get
Return _scaleDate
End Get
Set(ByVal value As Date)
_scaleDate = value
End Set
End Property
Private _scaleDate As Date
Public Property ScaleData() As Double
Get
Return _scaleData
End Get
Set(ByVal value As Double)
_scaleData = value
End Set
End Property
Private _scaleData As Double
End Class
Private _myItems As New List(Of MyItem)()
It's hard to say which will be faster, sorting the list or searching through it. It all depends how big it is, how often it's changed, and how often you search it. So, I would recommend trying both options and seeing for yourself which works better in your scenario.
For sorting, if you have your own type, you could simply make it implement IComparable(Of T) and then call the Sort method on the list:
Public Class MyItem
Implements IComparable(Of MyItem)
Public Property ScaleDate() As Date
Get
Return _scaleDate
End Get
Set(ByVal value As Date)
_scaleDate = value
End Set
End Property
Private _scaleDate As Date
Public Property ScaleData() As Double
Get
Return _scaleData
End Get
Set(ByVal value As Double)
_scaleData = value
End Set
End Property
Private _scaleData As Double
Public Function CompareTo(ByVal other As MyItem) As Integer Implements IComparable(Of MyItem).CompareTo
Return ScaleDate.CompareTo(other.ScaleDate)
End Function
End Class
Private _myItems As New List(Of MyItem)()
'To sort the list after it's been modified:
_myItems.Sort()
You'd want to only sort the list once each time it is modified. You wouldn't want to sort it every time you search through the list. Also, sorting it, in and by itself, doesn't make searching it front-to-back any faster, so you would want to implement a find method which quickly searches through a sorted list. For instance, something along these lines should work:
Private Function FindIndex(ByVal startDate As Date) As Integer
FindIndex(startDate, 0, _myItems.Count - 1)
End Function
Private Function FindIndex(ByVal startDate As Date, ByVal startIndex As Integer, ByVal endIndex As Integer) As Integer
If endIndex >= startIndex Then
Dim midIndex As Integer = ((endIndex - startIndex) \ 2) + startIndex
If _myItems(midIndex).ScaleDate < startDate Then
Return FindIndex(startDate, midIndex, endIndex)
Else
Return FindIndex(startDate, startIndex, midIndex)
End If
Else
Return startIndex
End If
End Function
For searching through an unsorted list, I simply loop through front-to-back on the whole list and I would create a new list of all the matching items:
Dim matches As New List(Of MyItem)()
For Each item As MyItem In _myItems
If (item.ScaleDate >= startDate) And (item.ScaleDate <= endDate) Then
matches.Add(item)
End If
Next
Alternatively, if the dates on these items are mostly sequential without giant gaps between them, it may be worth using a Dictionary(Of Date, List(Of MyItem)) object to store your list of items. This would contain separate lists of items for each date, all stored in a hash table. So, to get or set a list of items for a particular day would be very fast, but to get a list of all the items in a date range, you'd have to loop through every day in the date range and get the list for that day from the dictionary and combine them into one list of matches:
Dim _days As New Dictionary(Of Date, List(Of MyItem))()
'You'd need to loop through and add each item with code like this:
Private Sub AddItem(ByVal item As MyItem)
Dim dayItems As List(Of MyItem) = Nothing
_days.TryGetValue(item.ScaleDate, dayItems)
If dayItems Is Nothing Then
dayItems = New List(Of MyItem)()
_days(item.ScaleDate) = dayItems
End If
dayItems.Add(item)
End Sub
'And then to find all the items in a date range, you could do something like this:
Private Function FindItemsInRange(ByVal startDate As Date, ByVal endDate As Date) As List(Of MyItem)
Dim matches As New List(Of MyItem)()
Dim i As Date = startDate
While i <= endDate
Dim dayItems As List(Of MyItem) = Nothing
_days.TryGetValue(i, dayItems)
If dayItems Is Nothing Then
matches.AddRange(dayItems)
End If
i = i.AddDays(1)
End While
Return matches
End Function

Related

Combining two list, only records with 1 specific unique property

I'm combining two lists in visual basic. These lists are of a custom object. The only record I want to combine, are the once with a property doesn't match with any other object in the list so far. I've got it running. However, the first list is just 1.247 records. The second list however, is just short of 27.000.000 records. The last time I successfully merged the two list with this restriction, it took over 5 hours.
Usually I code in C#. I've had a similar problem there once, and solved it with the any function. It worked perfectly and really fast. So as you can see in the code, I tried that here too. However it takes way too long.
Private Function combineLists(list As List(Of Record), childrenlist As List(Of Record)) As List(Of Record) 'list is about 1.250 entries, childrenlist about 27.000.000
For Each r As Record In childrenlist
Dim dublicate As Boolean = list.Any(Function(record) record.materiaalnummerInfo = r.materiaalnummerInfo)
If Not dublicate Then
list.Add(r)
End If
Next
Return list
End Function
The object Record looks like this ( I wasn't sure how to make a custom object in VB, and this looks bad, but it worked):
Public Class Record
Dim materiaalnummer As String
Dim type As String 'Config or prefered
Dim materiaalstatus As String
Dim children As New List(Of String)
Public Property materiaalnummerInfo()
Get
Return materiaalnummer
End Get
Set(value)
materiaalnummer = value
End Set
End Property
Public Property typeInfo()
Get
Return type
End Get
Set(value)
type = value
End Set
End Property
Public Property materiaalstatusInfo()
Get
Return materiaalstatus
End Get
Set(value)
materiaalstatus = value
End Set
End Property
Public Property childrenInfo()
Get
Return children
End Get
Set(value)
children = value
End Set
End Property
End Class
I was hoping that someone could point me in the right direction to shorten the time needed. Thank you in advance.
I'm not 100% sure what you want the output to be such as all differences or just ones from the larger list etc but I would definitely try do it with LINQ! Basically sql for vb.net data so would something similar to this:
Dim differenceQuery = list.Except(childrenlist)
Console.WriteLine("The following lines are in list but not childrenlist")
' Execute the query.
For Each name As String In differenceQuery
Console.WriteLine(name)
Next
Also side-note i would suggest not calling one of the lists "list" as it is bad practice and is a in use name on the vb.net system
EDIT
Please try this then let me know what results come back.
Private Function combineLists(list As List(Of Record), childrenlist As List(Of Record)) As List(Of Record) 'list is about 1.250 entries, childrenlist about 27.000.000
list.AddRange(childrenlist) 'combines both lists
Dim result = From v In list Select v.materiaalnummerInfo Distinct.ToList
'result hopefully may be a list with all distinct values.
End Function
Or Don't combine them if you dont want to.

VB.NET copy a 2-dimenion List into a one dimension list

In my original code I have a list object containing 2 columns, Word and Percent. I sort the list but only want to return the list containing just the Word
Here is some example code broken down into something simple:
Public Function SortMyWords() as list(of string)
Dim Words As WordsToSort
Dim ListofWords As New List(Of WordsToSort)
Words.Word = "John"
Words.Percent = "10"
ListofWords.Add(Words)
Words.Word = "Robert"
Words.Percent = "1"
ListofWords.Add(Words)
ListofWords = ListofWords.OrderBy(Function(x) x.Percent).ToList()
End Sub
Public Structure WordsToSort
Public Word As String
Public Percent As String
Public Sub New(ByVal _word As String, ByVal _percent As String)
Word = _word
Percent = _percent
End Sub
End Structure
At the end of the SortMyWords function, I want to return just the Word column back as a list, I'm not sure if I can do this direct - i.e.
Return Listofwords(column Word) or whether I need to copy my ListofWords into a new list, just containing the Column Word - something like this (which doesn't work)
Dim Newlist As New List(Of String)
Newlist.AddRange(ListofWords(Words.Word))
Return NewList
Any suggestions on whether I should do this completely differently (and better) would be really appreciated as I am trying to get my head around objects and although I use them all the time, I'm new to structures and list objects.
Thanks for any help, this has been driving me crazy for an hour now.
I think you're close. Try:
ListOfWords
.OrderBy(Function(x) x.Percent)
.Select(Function(x) x.Word)
.ToList()
If you prefer, you can also use the LINQ syntax:
(from w in ListOfWords
orderby w.Percent ascending
select w.Word).ToList()
Note that the return type is a List(Of String) and not a List(Of WordsToSort) anymore. So you cannot assign it back to the variable ListOfWords again like you do in your sample code.

How to find the largest integer(s) between 3 integers

I would like to find the largest integer(s) between 3 integers.
I could do this by nesting If statements. Since I have further code to write however this would be long and untidy.
I was wondering if there was an easier way to find the largest integer(s) (including if let's say A and B are equal but both higher than C).
P.S Can you do this with 2-D arrays?
Use LINQ to do this:
Dim numbers() As Integer = {1, 3, 5}
Dim max As Integer = numbers.Max()
Debug.Write("Max number in numbers() is " & max.ToString())
Output:
Edited as per conversation with OP on wanting to know which genre was ranked the best.
When asked How do you get the data? OP responds with:
I have a text file containing movie|genre on every line. I read this and count which genre (out of 3) is the highest.
I have drafted up some code which reads from a text file and populates a class.
First let me show you the code:
Dim myFilms As New Films
Using sr As New IO.StreamReader("C:\films.txt")
Do Until sr.Peek = -1
Dim columns As String() = sr.ReadLine().Split(New Char() {"|"c}, StringSplitOptions.RemoveEmptyEntries)
'columns(0) = film name
'columns(1) = genre
myFilms.Add(New Film(columns(0), columns(1)))
Loop
End Using
If myFilms.Count > 0 Then
Dim bestGenre = myFilms.GetBestGenre()
'Go off and read the genre file based on bestGenre
End If
From the above code you can see the class Films being populated with a new Film. I then call a method from the Films class, but only if there are films to choose from. Let me show you the class structure for both these:
Film:
Public Class Film
Public Key As String
Public Sub New(ByVal filmName As String,
ByVal genre As String)
_filmName = filmName
_genre = genre
End Sub
Private _filmName As String
Public ReadOnly Property FilmName As String
Get
Return _filmName
End Get
End Property
Private _genre As String
Public ReadOnly Property Genre As String
Get
Return _genre
End Get
End Property
End Class
Films:
Public Class Films
Inherits KeyedCollection(Of String, Film)
Protected Overrides Function GetKeyForItem(ByVal item As Film) As String
Return item.Key
End Function
Public Function GetBestGenre() As String
Return Me.GroupBy(Function(r) r.Genre).OrderByDescending(Function(g) g.Count()).First().Key
End Function
End Class
I must note that although this code does work it may come unstuck if you have 2 or more genres which are joint top. The code still works however it only returns one of the genres. You may want to expand on the code to suit your needs based on that scenario.
Try something like this:
Dim max As Integer
max = integer1
If integer2 > max Then
max = integer2
End If
If integer3 > max Then
max = integer3
End If
Not many more ways that I can think of off the top of my head to do this.
Something along these lines will work for any number of integers.
Put the numbers into an array then use a For[...]Next statement to loop through the array comparing the current member with max. If max is lower, set it to the current member. When the loop terminates, max will contain the highest number:
Dim nums() As Integer = {1, 2, 3}
Dim max As Integer
For i = 0 To nums.Length - 1
If max < nums(i) Then
max = nums(i)
End If
Next

Visual Basic: loaded parallel list boxes with text file substrings, but now items other than lstBox(0) "out of bounds"

The text file contains lines with the year followed by population like:
2016, 322690000
2015, 320220000
etc.
I separated the lines substrings to get all the years in a list box, and all the population amounts in a separate listbox, using the following code:
Dim strYearPop As String
Dim intYear As Integer
Dim intPop As Integer
strYearPop = popFile.ReadLine()
intYear = CInt(strYearPop.Substring(0, 4))
intPop = CInt(strYearPop.Substring(5))
lstYear.Items.Add(intYear)
lstPop.Items.Add(intPop)
Now I want to add the population amounts together, using the .Items to act as an array.
Dim intPop1 As Integer
intPop1 = lstPop.Items(0) + lstPop.Items(1)
But I get an error on lstPop.Items(1) and any item other than lstPop.Items(0), due to out of range. I understand the concept of out of range, but I thought that I create an index of several items (about 117 lines in the file, so the items indices should go up to 116) when I populated the list box.
How do i populate the list box in a way that creates an index of list box items (similar to an array)?
[I will treat this as an XY problem - please consider reading that after reading this answer.]
What you are missing is the separation of the data from the presentation of the data.
It is not a good idea to use controls to store data: they are meant to show the underlying data.
You could use two arrays for the data, one for the year and one for the population count, or you could use a Class which has properties of the year and the count. The latter is more sensible, as it ties the year and count together in one entity. You can then have a List of that Class to make a collection of the data, like this:
Option Infer On
Option Strict On
Imports System.IO
Public Class Form1
Public Class PopulationDatum
Property Year As Integer
Property Count As Integer
End Class
Function GetData(srcFile As String) As List(Of PopulationDatum)
Dim data As New List(Of PopulationDatum)
Using sr As New StreamReader(srcFile)
While Not sr.EndOfStream
Dim thisLine = sr.ReadLine
Dim parts = thisLine.Split(","c)
If parts.Count = 2 Then
data.Add(New PopulationDatum With {.Year = CInt(parts(0).Trim()), .Count = CInt(parts(1).Trim)})
End If
End While
End Using
Return data
End Function
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
Dim srcFile = "C:\temp\PopulationData.txt"
Dim popData = GetData(srcFile)
Dim popTotal = 0
For Each p In popData
lstYear.Items.Add(p.Year)
lstPop.Items.Add(p.Count)
popTotal = popTotal + p.Count
Next
' popTotal now has the value of the sum of the populations
End Sub
End Class
If using a List(Of T) is too much, then just use the idea of separating the data from the user interface. It makes processing the data much simpler.

Windows forms CheckedListBox issue

I am working on a desktop application developed in vb.net. I am trying to select the items in a checkedlistbox depending on the values I get from database. Below is the code to populate the checkedlistboxes
Private Sub LoadDisapprovalList()
cblFedralReasons.Items.Clear()
cblStateReasons.Items.Clear()
cblFedralReasons.DataSource = Main.DataClient.DisapprovalReasonList_Get(FedralReason)
cblFedralReasons.DisplayMember = "DisapprovalReasonTypeDesc"
cblFedralReasons.ValueMember = "DisapprovalReasonTypeGenId"
cblStateReasons.DataSource = Main.DataClient.DisapprovalReasonList_Get(StateReason)
cblStateReasons.DisplayMember = "DisapprovalReasonTypeDesc"
cblStateReasons.ValueMember = "DisapprovalReasonTypeGenId"
End Sub
After that I am trying to select the items based on the values from database. Here is the code
Private Sub LoadApplicationDisapprovalReasons()
Dim lstApplicationDisapprovalReasons As New List(Of DataService.usp_ApplicationDisapprovalReason_Get_Result)
lstApplicationDisapprovalReasons = Main.DataClient.ApplicationDisapprovalReason_Get(_SeqID)
If lstApplicationDisapprovalReasons.Count > 0 Then
For Each item In lstApplicationDisapprovalReasons
Dim selectedDisapprovalId As Integer = item.DisapprovalReasonTypeGenId
Select Case item.DisapprovalReasonType
Case FedralReason
Dim selectedIndex = cblFedralReasons.Items.IndexOf(selectedDisapprovalId)
cblFedralReasons.SetItemCheckState(selectedIndex, CheckState.Checked)
Case StateReason
Dim selectedIndex = cblStateReasons.Items.IndexOf(selectedDisapprovalId)
cblStateReasons.SetItemCheckState(selectedIndex, CheckState.Checked)
End Select
Next
End If
End Sub
But the problem is cblFedralReasons.Items.IndexOf always returns -1. All the data from database is coming correctly but something weird happening with checkedlistbox which I couldn't understand.
EDIT:
Also when I try to get the text of an item by using the following code it returns me name of my collections instead of the text.
cblFedralReasons.items(1).tostring
It returns
DisapprovalReasonList
and not the text of that item!
I'll try to explain what I think about this:
If cblFedralReasons has as Datasource a List(Of DataService.usp_DisapprovalReasonList), if you search a selectedDisapprovalId vía IndexOf passing an Integer on the list.... that -1 value returned, its coherent.
IndexOf, on a collection, are internally doing a Equals comparison. So you are comparing different types: an Integer vs a DataService.usp_DisapprovalReasonList.
There are many ways to get the correct object from the collection.
One idea could be do an override of object.equals in your class:
Public Overrides Function Equals(ByVal p_oAnotherObject As Object) As Boolean
If TypeOf p_oAnotherObject Is DataService.usp_DisapprovalReasonList AndAlso Me.GetType.Equals(p_oAnotherObject.GetType) Then
Return Me.DisapprovalReasonTypeGenId.Equals(DirectCast(p_oAnotherObject, DataService.usp_DisapprovalReasonList).DisapprovalReasonTypeGenId)
Else
Return False
End If
End Function
Assuming you have a constructor accepting an ID, you now can do this:
cblFedralReasons.Items.IndexOf(New DataService.usp_DisapprovalReasonList(selectedDisapprovalId))
and then, you will get it.
Finally, cblFedralReasons.items(1).tostring, you are getting the default GetType.Name. Do this in your class, then:
Public Overrides Function ToString() As String
Return DisapprovalReasonTypeDesc
End Function
Hope I have explained.