VB.Net - multi-column data variable object - vb.net

I want to create an in-memory object in VB.Net with multiple columns. What I am trying to do is create an index of some data. It will look like:
Row 1: 23 1
Row 2: 5 1
Row 3: 3 38
...
I know I can use a rectangular array to do this, but I want to be able to use indexOf opearations on this object. Is there any such structure in VB.Net?
WT

Define a row class, and then create a list of rows, like so:
Class row
Inherits Collections.ArrayList
End Class
Dim cols As New List(Of row)
Now you can access your objects using a x/y notation:
cols(0)(1)
Note this is just a simple example, your structure is uninitialized and untyped.
You can also Shadow the IndexOf function in your own class, for example finding the indexOf by an item's name:
Class col
Inherits Generic.List(Of Object)
Shadows Function IndexOf(ByVal itemName As String) As Integer
Dim e As Enumerator = Me.GetEnumerator
While e.MoveNext
If CType(e.Current, myType).name = itemName Then
Return e.Current
End If
End While
End Function
End Class
You can then access it like so:
Private cols As New col
cols.IndexOf("lookingfor")

If the number of cells in each row is constant and you don't need to grow or shrink the structure, then a simple two-dimensional array is probably the best choice, because its exposes the best possible locality characteristics. If it is not sorted, you can implement indexOf via a simple linear search.

You can do this with a Dictionary.

Related

VBA dynamically populate nested data structure

I have a few SQL tables, some which are linked, that I would like to query once and store locally in a single variable. I can't predict the length of the data ahead of time so I need a dynamic data structure.
Example data I'm querying:
Table 1
NameA
Red
Green
Blue
Table 2
NameA NameB
Red A
Red B
Red C
Blue D
Blue E
Green F
Table 3
NameA NameC
Red One
Blue Two
Blue Three
Blue Four
Blue Five
Green Six
Green Seven
I need to be able to filter and access NameB and NameC based on NameA values. I would prefer a nested dictionary structure where I could query like below:
Table1("0") 'will equal "Red"
Table2("Red")("0") 'will equal "A"
Table2("Blue")("1") 'will equal "E"
Table3("Green")("1") 'will equal "Seven"
'note: point here is data structure, not order of results
I have tried using VBA's nested dictionaries but have been unable to get around the lack of a "deep copy" function. One algorithm I wrote:
With SqlQueryResult
i = 0
Do Until .EOF
Call Table1.Add(CStr(i), .Fields(0).Value)
i = i + 1
.MoveNext
Loop
End With
For Each key In Table1.Keys
SqlQueryResult = GetResultsFromQuery(SELECT NameB WHERE NameA = Table1(key))
With SqlQueryResult
i = 0
Do Until .EOF
Call TempDict.Add(CStr(i), .Fields(0).Value)
i = i + 1
.MoveNext
Loop
End With
Set Table2(Table1(key)) = TempDict
TempDict.RemoveAll
Next key
Unfortunately assigning a Dict to another Dict only sets a reference and doesn't actually copy over data -- when I delete TempDict, the nested data from Table2 is also removed.
I also can't have a new dictionary per "branch" in the nest structure as I need this data to be available at a module-level scope, and therefore need to define these in the top of the module before program execution.
I've looked at multi-dimentional dynamic arrays - these can't be assigned to a parent structure like a dictionary. I also can't predict the size of each of these tables, e.g. Table1 might be 5/20/100/etc in size, Red may have 2/5/100/etcetc results in Table 2, Blue have 1/20/etcetc results in Table 2. Redim only works on a single dimension in an array.
I've had a brief look at Collections as well, and I am not sure these are viable.
I don't have much experience with classes and I would rather avoid a very involved process - I want it to be easy to add linked and unliked (i.e. data linked to Table 1, like Table 2 and 3, vs stand-alone data not related to any other table) to this program should I need to in the future. (My benchmark for "easy" is a pandas dataframe in python).
A simple wrapper class for scripting dictionaries which implements a clone method. This should work fine with primitive datatypes.
Option Explicit
Private Type State
Dict As scripting.Dictionary
End Type
Private s As State
Private Sub Class_Initialize()
Set s.Dict = New scripting.Dictionary
End Sub
Public Function Clone()
Dim myClone As scripting.Dictionary
Set myClone = New scripting.Dictionary
Dim myKey As Variant
For Each myKey In s.Dict
myClone.Add myKey, s.Dict.Item(myKey)
Next
Set Clone = myClone
End Function
Public Property Get Item(ByVal Key As Variant) As Variant
Item = s.Dict.Item(Key)
End Property
Public Property Set Item(ByVal Key As Variant, ByVal Value As Variant)
s.Dict.Item(Key) = Value
End Property
Public Sub Add(ByVal Key As Variant, ByVal Item As Variant)
s.Dict.Add Key, Item
End Sub
You will now be able to say
Set Table2.Item(Table1.Item(key)) = TempDict.Clone

How to get index with known value as string in arraylist?

I have ArrayList with multi dimensional values in Structure as ArrayList
Structure
(0)=> Mstructure.ABCD
Area (these are value of ABCD)
Name
(1)=> Mstructure.EFGH
Area
Name
I want to know the index of string ABCD.
I tried
Dim myIndex as Integer = Structure.IndexOf("ABCD")
Dim myIndex = Array.IndexOf(Structure.ToArray(), myString)
This returned only -1. I want to get 0 if string is ABCD
EDITED
Structure is defined as ArrayList. I can iterate over it if I use loop e.g
Structure(i).GetType().Name = "ABCD"
I have checked if it exists in the ArrayList
Dim result = Structure.ToArray().Any(Function(x) x.ToString().Contains("ABCD"))
But I want to know the index of the multidimensional ArrayList without looping it. I want to get the index of Mstructure.ABCD. Msttructure.ABCD has values inside it but without knowing those values I want to get the index value.
(0){Mstructure.ABCD}
(1){Mstructure.EFGH}
(2){Mstructure.IJKL}
hhmmm i would use collections instead, or create an object with multi collections to make it easier for me... Have you thought of that?

Why does ".Remove" affect all the items in a 2D structure assigned with a list?

I'm currently trying to create a Sudoku Solver, and on the step of assigning some possible values to a box that is not already preoccupied. (Bit of background info for why I'm doing this shebang: Sudoku is a number game based on a 9x9 grid, its contextual rules allow certain boxes in the grid that are not preoccupied to hold possible values during the process of solving )
To do this I created a structure, defined it as two dimensional, and populated it with a predefined list of integers using a for-loop.
Now when I tried to remove one integer from the list of a particular item in the two dimensional structure, I found out that all the lists of the items in the structure have had that integer removed. There's probably a simple solution to this, but I've been really struggling to find it. Hope the code below clarifies the somewhat confusing verbal explanation.
Structure Element
Dim PossibleValues As List(Of Integer)
Dim ElementValue As Integer
End Structure
Sub Main()
Dim List as New List(Of Integer)({1,2,3})
Dim TDP(8,8) as Element
For x as integer = 0 to 8
For y as integer = 0 to 8
TDP(x,y).PossibleValues = List
Next
Next
TDP(0,0).PossibleValues.Remove(1)
End Sub
Now I expect only TDP(0,0) would have a list of "2,3" when print out its list of integers, but when I check other items , i.e. TDP(1,0), its list is of integer is also "2,3"
Look at the assignment here:
TDP(x,y).PossibleValues = List
List(Of T) is a reference type, so this assigns a reference to the same List object to each of the array elements.
If you want each item to have it's own list of possible items, you need to either deep copy the list or create a new list:
Sub Main()
Dim TDP(8,8) as Element
For x as integer = 0 to 8
For y as integer = 0 to 8
TDP(x,y).PossibleValues = New List(Of Integer)({1,2,3})
Next
Next
TDP(0,0).PossibleValues.Remove(1)
End Sub

In VBA, how can I store an entire excel table into an easily accessible array that contains all the relevant child information?

I have a very large excel table that contains these elements:
Name Date StartWorkedAt FinishWorkAt HoursWorked
I am creating a button to manipulate this data, but am having trouble deciding what format to store all the data in. Each person in the list will have multiple dates that they all worked, and so I would like to be able to check the start and finish times for each person for various dates.
I wrote a short script to count how many unique names I have so that I could use a multidimensional array to access the data of each person like so:
workTable[0][0]
So this would ideally give me the start and/or finish time of the first person on the first date that he/she worked on.
but the issue I was having was that the data is in various formats. The name is a string, the date is a Date, and the hoursWorked is an integer.
What is an easier way to store the data in VBA so that I can access each person individually and find out what date they worked and when they started and finished?
Use a Class module with properties for each of the columns you need and create a Collection of that class.
For example, create a class module (named say ExcelRow) with the following properties:
Private pName As String
Private pDate As Date
Private pStartWorkedAt As Date
Private FinishWorkAt As Date
Private HoursWorked As Integer
You'll need public properties for EACH of these private variables. Here's an example of setting up Get and Let for the pName property. The public vars can be differently named to the private vars:
Public Property Get Name() As String
Name = pName
End Property
Public Property Let Name(Value As String)
pName = Value
End Property
Then you can have a collection and add instances of each class module row to it:
Dim ExcelRows As Collection
Dim Row As ExcelRow
Set ExcelRows = New Collection
Set Row = New ExcelRow
Row.Name = "Joe"
Row.HourseWorked = 3
ExcelRows.Add Row
Set Row = New ExcelRow
Row.Name = "Sam"
Row.HourseWorked = 54
ExcelRows.Add Row
'Or you could use a For Loop for this
If your table has a unique identifier, consider using that on as the key for the collection item for easy access to the data rows later on. Ash's last line of code would change to:
ExcelRows.Add Row, Row.Name 'given that Name is unique across the table rows
Later you can access the data with:
x = ExcelRows("Joe").HourseWorked '= 3

Simplest/fastest way to check if value exists in DataTable in VB.net?

I have a DataTable (currently with multiple columns but I could just grab one column if it makes it easier). I want to check if a String value exists in a column of the DataTable. (I'm doing it many times so I want it to be reasonably fast.)
What is a good way to do this? Iterating through the DataTable rows each time seems like a bad way. Can I convert the column to a flat List/Array format, and use a built in function? Something like myStrList.Contains("value")?
You can use select to find whether that value exist or not. If so, it returns rows or it will not. Here is some sample code to help you.
Dim foundRow() As DataRow
foundRow = dt.Select("SalesCategory='HP'")
If the data in your DataTable doesn't change very often, and you search the DataTable multiple times, and your DataTable contains many rows, then it's likely going to be a lot faster to build your own index for the data.
The simplest way to do this is to sort the data by the key column so that you can then do a binary search on the sorted list. For instance, you can build an index like this:
Private Function BuildIndex(table As DataTable, keyColumnIndex As Integer) As List(Of String)
Dim index As New List(Of String)(table.Rows.Count)
For Each row As DataRow in table.Rows
index.Add(row(keyColumnIndex))
Next
index.Sort()
Return index
End Function
Then, you can check if a value exists in the index quickly with a binary search, like this:
Private Function ItemExists(index As List(Of String), key As String) As Boolean
Dim index As Integer = index.BinarySearch(key)
If index >= 0 Then
Return True
Else
Return False
End If
End Function
You could also do the same thing with a simple string array. Or, you could use a Dictionary object (which is an implementation of a hash table) to build a hash index of your DataTable, for instance:
Private Function BuildIndex(table As DataTable, keyColumnIndex As Integer) As Dictionary(Of String, DataRow)
Dim index As New Dictionary(Of String, DataRow)(table.Rows.Count)
For Each row As DataRow in table.Rows
index(row(keyColumnIndex)) = row
Next
Return index
End Function
Then, you can get the matching DataRow for a given key, like this:
Dim index As Dictionary(Of String, DataRow) = BuildIndex(myDataTable, myKeyColumnIndex)
Dim row As DataRow = Nothing
If index.TryGetValue(myKey, row) Then
' row was found, can now use row variable to access all the data in that row
Else
' row with that key does not exist
End If
You may also want to look into using either the SortedList or SortedDictionary class. Both of these are implementations of binary trees. It's hard to say which of all of these options is going to be fastest in your particular scenario. It all depends on the type of data, how often the index needs to be re-built, how often you search it, how many rows are in the DataTable, and what you need to do with the found items. The best thing to do would be to try each one in a test case and see which one works best for what you need.
You should use row filter or DataTable.Rows.Find() instead of select (select does not use indexes). Depending on your table structure, specifically if your field in question is indexed (locally), performance of either way should be much faster than looping through all rows. In .NET, a set of fields needs to be a PrimaryKey to become indexed.
If your field is not indexed, I would avoid both select and row filter, because aside from overhead of class complexity, they don't offer compile time check for correctness of your condition. If it's a long one, you may end up spending lots of time debugging it once in a while.
It is always preferable to have your check strictly typed. Having first defined an underlying type, you can also define this helper method, which you can convert to extension method of DataTable class later:
Shared Function CheckValue(myTable As DataTable, columnName As String, searchValue As String) As Boolean
For row As DataRow In myTable.Rows
If row(columnName) = searchValue Then Return True
Next
Return False
End Function
or a more generic version of it:
Shared Function CheckValue(myTable As DataTable, checkFunc As Func(Of DataRow, Boolean)) As Boolean
For Each row As DataRow In myTable.Rows
If checkFunc(row) Then Return True
Next
Return False
End Function
and its usage:
CheckValue(myTable, Function(x) x("myColumn") = "123")
If your row class has MyColumn property of type String, it becomes:
CheckValue(myTable, Function(x) x.myColumn = "123")
One of the benefits of above approach is that you are able to feed calculated fields into your check condition, since myColumn here does not need to match a physical myColumn in the table/database.
bool exists = dt.AsEnumerable().Where(c => c.Field<string>("Author").Equals("your lookup value")).Count() > 0;