Dynamicaly change the nr. of dimensions of a VBA array - vba

I was wondering if there was any way to change the number of dimensions of an array:
In VBA,
Depending on an integer max_dim_bound which indicates the the
desired nr. of dimensions.
Allowing for a starting index of the dimension: E.G. `array(4 to 5, 3 to 6) where the number of 3 to 6 are variable integers.
*In the code itself without extra tools
*Without exporting the code.
To be clear, the following change does not change the nr of dimensions of an array, (merely the starting end ending indices of the elements in each respective dimension):
my_arr(3 to 5, 6 to 10)
'changed to:
my_arr(4 to 8, 2 to 7)
The following example would be a successfull change of the nr. of dimensions in an array:
my_arr(3 to 5, 6 to 10)
'changed to:
my_arr(4 to 8, 2 to 7,42 to 29)
This would also be a change in the nr. of dimensions in an array:
my_arr(4 to 8, 2 to 7,42 to 29)
'changed to:
my_arr(3 to 5, 6 to 10)
So far my attempts have consisted of:
Sub test_if_dynamically_can_set_dimensions()
Dim changing_dimension() As Double
Dim dimension_string_attempt_0 As String
Dim dimension_string_attempt_1 As String
Dim max_dim_bound As String
Dim lower_element_boundary As Integer
Dim upper_element_boundary As Integer
upper_element_boundary = 2
max_dim_bound = 4
For dimen = 1 To max_dim_bound
If dimen < max_dim_bound Then
dimension_string_attempt_0 = dimension_string_attempt_0 & "1 To " & upper_element_boundary & ","
MsgBox (dimension_string_attempt_0)
Else
dimension_string_attempt_0 = dimension_string_attempt_0 & "1 To " & upper_element_boundary
End If
Next dimen
MsgBox (dimension_string_attempt_0)
'ReDim changing_dimension(dimension_string_attempt_0) 'does not work because the "To" as expected in the array dimension is not a string but reserved word that assists in the operation of setting an array's dimension(s)
'ReDim changing_dimension(1 & "To" & 3, 1 To 3, 1 To 3) 'does not work because the word "To" that is expected here in the array dimension is not a string but a reserved word that assists the operation of setting an array's dimension(s).
'ReDim changing_dimension(1 To 3, 1 To 3, 1 To 3, 1 To 3)
'attempt 1:
For dimen = 1 To max_dim_bound
If dimen < max_dim_bound Then
dimension_string_attempt_1 = dimension_string_attempt_1 & upper_element_boundary & ","
MsgBox (dimension_string_attempt_1)
Else
dimension_string_attempt_1 = dimension_string_attempt_1 & upper_element_boundary
End If
Next dimen
MsgBox (dimension_string_attempt_1)
ReDim changing_dimension(dimension_string_attempt_1) 'this does not change the nr of dimensions to 2, but just one dimension of "3" and "3" = "33" = 33 elements + the 0th element
'changing_dimension(2, 1, 2, 1) = 4.5
'MsgBox (changing_dimension(2, 1, 2, 1))
End Sub
*Otherwise a solution is to:
Export the whole code of a module, and at the line of the dimension substitute the static redimension of the array, with the quasi-dynamic string dimension_string.
Delete the current module
Import the new module with the quasi-dynamic string dimension_string as a refreshed static redimension in the code.
However, it seems convoluted and I am curious if someone knows a simpler solution.
Note that this is not a duplicate of: Dynamically Dimensioning A VBA Array? Even though the question seems to mean what I am asking here, the intention of the question seems to be to change the nr. of elements in a dimension, not the nr. of dimensions. (The difference is discussed in this article by Microsoft.)
In an attempt to apply the answer of Uri Goren, I analyzed every line and looked up what they did, and commented my understanding behind it, so that my understanding can be improved or corrected. Because I had difficulty not only running the code, but also understanding how this answers the question. This attempt consisted of the following steps:
Right click the code folder ->Insert ->Class Module Then clicked:
Tools>Options> "marked:Require variable declaration" as shown
here at 00:59.
Next I renamed the class module to
Next I wrote the following code in class module FlexibleArray:
Option Explicit
Dim A As New FlexibleArray
Private keys() As Integer
Private vals() As String
Private i As Integer
Public Sub Init(ByVal n As Integer)
ReDim keys(n) 'changes the starting element index of array keys to 0 and index of last element to n
ReDim vals(n) 'changes the starting element index of array keys to 0 and index of last element to n
For i = 1 To n
keys(i) = i 'fills the array keys as with integers from 1 to n
Next i
End Sub
Public Function GetByKey(ByVal key As Integer) As String
GetByKey = vals(Application.Match(key, keys, False))
' Application.Match("what you want to find as variant", "where you can find it as variant", defines the combination of match type required and accompanying output)
'Source: https://msdn.microsoft.com/en-us/vba/excel-vba/articles/worksheetfunction-match-method-excel
' If match_type is 1, MATCH finds the largest value that is less than or equal to lookup_value. Lookup_array must be placed in ascending order: ...-2, -1, 0, 1, 2, ..., A-Z, FALSE, TRUE.
' If match_type is 0, MATCH finds the first value that is exactly equal to lookup_value. Lookup_array can be in any order.
' If match_type is -1, MATCH finds the smallest value that is greater than or equal to lookup_value. Lookup_array must be placed in descending order: TRUE, FALSE, Z-A, ...2, 1, 0, -1, -2, ..., and so on.
'so with False as 3rd optional argument "-1" it finds the smallest value greater than or equal to the lookup variant, meaning:
'the lowest value of keys that equals or is greater than key is entered into vals,
'with keys as an array of 1 to n, it will return key, if n >= key. (if keys is initialized right before getbykey is called and is not changed inbetween.
'vals becomes the number inside a string. So vals becomes the number key if key >= n.
End Function
Public Sub SetByKey(ByVal key As Integer, ByVal val As String)
vals(Application.Match(key, keys, False)) = val
'here string array vals(element index: key) becomes string val if key >=n (meaning if the element exists)
End Sub
Public Sub RenameKey(ByVal oldName As Integer, ByVal newName As Integer)
keys(Application.Match(oldName, keys, False)) = newName
'here keys element oldname becomes new name if it exists in keys.
End Sub
And then I created a new module11 and copied the code below in it, including modifications to try and get the code working.
Option Explicit
Sub use_class_module()
Dim A As New FlexibleArray 'this dimensions object A but it is not set yet
A.Init (3) 'calls the public sub "Init" in class module FlexibleArray, and passes integer n = 3.
'A.SetByKey(1, "a") 'this means that Objecgt A. in class FlexibleArray function SetByKey sets the private string array vals(1) in class Flexible Array becomes "a"
'A.SetByKey(2, "b") 'this means that Objecgt A. in class FlexibleArray function SetByKey sets the private string array vals(2) in class Flexible Array becomes "b"
'A.SetByKey(3, "c") 'this means that Object A. in class FlexibleArray function SetByKey sets the private string array vals(3) in class Flexible Array becomes "c"
'A.RenameKey(3,5) 'This means that object A in class FlexibleArray keys element 3 becomes 5 so keys(3) = 5
' Would print the char "c"
'to try to use the functions:
'A.SetByKey(1, "a") = 4
'MsgBox (keys("a"))
'test = A.SetByKey(1, "a") 'this means that Objecgt A. in class FlexibleArray function SetByKey sets the private string array vals(1) in class Flexible Array becomes "a"
'MsgBox (test)
'test_rename = A.RenameKey(3, 5) 'This means that object A in class FlexibleArray keys element 3 becomes 5 so keys(3) = 5
'MsgBox (test_rename)
'Print A.GetByKey(5) 'Method not valid without suitable object
'current problem:
'the A.SetByKey expects a function or variable, even though it appears to be a function itself.
End Sub
What I currently expect that this code replaces the my_array(3 to 4,5 to 9..) to an array that exists in/as the class module FlexibleArray, that is called when it needs to be used in the module. But Any clearifications would be greatly appreciated! :)

If the goal of redimensioning arrays is limited to a non-absurd number of levels, a simple function might work for you, say for 1 to 4 dimensions?
You could pass the a string representing the lower and upper bounds of each dimension and that pass back the redimensioned array
Public Function FlexibleArray(strDimensions As String) As Variant
' strDimensions = numeric dimensions of new array
' eg. "1,5,3,6,2,10" creates ARRAY(1 To 5, 3 To 6, 2 To 10)
Dim arr() As Variant
Dim varDim As Variant
Dim intDim As Integer
varDim = Split(strDimensions, ",")
intDim = (UBound(varDim) + 1) / 2
Select Case intDim
Case 1
ReDim arr(varDim(0) To varDim(1))
Case 2
ReDim arr(varDim(0) To varDim(1), varDim(2) To varDim(3))
Case 3
ReDim arr(varDim(0) To varDim(1), varDim(2) To varDim(3), varDim(4) To varDim(5))
Case 4
ReDim arr(varDim(0) To varDim(1), varDim(2) To varDim(3), varDim(4) To varDim(5), varDim(6) To varDim(7))
End Select
' Return re-dimensioned array
FlexibleArray = arr
End Function
Test it by calling it with your array bounds
Public Sub redimarray()
Dim NewArray() As Variant
NewArray = FlexibleArray("1,2,3,8,2,9")
End Sub
Should come back with an array looking like this in Debug mode
EDIT - Added Example of truly dynamic array of variant arrays
Here's an example of a way to get a truly flexible redimensioned array, but I'm not sure it's what you're looking for as the firt index is used to access the other array elements.
Public Function FlexArray(strDimensions As String) As Variant
Dim arrTemp As Variant
Dim varTemp As Variant
Dim varDim As Variant
Dim intNumDim As Integer
Dim iDim As Integer
Dim iArr As Integer
varDim = Split(strDimensions, ",")
intNumDim = (UBound(varDim) + 1) / 2
' Setup redimensioned source array
ReDim arrTemp(intNumDim)
iArr = 0
For iDim = LBound(varDim) To UBound(varDim) Step 2
ReDim varTemp(varDim(iDim) To varDim(iDim + 1))
arrTemp(iArr) = varTemp
iArr = iArr + 1
Next iDim
FlexArray = arrTemp
End Function
And if you look at it in Debug, you'll note the redimensioned sub arrays that are now accessible from the first index of the returned array

Sounds like you are abusing arrays for something they weren't meant to do with a ton of memory copying.
What you want is to write your own Class (Right click the code folder ->Insert ->Class Module), let's call it FlexibleArray.
Your class code would be something like this:
Private keys() as Integer
Private vals() as String
Private i as Integer
Public Sub Init(ByVal n as Integer)
Redim keys(n)
Redim vals(n)
For i = 1 to n
keys(i) = i
Next i
End Sub
Public Function GetByKey(ByVal key As Integer) As String
GetByKey = vals(Application.Match(key, keys, False))
End Function
Public Sub SetByKey(ByVal key As Integer, ByVal val As String)
vals(Application.Match(key, keys, False)) = val
End Sub
Public Sub RenameKey(ByVal oldName As Integer, ByVal newName As Integer)
keys(Application.Match(oldName, keys, False))=newName
End Sub
Now you can rename whatever key you want:
Dim A as New FlexibleArray
A.Init(3)
A.SetByKey(1, "a")
A.SetByKey(2, "b")
A.SetByKey(3, "c")
A.RenameKey(3,5)
Print A.GetByKey(5)
' Would print the char "c"
Extending it to integer ranges (like your example) is pretty straight forward

Related

For Loop Overwriting Data within Array

So I have a data set that is made up of a variety tag numbers - I'm trying to develop a VBA "fucntion" to basically give recommendation on a tag number when inputting a new one. This would be easy but the current list gives gaps within tag numbers (eg) goes 4001 4002 4005 . This bit of code is taking that gap and storing "option" tags which I plan to display to the user (so 4003 and 4004). The problem is that these gaps are encountered more than once eg) 4001 4002 4005 4006 4007 4011 4012 and when it comes to the second gap (4008 4009 4010) it overwrites the existing array - how can I get it take each gap and then begin the array below that?
My code is as follows:
For j = 2 To UBound(strArrayNumber)
If strArrayNumber(j) <> strArrayNumber(j - 1) + 1 Then
Gap = strArrayNumber(j) - strArrayNumber(j - 1)
For b = 1 To Gap
ReDim TagOptions(1 To Gap) As Integer
TagOptions(b) = strArrayNumber(j - 1) + b
sh.Cells(b, 6) = TagOptions(b)
Next b
End If
At the prompting of #JohnSUN here is the bare bones of a solution using Scripting.Dictionary.
The OP desires to manage a data set. To simplify this management I have chosen to create a TagList object. The TagList object allows
Population of the tag list from two arrays
Updating the value associated with a tag item
getting back an array of tags or object
getting an array of the next free tags
testing if a tag exists
This is a bare bones example as it is designed to point the way rather than provide a complete solution. It has some obvious ommisions, i.e. there is no code for what happens if a the request for the next set of free tags uses a tag that is higher then the maximum tag number, there is no error code for what happens if we try to add an existing tag to the TagList etc.
The Class Taglist code compiles without error and shows no significant inospection results after a Rubberduck code inspection
Class TagList
Option Explicit
Private Type State
' Requires a reference to 'Microsoft Scripting Runtime'
TagList As Scripting.Dictionary
End Type
Private s As State
Private Sub Class_Initialize()
Set s.TagList = New Scripting.Dictionary
End Sub
Public Property Get Item(ByVal iptag As Long) As Variant
Item = s.TagList.Item(iptag)
End Property
Public Property Let Item(ByVal iptag As Long, ByVal ipitem As Variant)
s.TagList.Item(iptag) = ipitem
End Property
Public Function Tags() As Variant
Tags = s.TagList.keys
End Function
Public Function Items() As Variant
Items = s.TagList.Items
End Function
Public Sub Add(ByVal ipTags As Variant, Optional ByVal ipItems As Variant)
'ipTags and ipItems should be arrays of equal size
If IsMissing(ipItems) Then
ReDim ipItems(UBound(ipTags) - LBound(Tags) + 1)
End If
If (UBound(ipTags) - LBound(ipTags)) - (UBound(ipItems) - LBound(ipItems)) <> 0 Then
Err.Raise vbObjectError + 512, "Size Error", "Arrays are different sizes"
End If
Dim myItemsIndex As Long
myItemsIndex = LBound(ipItems)
Dim myTag As Variant
For Each myTag In ipTags
s.TagList.Add myTag, CVar(ipItems(myItemsIndex))
myItemsIndex = myItemsIndex + 1
Next
End Sub
Public Function Exists(ByVal iptag As Long) As Boolean
Exists = s.TagList.Exists(iptag)
End Function
Public Function NextFreeTags(ByVal iptag As Long) As Variant
If iptag < 1 Then
Err.Raise vbObjectError + 512, "Negative Tag", "Tag numbers must be positive"
End If
Dim myFreeTags As Scripting.Dictionary
Set myFreeTags = New Scripting.Dictionary
Do While s.TagList.Exists(iptag)
iptag = iptag + 1
Loop
Do Until s.TagList.Exists(iptag)
myFreeTags.Add iptag, iptag
iptag = iptag + 1
Loop
NextFreeTags = myFreeTags.keys
End Function
Thus we can now do the following
Dim myTagList as TagList
Set myTagList = New TagList
mytaglist.Add Array(4006, 4001, 4002, 4011, 4005, 4007)
' Note the above is the short form version we could equally say
' myTagList.add Array(4006, 4001, 4002, 4011, 4005, 4007), Array(Obj6, Obj1, Obj11, Obj5, Obj7)
'Oops, we forgot to add tag 4012
myTaglist.add array(4012)
' getting then next free tags
dim myTags as variant
myTags = myTaglist.NextFreeTags(4001)
etc
Let's not calculate the Gap. The parameters of the cycle by B can be any - we will use this. Try this:
Set oCell = sh.Range("F1")
TagOptions = Array()
For j = 2 To UBound(strArrayNumber)
If strArrayNumber(j) <> strArrayNumber(j - 1) + 1 Then
For b = strArrayNumber(j - 1) + 1 To strArrayNumber(j) - 1
uBoundTagOptions = UBound(TagOptions) + 1
ReDim Preserve TagOptions(1 To uBoundTagOptions)
TagOptions(uBoundTagOptions) = b
oCell.Value = b
Set oCell = oCell.Offset(1, 0)
Next b
End If
Next j
Don't forget to specify Option Base 1 at the beginning of the module, otherwise TagOptions = Array() will create an empty array [0..-1] , with LBound = 0, but you want from LBound = 1
UPDATE
By the way, there is one old trick using an auxiliary array. This is used to clear data from duplicates, for sorting and other things. The method is resource-intensive, but very fast. It looks something like this:
Function getGaps(aData As Variant) As Variant
Dim lB As Long, uB As Long, i As Long, lCount As Long
Dim aTemp() As Boolean, aResult() As Long
If IsArray(aData) Then
lB = Application.WorksheetFunction.Min(aData)
uB = Application.WorksheetFunction.Max(aData)
Rem Let's create an auxiliary array of boolean flags. It may be very large, but it won't be long.
ReDim aTemp(lB To uB)
Rem Let's mark in the array those values ??that are in the aData.
Rem In this case, we will skip duplicates, if any, and count the number of unique numbers.
lCount = 0
For i = LBound(aData) To UBound(aData)
If Not aTemp(aData(i)) Then
aTemp(aData(i)) = True
lCount = lCount + 1
End If
Next i
Rem Number of values in gaps:
uB = uB - lB - lCount + 1
If uB > 0 Then ' It may be that there were no gaps in the array
ReDim aResult(1 To uB)
lCount = 0
For i = LBound(aTemp) To UBound(aTemp)
If Not aTemp(i) Then
lCount = lCount + 1
aResult(lCount) = i
End If
Next i
Rem Here it is - the result of the function
getGaps = aResult
Rem The interpreter will destroy this variable immediately after exiting the function,
Rem but will free the memory a little later. This line should help free memory faster
ReDim aTemp(0)
End If
End If
Rem In all other cases, the function will return Empty
End Function
Despite the number of lines, it is very simple and very fast since all the work is done in RAM.
With this function, all the code you showed in your question becomes:
Set oCell = sh.Range("F1")
oCell.EntireColumn.ClearContents ' This is for the case where no gaps are found
TagOptions = getGaps(strArrayNumber)
If IsEmpty(TagOptions) Then
Debug.Print "No gaps in array"
Exit Sub
End If
oCell.Resize(UBound(TagOptions), 1).Value2 = Application.WorksheetFunction.Transpose(TagOptions)
Of course TRANSPOSE () might not work correctly for a very large array. But I hope that you are not so cruel to your users and the list of possible tags does not exceed a hundred or two.

VBA: adding random numbers to a grid that arent already in the grid

Sub FWP()
Dim i As Integer
Dim j As Integer
Dim n As Integer
n = Range("A1").Value
For i = 1 To n
For j = 1 To n
If Cells(i + 1, j) = 0 Then
Cells(i + 1, j).Value = Int(((n ^ 2) - 1 + 1) * Rnd + 1)
ElseIf Cells(i + 1, j) <> 0 Then
Cells(i + 1, j).Value = Cells(i + 1, j).Value
End If
Next j
Next i
I am trying to do a part of a homework question that asks to fill in missing spaces in a magic square in VBA. It is set up as a (n x n) matrix with n^2 numbers in; the spaces I need to fill are represented by zeros in the matrix. So far I have some code that goes through checking each individual cell value, and will leave the values alone if not 0, and if the value is 0, it replaces them with a random number between 1 and n^2. The issue is that obviously I'm getting some duplicate values, which isn't allowed, there must be only 1 of each number.
How do I code it so that there will be no duplicate numbers appearing in the grid?
I am attempting to put in a check function to see if they are already in the grid but am not sure how to do it
Thanks
There are a lot of approaches you can take, but #CMArg is right in saying that an array or dictionary is a good way of ensuring that you don't have duplicates.
What you want to avoid is a scenario where each cell takes progressively longer to populate. It isn't a problem for a very small square (e.g. 10x10), but very large squares can get ugly. (If your range is 1-100, and all numbers except 31 are already in the table, it's going to take a long time--100 guesses on average, right?--to pull the one unused number. If the range is 1-40000 (200x200), it will take 40000 guesses to fill the last cell.)
So instead of keeping a list of numbers that have already been used, think about how you can effectively go through and "cross-off" the already used numbers, so that each new cell takes exactly 1 "guess" to populate.
Here's one way you might implement it:
Class: SingleRandoms
Option Explicit
Private mUnusedValues As Scripting.Dictionary
Private mUsedValues As Scripting.Dictionary
Private Sub Class_Initialize()
Set mUnusedValues = New Scripting.Dictionary
Set mUsedValues = New Scripting.Dictionary
End Sub
Public Sub GenerateRange(minimumNumber As Long, maximumNumber As Long)
Dim i As Long
With mUnusedValues
.RemoveAll
For i = minimumNumber To maximumNumber
.Add i, i
Next
End With
End Sub
Public Function GetRandom() As Long
Dim i As Long, keyID As Long
Randomize timer
With mUnusedValues
i = .Count
keyID = Int(Rnd * i)
GetRandom = .Keys(keyID)
.Remove GetRandom
End With
mUsedValues.Add GetRandom, GetRandom
End Function
Public Property Get AvailableValues() As Scripting.Dictionary
Set AvailableValues = mUnusedValues
End Property
Public Property Get UsedValues() As Scripting.Dictionary
Set UsedValues = mUsedValues
End Property
Example of the class in action:
Public Sub getRandoms()
Dim r As SingleRandoms
Set r = New SingleRandoms
With r
.GenerateRange 1, 100
Do Until .AvailableValues.Count = 0
Debug.Print .GetRandom()
Loop
End With
End Sub
Using a collection would actually be more memory efficient and faster than using a dictionary, but the dictionary makes it easier to validate that it's doing what it's supposed to do (since you can use .Exists, etc.).
Nobody is going to do your homework for you. You would only be cheating yourself. Shame on them if they do.
I'm not sure how picky your teacher is, but there are many ways to solve this.
You can put the values of the matrix into an array.
Check if a zero value element exists, if not, break.
Then obtain your potential random number for insertion.
Iterate through the array with a for loop checking each element for this value. If it is not present, replace the zero element.

Function will not return array when range contains only one value

I have a function meant to return an array which is created out of a single-column list of data. I have been using this function's return value essentially as a pseudo-global variable (LINENAMES_ARRAY) which I pass to many functions. Those functions than do checks on it such as If Len(Join(LINENAMES_ARRAY)) = 0 Then or go through items with For Each statements. Here is the code:
Function LINENAMES_ARRAY() As Variant
'returns an array of all items in the main sheet linenames column
LINENAMES_ARRAY = Application.Transpose(MAIN.Range( _
MAIN.Cells(MAIN_HEAD_COUNT + 1, MAIN_LINENAMES_COLUMN), _
MAIN.Cells(LINENAMES_COUNT + 1, MAIN_LINENAMES_COLUMN)))
End Function
I recently stumbled on one of those you-don't-see-it-till-you-see-it problems while using this workbook for a new project, where if the array happens to be only 1 element, everything fails. Apparently in that case, this returns a single value so Join() will fail For Each __ in LINENAMES_ARRAY will too. Why won't it treat this as a 1x1 array rather than a free value? I have started to mitigate the problem by rewriting functions where this is called, to check whether it is an array, then do some other procedure. Things like:
For j = 1 To LINENAMES_COUNT
LINES_BOX.AddItem lineNames(j)
Next j
is changed to:
If Not IsArray(LINENAMES_ARRAY) Then
myListBox.AddItem CStr(LINENAMES_ARRAY)
Else
For j = 1 To LINENAMES_COUNT
LINES_BOX.AddItem LINENAMES_ARRAY(j)
Next j
End If
However this becomes messy and is adding many extra checks to my code that I would prefer to handle in the LINENAMES_ARRAY function. Is there a way to return a 1x1 array? Or any other workaround?
An array can have a single element if you create it as a single element array and populate it in an array manner.
Option Explicit
Dim MAIN_HEAD_COUNT As Long
Dim LINENAMES_COUNT As Long
Dim MAIN_LINENAMES_COLUMN As Long
Dim MAIN As Worksheet
Sub stuff()
Dim arr As Variant
Set MAIN = Worksheets("Sheet1")
MAIN_LINENAMES_COLUMN = 2
MAIN_HEAD_COUNT = 2
LINENAMES_COUNT = 2
arr = LINENAMES_ARRAY()
Debug.Print IsArray(arr)
Debug.Print LBound(arr) & ":" & UBound(arr)
End Sub
Function LINENAMES_ARRAY() As Variant
Dim a As Long, tmp() As Variant
ReDim tmp(0 To LINENAMES_COUNT - MAIN_HEAD_COUNT)
For a = 0 To LINENAMES_COUNT - MAIN_HEAD_COUNT
tmp(a) = MAIN.Range(MAIN.Cells(MAIN_HEAD_COUNT + 1, MAIN_LINENAMES_COLUMN), _
MAIN.Cells(LINENAMES_COUNT + 1, MAIN_LINENAMES_COLUMN)).Cells(a).Value2
Next a
'returns an array of all items in the main sheet linenames column
LINENAMES_ARRAY = tmp
End Function
Results from the VBE's Immediate window:
True
0:0

Multi-dimensional Dictionary class in VBA

This post is half to share a solution and half to ask if there's a better way to do it.
Problem: how to build a multi-dimensional dictionary in VBA.
It seems there are people out there looking for one, but there isn't an obvious neat solution around so I came up with some code, as follows.
Specific case: convert an ADO Recordset into a Dictionary, where several columns comprise the unique key for a row. Adding multiple records to the same Dictionary fails unless you come up with a key that concatenates all the columns that comprise the unique key.
General case: model a tree structure in an object hierarchy where there might not be the same number of branches across every node at the same level in the hierarchy.
The code below solves both problems. Performance untested but the VBA Scripting library's Dictionary class is apparently indexed with a hash table and I've seen very large systems built with it, so I doubt performance will be an issue. Maybe one of the giant brains out there will correct me on this.
Put this into a VBA class called multiDictionary:
Option Explicit
' generic multi-dimensional dictionary class
' each successive higher dimension dictionary is nested within a lower dimension dictionary
Private pDictionary As Dictionary
Private pDimensionKeys() As Variant
Private Const reservedItemName As String = "multiItem"
Public Function add(value As Variant, ParamArray keys() As Variant)
Dim searchDictionary As Dictionary
Dim newDictionary As Dictionary
Dim count As Long
If pDictionary Is Nothing Then Set pDictionary = New Dictionary
Set searchDictionary = pDictionary
For count = LBound(keys) To UBound(keys)
If keys(count) = reservedItemName Then Err.Raise -1, "multiDictionary.add", "'" & reservedItemName & "' is a reserved key and cannot be used"
If searchDictionary.Exists(keys(count)) Then
Set newDictionary = searchDictionary.item(keys(count))
Else
Set newDictionary = New Dictionary
searchDictionary.add key:=keys(count), item:=newDictionary
End If
Set searchDictionary = searchDictionary.item(keys(count))
Next
' each node can have only one item, otherwise it has dictionaries as children
searchDictionary.add item:=value, key:=reservedItemName
End Function
Public Function item(ParamArray keys() As Variant) As Variant
Dim count As Long
Dim searchDictionary As Dictionary
Set searchDictionary = pDictionary
For count = LBound(keys) To UBound(keys)
' un-nest iteratively
Set searchDictionary = searchDictionary.item(keys(count))
Next
' the item always has the key 'reservedItemName' (by construction)
If IsObject(searchDictionary.item(reservedItemName)) Then
Set item = searchDictionary.item(reservedItemName)
Else
item = searchDictionary.item(reservedItemName)
End If
End Function
And test it like this
Sub testMultiDictionary()
Dim MD As New multiDictionary
MD.add "Blah123", 1, 2, 3
MD.add "Blah124", 1, 2, 4
MD.add "Blah1234", 1, 2, 3, 4
MD.add "BlahXYZ", "X", "Y", "Z"
MD.add "BlahXY3", "X", "Y", 3
Debug.Print MD.item(1, 2, 3)
Debug.Print MD.item(1, 2, 4)
Debug.Print MD.item(1, 2, 3, 4)
Debug.Print MD.item("X", "Y", "Z")
Debug.Print MD.item("X", "Y", 3)
End Sub

N-gram function in vb.net -> create grams for words instead of characters

I recently found out about n-grams and the cool possibility to compare frequency of phrases in a text body with it. Now I'm trying to make an vb.net app that simply gets an text body and returns a list of the most frequently used phrases (where n >= 2).
I found an C# example of how to generate a n-gram from a text body so I started out with converting the code to VB. The problem is that this code does create one gram per character instead of one per word. The delimiters I want to use for the words is: VbCrLf (new line), vbTab (tabs) and the following characters: !##$%^&*()_+-={}|\:\"'?¿/.,<>’¡º×÷‘;«»[]
Does anyone have an idea how I can rewrite the following function for this purpose:
Friend Shared Function GenerateNGrams(ByVal text As String, ByVal gramLength As Integer) As String()
If text Is Nothing OrElse text.Length = 0 Then
Return Nothing
End If
Dim grams As New ArrayList()
Dim length As Integer = text.Length
If length < gramLength Then
Dim gram As String
For i As Integer = 1 To length
gram = text.Substring(0, (i) - (0))
If grams.IndexOf(gram) = -1 Then
grams.Add(gram)
End If
Next
gram = text.Substring(length - 1, (length) - (length - 1))
If grams.IndexOf(gram) = -1 Then
grams.Add(gram)
End If
Else
For i As Integer = 1 To gramLength - 1
Dim gram As String = text.Substring(0, (i) - (0))
If grams.IndexOf(gram) = -1 Then
grams.Add(gram)
End If
Next
For i As Integer = 0 To (length - gramLength)
Dim gram As String = text.Substring(i, (i + gramLength) - (i))
If grams.IndexOf(gram) = -1 Then
grams.Add(gram)
End If
Next
For i As Integer = (length - gramLength) + 1 To length - 1
Dim gram As String = text.Substring(i, (length) - (i))
If grams.IndexOf(gram) = -1 Then
grams.Add(gram)
End If
Next
End If
Return Tokeniser.ArrayListToArray(grams)
End Function
An n-gram for words is just a list of length n that stores these words. A list of n-grams is then simply a list of list of words. If you want to store frequency then you need a dictionary that is indexed by these n-grams. For your special case of 2-grams, you can imagine something like this:
Dim frequencies As New Dictionary(Of String(), Integer)(New ArrayComparer(Of String)())
Const separators as String = "!##$%^&*()_+-={}|\:""'?¿/.,<>’¡º×÷‘;«»[] " & _
ControlChars.CrLf & ControlChars.Tab
Dim words = text.Split(separators.ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
For i As Integer = 0 To words.Length - 2
Dim ngram = New String() { words(i), words(i + 1) }
Dim oldValue As Integer = 0
frequencies.TryGetValue(ngram, oldValue)
frequencies(ngram) = oldValue + 1
Next
frequencies should now contain a dictionary with all two consecutive word pairs contained in the text, and the frequency with which they appear (as a consecutive pair).
This code requires the ArrayComparer class:
Public Class ArrayComparer(Of T)
Implements IEqualityComparer(Of T())
Private ReadOnly comparer As IEqualityComparer(Of T)
Public Sub New()
Me.New(EqualityComparer(Of T).Default)
End Sub
Public Sub New(ByVal comparer As IEqualityComparer(Of T))
Me.comparer = comparer
End Sub
Public Overloads Function Equals(ByVal a As T(), ByVal b As T()) As Boolean _
Implements IEqualityComparer(Of T()).Equals
System.Diagnostics.Debug.Assert(a.Length = b.Length)
For i As Integer = 0 to a.Length - 1
If Not comparer.Equals(a(i), b(i)) Then Return False
Next
Return True
End Function
Public Overloads Function GetHashCode(ByVal arr As T()) As Integer _
Implements IEqualityComparer(Of T()).GetHashCode
Dim hashCode As Integer = 17
For Each obj As T In arr
hashCode = ((hashCode << 5) - 1) Xor comparer.GetHashCode(obj)
Next
Return hashCode
End Function
End Class
Unfortunately, this code doesn’t compile on Mono because the VB compiler has problems finding the generic EqualityComparer class. I’m therefore unable to test whether the GetHashCode implementationw works as expected but it should be fine.
Thank you Konrad very much for this beginning of an solution!
I tried your code and got the following result:
Text = "Hello I am a test Also I am a test"
(I also included whitespace as a separator)
frequencies now has 9 items:
---------------------
Keys: "Hello", "I"
Value: 1
---------------------
Keys: "I", "am"
Value: 1
---------------------
Keys: "am", "a"
Value: 1
---------------------
Keys: "a", "test"
Value: 1
---------------------
Keys: "test", "Also"
Value: 1
---------------------
Keys: "Also", "I"
Value: 1
---------------------
Keys: "I", "am"
Value: 1
---------------------
Keys: "am", "a"
Value: 1
---------------------
Keys: "a", "test"
Value: 1
---------------------
My first question: shouldn't the 3 last key pairs get a value of 2 as they're found two times in the text?
Second: The reason I got into the n-gram approach is that I don't want to limit the word count (n) to a specific length. Is there a way to make a dynamic approach that tries to find the longest phrase match first and then step down to the last wordcount of 2?
My goal result for the sample query above is:
---------------------
Match: "I am a test"
Frequency: 2
---------------------
Match: "I am a"
Frequency: 2
---------------------
Match: "am a test"
Frequency: 2
---------------------
Match: "I am"
Frequency: 2
---------------------
Match: "am a"
Frequency: 2
---------------------
Match: "a test"
Frequency: 2
---------------------
There is an similar C++ approach for this written by Hatem Mostafa over at codeproject.com: N-gram and Fast Pattern Extraction Algorithm
Sadly I'm no C++ expert and have no idea how to convert this bit of code as it includes a lot of memory handling that .Net doesn't. The only problem with this example is that you have to specify the minimum word pattern length and I want it to be dynamic from 2 to max found.