Using Dictionary As Item Within Another Dictionary to Count Unique Values - vba

Here is what I am trying to accomplish:
I have an ID in Column DA. I have a product in Column CB. I want to count the number of unique products for each ID (ID can have multiple lines). I then want to write the number of unique products to a new column (DB).
My logic was to write two dictionaries. The "main" dictionary would be keyed to the ID and the second dictionary would be keyed to my products. In the main dictionary, if the key exists I would retrieve the item into a dict object, if it doesn't exist then I'd re-write the main dictionary.
After this, I was thinking of using .count function to tell me how many unique values existed for each ID.
I have been trying to adapt some code to fit my needs. Here is what I have currently:
Sub Find_Unique_Product_Number()
Dim LastRow As Long
LastRow = Worksheets("Orders").Range("A" & Rows.Count).End(xlUp).Row
adata = Worksheets("Orders").Range("A1:DB" & LastRow)
Set dicTwoProds = CreateObject("Scripting.Dictionary") 'Late binding creation of dictionary
For LastRow = LBound(adata, 1) To UBound(adata, 1)
sKey1 = adata(LastRow, 80)
If Not dicTwoProds.Exists(sKey) Then
Set dicItem = CreateObject("Scripting.Dictionary")
dicTwoProds.Add sKey1, aItem
sKey1 = Worksheets("Orders").Range("CB2:CB" & LastRow) 'Product
dicTwoProdsItem.Add sKey1, ""
Else
Set dicItem = dicTwoProd.Item(sKey)
sKey2 = Worksheets("Orders").Range("CB2:CB" & LastRow) 'Product
If Not dicItem.Exists(sKey2) Then
dicItem.Add sKey2, ""
dicTwoProds.Item(sKey) = aItem
End If
End If
Next
End Sub
Currently this code throws an "object required" error on this line: dicTwoProdsItem.Add sKey1, ""
I am guessing part of my issue is how I am using .range on the sKey lines? I am unsure of how that syntax should be.
I am not sure how writing the .count results to the sheet would work at all yet.
I do have an array formula that does what I need which is this:
'=SUM(IF(DA5=DA2:DA100,1/ (COUNTIFS(DA2:DA100,DA5,CB2:CB100,CB2:CB100)),0))
The formula is extremely slow which is part of the reason I wanted to use dictionaries. If the dictionary solution does not work, I would be interested in assistance with the syntax for using VBA to place the array formula into Column DB on my worksheet.
All data is on the same worksheet in the described scenario.
Cheers!

You can use a single Dictionary to count the number of unique products for each ID.
The trick is to concatenate the id with the product to create a unique key:
Dim dict As Object, lastRow As Long, r As Long, ids(), products(), dupIds()
Set dict = CreateObject("Scripting.Dictionary")
lastRow = Worksheets("Orders").Cells(Rows.Count, 1).End(xlUp).Row
ids = Worksheets("Orders").Range("DA2:DA" & lastRow).Value
products = Worksheets("Orders").Range("CB2:CB" & lastRow).Value
' map the id to each unique id/product '
For r = LBound(ids) To UBound(ids)
dict(ids(r, 1) & products(r, 1)) = CStr(ids(r, 1))
Next
' map the count of products to each unique id '
dupIds = dict.Items
dict.RemoveAll
For r = LBound(dupIds) To UBound(dupIds)
dict(dupIds(r)) = dict(dupIds(r)) + 1
Next
' build the column holding the count of products '
For r = LBound(ids) To UBound(ids)
products(r, 1) = dict(CStr(ids(r, 1)))
Next
Worksheets("Orders").Range("DB2:DB" & lastRow).Value = products

If I'm understanding you correctly something like this should work.
(Simplified for a 2-column dataset)
Sub Find_Unique_Product_Number()
Dim dict As Object, i As Long, id, prod, adata, k, k2
Dim rngDest As Range
Set dict = CreateObject("Scripting.Dictionary")
With Worksheets("Orders")
adata = Worksheets("Orders").Range("A1:B" & _
.Cells(.Rows.Count, 1).End(xlUp).Row).Value
End With
For i = LBound(adata, 1) To UBound(adata, 1)
id = adata(i, 1)
prod = adata(i, 2)
If Len(id) > 0 And Len(prod) > 0 Then
'New id? Add as key with empty dictionary
If Not dict.exists(id) Then dict.Add id, CreateObject("Scripting.Dictionary")
'increment the count for this combination
dict(id)(prod) = dict(id)(prod) + 1
End If
Next
'EDIT: output the counts
Set rngDest = Worksheets("Summary").Range("A2")
For Each k In dict.keys
For Each k2 In dict(k).keys
rngDest.Resize(1, 3).Value = Array(k, k2, dict(k)(k2))
Set rngDest = rngDest.Offset(1, 0)
Next k2
Next
End Sub

Related

Creating Dictionary Key and Item When Cell Contains Specific Value Only

I am trying to create a dictionary which only adds a key and item when "ABC" is found in Column N. They key is a unique ID (Concatenated attributes) and the item is a number. If the key already exists in the dictionary I want to sum the existing item with the new item (which have the same key / unique ID).
Dim x, x2, y, y2()
Dim i As Long
Dim dict As Object
Dim LastRowForDict As Long
Dim p As Long
dim ws as worksheet
dim LastRowResult as long
set ws = worksheets("DictionaryTest")
Set dict = CreateObject("Scripting.Dictionary")
With ws
LastRowForDict = .Range("B" & rows.Count).End(xlUp).Row
For p = 1 To LastRowForDict
If ws.Range("N" & p).Value = "ABC" Then 'only adds to dictionary if line is an "ABC" line
x = .Range("H2:H" & LastRowForDict).Value
x2 = .Range("AG2:AG" & LastRowForDict).Value
'Check if key exists and if yes add new value to existing item (SUM them)
''' For i = 1 To UBound(x, 1) should this be here?
If Not dict.Exists(x(p, 1)) Then
dict.Item(x(p, 1)) = x2(p, 1)
Else
dict.Item(x(p, 1)) = CDbl(dict.Item(x(p, 1))) + CDbl(x2(p, 1))
End If
'''next i should this be here?
End If
Next p
End With
'map the values
With ws
LastRowResult = .Range("B" & rows.Count).End(xlUp).Row
y = .Range("H2:H" & LastRowResult).Value 'looks up to this range
ReDim y2(1 To UBound(y, 1), 1 To 1) '<< size the output array
For i = 1 To UBound(y, 1)
If dict.Exists(y(i, 1)) Then
y2(i, 1) = dict(y(i, 1))
Else
y2(i, 1) = ""
End If
Next i
.Range("CK2:CK" & LastRowResult).Value = y2 '<< place the output on the sheet
End With
I currently am getting an error (RunTime 9 - Subscript Out Of Range) on this line If Not dict.Exists(x(p, 1)) Then and this error occurs on the last row of data on my worksheet (ie. It occurs on LastRowForDict). I am thinking this is related to the UBound that I have commented out? I removed it because it causes the code to run from row 1 to UBound / LastRowForDict every time the "outer" if statement is met. By this I mean for every "ABC" line, the code runs through all rows on the sheet and thus creates incorrect items.
Thank you in advance for any help you can offer!

VBA Delete Duplicates code faster

Currently using this code, however, I have a huge set of data, and this runs really slow for that. I need to remove any duplicate information, and keep the highest row of information.
dim dup as variant, r as long, lncheckduplicatescolumn as long
With wb_DST.Sheets(sWs_DST)
lncheckduplicatescolumn = .Cells(.Rows.Count, "A").End(xlUp).row
for r = lncheckduplicatescolumn to 2 step -1
dup = application.match(.cells(r, "A").value, .columns(1), 0)
if dup < r then .rows(dup).delete
next r
end with
Data:
Column A Column B
A 1
B 2
C 3
A 3
Result should be:
B 2
C 3
A 3
The order of data in column A doesnt matter as long as it is unique, and retains the information that is in the higher row number. While the code I shared works, it is too slow for a large data set.
Another fast method, is to use the Dictionary object. You can check if any of the values in Column A already exists in the Dictionary. If they do (meaning it's a duplicate), then don't delete them every time, this adds a long time for code's run-time. Instead, you can use a DelRng object, which is a Range that uses Union to merge multiple rows that are duplicates.
Later on, you can delete the entire ducplicates range at once by using DelRng.Delete.
Code
Option Explicit
Sub RemoveDuplicatesUsingDict()
Dim wb_DST As Workbook
Dim sWs_DST As String
' Dictionary variables
Dim Dict As Object
Dim DictIndex As Long, ExistIndex As Long
Dim DelRng As Range
Dim LastRow As Long, i As Long
' --- parameters for my internal testing ---
Set wb_DST = ThisWorkbook
sWs_DST = "Sheet1"
Application.ScreenUpdating = False
Set Dict = CreateObject("Scripting.Dictionary")
With wb_DST.Sheets(sWs_DST)
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row ' get last row with data in column "A"
For i = LastRow To 2 Step -1
If Not Dict.exists(.Range("A" & i).Value) Then ' value doesn't exists yet in Dictionary >> add this Key
Dict.Add .Range("A" & i).Value, .Range("A" & i).Value
Else ' value already exists in Dictionary >> add it to DelRng (later will delete the entire range)
If Not DelRng Is Nothing Then
Set DelRng = Application.Union(DelRng, .Rows(i)) ' add current row to existing DelRng
Else
Set DelRng = .Rows(i)
End If
End If
Next i
End With
' delete the entire range at 1-shot
If Not DelRng Is Nothing Then DelRng.Delete
Application.ScreenUpdating = True
End Sub
Fast use of data field array
Looping through a range isn't that fast - you can speed it up considerably if you create a data field array with your search data (array = needed range in column "A" - see 1) and loop therein. If your data set grows, this gets even faster in comparison to the above shown dictionary approach, though it rests a good and reliable method.
Search Method
Any array value is checked against a concatenated search string with already found unique values and added if not yet included - see 2)
The completed string is transformed to an array and written back to a given target column (e.g. "H") - see 3) and 4)
I even added a second column with the corresponding row numbers, so you should be in the position to use them for further action. You could write results to another sheet, too.
Code - method demo
Sub RemoveDuplicates()
Dim t As Double: t = Timer ' stop watch
Dim ws As Worksheet ' source sheet (object)
Dim i As Long ' row index
Dim a, arr, arr2 ' variant
Dim s As String, si As String
Const SEP = "|" ' delimiter
s = SEP: si = SEP
' 0) fully qualified range reference to source sheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
' 1) write column A data to one based 2-dim data field array
a = ws.Range("A1:A" & ws.Cells(ws.Rows.Count, "A").End(xlUp).Row)
' 2) loop through data and check for duplicates in concatenated string s
For i = 2 To UBound(a)
If InStr(s, SEP & a(i, 1) & SEP) = 0 Then
If Len(a(i, 1)) > 0 Then
s = s & a(i, 1) & SEP
si = si & i & SEP
End If
End If
Next i
' 3) transform unique values to zero based 1-dim array
arr = Split(Mid(s, 2), SEP) ' duplicates string to array
arr2 = Split(Mid(si, 2), SEP) ' found row numbers
' 4) write result to column H2:H... ' <<< change target to wanted column
ws.Range("H:H").ClearContents '
ws.Range("H2:H" & (2 + UBound(arr))).Value = Application.Transpose(arr)
ws.Range("I2:I" & (2 + UBound(arr2))).Value = Application.Transpose(arr2)
Debug.Print UBound(arr) + 0 & " unique items found", Format(Timer - t, "0.00 seconds needed")
End Sub
=================================================================
EDIT
Version 2 -- includes overwriting original data with unique values
Here you find a slightly modified version overwriting the original data in 35 columns (A2:AI..) with unique values.
Sub RemoveDuplicates2()
' Edit: overwrite original data A2:AI{..} with unique values
Dim t As Double: t = Timer ' stop watch
Dim ws As Worksheet ' source sheet (object)
Dim i As Long ' row index
Dim a, arr, arr2 ' variant
Dim s As String, si As String
Const SEP = "|" ' delimiter
Const MyLastColumn = "AI" ' letter of last column (no 35) = "AI"
s = SEP: si = SEP
' fully qualified range reference to source sheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
' write column A data to one based 2-dim data field array
a = ws.Range("A1:A" & ws.Cells(ws.Rows.Count, "A").End(xlUp).Row)
' loop through data and check for duplicates in concatenated string s
For i = 2 To UBound(a) ' For i = UBound(a) To 2 Step -1
If InStr(s, SEP & a(i, 1) & SEP) = 0 Then
If Len(Trim(a(i, 1))) > 0 Then
s = s & a(i, 1) & SEP
si = si & i & SEP
End If
End If
Next i
' write unique values to zero based 1-dim array (starts with index 0; last delimiter removed in this version)
arr2 = Split(Mid(si, 2, Len(si) - 2), SEP) ' found row numbers
' overwrite original data
For i = LBound(arr2) To UBound(arr2) ' starts with index 0!
s = "A" & arr2(i) & ":" & MyLastColumn & arr2(i)
arr = ws.Range(s) ' create 1-based 1-line (2-dim) array
s = "A" & i + 2 & ":" & MyLastColumn & i + 2 ' 0 + 2 = +2 ... start in row 2
ws.Range(s) = arr ' write back unique row values
Next i
s = "A" & UBound(arr2) + 3 & ":" & MyLastColumn & UBound(a) + 1
ws.Range(s).ClearContents ' clear rest of original data
Debug.Print UBound(arr2) + 1 & " unique items found", Format(Timer - t, "0.00 seconds needed") ' result
End Sub

Put dictionary key and item in columns without transposing

I have a macro that creates a dictionary with over 30k entries, My problem now is that I would like to have these entries show in columns in excel Excel has a maximum number of column entries that doesn't allow me to put my keys in and then transpose them. I would like all the keys to be in one column and all the values in another.
Question: Is there a way around this?
Sub HTH()
Dim vArray As Variant
Dim lLoop As Long
Dim rCell As Range
With CreateObject("Scripting.Dictionary")
.CompareMode = 1
For Each rCell In Range("B1:I3022")
vArray = Split(rCell.Value, " ")
For lLoop = LBound(vArray) To UBound(vArray)
If Not .Exists(vArray(lLoop)) Then
.Add vArray(lLoop), 1
Else
.Item(vArray(lLoop)) = .Item(vArray(lLoop)) + 1
End If
Next lLoop
Next rCell
MsgBox ("there are " & .Count & "Keys")
Dim keyArray, itemArray, resultArray
keyArray = .Keys
itemArray = .Items
ReDim resultArray(LBound(keyArray) To UBound(keyArray), 0 To 1)
For i = LBound(keyArray) To UBound(keyArray)
resultArray(i, 0) = keyArray(i)
resultArray(i, 1) = itemArray(i)
Next i
Range("L1").Resize(UBound(resultArray) + 1, 2) = resultArray
End With
End Sub

Sum Column B based on Column A using Excel VBA Macro

OK, I have a simple problem that I need help with in a VBA Macro. I have an excel sheet that looks like this...
Product # Count
101 1
102 1
101 2
102 2
107 7
101 4
101 4
189 9
I need a macro that adds up the "count" column based on the Product Number Column. I want it to over all look like this after I am done...
Product # Count
101 7
102 7
107 7
189 9
I am an amiture to VBA so I would love any help I can get.
Assuming the data is in columns A and B, you can do it with a formula:
=SUMIF(A:A,101,B:B)
Or if you put 101 in C1:
=SUMIF(A:A,C1,B:B)
EDIT
However if you still require VBA, here is my (quick and dirty) proposal - I use a dictionary to keep track of the sum for each item.
Sub doIt()
Dim data As Variant
Dim i As Long
Dim countDict As Variant
Dim category As Variant
Dim value As Variant
Set countDict = CreateObject("Scripting.Dictionary")
data = ActiveSheet.UsedRange 'Assumes data is in columns A/B
'Populate the dictionary: key = category / Item = count
For i = LBound(data, 1) To UBound(data, 1)
category = data(i, 1)
value = data(i, 2)
If countDict.exists(category) Then
countDict(category) = countDict(category) + value 'if we have already seen that category, add to the total
Else
countDict(category) = value 'first time we find that category, create it
End If
Next i
'Copy dictionary into an array
ReDim data(1 To countDict.Count, 1 To 2) As Variant
Dim d As Variant
i = 1
For Each d In countDict
data(i, 1) = d
data(i, 2) = countDict(d)
i = i + 1
Next d
'Puts the result back in the sheet in column D/E, including headers
With ActiveSheet
.Range("D1").Resize(UBound(data, 1), UBound(data, 2)) = data
End With
End Sub
The easiest thing is to use a Pivot Table in this case as Tim suggested.
Here is a VBA solution that uses multidimensional arrays. I noticed you said you are a bit new to VBA so I tried to put some meaningful comments in there. One thing that might look strange is when I redimension the arrays. That's because when you have multidimensional arrays you can only ReDim the last dimension in the array when you use the Preserve keyword.
Here is how my data looked:
Product Count
101 1
102 1
101 2
102 2
107 7
101 4
101 4
189 9
And here is the code. It has the same output as my last answer. Test this in a new workbook and put the test data in Sheet1 with headers.
Option Explicit
Sub testFunction()
Dim rng As Excel.Range
Dim arrProducts() As String
Dim i As Long
Set rng = Sheet1.Range("A2:A9")
arrProducts = getSumOfCountArray(rng)
Sheet2.Range("A1:B1").Value = Array("Product", "Sum of Count")
' go through array and output to Sheet2
For i = 0 To UBound(arrProducts, 2)
Sheet2.Cells(i + 2, "A").Value = arrProducts(0, i)
Sheet2.Cells(i + 2, "B").Value = arrProducts(1, i)
Next
End Sub
' Pass in the range of the products
Function getSumOfCountArray(ByRef rngProduct As Excel.Range) As String()
Dim arrProducts() As String
Dim i As Long, j As Long
Dim index As Long
ReDim arrProducts(1, 0)
For j = 1 To rngProduct.Rows.Count
index = getProductIndex(arrProducts, rngProduct.Cells(j, 1).Value)
If (index = -1) Then
' create value in array
ReDim Preserve arrProducts(1, i)
arrProducts(0, i) = rngProduct.Cells(j, 1).Value ' product name
arrProducts(1, i) = rngProduct.Cells(j, 2).Value ' count value
i = i + 1
Else
' value found, add to id
arrProducts(1, index) = arrProducts(1, index) + rngProduct.Cells(j, 2).Value
End If
Next
getSumOfCountArray = arrProducts
End Function
Function getProductIndex(ByRef arrProducts() As String, ByRef strSearch As String) As Long
' returns the index of the array if found
Dim i As Long
For i = 0 To UBound(arrProducts, 2)
If (arrProducts(0, i) = strSearch) Then
getProductIndex = i
Exit Function
End If
Next
' not found
getProductIndex = -1
End Function
Sub BestWaytoDoIt()
Dim i As Long ' Loop Counter
Dim int_DestRwCntr As Integer ' Dest. sheet Counter
Dim dic_UniquePrd As Scripting.Dictionary
Set dic_UniquePrd = New Scripting.Dictionary
For i = 2 To Sheet1.Range("A" & Sheet1.Cells.Rows.Count - 1).End(xlUp).Row
If dic_UniquePrd.exist(Sheet1.Range("A" & i).Value) <> True Then
dic_UniquePrd.Add Sheet1.Range("A" & i).Value, DestRwCntr
sheet2.Range("A" & int_DestRwCntr).Value = Sheet1.Range("A" & i).Value
sheet2.Range("B" & int_DestRwCntr).Value = Sheet1.Range("B" & i).Value
Else
sheet2.Range("A" & dic_UniquePrd.Item(Sheet1.Range("A" & i).Value)).Value = sheet2.Range("B" & dic_UniquePrd.Item(Sheet1.Range("A" & i).Value)).Value + Sheet1.Range("B" & i).Value
End If
Next
End Sub
This will serve the purpose..
Only thing to remember is to activate "Microsoft Scripting Runtimes" in references.
Based the code in Sub doIt(), is possible in the for Each ycle to retrive also the number of occurence?
Example:
Product # 101 have 4 occurence
Product # 102 have 2 occurence
ecc...
I know it' late... but I've been brought here by Sum up column B based on colum C values and so I post a solution with the same "formula" approach I used there but adapted to this actual need
Option Explicit
Sub main()
With ActiveSheet
With .Range("A:B").Resize(.cells(.Rows.Count, 1).End(xlUp).row) '<== here adjust "A:B" to whatever colums range you need
With .Offset(1).Resize(.Rows.Count - 1)
.Offset(, .Columns.Count).Resize(, 1).FormulaR1C1 = "=SUMIF(C1,RC1,C2)" ' "helper" column: it's the 1st column right of data columns (since ".Offset(, .Columns.Count)")
.Columns(2).Value = .Offset(, .Columns.Count).Resize(, 1).Value 'update "count" with sum-up from "helper" column
With .Offset(, .Columns.Count).Resize(, 1) ' reference to "helper" column
.FormulaR1C1 = "=IF(countIF(R1C1:RC1,RC1)=1,1,"""")" ' locate Product# repetition with blank cells
.Value = .Value 'fix values
.SpecialCells(xlCellTypeBlanks).EntireRow.Delete 'delete rows corresponding to blank cells
.ClearContents ' clear "helper" column
End With
End With
End With
End With
End Sub
it makes use of a "helper" columns, which I assumed could be the one adjacent to the last data columns (i.e.: if data columns are "A:B" then helper column is "C")
should different "helper" column be needed then see comments about how it's located and change code accordingly

How to remove duplicate values from 2 columns in excel using vba

I am new to Excel VBA Programming. I have one excel sheet with two columns and each column has some email adresses separated by ##. like
ColumA
aa#yahoo.com##bb#yahoo.com##cc#yahoo.com
x#.com##y#y.com
ColumnB
zz#yahoo.com##aa#yahoo.com
aa#yahoo.com
As you can see that both column has two rows, I need 3rd column that should contain all the unique values like
ColumnC
aa#yahoo.com##bb#yahoo.com##cc#yahoo.com#zz#yahoo.com
x#.com##y#y.com##aa#yahoo.com
Thanks
Something like this with variant arrays and a dictionary is an efficient process of getting your desired outcome
[updated to remove delimiter at front of string, code is flexible on delimiter length]
SO seems to have removed the ability to upload image so my picture has fallen off ....
Sub GetUniques()
Dim strDelim As String
Dim X
Dim Y
Dim objDic As Object
Dim lngRow As Long
Dim lngRow2 As Long
strDelim = "##"
Set objDic = CreateObject("scripting.dictionary")
X = Range([a1], Cells(Rows.Count, "B").End(xlUp)).Value2
For lngRow = 1 To UBound(X, 1)
X(lngRow, 1) = X(lngRow, 1) & strDelim & X(lngRow, 2)
Y = Split(X(lngRow, 1), strDelim)
X(lngRow, 1) = vbNullString
For lngRow2 = 0 To UBound(Y, 1)
If Not objDic.exists(lngRow & Y(lngRow2)) Then
X(lngRow, 1) = X(lngRow, 1) & (strDelim & Y(lngRow2))
objDic.Add (lngRow & Y(lngRow2)), 1
End If
Next lngRow2
If Len(X(lngRow, 1)) > Len(strDelim) Then X(lngRow, 1) = Right(X(lngRow, 1), Len(X(lngRow, 1)) - Len(strDelim))
Next lngRow
[c1].Resize(UBound(X, 1), 1).Value2 = X
End Sub
Here's my take. How it works:
Dump columnA and B into a variant array
Combine each row, split into an array of emails, then weed out dupes with a dictionary.
Combine unique list into a single string and store in a new array
Transpose the new array onto column C.
Sub JoinAndUnique()
Application.ScreenUpdating = False
Dim varray As Variant, newArray As Variant
Dim i As Long, lastRow As Long
Dim temp As Variant, email As Variant
Dim newString As String, seperator As String
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
seperator = "##"
lastRow = range("A" & Rows.count).End(xlUp).Row
varray = range("A1:B" & lastRow).Value
ReDim newArray(1 To UBound(varray, 1))
On Error Resume Next
For i = 1 To UBound(varray, 1)
temp = Split(varray(i, 1) & seperator & varray(i, 2), seperator)
For Each email In temp
If Not dict.exists(email) Then
dict.Add email, 1
newString = newString & (seperator & email)
End If
Next
newArray(i) = Mid$(newString, 3)
dict.RemoveAll
newString = vbNullString
Next
range("C1").Resize(UBound(newArray)).Value = Application.Transpose(newArray)
Application.ScreenUpdating = True
End Sub
Note:
It's fairly similar to brettdj's answer, but there are a few differences worth mentioning:
I used more meaninful names for variables (for readability and to make it easier to edit)
I do clean up of the "##" at the start of the sentence
I use a new array rather than overwrite the values of an existing one
I choose to clear the dictionary after each cell
I choose to use "on error resume next" and just dump entries into the dictionary instead of checking if they exist or not (personal preference, makes no major difference)
The easiest way to do this would be to use the dictionary object, split function, and join function. Of course, you don't need to use those exact ones, but give it a try and see what you get.