Fast merging duplicates in excel using VBA - vba

I created a macro in Excel to mergue duplicated rows:
The idea is that if 2 rows or more have the same A B C columns, I mergue their D columns removing ABC duplicates.
I need to do this, but checking more colums.
My macro looks like this:
processingRow = 2
Do Until Cells(processingRow, 1).Value = ""
i = processingRow + 1
Do Until Cells(i, 1).Value = ""
If Cells(processingRow, 8) = Cells(i, 8) And _
Cells(processingRow, 12) = Cells(i, 12) And _
Cells(processingRow, 7) = Cells(i, 7) And _
Cells(processingRow, 6) = Cells(i, 6) And _
Cells(processingRow, 5) = Cells(i, 5) And _
Cells(processingRow, 4) = Cells(i, 4) And _
Cells(processingRow, 3) = Cells(i, 3) And _
Cells(processingRow, 2) = Cells(i, 2) And _
Cells(processingRow, 1) = Cells(i, 1) Then
If Cells(i, 14) <> "" Then
Cells(processingRow, 14) = Cells(processingRow, 14) & "," & Cells(i, 14)
End If
Rows(i).Delete
End If
i = i + 1
Loop
processingRow = processingRow + 1
Loop
When running the macro with 500 rows, it takes a while, but its still reasonable. But I need to run this macro in a excel with more than 2500 rows, and it takes so much time that its not practical anymore.
This is my first macro in excel using VBA, so I was wondering if there is a faster way to process rows/cells, since accessing them individually seems extremelly slow.
Any ideas?

EDITED: I missed that you weren't checking every column to determine what was a duplicate. This should be closer now:
Sub Tester()
Dim rngCheck As Range, rw As Range
Dim dict As Object, k As String, rwDup As Range
Dim rngDel As Range, tmp
Set dict = CreateObject("scripting.dictionary")
With ActiveSheet
Set rngCheck = .Range(.Cells(2, 1), _
.Cells(Rows.Count, 1).End(xlUp)).Resize(, 14)
End With
For Each rw In rngCheck.Rows
k = RowKey(rw)
If dict.exists(k) Then
Set rwDup = dict(k)
tmp = rw.Cells(14).Value
If Len(tmp) > 0 Then
rwDup.Cells(14).Value = rwDup.Cells(14).Value & "," & tmp
End If
If rngDel Is Nothing Then
Set rngDel = rw
Else
Set rngDel = Application.Union(rngDel, rw)
End If
Else
dict.Add k, rw
End If
Next rw
If Not rngDel Is Nothing Then rngDel.EntireRow.Delete
End Sub
'create a "key" for the row by joining all columns to be checked
Function RowKey(rw As Range) As String
Dim arr, x As Long, sep As String, rv As String
arr = Array(1, 2, 3, 4, 5, 6, 7, 8, 12)
For x = LBound(arr) To UBound(arr)
rv = rv & sep & rw.Cells(arr(x)).Value
sep = Chr(0)
Next x
RowKey = rv
End Function

Related

Speed Up Matching program in Excel VBA

I am writing a VBA code on excel using loops to go through 10000+ lines.
Here is an example of the table
And here is the code I wrote :
Sub Find_Matches()
Dim wb As Workbook
Dim xrow As Long
Set wb = ActiveWorkbook
wb.Worksheets("Data").Activate
tCnt = Sheets("Data").UsedRange.Rows.Count
Dim e, f, a, j, h As Range
xrow = 2
Application.ScreenUpdating = False
Application.Calculation = xlManual
For xrow = 2 To tCnt Step 1
Set e = Range("E" & xrow)
Set f = e.Offset(0, 1)
Set a = e.Offset(0, -4)
Set j = e.Offset(0, 5)
Set h = e.Offset(0, 3)
For Each Cell In Range("E2:E" & tCnt)
If Cell.Value = e.Value Then
If Cell.Offset(0, 1).Value = f.Value Then
If Cell.Offset(0, -4).Value = a.Value Then
If Cell.Offset(0, 5).Value = j.Value Then
If Cell.Offset(0, 3).Value = h.Value Then
If (e.Offset(0, 7).Value) + (Cell.Offset(0, 7).Value) = 0 Then
Cell.EntireRow.Interior.Color = vbYellow
e.EntireRow.Interior.Color = vbYellow
End If
End If
End If
End If
End If
End If
Next
Next
End Sub
As you can imagine, this is taking a lot of time to go through 10000+ lines and I would like to find a faster solution. There must be a method I don't think to avoid the over looping
Here are the condition :
For each line, if another line anywhere in the file has the exact same
:
Buyer ID (col. E)
`# purchased (col. F)
Product ID (col.A)
Payment (col. J)
Date purchased (col. H)
Then, if the SUM of the Amount (col. L) the those two matching line is
0, then color both rows in yellow.
Note that extra columns are present and not being compared (eg- col. B) but are still important for the document and cannot be deleted to ease the process.
Running the previous code, in my example, row 2 & 5 get highlighted :
This is using nested dictionaries and arrays to check all conditions
Timer with my test data: Rows: 100,001; Dupes: 70,000 - Time: 14.217 sec
Option Explicit
Public Sub FindMatches()
Const E = 5, F = 6, A = 1, J = 10, H = 8, L = 12
Dim ur As Range, x As Variant, ub As Long, d As Object, found As Object
Set ur = ThisWorkbook.Worksheets("Data").UsedRange
x = ur
Set d = CreateObject("Scripting.Dictionary")
Set found = CreateObject("Scripting.Dictionary")
Dim r As Long, rId As String, itm As Variant, dupeRows As Object
For r = ur.Row To ur.Rows.Count
rId = x(r, E) & x(r, F) & x(r, A) & x(r, J) & x(r, H)
If Not d.Exists(rId) Then
Set dupeRows = CreateObject("Scripting.Dictionary")
dupeRows(r) = 0
Set d(rId) = dupeRows
Else
For Each itm In d(rId)
If x(r, L) + x(itm, L) = 0 Then
found(r) = 0
found(itm) = 0
End If
Next
End If
Next
Application.ScreenUpdating = False
For Each itm In found
ur.Range("A" & itm).EntireRow.Interior.Color = vbYellow
Next
Application.ScreenUpdating = True
End Sub
Before
After
I suggest a different approach altogether: add a temporary column to your data that contains a concatenation of each cell in the row. This way, you have:
A|B|C|D|E
1|Mr. Smith|500|A|1Mr. Smith500A
Then use Excel's conditional formatting on the temporary column, highlighting duplicate values. There you have your duplicated rows. Now it's only a matter of using a filter to check which ones have amounts equal to zero.
You can use the CONCATENATE function; it requires you to specify each cell separately and you can't use a range, but in your case (comparing only some of the columns) it seems like a good fit.
Maciej's answer is easy to implement (if you can add columns to your data without interrupting anything), and I would recommend it if possible.
However, for the sake of answering your question, I will contribute a VBA solution as well. I tested it on dataset that is a bit smaller than yours, but I think it will work for you. Note that you might have to tweak it a little (which row you start on, table name, etc) to fit your workbook.
Most notably, the segment commented with "Helper column" is something you most likely will have to adjust - currently, it compares every cell between A and H for the current row, which is something you may or may not want.
I've tried to include a little commentary in the code, but it's not much. The primary change is that I'm using in-memory processing of an array rather than iterating over a worksheet range (which for larger datasets should be exponentially faster).
Option Base 1
Option Explicit
' Uses ref Microsoft Scripting Runtime
Sub Find_Matches()
Dim wb As Workbook, ws As Worksheet
Dim xrow As Long, tCnt As Long
Dim e As Range, f As Range, a As Range, j As Range, h As Range
Dim sheetArr() As Variant, arr() As Variant
Dim colorTheseYellow As New Dictionary, colorResults() As String, dictItem As Variant
Dim arrSize As Long, i As Long, k As Long
Dim c As Variant
Set wb = ThisWorkbook
Set ws = wb.Worksheets("Data")
ws.Activate
tCnt = ws.UsedRange.Rows.Count
xrow = 2
Application.ScreenUpdating = False
Application.Calculation = xlManual
' Read range into an array so we process in-memory
sheetArr = ws.Range("A2:H" & tCnt)
arrSize = UBound(sheetArr, 1)
' Build new arr with "helper column"
ReDim arr(1 To arrSize, 1 To 9)
For i = 1 To arrSize
For k = 1 To 8
arr(i, k) = sheetArr(i, k)
arr(i, 9) = CStr(arr(i, 9)) & CStr(arr(i, k)) ' "Helper column"
Next k
Next i
' Iterate over array & build collection to indicate yellow lines
For i = LBound(arr, 1) To UBound(arr, 1)
If Not colorTheseYellow.Exists(i) Then colorResults = Split(ReturnLines(arr(i, 9), arr), ";")
For Each c In colorResults
If Not colorTheseYellow.Exists(CLng(c)) Then colorTheseYellow.Add CLng(c), CLng(c)
Next c
Next i
' Enact row colors
For Each dictItem In colorTheseYellow
'Debug.Print "dict: "; dictItem
If dictItem <> 0 Then ws.ListObjects(1).ListRows(CLng(dictItem)).Range.Interior.Color = vbYellow
Next dictItem
End Sub
Function ReturnLines(ByVal s As String, ByRef arr() As Variant) As String
' Returns a "Index;Index" string indicating the index/indices where the second, third, etc. instance(s) of s was found
' Returns "0;0" if 1 or fewer matches
Dim i As Long
Dim j As Long
Dim tmp As String
ReturnLines = 0
j = 0
tmp = "0"
'Debug.Print "arg: " & s
For i = LBound(arr, 1) To UBound(arr, 1)
If arr(i, 9) = s Then
j = j + 1
'Debug.Print "arr: " & arr(i, 9)
'Debug.Print "ReturnLine: " & i
tmp = tmp & ";" & CStr(i)
End If
Next i
'If Left(tmp, 1) = ";" Then tmp = Mid(tmp, 2, Len(tmp) - 1)
'Debug.Print "tmp: " & tmp
If j >= 2 Then
ReturnLines = tmp
Else
ReturnLines = "0;0"
End If
End Function
On my simple dataset, it yields this result (marked excellently with freehand-drawn color indicators):
Thanks everybody for your answers,
Paul Bica's solution actually worked and I am using a version of this code now.
But, just to animate the debate, I think I also found another way around my first code, inspired by Maciej's idea of concatenating the cells and using CStr to compare the values and, of course Vegard's in-memory processing by using arrays instead of going through the workbook :
Sub Find_MatchesStr()
Dim AmountArr(300) As Variant
Dim rowArr(300) As Variant
Dim ws As Worksheet
Dim wb As Workbook
Set ws = ThisWorkbook.Sheets("Data")
ws.Activate
Range("A1").Select
rCnt = ws.Cells.SpecialCells(xlCellTypeLastCell).Row
For i = 2 To rCnt
If i = rCnt Then
Exit For
Else
intCnt = 0
strA = ws.Cells(i, 1).Value
strE = ws.Cells(i, 5).Value
strF = ws.Cells(i, 6).Value
strH = ws.Cells(i, 8).Value
strL = ws.Cells(i, 10).Value
For j = i To rCnt - 1
strSearchA = ws.Cells(j, 1).Value
strSearchE = ws.Cells(j, 5).Value
strSearchF = ws.Cells(j, 6).Value
strSearchH = ws.Cells(j, 8).Value
strSearchL = ws.Cells(j, 10).Value
If CStr(strE) = CStr(strSearchE) And CStr(strA) = CStr(strSearchA) And CStr(strF) = CStr(strSearchF) And CStr(strH) = CStr(strSearchH) And CStr(strL) = CStr(strSearchL) Then
AmountArr(k) = ws.Cells(j, 12).Value
rowArr(k) = j
intCnt = intCnt + 1
k = k + 1
Else
Exit For
End If
Next
strSum = 0
For s = 0 To UBound(AmountArr)
If AmountArr(s) <> "" Then
strSum = strSum + AmountArr(s)
Else
Exit For
End If
Next
strAppenRow = ""
For b = 0 To UBound(rowArr)
If rowArr(b) <> "" Then
strAppenRow = strAppenRow & "" & rowArr(b) & "," & AmountArr(b) & ","
Else
Exit For
End If
Next
If intCnt = 1 Then
Else
If strSum = 0 Then
For rn = 0 To UBound(rowArr)
If rowArr(rn) <> "" Then
Let rRange = rowArr(rn) & ":" & rowArr(rn)
Rows(rRange).Select
Selection.Interior.Color = vbYellow
Else
Exit For
End If
Next
Else
strvar = ""
strvar = Split(strAppenRow, ",")
For ik = 1 To UBound(strvar)
If strvar(ik) <> "" Then
strVal = CDbl(strvar(ik))
For ik1 = ik To UBound(strvar)
If strvar(ik1) <> "" Then
strVal1 = CDbl(strvar(ik1))
If strVal1 + strVal = 0 Then
Let sRange1 = strvar(ik - 1) & ":" & strvar(ik - 1)
Rows(sRange1).Select
Selection.Interior.Color = vbYellow
Let sRange = strvar(ik1 - 1) & ":" & strvar(ik1 - 1)
Rows(sRange).Select
Selection.Interior.Color = vbYellow
End If
Else
Exit For
End If
ik1 = ik1 + 1
Next
Else
Exit For
End If
ik = ik + 1
Next
End If
End If
i = i + (intCnt - 1)
k = 0
Erase AmountArr
Erase rowArr
End If
Next
Range("A1").Select
End Sub
I still have some mistakes (rows not higlighted when they should be), the above code is not perfect, but I thought it'd be OK to give you an idea of where I was going before Paul Bica's solution came in.
Thanks again !
If your data is only till column L, then use below code, I found it is taking less time to run....
Sub Duplicates()
Application.ScreenUpdating = False
Dim i As Long, lrow As Long
lrow = Cells(Rows.Count, 1).End(xlUp).Row
Range("O2") = "=A2&E2&F2&J2&L2"
Range("P2") = "=COUNTIF(O:O,O2)"
Range("O2:P" & lrow).FillDown
Range("O2:O" & lrow).Copy
Range("O2:O" & lrow).PasteSpecial xlPasteValues
Application.CutCopyMode = False
For i = 1 To lrow
If Cells(i, 16) = 2 Then
Cells(i, 16).EntireRow.Interior.Color = vbYellow
End If
Next
Application.ScreenUpdating = True
Range("O:P").Delete
Range("A1").Select
MsgBox "Done"
End Sub

Sum multiple column with criteria in row

How to consolidate row H & L and then sum value in column N & Q?
Screenshot of my data:
From the data populate to another sheet, I call it sheet "X".
Expected Result
I got this code from my last question, it uses a dictionary. It can only take 1 key and 1 value and does not meet my expectation:
Sub testttt()
Dim countDict As Object, countDict2 As Object
Set countDict = CreateObject("Scripting.Dictionary")
Set countDict2 = CreateObject("Scripting.Dictionary")
Dim x() As Variant
x = Sheets("Data").Range("A2").CurrentRegion.Value2
Dim a As Long
For a = 2 To UBound(x, 1)
countDict(x(a, 8)) = countDict(x(a, 8)) + x(a, 14)
countDict2(x(a, 12)) = countDict(x(a, 8))
Next
With ThisWorkbook.Sheets("X").Range("B5").Resize(countDict.Count)
.Offset(, 1).Value = Application.Transpose(countDict.Keys)
.Offset(, 4).Value = Application.Transpose(countDict2.Keys)
.Offset(, 5).Value = Application.Transpose(countDict.Items)
.Offset(, 6).Value = Application.Transpose(countDict2.Items)
End With
End Sub
build a key as a combination of the wanted values, and then use as many dictionaries as you need sharing the same key and having each a single value as item
Sub testttt()
Dim dictH As Object, dictSumQ As Object, dictSumN As Object, dictA As Object, dictI As Object, dictL As Object, dictR As Object
Set dictA = CreateObject("Scripting.Dictionary")
Set dictH = CreateObject("Scripting.Dictionary")
Set dictI = CreateObject("Scripting.Dictionary")
Set dictL = CreateObject("Scripting.Dictionary")
Set dictR = CreateObject("Scripting.Dictionary")
Set dictSumN = CreateObject("Scripting.Dictionary")
Set dictSumQ = CreateObject("Scripting.Dictionary")
Dim x() As Variant
x = Sheets("Data").Range("A2").CurrentRegion.Value2
Dim a As Long
Dim key As Variant
For a = 2 To UBound(x, 1)
key = x(a, 8) & "|" & x(a, 12) & "|"
dictA(key) = x(a, 1)
dictH(key) = x(a, 8)
dictI(key) = x(a, 9)
dictL(key) = x(a, 12)
dictR(key) = x(a, 18)
dictSumN(key) = dictSumN(key) + x(a, 14)
dictSumQ(key) = dictSumQ(key) + x(a, 17)
Next
With ThisWorkbook.Sheets("X1").Range("A5").Resize(dictSumN.Count)
.Offset(, 1).Value = Application.Transpose(dictA.Items)
.Offset(, 2).Value = Application.Transpose(dictH.Items)
.Offset(, 3).Value = Application.Transpose(dictI.Items)
.Offset(, 4).Value = Application.Transpose(dictR.Items)
.Offset(, 5).Value = Application.Transpose(dictL.Items)
.Offset(, 6).Value = Application.Transpose(dictSumN.Items)
.Offset(, 7).Value = Application.Transpose(dictSumQ.Items)
End With
End Sub

Is there a way to make this less hacky / work properly?

I have a function that checks the rows underneath the current one depending on the unique ID. There can be up to 6 unique ideas under the current record (loop variable = i) that match the current record being checked in the loop. After this is done, the records underneath are checked for specific conditions (loop variable x). However, for some reason, I'm running into several issues. The first is that I had to set the range references inside of both loops, otherwise I got an error. The second is that, all of the stuff after the x loop seems to be outputting in the i loop that came before it. What am I doing wrong, and how can i make this function properly?
Please find my code below:
Function First_check()
dim i as long, x as long
Dim numComponents As Variant
Dim in1 As Range, in2 As Range, in3 As Range, in4 As Range, in5 As Range, _
in6 As Range, in7 As Range, in8 As Range, in9 As Range, in10 As Range, _
in11 As Range, in12 As Range, in13 As Range, in14 As Range, in15 As Range, _
in16 As Range, in17 As Range, in18 As Range, in19 As Range, in20 As Range
Dim out1 As Range, out2 As Range, out3 As Range, out4 As Range, out5 As Range, _
out6 As Range, out7 As Range, out8 As Range, out9 As Range, out10 As Range, _
out11 As Range, out12 As Range, out13 As Range, out14 As Range, out15 As Range, _
out16 As Range, out17 As Range, out18 As Range, out19 As Range, out20 As Range
Dim str, msg, oft, BTG, LOB, pdf, mht, emails, zip_rar, xls, doc, xls_doc, mrTT, lobVal, cmt1, giveURL, giveURLm As String
lastRow = Cells(Rows.Count, 1).End(xlUp).Row
lastCol = Cells(1, Columns.Count).End(xlToLeft).Column
For i = 2 To lastRow
If Cells(i, 5).Value2 = Cells(i + 6, 5).Value2 Then
numComponents = 6
ElseIf Cells(i, 5).Value2 = Cells(i + 5, 5).Value2 Then
numComponents = 5
ElseIf Cells(i, 5).Value2 = Cells(i + 4, 5).Value2 Then
numComponents = 4
ElseIf Cells(i, 5).Value2 = Cells(i + 3, 5).Value2 Then
numComponents = 3
ElseIf Cells(i, 5).Value2 = Cells(i + 2, 5).Value2 Then
numComponents = 2
ElseIf Cells(i, 5).Value2 = Cells(i + 1, 5).Value2 Then
numComponents = 1
Else
numComponents = 0
End If
For x = i + 1 To i + numComponents
Set in1 = Cells(i, 11) 'test
Set in2 = Cells(i, 12)
Set in3 = Cells(i, 13)
Set in4 = Cells(i, 16) 'e
Set in5 = Cells(i, 37) 'target date
Set in6 = Cells(i, 38) 'target date end
Set in7 = Cells(i, 35) 'target date actual
Set in8 = Cells(i, 37) 'target date start
Set in9 = Cells(i, 38) 'target date end
Set in10 = Cells(x, 50) ' date start
Set in11 = Cells(x, 51) ' date end
Set in12 = Cells(i, 42) 'pro
Set in13 = Cells(i, 43) 'reco
Set in14 = Cells(x, 62) 'cert
Set in15 = Cells(x, 63) 'com
Set in16 = Cells(x, 64) 'comp
Set in17 = Cells(x, 49) 'uniqueID
'outs
Set out1 = Cells(i, 72) 'test
Set out2 = Cells(i, 73) '
Set out3 = Cells(i, 74) '
Set out4 = Cells(i, 75) 'e
Set out5 = Cells(i, 76) 'tar
Set out6 = Cells(i, 77) 'comp
Set out7 = Cells(i, 78) 'pro
Set out8 = Cells(i, 75) 'empty
Set out9 = Cells(i, 80) 'cer
Set out10 = Cells(i, 81) 'comp
Set out11 = Cells(i, 85) 'pre
Set out12 = Cells(i, 88) 'missing
Set out13 = Cells(i, 89) 'missing2
Set out14 = Cells(i, 71) 'uniqueID
'------ATTACHMENT SET
str = Cells(i, 46).Value2
msg = UBound(Split(str, ".msg"))
oft = UBound(Split(str, ".oft"))
BTG = UBound(Split(str, "BTG"))
LOB = UBound(Split(str, "LOB"))
pdf = UBound(Split(str, ".pdf"))
mht = UBound(Split(str, ".mht"))
emails = msg + oft + pdf + mht
zip_rar = UBound(Split(str, ".zip"))
xls = UBound(Split(str, ".xls"))
doc = UBound(Split(str, ".doc"))
xls_doc = xls Or doc
If (in8.Value2 <> in10.Value2) Or (in9.Value <> in11.Value2) Then 'date
out6.Value2 = Cells(x, 49).Value2 & ", " & out6.Value2
End If
If IsBlank(in14.Value2) Then 'Check cer
out9.Value2 = Cells(x, 49).Value2 & ", " & out9.Value2
End If
If IsBlank(in15.Value2) Or IsBlank(in16.Value2) Then 'check loc
out10.Value2 = Cells(x, 49).Value2 & ", " & out10.Value2
End If
If Not IsBlank(in17.Value2) Then
out14.Value2 = in17.Value2 & ", " & out14.Value2
End If
Next x
If Not IsBlank(out6.Value2) Then 'date
out6.Value2 = "Wrong dates"
out6.Value2 = fixtrail(out6.Value2)
End If
If Not IsBlank(out9.Value2) Then 'cert
out9.Value2 = "Cert Issue"
out9.Value2 = fixtrail(out9.Value2)
End If
If Not IsBlank(out10.Value2) Then 'comp
out10.Value2 = "Comp not found"
out10.Value2 = fixtrail(out10.Value2)
End If
If IsBlank(in1.Value2) Then
out1.Value2 = "Missing type"
End If
'
'many more checks happening that i omittied for brevity
'
If numComponents = 0 Then
Cells(i, 70).Value2 = "0"
Else
Cells(i, 70).Value2 = numComponents
End If
i = i + numComponents
Next i
End Function
The first idea that came to mind is using an array of Range objects to clean up the variable declarations:
Dim inRange(20) As Range
Dim outRange(20) As Range
'...
For x = i + 1 To i + numComponents
Set inRange(1) = Cells(i, 11)
Set inRange(2) = Cells(i, 12)
'...
Next
This will work especially well if you can get a formula for the cell numbers that map to each array position.
Additionally, we can improve variables around how the two loops are nested. The outer loop uses the i variable, while the inner loop uses the x variable. Since these are both looking at rows, I would re-name them as r0 and r1 (or rBase and rNested, rParent and rChild, rMaster and rDetail, etc) to help you understand what you're looking at with each index. I also see that some of the Range objects depend on the current i value, while other depend on x. You should be able to assign the i ranges above the inner loop, and save some CPU/memory work that way:
For irParent = 2 To LastRow
'...
Set inRange(1) = Cells(irParent, 11) 'test
Set inRange(2) = Cells(irParent, 12)
Set inRange(3) = Cells(irParent, 13)
Set inRange(4) = Cells(irParent, 16) 'e
'...
'If numComponents is 0, there are no child rows and this loop is skipped
For rChild = rParent + 1 To rParent + numComponents
Set inRange(10) = Cells(irChild, 50) ' date start
Set inRange(11) = Cells(irChild, 51) ' date end
'...
str = Cells(irParent, 46).Value2
msg = UBound(Split(str, ".msg"))
oft = UBound(Split(str, ".oft"))
'...
Next
irParent = irParent + numComponents
Next
Another thing is this method runs kind of long. You may want to abstract out some of the checks to a separate method, or a few separate methods that depend on what type of parent record you're looking at. Create methods that just accept the values needed for checking a particular kind of row, and then returns a single result for the check. This adds names to the code that help you understand what you're doing, as well as shorting the parent code to make it easier to read and understand at a high level more quickly.
As you make those other changes, you may want to start thinking in terms of creating Range objects that represent an entire row (or section from a row), so you can pass them to methods. This is especially true, as it appears many Range objects are currently used to hold values from single Cells. You can build strings to define non-contiguous Ranges that have the values needed for each row (including the parent cells when working in a child row). This will make building functions much easier, if you can have them simply accept a single Range object that you know has the correct cells in it.
This is also helpful because it minimizes instances where you copy from Excel Cells to memory. Moving data between VBA and Excel is a costly operation. It's usually better for performance to copy to or from a set of Cells in bulk, rather than one Cell at a time. This often holds even when it means using some extra memory. It also often helps reduce or simplify the total amount of code needed. Unfortunately, I'm too far out of VBA to show you an example.
Finally, notice my indentation. Professionals will do that consistently... even religiously. "Hacky" code does not. It's extremely helpful for spotting mistakes.

VBA EXCEL Compare Columns and bring over the value

Image1
Hi, Referring to the image, I am trying to compare column G and Column K, if the value is the same then copy the value in column J to column F. However, my code doesn't copy the value from Column J to F.
Sub createarray1()
Dim i As Integer
Dim j As Integer
Dim masterarray As Range
Set masterarray = Range("D3:G12")
Dim sourcearray As Range
Set sourcearray = Range("H3:K26")
For i = 1 To 10
For j = 1 To 25
If masterarray(i, 4).Value = sourcearray(j, 4).Value Then
masterarray(i, 3) = sourcearray(j, 3).Value
Else
masterarray(i, 3).Value = ""
End If
Next
Next
End Sub
Function concatenate()
Dim nlastrow As Long
For i = 2 To Cells(Rows.Count, "D").End(xlUp).Row
Cells(i, "G").Value = Cells(i, "D").Value & "_" & Cells(i, "E").Value
Next i
Dim nnlastrow As Long
For i = 2 To Cells(Rows.Count, "H").End(xlUp).Row
Cells(i, "K").Value = Cells(i, "H").Value & "_" & Cells(i, "I").Value
Next i
End Function
Use variant arrays, that way you limit the number of calls to the sheet to only 3.
When your positive is found you need to exit the inner loop.
Sub createarray1()
Dim i As Long
Dim j As Long
Dim masterarray As Variant
Dim sourcearray As Variant
With ThisWorkbook.Worksheets("Sheet1") ' change to your sheet
masterarray = .Range("D3:G12")
sourcearray = .Range("H3:K26")
For i = LBound(masterarray, 1) To UBound(masterarray, 1)
masterarray(i, 3) = ""
For j = LBound(sourcearray, 1) To UBound(sourcearray, 1)
If masterarray(i, 4) = sourcearray(j, 4) Then
masterarray(i, 3) = sourcearray(j, 3)
Exit For
End If
Next j
Next i
.Range("D3:G12") = masterarray
End With
End Sub
But this can all be done with the following formula:
=INDEX(J:J,MATCH(G3,K:K,0))
Put it in F3 and copy/drag down.

MS Excel -Finding merged cells & combining info in them at corresponding row

There are some merged and some unmerged cells in column A, of different sizes, and column B is made up of all unmerged cells.
I am looking for a formula (if none exists, that could be written with VBA), that would determine whether a cell is merged or unmerged in A, and if it is merged, combine the components in column B (like the formula concatenate does) and write it to one of the rows of it, say the upper one, and if possible delete the row below.
Can I do this with a formula, can anyone help me with a given code, please?
Now I want to not lose the data of the given rows, but add the data in 3rd and 4th columns between themselves as shown in the figure. And make the stars disappear if possible.
To make it quick and simple: (put this in any module in your VBA window)
Option Explicit
Public Function merge_merged(rng As Range) As Variant
Dim i As Long, j As Long, output() As Variant
ReDim output(1 To UBound(rng.Value), 1 To 2)
For j = 1 To UBound(rng.Value)
If Len(rng(j, 1).Text) Then
i = i + 1
output(i, 1) = rng(j, 1).Text
output(i, 2) = rng(j, 2).Text
Else
output(i, 2) = output(i, 2) & ", " & rng(j, 2).Text
End If
Next
For i = i + 1 To j - 1
output(i, 1) = ""
output(i, 2) = ""
Next
merge_merged = output
End Function
then select the range D2:E13 and use the formula
=merge_merged(B2:C13)
This is an array formula and must be confirmed with Ctrl+Shift+Enter↵.
should do exactly what you asked for... If you still have any questions, just write a comment
Also works with strings for me:
EDIT:
You should not change the Question after you got the answer you desired, better ask a new one. Still, I will provide a solution this time:
Option Explicit
Public Function merge_merged(rngIn As Range) As Variant
Dim i As Long, j As Long, k As Long, output() As Variant, rng As Variant
rng = rngIn.Value
ReDim output(1 To UBound(rng), 1 To UBound(rng, 2))
For j = 1 To UBound(rng)
If Len(rng(j, 1)) Then
i = i + 1
For k = 1 To UBound(output, 2)
If IsNumeric(Replace(rng(j, k), "*", "")) Then
output(i, k) = Replace(rng(j, k), "*", "")
Else
output(i, k) = rng(j, k)
End If
Next
Else
For k = 1 To UBound(output, 2)
If Len(rng(j, k)) Then
If IsNumeric(output(i, k)) And IsNumeric(Replace(rng(j, k), "*", "")) Then
output(i, k) = 0 + output(i, k) + Replace(rng(j, k), "*", "")
Else
output(i, k) = output(i, k) & ", " & rng(j, k)
End If
End If
Next
End If
Next
For i = i + 1 To j - 1
For k = 1 To UBound(output, 2)
output(i, k) = ""
Next
Next
merge_merged = output
End Function
only the first column will be checked for collapsing
if column "2" to "end" contain numbers, they will be summed up
having mixed values (numerical and strings) may mess up
"A", "3", "5" will be "A, 3, 5"
"3", "A", "5" will be "3, A, 5"
but "3", "5", "A" will be "8, A"
* will be deleted if the string is numerical without it
it will pull all values of the first row (for each merged part)
if there is no "first" value, the first found will be shown as ", value"
if all cells are empty, the output will also be empty
empty cells will be ignored ("A", "", "C" will become "A, C")
pushed everything in a variable to be fast for bigger tables
Rather than dealing with the Range.MergeArea property, it is probably better to simply Range.UnMerge method the offending cells and deal with the resulting blanks differently than the one that remained populated.
Sub flatten_merge()
Dim rw As Long, v As Long, vVALs As Variant
With Worksheets("Sheet1")
.Columns(1).UnMerge
ReDim vVALs(1 To Application.Count(.Columns(1)), 1 To 2)
For rw = 1 To .Cells(Rows.Count, "B").End(xlUp).Row
If IsEmpty(.Cells(rw, 1)) Then
vVALs(v, 2) = vVALs(v, 2) & Chr(44) & .Cells(rw, 2).Value2
Else
v = v + 1
vVALs(v, 1) = .Cells(rw, 1).Value2
vVALs(v, 2) = .Cells(rw, 2).Value2
End If
Next rw
.Cells(1, 1).Resize(1, 2).EntireColumn.Clear
.Cells(1, 1).Resize(UBound(vVALs, 1), UBound(vVALs, 2)) = vVALs
End With
End Sub
If you have a need to retain the original(s), then a simple modification to copy the source to a new location would suffice.
Sample data and results:
                              Before                               After
additional variant to already posted:
Sub tets()
Dim Dic As Object: Set Dic = CreateObject("Scripting.Dictionary")
Dim cl As Range, Data As Range, k, s%
Dic.comparemode = vbTextCompare
Set Data = Range("A1:A" & [A:A].Find("*", , , , xlByRows, xlPrevious).Row)
For Each cl In Data
If cl.Value2 <> "" Then s = cl.Value2
If Not Dic.exists(s) Then
Dic.Add s, cl.Offset(, 1).Value2
Else
Dic(s) = Dic(s) & "," & cl.Offset(, 1).Value2
End If
Next cl
For Each k In Dic
Debug.Print k, Dic(k)
Next k
End Sub
test
I would like to "Unmerge" the cells first, then use collections to get the unique values and create a loop.
Sub uNMERGE()
Dim rng As Range, lstRw As Long, c As Range
Columns("A:A").MergeCells = 0
lstRw = Cells(Rows.Count, "A").End(xlUp).Row
Set rng = Range("A1:A" & lstRw)
For Each c In rng.Cells
If c = "" Then
c = c.Offset(-1)
End If
Next c
UsingColection
End Sub
Sub UsingColection()
Dim cUnique As Collection
Dim rng As Range, c As Range
Dim Cell As Range
Dim sh As Worksheet
Dim vNum As Variant
Dim rws As Long, s As String
Set sh = ThisWorkbook.Sheets("Sheet1")
rws = sh.Cells(sh.Rows.Count, "A").End(xlUp).Row
Set rng = sh.Range("A1:A" & rws)
Set cUnique = New Collection
On Error Resume Next
For Each Cell In rng.Cells
cUnique.Add Cell.Value, CStr(Cell.Value)
Next Cell
On Error GoTo 0
For Each vNum In cUnique
Cells(Rows.Count, "D").End(xlUp).Offset(1, 0) = vNum
For Each c In rng.Cells
If c = vNum Then
s = s & c.Offset(, 1) & ","
End If
Next c
Cells(Rows.Count, "D").End(xlUp).Offset(0, 1) = Mid(s, 1, Len(s) - 1)
s = ""
Next vNum
End Sub
Before
After
So what is wrong with this code? -Since it gives #VALUE! error at every cell selected.
Option Explicit
Public Function merge_merged(rng As Range) As Variant
Dim i As Long, j As Long, output() As Variant
ReDim output(1 To UBound(rng.Value), 1 To 4)
For j = 1 To UBound(rng.Value)
If Len(rng(j, 1).Text) Then
i = i + 1
output(i, 1) = rng(j, 1).Text
output(i, 2) = rng(j, 2).Text
output(i, 3) = rng(j, 3).Value
output(i, 4) = rng(j, 4).Value
output(i, 5) = rng(j, 5).Text
Else
output(i, 2) = output(i, 2) & ", " & rng(j, 2).Text
output(i, 3) = output(i, 3) + rng(j, 3).Value
output(i, 4) = output(i, 4) + rng(j, 4).Value
output(i, 5) = rng(j, 5).Text
End If
Next
For i = i To j - 1
output(i, 1) = ""
output(i, 2) = ""
output(i, 3) = ""
output(i, 4) = ""
output(i, 5) = ""
Next
merge_merged = output
End Function
Sub ece()
End Sub
And what else can I do to search for "star"s? And create a new column to refer if data in each cell (even if once merged) had "star"s?