Related
I need a loop that will match and select different columns (not in sequential order) and paste them to another sheet all whilst keeping the condition in check. It would also be ideal if when the values get pasted that the formatting for the cell is not carried over, just the value.
Below is the code I am currently using:
Sub Test()
Application.ScreenUpdating = False
Sheets("DATA").Select
lr = Range("B" & Rows.Count).End(xlUp).Row
Range("P3").Select
For i = 3 To lr
If Cells(i, 2) <> "" Then Range(Cells(i, 7), Cells(i, 16), Cells(i, 26)).Copy
Sheets("Sheet2").Range("A" & Rows.Count).End(xlUp).Offset(1, 0)
Next
Application.ScreenUpdating = True
End Sub
The problem is declaring the columns I want the loop to paste. I need the loop to run through the 16th column, check empty values, and then paste the index/matched value in the rows of columns 7,16,and 26 (so not in sequential order).. Any help would be appreciated.
The next code has to do what I understood you need. Please check it and confirm this aspect. It is very fast, working only in memory...
Sub PastingNextPage()
Dim sh As Worksheet, sh1 As Worksheet, arrIn As Variant, arrOut() As Variant
Dim lastRowIn As Long, lastRowOut As Long, nonEmpt As Long, rngP As Range, nrEl As Long
Dim i As Long, j As Long, P As Long
Set sh = Sheets("DATA"): lastRowIn = sh.Range("P" & sh.Rows.count).End(xlUp).Row
Set sh1 = Sheets("Sheet2"): lastRowOut = sh1.Range("A" & sh1.Rows.count).End(xlUp).Row + 1
arrIn = sh.Range("G2:Z" & lastRowIn).Value
nrEl = lastRowIn - Application.WorksheetFunction.CountIf(sh.Range("P2:P" & lastRowIn), "") - 2
P = 10 'column P:P number in the range starting with G:G column
ReDim arrOut(nrEl, 3) 'redim the array to keep the collected values
For i = 1 To lastRowIn - 1
If arrIn(i, P) <> "" Then
arrOut(j, 0) = arrIn(i, 1): arrOut(j, 1) = arrIn(i, P): arrOut(j, 2) = arrIn(i, 20)
j = j + 1
End If
Next i
sh1.Range(sh1.Cells(lastRowOut, "A"), sh1.Cells(lastRowOut + nrEl, "C")).Value = arrOut
End Sub
It does not select anything, you can run it activating any of the two involved sheets. I would recommend to be in "Sheet2" and see the result. If you want to repeat the test, its result will be added after the previous testing resulted rows...
If something unclear or not doing what you need, do not hesitate to ask for clarifications.
I am writing a VBA code on excel using loops to go through 10000+ lines.
Here is an example of the table
And here is the code I wrote :
Sub Find_Matches()
Dim wb As Workbook
Dim xrow As Long
Set wb = ActiveWorkbook
wb.Worksheets("Data").Activate
tCnt = Sheets("Data").UsedRange.Rows.Count
Dim e, f, a, j, h As Range
xrow = 2
Application.ScreenUpdating = False
Application.Calculation = xlManual
For xrow = 2 To tCnt Step 1
Set e = Range("E" & xrow)
Set f = e.Offset(0, 1)
Set a = e.Offset(0, -4)
Set j = e.Offset(0, 5)
Set h = e.Offset(0, 3)
For Each Cell In Range("E2:E" & tCnt)
If Cell.Value = e.Value Then
If Cell.Offset(0, 1).Value = f.Value Then
If Cell.Offset(0, -4).Value = a.Value Then
If Cell.Offset(0, 5).Value = j.Value Then
If Cell.Offset(0, 3).Value = h.Value Then
If (e.Offset(0, 7).Value) + (Cell.Offset(0, 7).Value) = 0 Then
Cell.EntireRow.Interior.Color = vbYellow
e.EntireRow.Interior.Color = vbYellow
End If
End If
End If
End If
End If
End If
Next
Next
End Sub
As you can imagine, this is taking a lot of time to go through 10000+ lines and I would like to find a faster solution. There must be a method I don't think to avoid the over looping
Here are the condition :
For each line, if another line anywhere in the file has the exact same
:
Buyer ID (col. E)
`# purchased (col. F)
Product ID (col.A)
Payment (col. J)
Date purchased (col. H)
Then, if the SUM of the Amount (col. L) the those two matching line is
0, then color both rows in yellow.
Note that extra columns are present and not being compared (eg- col. B) but are still important for the document and cannot be deleted to ease the process.
Running the previous code, in my example, row 2 & 5 get highlighted :
This is using nested dictionaries and arrays to check all conditions
Timer with my test data: Rows: 100,001; Dupes: 70,000 - Time: 14.217 sec
Option Explicit
Public Sub FindMatches()
Const E = 5, F = 6, A = 1, J = 10, H = 8, L = 12
Dim ur As Range, x As Variant, ub As Long, d As Object, found As Object
Set ur = ThisWorkbook.Worksheets("Data").UsedRange
x = ur
Set d = CreateObject("Scripting.Dictionary")
Set found = CreateObject("Scripting.Dictionary")
Dim r As Long, rId As String, itm As Variant, dupeRows As Object
For r = ur.Row To ur.Rows.Count
rId = x(r, E) & x(r, F) & x(r, A) & x(r, J) & x(r, H)
If Not d.Exists(rId) Then
Set dupeRows = CreateObject("Scripting.Dictionary")
dupeRows(r) = 0
Set d(rId) = dupeRows
Else
For Each itm In d(rId)
If x(r, L) + x(itm, L) = 0 Then
found(r) = 0
found(itm) = 0
End If
Next
End If
Next
Application.ScreenUpdating = False
For Each itm In found
ur.Range("A" & itm).EntireRow.Interior.Color = vbYellow
Next
Application.ScreenUpdating = True
End Sub
Before
After
I suggest a different approach altogether: add a temporary column to your data that contains a concatenation of each cell in the row. This way, you have:
A|B|C|D|E
1|Mr. Smith|500|A|1Mr. Smith500A
Then use Excel's conditional formatting on the temporary column, highlighting duplicate values. There you have your duplicated rows. Now it's only a matter of using a filter to check which ones have amounts equal to zero.
You can use the CONCATENATE function; it requires you to specify each cell separately and you can't use a range, but in your case (comparing only some of the columns) it seems like a good fit.
Maciej's answer is easy to implement (if you can add columns to your data without interrupting anything), and I would recommend it if possible.
However, for the sake of answering your question, I will contribute a VBA solution as well. I tested it on dataset that is a bit smaller than yours, but I think it will work for you. Note that you might have to tweak it a little (which row you start on, table name, etc) to fit your workbook.
Most notably, the segment commented with "Helper column" is something you most likely will have to adjust - currently, it compares every cell between A and H for the current row, which is something you may or may not want.
I've tried to include a little commentary in the code, but it's not much. The primary change is that I'm using in-memory processing of an array rather than iterating over a worksheet range (which for larger datasets should be exponentially faster).
Option Base 1
Option Explicit
' Uses ref Microsoft Scripting Runtime
Sub Find_Matches()
Dim wb As Workbook, ws As Worksheet
Dim xrow As Long, tCnt As Long
Dim e As Range, f As Range, a As Range, j As Range, h As Range
Dim sheetArr() As Variant, arr() As Variant
Dim colorTheseYellow As New Dictionary, colorResults() As String, dictItem As Variant
Dim arrSize As Long, i As Long, k As Long
Dim c As Variant
Set wb = ThisWorkbook
Set ws = wb.Worksheets("Data")
ws.Activate
tCnt = ws.UsedRange.Rows.Count
xrow = 2
Application.ScreenUpdating = False
Application.Calculation = xlManual
' Read range into an array so we process in-memory
sheetArr = ws.Range("A2:H" & tCnt)
arrSize = UBound(sheetArr, 1)
' Build new arr with "helper column"
ReDim arr(1 To arrSize, 1 To 9)
For i = 1 To arrSize
For k = 1 To 8
arr(i, k) = sheetArr(i, k)
arr(i, 9) = CStr(arr(i, 9)) & CStr(arr(i, k)) ' "Helper column"
Next k
Next i
' Iterate over array & build collection to indicate yellow lines
For i = LBound(arr, 1) To UBound(arr, 1)
If Not colorTheseYellow.Exists(i) Then colorResults = Split(ReturnLines(arr(i, 9), arr), ";")
For Each c In colorResults
If Not colorTheseYellow.Exists(CLng(c)) Then colorTheseYellow.Add CLng(c), CLng(c)
Next c
Next i
' Enact row colors
For Each dictItem In colorTheseYellow
'Debug.Print "dict: "; dictItem
If dictItem <> 0 Then ws.ListObjects(1).ListRows(CLng(dictItem)).Range.Interior.Color = vbYellow
Next dictItem
End Sub
Function ReturnLines(ByVal s As String, ByRef arr() As Variant) As String
' Returns a "Index;Index" string indicating the index/indices where the second, third, etc. instance(s) of s was found
' Returns "0;0" if 1 or fewer matches
Dim i As Long
Dim j As Long
Dim tmp As String
ReturnLines = 0
j = 0
tmp = "0"
'Debug.Print "arg: " & s
For i = LBound(arr, 1) To UBound(arr, 1)
If arr(i, 9) = s Then
j = j + 1
'Debug.Print "arr: " & arr(i, 9)
'Debug.Print "ReturnLine: " & i
tmp = tmp & ";" & CStr(i)
End If
Next i
'If Left(tmp, 1) = ";" Then tmp = Mid(tmp, 2, Len(tmp) - 1)
'Debug.Print "tmp: " & tmp
If j >= 2 Then
ReturnLines = tmp
Else
ReturnLines = "0;0"
End If
End Function
On my simple dataset, it yields this result (marked excellently with freehand-drawn color indicators):
Thanks everybody for your answers,
Paul Bica's solution actually worked and I am using a version of this code now.
But, just to animate the debate, I think I also found another way around my first code, inspired by Maciej's idea of concatenating the cells and using CStr to compare the values and, of course Vegard's in-memory processing by using arrays instead of going through the workbook :
Sub Find_MatchesStr()
Dim AmountArr(300) As Variant
Dim rowArr(300) As Variant
Dim ws As Worksheet
Dim wb As Workbook
Set ws = ThisWorkbook.Sheets("Data")
ws.Activate
Range("A1").Select
rCnt = ws.Cells.SpecialCells(xlCellTypeLastCell).Row
For i = 2 To rCnt
If i = rCnt Then
Exit For
Else
intCnt = 0
strA = ws.Cells(i, 1).Value
strE = ws.Cells(i, 5).Value
strF = ws.Cells(i, 6).Value
strH = ws.Cells(i, 8).Value
strL = ws.Cells(i, 10).Value
For j = i To rCnt - 1
strSearchA = ws.Cells(j, 1).Value
strSearchE = ws.Cells(j, 5).Value
strSearchF = ws.Cells(j, 6).Value
strSearchH = ws.Cells(j, 8).Value
strSearchL = ws.Cells(j, 10).Value
If CStr(strE) = CStr(strSearchE) And CStr(strA) = CStr(strSearchA) And CStr(strF) = CStr(strSearchF) And CStr(strH) = CStr(strSearchH) And CStr(strL) = CStr(strSearchL) Then
AmountArr(k) = ws.Cells(j, 12).Value
rowArr(k) = j
intCnt = intCnt + 1
k = k + 1
Else
Exit For
End If
Next
strSum = 0
For s = 0 To UBound(AmountArr)
If AmountArr(s) <> "" Then
strSum = strSum + AmountArr(s)
Else
Exit For
End If
Next
strAppenRow = ""
For b = 0 To UBound(rowArr)
If rowArr(b) <> "" Then
strAppenRow = strAppenRow & "" & rowArr(b) & "," & AmountArr(b) & ","
Else
Exit For
End If
Next
If intCnt = 1 Then
Else
If strSum = 0 Then
For rn = 0 To UBound(rowArr)
If rowArr(rn) <> "" Then
Let rRange = rowArr(rn) & ":" & rowArr(rn)
Rows(rRange).Select
Selection.Interior.Color = vbYellow
Else
Exit For
End If
Next
Else
strvar = ""
strvar = Split(strAppenRow, ",")
For ik = 1 To UBound(strvar)
If strvar(ik) <> "" Then
strVal = CDbl(strvar(ik))
For ik1 = ik To UBound(strvar)
If strvar(ik1) <> "" Then
strVal1 = CDbl(strvar(ik1))
If strVal1 + strVal = 0 Then
Let sRange1 = strvar(ik - 1) & ":" & strvar(ik - 1)
Rows(sRange1).Select
Selection.Interior.Color = vbYellow
Let sRange = strvar(ik1 - 1) & ":" & strvar(ik1 - 1)
Rows(sRange).Select
Selection.Interior.Color = vbYellow
End If
Else
Exit For
End If
ik1 = ik1 + 1
Next
Else
Exit For
End If
ik = ik + 1
Next
End If
End If
i = i + (intCnt - 1)
k = 0
Erase AmountArr
Erase rowArr
End If
Next
Range("A1").Select
End Sub
I still have some mistakes (rows not higlighted when they should be), the above code is not perfect, but I thought it'd be OK to give you an idea of where I was going before Paul Bica's solution came in.
Thanks again !
If your data is only till column L, then use below code, I found it is taking less time to run....
Sub Duplicates()
Application.ScreenUpdating = False
Dim i As Long, lrow As Long
lrow = Cells(Rows.Count, 1).End(xlUp).Row
Range("O2") = "=A2&E2&F2&J2&L2"
Range("P2") = "=COUNTIF(O:O,O2)"
Range("O2:P" & lrow).FillDown
Range("O2:O" & lrow).Copy
Range("O2:O" & lrow).PasteSpecial xlPasteValues
Application.CutCopyMode = False
For i = 1 To lrow
If Cells(i, 16) = 2 Then
Cells(i, 16).EntireRow.Interior.Color = vbYellow
End If
Next
Application.ScreenUpdating = True
Range("O:P").Delete
Range("A1").Select
MsgBox "Done"
End Sub
I'm writing a code to loop through an excel sheet and changing the text (in column B) to uppercase/lowercase, depending on the value of cell in column N on the same row.
Macros purpose:
loop through cells in column B starting at row 2 and changing the string from upper to lowercase or vice versa, depending on the value of the cell in column N (lowercase if value = 5, other cases text should be uppercase)
Code I've got so far:
Sub CAPS()
'
' CAPS Macro
'
Dim Rang As Integer
Dim j As Integer
j = 2
For Each N In Source.Range("N2:N10000") ' Do 10000 rows
Rang = Cells(j, 14)
If Rang = 5 Then
Cells(j, 2).Range("A1").Select
ActiveCell.Value = LCase$(ActiveCell.Text)
Else
ActiveCell.Value = UCase$(ActiveCell.Text)
j = j + 1
End If
Next N
End Sub
I'm a little bit stuck in the looping part, not really a clue how to fix the error(s) in the current code.
Thanks in advance :)
Sub CAPS()
'
' CAPS Macro
'
Dim N as long 'use long here as integer is limite to a 32b character
For N Is 2 to 10000 ' Do 10000 rows
If Cells(N, 14) = 5 Then
Cells(N, 2) = LCase(Cells(N,2)
Else
Cells(N, 2) = UCase(Cells(N,2)
EndIf
Next N
End Sub
This should do the trick, untested though.
You currently have a fixed number of rows you want to test. To optimize your code you could first check how many rows are filled with data. To do so you can use:
DIM lastrow as long
lastrow = Cells(Rows.Count, "B").End(xlUp).Row
And then make the loop with For N Is 2 to lastrow
Also it is good practice to explicitly reference your worksheets, as this prevents undesired results. For example you click on another worksheet whilst the code is running it will continue formatting on that sheet. To do so declare a variable as your worksheet:
DIM ws as worksheet
And set a value to your variable, in this case Sheet1.
Set ws as ThisWorkbook.Worksheets("Sheet1")
Now every time you reference a Cells(), you explicitly say on what sheet that has to be by adding ws. in front of it like such: ws.Cells()
To summarize all that into your code:
Sub CAPS()
'
' CAPS Macro
'
Dim N as long 'use long here as integer is limite to a 32b character
Dim lastrow as long
Dim ws as worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1") 'Set the code to run on Sheet 1 of your current workbook.
lastrow = ws.Cells(Rows.Count, "B").End(xlUp).Row
For N Is 2 to lastrow ' Do all rows that have data in column B
If ws.Cells(N, 14) = 5 Then
ws.Cells(N, 2) = LCase(ws.Cells(N,2)
Else
ws.Cells(N, 2) = UCase(ws.Cells(N,2)
EndIf
Next N
End Sub
Try processing in an array,
Sub CAPS()
'
' CAPS Macro
'
Dim arr As variant, j As Integer
with worksheets("sheet1")
arr = .range(.cells(2, "B"), .cells(.rows.count, "B").end(xlup).offset(0, 12)).value2
for j= lbound(arr, 1) to ubound(arr, 1)
if arr(j, 13) = 5 then
arr(j, 1) = lcase(arr(j, 1))
else
arr(j, 1) = ucase(arr(j, 1))
end if
next j
redim preserve arr(lbound(arr, 1) to ubound(arr, 1), 1 to 1)
.cells(2, "B").resize(ubound(arr, 1), ubound(arr, 2)) = arr
end with
End Sub
You may try something like this...
Sub CAPS()
Dim ws As Worksheet
Dim lr As Long, i As Long
Application.ScreenUpdating = False
Set ws = Sheets("Sheet1") 'Sheet where you have to change the letter case
lr = ws.Cells(Rows.Count, "B").End(xlUp).Row
For i = 2 To lr
Select Case ws.Cells(i, "N")
Case 5
ws.Cells(i, "B") = LCase(ws.Cells(i, "B"))
Case Else
ws.Cells(i, "B") = UCase(ws.Cells(i, "B"))
End Select
Next i
Application.ScreenUpdating = True
End Sub
Another approach using for each loop with Range:
Sub UCaseLCase()
Dim rng, cell As Range
Dim Test As Integer
Test = 5
Set rng = Range(Cells(2, 14), Cells(10000, 14))
For Each cell In rng.Cells
If cell.Value = Test Then
cell.Offset(0, -12) = LCase(cell.Offset(0, -12))
Else
cell.Offset(0, -12) = UCase(cell.Offset(0, -12))
End If
Next cell
End Sub
I know you said in your question starting at row 2 but it's easier just going from last row until row 2.
Hope this can help or at least, learn something new about Loops :)
Sub CAPS()
Dim j As Integer
For j = Range("B2").End(xlDown).Row To 2 Step -1
If Range("N" & j).Value = 5 Then
'uppercase
Range("B" & j).Value = UCase(Range("B" & j).Value)
Else
'lowercase
Range("B" & j).Value = LCase(Range("B" & j).Value)
End If
Next j
End Sub
I've been working on a VBA script for a while now that goes through the values of a column and deletes all rows with values appearing only once (pretty much the inverse of deleting duplicates).
Column Headers to make explanation easier
There are numbers in the 'VTR' column that appear more than once. most appear just once.
I'd like the macro to delete all rows where the number in the 'VTR' column appears only once.(in the case of one of these numbers appearing more than once, the difference lies at the 'AFTARTKRZ' column where the value can either be (GAPNK or GAPN2) or RSLNV or RSVNK. (GAPNK or GAPN2 are the same thing)
i.e a row can appear either once with AFTARTKRZ,
(GAPNK or GAPN2)
-OR twice
either (GAPNKorGAPN2), RSLNV
or (GAPNKorGAPN2), RSVNK
OR thrice
(GAPNK or GAPN2), RSLNV, RSVNK.
I'd like to delete all those that appear only once (GAPNKorGAPN2)
Furthermore, I'd like to then add the values of the 'AFTARTKRZ' values of the duplicates to 2 extra columns at the end.
i.e, when a (GAPNK or GAPN2) appears two or threee other times, I'd like to input the 'AFTARTKRZ' column value in the 2 last columns at the end.
Something like this should be the final result
VTR|AFTARTKRZ | Add1 | Add2
11 |GAPNK |RSLNV | RSVNK| - VTR appeared thrice
12 |GAPN2 |RSLNV | | - Appeared twice as (GAPNKorGAPN2), RSLNV
13 |GAPNK |RSVNK | | - Appeared twice as (GAPNKorGAPN2), RSVNK
14 |GAPN2 | |
15 |GAPNK | |
16 |GAPN2 | |
The relevant part begins at '~~~~ Work on A
Sub Test()
Dim wb As Workbook
Dim ws1 As Worksheet, ws2 As Worksheet, ws3 As Worksheet
Dim RowsToTestC As Range, Delrange As Range
Dim i As Long, Lrow As Long, Lop As Long
Set ws1 = ThisWorkbook.Worksheets(1)
ThisWorkbook.ActiveSheet.Name = "A"
ws1.Copy ThisWorkbook.Sheets(Sheets.Count)
ThisWorkbook.ActiveSheet.Name = "B"
Set ws2 = ThisWorkbook.Worksheets(2)
ws1.Copy ThisWorkbook.Sheets(Sheets.Count)
ThisWorkbook.ActiveSheet.Name = "C"
Set ws2 = ThisWorkbook.Worksheets(3)
'~~~~ Work on C
Worksheets("C").Activate
With ActiveSheet
ActiveSheet.Range("A:AQ").RemoveDuplicates Columns:=6, Header:=xlNo
End With
Worksheets("C").Activate
Application.ScreenUpdating = False
'~~> Delete all but RSV
For Lrow = Range("D" & Rows.Count).End(xlUp).Row To 2 Step -1
If Range("D" & Lrow).Value = "GAPNK" Or Range("D" & Lrow) = "GAPN2" Then
Rows(Lrow).EntireRow.Delete
End If
Next Lrow
'~~~~ Work on B
Worksheets("B").Activate
With ActiveSheet
ActiveSheet.Range("A:AQ").RemoveDuplicates Columns:=6, Header:=xlNo
End With
Worksheets("B").Activate
Application.ScreenUpdating = False
'~~> Delete all but GAP
For Lrow = Range("D" & Rows.Count).End(xlUp).Row To 2 Step -1
If Range("D" & Lrow).Value = "RSVNK" Or Range("D" & Lrow) = "RSLNV" Then
Rows(Lrow).EntireRow.Delete
End If
Next Lrow
'~~~~ Work on A
Worksheets("A").Activate
Range("AR1").Select
ActiveCell.FormulaR1C1 = "RSVNK"
Range("AS1").Select
ActiveCell.FormulaR1C1 = "RSLNV"
With ws1
'~~> Get the last row which has data in Col A
Lop = .Range("A" & .Rows.Count).End(xlUp).Row
'~~> Loop through the rows
For i = 2 To Lop
'~~> For for multiple occurances
If .Cells(i, 6).Value <> "" And .Cells(i, 4).Value <> "" Then
If Application.WorksheetFunction.CountIf(.Columns(6), .Cells(i, 6)) = 1 And _
Application.WorksheetFunction.CountIf(.Columns(4), .Cells(i, 4)) = 1 Then
'~~> Store thee row in a temp range
If Delrange Is Nothing Then
Set Delrange = .Rows(i)
Else
Set Delrange = Union(Delrange, .Rows(i))
End If
End If
End If
Next
End With
End Sub
The logic of your code isn't valid.
The condition If Application.WorksheetFunction.CountIf(.Columns(4), .Cells(i, 4)) = 1 will always be False because this is the column containing your AFTARTKRZ keys. I don't know how many rows of data you have, but even in the 10 row sample you gave us the result is always greater than 1.
I do think you're making this unnecessarily complicated. Aren't you just trying to populate two lists: one of GAPs and the other of RSVs? You then want to create a third list where the GAP entries have corresponding RSV entries?
That could be done in a couple of short routines. You could do away with all of your sheet copying and row deleting and simply write your three lists directly to your sheets.
The code below shows you how this could be done. I've created 4 sheets, so you may need to add another to your Workbook: Sheet1 is your summary list (A), Sheet2 is your GAP list (B), Sheet3 is your RSV list (C), and Sheet4 holds the raw data.
Hopefully this code can get you started:
Option Explicit
Public Sub RunMe()
Const AFTARTKRZ_COL As Long = 4
Const VTR_COL As Long = 6
Dim data As Variant
Dim GAPs As Collection
Dim RSVs As Collection
Dim multis As Collection
Dim vtrKey As String
Dim multi(0 To 1) As Long
Dim i As Long, r As Long, c As Long
Dim v As Variant
Dim countRSV As Long
Dim output() As Variant
'Name your sheets.
'If you have fewer than 3 sheets or
'sheets already names A, B, C then this
'will throw an error.
Sheet1.Name = "A"
Sheet2.Name = "B"
Sheet3.Name = "C"
'Initialise the 3 collections
Set GAPs = New Collection
Set RSVs = New Collection
Set multis = New Collection
'Read the data - I've put my dummy data on Sheet4
data = Sheet4.UsedRange.Value2
'Iterate rows and place row in relevant collection
For r = 1 To UBound(data, 1)
vtrKey = CStr(data(r, VTR_COL))
On Error Resume Next 'removes duplicate entries
Select Case data(r, AFTARTKRZ_COL)
Case Is = "GAPNK", "GAPN2": GAPs.Add r, vtrKey
Case Is = "RSLNV": RSVs.Add r, vtrKey & "|RSLNV"
Case Is = "RSVNK": RSVs.Add r, vtrKey & "|RSVNK"
End Select
On Error GoTo 0
Next
'Check if each GAP also has RSVs
For Each v In GAPs
vtrKey = CStr(data(v, VTR_COL))
countRSV = 0
If Exists(RSVs, vtrKey & "|RSLNV") Then countRSV = countRSV + 1
If Exists(RSVs, vtrKey & "|RSVNK") Then countRSV = countRSV + 2
If countRSV > 0 Then
multi(0) = CLng(v)
multi(1) = countRSV
multis.Add multi, vtrKey
End If
Next
'Write your outputs
'Sheet C
ReDim output(1 To RSVs.Count + 1, 1 To UBound(data, 2))
For c = 1 To UBound(data, 2)
output(1, c) = data(1, c)
Next
i = 2
For Each v In RSVs
For c = 1 To UBound(data, 2)
output(i, c) = data(v, c)
Next
i = i + 1
Next
With Sheet3
.Cells.Clear
.Range("A1").Resize(UBound(output, 1), UBound(output, 2)).Value = output
.Columns.AutoFit
End With
'Sheet B
ReDim output(1 To GAPs.Count + 1, 1 To UBound(data, 2))
For c = 1 To UBound(data, 2)
output(1, c) = data(1, c)
Next
i = 2
For Each v In GAPs
For c = 1 To UBound(data, 2)
output(i, c) = data(v, c)
Next
i = i + 1
Next
With Sheet2
.Cells.Clear
.Range("A1").Resize(UBound(output, 1), UBound(output, 2)).Value = output
.Columns.AutoFit
End With
'Sheet A
ReDim output(1 To multis.Count + 1, 1 To 5)
output(1, 1) = "VTR"
output(1, 2) = "AFTARTKRZ"
output(1, 3) = "Add1"
output(1, 4) = "Add2"
i = 2
For Each v In multis
r = v(0)
output(i, 1) = data(r, VTR_COL)
output(i, 2) = data(r, AFTARTKRZ_COL)
output(i, 2) = data(r, AFTARTKRZ_COL)
Select Case v(1)
Case 1
output(i, 3) = "RSLNV"
output(i, 5) = "Appeared twice as (GAPNK or GAPN2), RSLNV"
Case 2
output(i, 3) = "RSVNK"
output(i, 5) = "Appeared twice as (GAPNK or GAPN2), RSVNK"
Case 3
output(i, 3) = "RSLNV"
output(i, 4) = "RSVNK"
output(i, 5) = "VTR appeared thrice"
End Select
i = i + 1
Next
With Sheet1
.Cells.Clear
.Range("A1").Resize(UBound(output, 1), UBound(output, 2)).Value = output
.Columns.AutoFit
End With
End Sub
Private Function Exists(col As Collection, key As String) As Boolean
Dim v As Variant
On Error Resume Next
v = col(key)
On Error GoTo 0
Exists = Not IsEmpty(v)
End Function
I have similar question to
[combine Rows with Duplicate Values][1]
Excel VBA - Combine rows with duplicate values in one cell and merge values in other cell
I have data in this format (rows are sorted)
Pub ID CH Ref
no 15 1 t2
no 15 1 t88
yes 15 2 t3
yes 15 2 t3
yes 15 2 t6
compare adjacent rows (say row 4 and 5) , if col 2 and 3 match then if col 4 different merge col4, delete row. if col 2,3,4 match then delete row, don't merge col 4
Desired Output
key ID CH Text
no 15 1 t2 t88
yes 15 2 t3 t6
This first code section doesn't work right
Sub mergeCategoryValues()
Dim lngRow As Long
With ActiveSheet
Dim columnToMatch1 As Integer: columnToMatch1 = 2
Dim columnToMatch2 As Integer: columnToMatch2 = 3
Dim columnToConcatenate As Integer: columnToConcatenate = 4
lngRow = .Cells(65536, columnToMatch1).End(xlUp).row
.Cells(columnToMatch1).CurrentRegion.Sort key1:=.Cells(columnToMatch1), Header:=xlYes
.Cells(columnToMatch2).CurrentRegion.Sort key1:=.Cells(columnToMatch2), Header:=xlYes
Do
If .Cells(lngRow, columnToMatch1) = .Cells(lngRow - 1, columnToMatch1) Then 'check col 2 row lngRow, lngRow-1
If .Cells(lngRow, columnToMatch2) = .Cells(lngRow - 1, columnToMatch2) Then 'check col 3 row lngRow, lngRow-1
If .Cells(lngRow - 1, columnToConcatenate) = .Cells(lngRow, columnToConcatenate) Then
Else
.Cells(lngRow - 1, columnToConcatenate) = .Cells(lngRow - 1, columnToConcatenate) & "; " & .Cells(lngRow, columnToConcatenate)
End If
.Rows(lngRow).Delete
End If
End If
lngRow = lngRow - 1
Loop Until lngRow = 1
End With
Actual Output incorrect because when cells merge t3 will not match t3;t6, my comparison on col 4 will only work in very simple case only.
Actual Output
key ID CH Text
no 15 1 t2; t88
yes 15 2 t3; t3; t6
Therefore, I had to add these two sections to split the Concatenate cells and then remove duplicates
'split cell in Col d to col e+ delimited by ;
With Range("D2:D6", Range("D" & Rows.Count).End(xlUp))
.Replace ";", " ", xlPart
.TextToColumns other:=True
End With
'remove duplicates in each row
Dim x, y(), i&, j&, k&, s$
With ActiveSheet.UsedRange
x = .Value: ReDim y(1 To UBound(x, 1), 1 To UBound(x, 2))
For i = 1 To UBound(x)
For j = 1 To UBound(x, 2)
If Len(x(i, j)) Then
If InStr(s & "|", "|" & x(i, j) & "|") = 0 Then _
s = s & "|" & x(i, j): k = k + 1: y(i, k) = x(i, j)
End If
Next j: s = vbNullString: k = 0
Next i
.Value = y()
End With
End Sub
With additional code output is
Pub ID CH Ref
no 15 1 t2 t88
yes 15 2 t3 t6
Question: There must be much easier way to do this right than use three different methods? How about inserting new columns 5+ if col 4 items don't match?
Note: Remove duplicates code was found from user nilem at excelforum.
Edit: Col 1 will always be same if Col 2 and 3 match. If solution is much easier we can assume Col 1 is blank and ignore data.
I have printed book lookup table and need to convert to a simple format that will be used in equipment that use a 1960's language which has very limited commands. I am trying to preformat this data so I only need to search for one row that has all info.
Col D final output can be in col D with delimiter or into col D-K (only 8 max Ref) because I will parse to use on other machine. Whatever method is easier.
The canonical practise for deleting rows is to start at the bottom and work toward the top. In this manner, rows are not skipped. The trick here is to find rows above the current position that match columns B and C and concatenate the strings from column D before removing the row. There are several good worksheet formulas that can acquire the row number of a two-column-match. Putting one of them into practise with application.Evaluate would seem to be the most expedient method of collecting the values from column D.
Sub dedupe_and_collect()
Dim rw As Long, mr As Long, wsn As String
With ActiveSheet '<- set this worksheet reference properly!
wsn = .Name
With .Cells(1, 1).CurrentRegion
.RemoveDuplicates Columns:=Array(2, 3, 4), Header:=xlYes
End With
With .Cells(1, 1).CurrentRegion 'redefinition after duplicate removal
For rw = .Rows.Count To 2 Step -1 'walk backwards when deleting rows
If Application.CountIfs(.Columns(2), .Cells(rw, 2).Value, .Columns(3), .Cells(rw, 3).Value) > 1 Then
mr = Application.Evaluate("MIN(INDEX(ROW(1:" & rw & ")+(('" & wsn & "'!B1:B" & rw & "<>'" & wsn & "'!B" & rw & ")+('" & wsn & "'!C1:C" & rw & "<>'" & wsn & "'!C" & rw & "))*1E+99, , ))")
'concatenate column D
'.Cells(mr, 4) = .Cells(mr, 4).Value & "; " & .Cells(rw, 4).Value
'next free column from column D
.Cells(mr, Columns.Count).End(xlToLeft).Offset(0, 1) = .Cells(rw, 4).Value
.Rows(rw).EntireRow.Delete
End If
Next rw
End With
End With
End Sub
The removal of records on a three-column-match is done with the VBA equivalent of the Date ► Data Tools ► Remove Duplicates command. This only considers columns B, C and D and deletes the lower duplicates (keeping the ones closest to row 1). If Column A is important in this respect, additional coding would have to be added.
It's unclear to me whether you wanted column D as delimited string or separate cells as an end result. Could you clarify?
As I wrote above, I would iterate through the data and collect things into the User Defined Object. There is no need for the data to be sorted in this method; and duplicate REF's will be omitted.
One advantage of a User Defined Object is that it makes debugging easier as you can see more clearly what you have done.
We combine every line where ID and CH are the same, by using the property of the Collection object to raise an error if identical keys are used.
So far as combining the Ref's in a single cell with a delimiter, vs individual cells in columns D:K, either can be done simply. I chose to separate into columns, but changing it to combine into a single column would be trivial.
After Inserting the Class Module, you must rename it: cID_CH
You will note I placed the results on a separate worksheets. You could overwrite the original data, but I would advise against that.
Class Module
Option Explicit
Private pID As Long
Private pCH As Long
Private pPUB As String
Private pREF As String
Private pcolREF As Collection
Public Property Get ID() As Long
ID = pID
End Property
Public Property Let ID(Value As Long)
pID = Value
End Property
Public Property Get CH() As Long
CH = pCH
End Property
Public Property Let CH(Value As Long)
pCH = Value
End Property
Public Property Get PUB() As String
PUB = pPUB
End Property
Public Property Let PUB(Value As String)
pPUB = Value
End Property
Public Property Get REF() As String
REF = pREF
End Property
Public Property Let REF(Value As String)
pREF = Value
End Property
Public Property Get colREF() As Collection
Set colREF = pcolREF
End Property
Public Sub ADD(refVAL As String)
On Error Resume Next
pcolREF.ADD refVAL, refVAL
On Error GoTo 0
End Sub
Private Sub Class_Initialize()
Set pcolREF = New Collection
End Sub
Regular Module
Option Explicit
Sub CombineDUPS()
Dim wsSRC As Worksheet, wsRES As Worksheet
Dim vSRC As Variant, vRES() As Variant, rRES As Range
Dim cI As cID_CH, colI As Collection
Dim I As Long, J As Long
Dim S As String
'Set source and results worksheets and results range
Set wsSRC = Worksheets("sheet1")
Set wsRES = Worksheets("sheet2")
Set rRES = wsRES.Cells(1, 1)
'Get Source data
With wsSRC
vSRC = .Range("A2", .Cells(.Rows.Count, "D").End(xlUp))
End With
'Collect and combine data
Set colI = New Collection
On Error Resume Next
For I = 1 To UBound(vSRC, 1)
Set cI = New cID_CH
With cI
.PUB = vSRC(I, 1)
.ID = vSRC(I, 2)
.CH = vSRC(I, 3)
.REF = vSRC(I, 4)
.ADD .REF
S = CStr(.ID & "|" & .CH)
colI.ADD cI, S
If Err.Number = 457 Then
Err.Clear
colI(S).ADD .REF
ElseIf Err.Number <> 0 Then
Debug.Print Err.Number, Err.Description
Stop
End If
End With
Next I
On Error GoTo 0
'Create and populate Results Array
ReDim vRES(0 To colI.Count, 1 To 11)
'Header row
vRES(0, 1) = "Pub"
vRES(0, 2) = "ID"
vRES(0, 3) = "CH"
vRES(0, 4) = "Ref"
'populate array
For I = 1 To colI.Count
With colI(I)
vRES(I, 1) = .PUB
vRES(I, 2) = .ID
vRES(I, 3) = .CH
For J = 1 To .colREF.Count
vRES(I, J + 3) = .colREF(J)
Next J
End With
Next I
'Write the results to the worksheet
Set rRES = rRES.Resize(UBound(vRES, 1) + 1, UBound(vRES, 2))
With rRES
.EntireColumn.Clear
.Value = vRES
With .Rows(1)
.Font.Bold = True
.HorizontalAlignment = xlCenter
Range(.Cells(4), .Cells(11)).HorizontalAlignment = xlCenterAcrossSelection
End With
.EntireColumn.AutoFit
End With
End Sub
Original
Processed Results
variant using dictionary below
Sub test()
Dim Dic As Object: Set Dic = CreateObject("Scripting.Dictionary")
Dic.Comparemode = vbTextCompare
Dim Cl As Range, x$, y$, i&, Key As Variant
For Each Cl In Range("B2:B" & Cells(Rows.Count, "B").End(xlUp).Row)
x = Cl.Value & "|" & Cl.Offset(, 1).Value
y = Cl.Offset(, 2).Value
If Not Dic.exists(x) Then
Dic.Add x, Cl.Offset(, -1).Value & "|" & y & "|"
ElseIf Dic.exists(x) And Not LCase(Dic(x)) Like "*|" & LCase(y) & "|*" Then
Dic(x) = Dic(x) & "|" & y & "|"
End If
Next Cl
Range("A2:D" & Cells(Rows.Count, "B").End(xlUp).Row).ClearContents
i = 2
For Each Key In Dic
Cells(i, "A") = Split(Dic(Key), "|")(0)
Range(Cells(i, "B"), Cells(i, "C")) = Split(Key, "|")
Cells(i, "D") = Replace(Split(Replace(Dic(Key), "||", ";"), "|")(1), ":", ";")
i = i + 1
Next Key
Set Dic = Nothing
End Sub
before
after