Related
I am writing a VBA code on excel using loops to go through 10000+ lines.
Here is an example of the table
And here is the code I wrote :
Sub Find_Matches()
Dim wb As Workbook
Dim xrow As Long
Set wb = ActiveWorkbook
wb.Worksheets("Data").Activate
tCnt = Sheets("Data").UsedRange.Rows.Count
Dim e, f, a, j, h As Range
xrow = 2
Application.ScreenUpdating = False
Application.Calculation = xlManual
For xrow = 2 To tCnt Step 1
Set e = Range("E" & xrow)
Set f = e.Offset(0, 1)
Set a = e.Offset(0, -4)
Set j = e.Offset(0, 5)
Set h = e.Offset(0, 3)
For Each Cell In Range("E2:E" & tCnt)
If Cell.Value = e.Value Then
If Cell.Offset(0, 1).Value = f.Value Then
If Cell.Offset(0, -4).Value = a.Value Then
If Cell.Offset(0, 5).Value = j.Value Then
If Cell.Offset(0, 3).Value = h.Value Then
If (e.Offset(0, 7).Value) + (Cell.Offset(0, 7).Value) = 0 Then
Cell.EntireRow.Interior.Color = vbYellow
e.EntireRow.Interior.Color = vbYellow
End If
End If
End If
End If
End If
End If
Next
Next
End Sub
As you can imagine, this is taking a lot of time to go through 10000+ lines and I would like to find a faster solution. There must be a method I don't think to avoid the over looping
Here are the condition :
For each line, if another line anywhere in the file has the exact same
:
Buyer ID (col. E)
`# purchased (col. F)
Product ID (col.A)
Payment (col. J)
Date purchased (col. H)
Then, if the SUM of the Amount (col. L) the those two matching line is
0, then color both rows in yellow.
Note that extra columns are present and not being compared (eg- col. B) but are still important for the document and cannot be deleted to ease the process.
Running the previous code, in my example, row 2 & 5 get highlighted :
This is using nested dictionaries and arrays to check all conditions
Timer with my test data: Rows: 100,001; Dupes: 70,000 - Time: 14.217 sec
Option Explicit
Public Sub FindMatches()
Const E = 5, F = 6, A = 1, J = 10, H = 8, L = 12
Dim ur As Range, x As Variant, ub As Long, d As Object, found As Object
Set ur = ThisWorkbook.Worksheets("Data").UsedRange
x = ur
Set d = CreateObject("Scripting.Dictionary")
Set found = CreateObject("Scripting.Dictionary")
Dim r As Long, rId As String, itm As Variant, dupeRows As Object
For r = ur.Row To ur.Rows.Count
rId = x(r, E) & x(r, F) & x(r, A) & x(r, J) & x(r, H)
If Not d.Exists(rId) Then
Set dupeRows = CreateObject("Scripting.Dictionary")
dupeRows(r) = 0
Set d(rId) = dupeRows
Else
For Each itm In d(rId)
If x(r, L) + x(itm, L) = 0 Then
found(r) = 0
found(itm) = 0
End If
Next
End If
Next
Application.ScreenUpdating = False
For Each itm In found
ur.Range("A" & itm).EntireRow.Interior.Color = vbYellow
Next
Application.ScreenUpdating = True
End Sub
Before
After
I suggest a different approach altogether: add a temporary column to your data that contains a concatenation of each cell in the row. This way, you have:
A|B|C|D|E
1|Mr. Smith|500|A|1Mr. Smith500A
Then use Excel's conditional formatting on the temporary column, highlighting duplicate values. There you have your duplicated rows. Now it's only a matter of using a filter to check which ones have amounts equal to zero.
You can use the CONCATENATE function; it requires you to specify each cell separately and you can't use a range, but in your case (comparing only some of the columns) it seems like a good fit.
Maciej's answer is easy to implement (if you can add columns to your data without interrupting anything), and I would recommend it if possible.
However, for the sake of answering your question, I will contribute a VBA solution as well. I tested it on dataset that is a bit smaller than yours, but I think it will work for you. Note that you might have to tweak it a little (which row you start on, table name, etc) to fit your workbook.
Most notably, the segment commented with "Helper column" is something you most likely will have to adjust - currently, it compares every cell between A and H for the current row, which is something you may or may not want.
I've tried to include a little commentary in the code, but it's not much. The primary change is that I'm using in-memory processing of an array rather than iterating over a worksheet range (which for larger datasets should be exponentially faster).
Option Base 1
Option Explicit
' Uses ref Microsoft Scripting Runtime
Sub Find_Matches()
Dim wb As Workbook, ws As Worksheet
Dim xrow As Long, tCnt As Long
Dim e As Range, f As Range, a As Range, j As Range, h As Range
Dim sheetArr() As Variant, arr() As Variant
Dim colorTheseYellow As New Dictionary, colorResults() As String, dictItem As Variant
Dim arrSize As Long, i As Long, k As Long
Dim c As Variant
Set wb = ThisWorkbook
Set ws = wb.Worksheets("Data")
ws.Activate
tCnt = ws.UsedRange.Rows.Count
xrow = 2
Application.ScreenUpdating = False
Application.Calculation = xlManual
' Read range into an array so we process in-memory
sheetArr = ws.Range("A2:H" & tCnt)
arrSize = UBound(sheetArr, 1)
' Build new arr with "helper column"
ReDim arr(1 To arrSize, 1 To 9)
For i = 1 To arrSize
For k = 1 To 8
arr(i, k) = sheetArr(i, k)
arr(i, 9) = CStr(arr(i, 9)) & CStr(arr(i, k)) ' "Helper column"
Next k
Next i
' Iterate over array & build collection to indicate yellow lines
For i = LBound(arr, 1) To UBound(arr, 1)
If Not colorTheseYellow.Exists(i) Then colorResults = Split(ReturnLines(arr(i, 9), arr), ";")
For Each c In colorResults
If Not colorTheseYellow.Exists(CLng(c)) Then colorTheseYellow.Add CLng(c), CLng(c)
Next c
Next i
' Enact row colors
For Each dictItem In colorTheseYellow
'Debug.Print "dict: "; dictItem
If dictItem <> 0 Then ws.ListObjects(1).ListRows(CLng(dictItem)).Range.Interior.Color = vbYellow
Next dictItem
End Sub
Function ReturnLines(ByVal s As String, ByRef arr() As Variant) As String
' Returns a "Index;Index" string indicating the index/indices where the second, third, etc. instance(s) of s was found
' Returns "0;0" if 1 or fewer matches
Dim i As Long
Dim j As Long
Dim tmp As String
ReturnLines = 0
j = 0
tmp = "0"
'Debug.Print "arg: " & s
For i = LBound(arr, 1) To UBound(arr, 1)
If arr(i, 9) = s Then
j = j + 1
'Debug.Print "arr: " & arr(i, 9)
'Debug.Print "ReturnLine: " & i
tmp = tmp & ";" & CStr(i)
End If
Next i
'If Left(tmp, 1) = ";" Then tmp = Mid(tmp, 2, Len(tmp) - 1)
'Debug.Print "tmp: " & tmp
If j >= 2 Then
ReturnLines = tmp
Else
ReturnLines = "0;0"
End If
End Function
On my simple dataset, it yields this result (marked excellently with freehand-drawn color indicators):
Thanks everybody for your answers,
Paul Bica's solution actually worked and I am using a version of this code now.
But, just to animate the debate, I think I also found another way around my first code, inspired by Maciej's idea of concatenating the cells and using CStr to compare the values and, of course Vegard's in-memory processing by using arrays instead of going through the workbook :
Sub Find_MatchesStr()
Dim AmountArr(300) As Variant
Dim rowArr(300) As Variant
Dim ws As Worksheet
Dim wb As Workbook
Set ws = ThisWorkbook.Sheets("Data")
ws.Activate
Range("A1").Select
rCnt = ws.Cells.SpecialCells(xlCellTypeLastCell).Row
For i = 2 To rCnt
If i = rCnt Then
Exit For
Else
intCnt = 0
strA = ws.Cells(i, 1).Value
strE = ws.Cells(i, 5).Value
strF = ws.Cells(i, 6).Value
strH = ws.Cells(i, 8).Value
strL = ws.Cells(i, 10).Value
For j = i To rCnt - 1
strSearchA = ws.Cells(j, 1).Value
strSearchE = ws.Cells(j, 5).Value
strSearchF = ws.Cells(j, 6).Value
strSearchH = ws.Cells(j, 8).Value
strSearchL = ws.Cells(j, 10).Value
If CStr(strE) = CStr(strSearchE) And CStr(strA) = CStr(strSearchA) And CStr(strF) = CStr(strSearchF) And CStr(strH) = CStr(strSearchH) And CStr(strL) = CStr(strSearchL) Then
AmountArr(k) = ws.Cells(j, 12).Value
rowArr(k) = j
intCnt = intCnt + 1
k = k + 1
Else
Exit For
End If
Next
strSum = 0
For s = 0 To UBound(AmountArr)
If AmountArr(s) <> "" Then
strSum = strSum + AmountArr(s)
Else
Exit For
End If
Next
strAppenRow = ""
For b = 0 To UBound(rowArr)
If rowArr(b) <> "" Then
strAppenRow = strAppenRow & "" & rowArr(b) & "," & AmountArr(b) & ","
Else
Exit For
End If
Next
If intCnt = 1 Then
Else
If strSum = 0 Then
For rn = 0 To UBound(rowArr)
If rowArr(rn) <> "" Then
Let rRange = rowArr(rn) & ":" & rowArr(rn)
Rows(rRange).Select
Selection.Interior.Color = vbYellow
Else
Exit For
End If
Next
Else
strvar = ""
strvar = Split(strAppenRow, ",")
For ik = 1 To UBound(strvar)
If strvar(ik) <> "" Then
strVal = CDbl(strvar(ik))
For ik1 = ik To UBound(strvar)
If strvar(ik1) <> "" Then
strVal1 = CDbl(strvar(ik1))
If strVal1 + strVal = 0 Then
Let sRange1 = strvar(ik - 1) & ":" & strvar(ik - 1)
Rows(sRange1).Select
Selection.Interior.Color = vbYellow
Let sRange = strvar(ik1 - 1) & ":" & strvar(ik1 - 1)
Rows(sRange).Select
Selection.Interior.Color = vbYellow
End If
Else
Exit For
End If
ik1 = ik1 + 1
Next
Else
Exit For
End If
ik = ik + 1
Next
End If
End If
i = i + (intCnt - 1)
k = 0
Erase AmountArr
Erase rowArr
End If
Next
Range("A1").Select
End Sub
I still have some mistakes (rows not higlighted when they should be), the above code is not perfect, but I thought it'd be OK to give you an idea of where I was going before Paul Bica's solution came in.
Thanks again !
If your data is only till column L, then use below code, I found it is taking less time to run....
Sub Duplicates()
Application.ScreenUpdating = False
Dim i As Long, lrow As Long
lrow = Cells(Rows.Count, 1).End(xlUp).Row
Range("O2") = "=A2&E2&F2&J2&L2"
Range("P2") = "=COUNTIF(O:O,O2)"
Range("O2:P" & lrow).FillDown
Range("O2:O" & lrow).Copy
Range("O2:O" & lrow).PasteSpecial xlPasteValues
Application.CutCopyMode = False
For i = 1 To lrow
If Cells(i, 16) = 2 Then
Cells(i, 16).EntireRow.Interior.Color = vbYellow
End If
Next
Application.ScreenUpdating = True
Range("O:P").Delete
Range("A1").Select
MsgBox "Done"
End Sub
I am trying to exit the function if either the value is equal to the value I'm looking for, or if the row is equal to the row I'm looking for.
But every time I use Exit Function, it doesn't work. And if I replace it with End Function it tells me that I don't have an End to my If statement. And I'm getting lost.
Function recursion(whereItEnds As Integer, lookingFor As Variant, currentMarker As Range, I As Integer, wsEverything As Worksheet) As Integer
Dim col As Integer
Dim newMarker As String
newMarker = currentMarker.Value
Dim currentMarker1 As Range
recursion = 2
col = 2
If (StrComp(lookingFor, newMarker, vbTextCompare) = 0) Then
Exit Function
End If
While (IsEmpty(wsEverything.Cells(col, "B").Value) = False)
If (StrComp(wsEverything.Cells(col, "B").Value, newMarker, vbTextCompare) = 0) Then
wsEverything.Cells.Range("A" & col, "F" & col).Copy
Worksheets("Review").Cells.Range("A" & I).PasteSpecial
Worksheets("Review").Cells.Range("G" & I).Value = col
I = I + 1
Set currentMarker1 = wsEverything.Cells(col, "E")
If (col = whereItEnds) Then
Exit Function
End If
recursion = recursion(whereItEnds, lookingFor, currentMarker1, I, wsEverything)
End If
col = col + 1
Wend
End Function
I'm almost completely out of ideas as to why neither works.
EDIT: It hits the if statements, it goes into those codes. but when debugging, it touches "exit function" but then it just keeps going. i just want it to end the statement. This is pulling data off another long sheet and putting it on a second sheet. it is checking for child parent circular errors. where a parent in the future is dependent on the child that originally was dependent on it.
Is this what you want?
Before (Sheet1):
After (Review Sheet):
Option Explicit
Public Sub TestRecursion()
Dim result As Variant, ws As Worksheet
Set ws = Sheet1
result = Recursion(ws.Cells(2, 8), ws.Cells(2, 2), ws.Cells(2, 5), 2, ws)
End Sub
Public Function Recursion(ByVal whereItEnds As Long, lookingFor As Variant, _
ByRef currentMarker As Range, ByVal i As Long, _
ByRef wsEverything As Worksheet) As Long
Dim col As Long, newMarker As String, currentMarker1 As Range
newMarker = currentMarker.Value
Recursion = 2
col = 2
If StrComp(lookingFor, newMarker, vbTextCompare) = 0 Then Exit Function
While Len(wsEverything.Cells(col, "B").Value2) > 0
If StrComp(wsEverything.Cells(col, "B").Value2, newMarker, vbTextCompare) = 0 Then
wsEverything.Cells.Range("A" & col, "F" & col).Copy
Worksheets("Review").Cells.Range("A" & i).PasteSpecial
Worksheets("Review").Cells.Range("G" & i).Value = col
i = i + 1
Set currentMarker1 = wsEverything.Cells(col, "E")
If col = whereItEnds Then Exit Function
Recursion = Recursion(whereItEnds, lookingFor, currentMarker1, i, wsEverything)
End If
col = col + 1
Wend
End Function
If so, you can provide more context explaining the logic for the expected result
Probably a less convoluted solution can be found for this
I'm an Excel VBA newbie and i'm trying to get the duplicates rows to appends to the first occurence of that row.
Per exemple we have the table here
I would like to format data as here
The logic goes like this. Whenever we detect that the last name and the birth date are the same for the current and following line that mean we have a dependant and we need to append the dependant's data to the "Main"
I have started writing code but i'm not able to detect the dependants properly.
Below is what i have. please consider that i'm a real noob and i'm trying hard.
Sub formatData()
Dim sh As Worksheet
Dim rw As Range
Dim RowCount As Integer
'This variable is checked to see if we have a first occurence of a line
Dim firstOccurence
'Initialise the variables for that will be used to match the data
Dim LocationName
Dim PlanCode
Dim LastName
Dim FirstName
Dim dependantFirstName
Dim dependantLastName
Dim dependantBirthdate
RowCount = 0
firstOccurence = True
'Check if the spreadsheet already exist if not create it.
For i = 1 To Worksheets.Count
If Worksheets(i).Name = "Benefits Census Formatted" Then
exists = True
End If
Next i
If Not exists Then
'Create a new spreadsheet to add the data to
Set ws = Sheets.Add
Sheets.Add.Name = "Benefits Census Formatted"
End If
'Set the ActiveSheet to the one containing the original data
Set sh = Sheets("BENEFIT Census")
With ActiveSheet
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
For Each rw In sh.Rows
'If the data of one cell is empty EXIT THE LOOP
If sh.Cells(rw.Row, 1).Value = "" Then
Exit For
End If
If rw.Row > 1 Then
'Afffecting the variables to the next loop so we can compare the values
nextLocationName = sh.Cells(rw.Row + 1, 1).Value
nextPlanCode = sh.Cells(rw.Row + 1, 2).Value
nextLastName = sh.Cells(rw.Row + 1, 3).Value
nextFirstName = sh.Cells(rw.Row + 1, 4).Value
nextEmploymentDate = sh.Cells(rw.Row + 1, 5).Value
nextBirthDate = sh.Cells(rw.Row + 1, 6).Value
nextDependantFirstName = sh.Cells(rw.Row + 1, 25).Value
nextDependantLastName = sh.Cells(rw.Row + 1, 26).Value
nextDependantBirthdate = sh.Cells(rw.Row + 1, 27).Value
Debug.Print LastName & " - " & FirstName & " ::: " & nextLastName & " - " & nextFirstName & " : " & rw.Row & " : " & firstOccurence
'First time you pass through the loop write the whole lane
If firstOccurence = True Then
'Affecting the variables to the current loops values
LocationName = sh.Cells(rw.Row, 1).Value
PlanCode = sh.Cells(rw.Row, 2).Value
LastName = sh.Cells(rw.Row, 3).Value
FirstName = sh.Cells(rw.Row, 4).Value
dependantFirstName = sh.Cells(rw.Row, 25).Value
dependantLastName = sh.Cells(rw.Row, 26).Value
dependantBirthdate = sh.Cells(rw.Row, 27).Value
'Write the current line
sh.Rows(rw.Row).Copy
'We copy the value into another sheet
Set ns = Sheets("Benefits Census Formatted")
LastRow = ns.Cells(ns.Rows.Count, "A").End(xlUp).Row + 1
ns.Rows(LastRow).PasteSpecial xlPasteValues
firstOccurence = False
Else
'We match the location with the plan code and the last name and first name of the user to find duplicates
If dependantFirstName <> nextDependantFirstName And PlanCode <> nextPlanCode And LastName <> nextLastName And FirstName <> nextFirstName Then
'We find a different dependant if the first name or the last name or the birthdate differs
'If Not (dependantFirstName <> nextDependantFirstName) Or Not (dependantLastName <> nextDependantLastName) Or Not (dependantBirthdate <> nextDependantBirthdate) Then
'We have a dependant Append it to the line
'append the user to the currentLine
'End If
Else
'If the dependantFirstName and the nextDependant First name doesn't match then on the next loop we print the full line
firstOccurence = True
End If
End If
RowCount = RowCount + 1
'End of if row > 2
End If
Next rw
End With
End Sub
This is the code I wrote for you. (Glad to see that so many others did, too. So you got a choice :-))
Sub TransscribeData()
' 25 Mar 2017
Dim WsS As Worksheet ' Source
Dim WsT As Worksheet ' Target
Dim TargetName As String
Dim LastRow As Long ' in WsS
Dim Rs As Long ' Source: row
Dim Rt As Long, Ct As Long ' Target: row / column
Dim Tmp As String
Dim Comp As String ' compare string
' Set Source sheet to the one containing the original data
Set WsS = Worksheets("BENEFIT Census")
LastRow = WsS.Cells(WsS.Rows.Count, NbcName).End(xlUp).Row
Application.ScreenUpdating = False
TargetName = "Benefits Census Formatted"
On Error Resume Next
Set WsT = Worksheets(TargetName) ' Set the Target sheet
If Err Then
' Create it if it doesn't exist
Set WsT = Worksheets.Add(After:=Worksheets(Worksheets.Count))
WsT.Name = TargetName
' insert the column captions here
End If
On Error GoTo 0
Rt = WsT.Cells(WsS.Rows.Count, NfdName).End(xlUp).Row
AddMain WsS, WsT, NbcFirstDataRow, Rt ' Rt is counting in the sub
For Rs = NbcFirstDataRow To LastRow - 1
With WsS.Rows(Rs)
Tmp = .Cells(NbcFname).Value & .Cells(NbcName).Value & .Cells(NbcDob).Value
End With
With WsS.Rows(Rs + 1)
Comp = .Cells(NbcFname).Value & .Cells(NbcName).Value & .Cells(NbcDob).Value
End With
If StrComp(Tmp, Comp, vbTextCompare) Then
AddMain WsS, WsT, Rs + 1, Rt
Else
Ct = WsT.Cells(Rt, WsT.Columns.Count).End(xlToLeft).Column
If Ct > NfdMain Then Ct = Ct + 1
With WsS.Rows(Rs + 1)
WsT.Cells(Rt, Ct + NfdRelate).Value = .Cells(NbcRelate).Value
WsT.Cells(Rt, Ct + NfdDepName).Value = .Cells(NbcDepName).Value
End With
End If
Next Rs
Application.ScreenUpdating = True
End Sub
The above code calls one Sub routine which you must add in the same code module which, by the way, should be a normal code module (by default "Module1" but you can rename it to whatever).
Private Sub AddMain(WsS As Worksheet, WsT As Worksheet, _
Rs As Long, Rt As Long)
' 25 Mar 2017
Rt = Rt + 1
With WsS.Rows(Rs)
WsT.Cells(Rt, NfdFname).Value = .Cells(NbcFname).Value
WsT.Cells(Rt, NfdName).Value = .Cells(NbcName).Value
WsT.Cells(Rt, NfdDob).Value = .Cells(NbcDob).Value
WsT.Cells(Rt, NfdMain).Value = "Main"
End With
End Sub
Observe that I inserted the word "Main" as hard text. You could also copy the content of the appropriate call in the Source sheet. This procedure only writes the first entry. Dependents are written by another code.
The entire code is controlled by two "enums", enumerations, one for each of the worksheets. Enums are the quickest way to assign names to numbers. Please paste these two enums at the top of your code sheet, before either of the procedures.
Private Enum Nbc ' worksheet Benefit Census
NbcFirstDataRow = 2 ' Adjust as required
NbcFname = 1 ' columns:
NbcName
NbcDob
NbcRelate
NbcDepName
End Enum
Private Enum Nfd ' worksheet Formatted Data
NfdFirstDataRow = 2 ' Adjust as required
NfdName = 1 ' columns:
NfdFname
NfdDob
NfdMain
NfdRelate = 0 ' Offset from NfdMain
NfdDepName
End Enum
Note that the rule of enums is that you can assign any integer to them. If you don't assign any number the value will be one higher than the previous. So, NfdMain = 4, followed by NfdRelate which has an assigned value of 0, followed by NfdDepName which has a value of 0 + 1 = 1.
The numbers in these enumerations are columns (and rows). You can control the entire output by adjusting these numbers. For example, "Main" is written into column NfdMain (=4 =D). Change the value to 5 and "Main" will appear in column 5 = E. No need to go rummaging in the code. Consider this a control panel.
In the formatted output I introduced a logic which is slightly different from yours. If you don't like it you can change it easily by modifying the enums. My logic has the family name as the main criterion in the first column (switched from the raw data). In column D I write "Main". But when there is a dependent I write the relationship in column D. Therefore only entries without any dependents will have "Main" in that column. For your first example, the formatted row will show Rasmond / Shawn / 01-01-1990 / Spouse / Jessica, Child 1 / Vanessa.
If you wish to keep the "Main and place "Spouse" in the next column, just set the enum NfdRelate = 1. With the "control panel" it's that simple.
I would use an approach using Dictionaries to collect and organize the data, and then output it. Judging both by your comments, and the code, there is a lot of stuff you haven't included. But the following code will take your original data, and output a table close to what you show -- some of the results ordering is different, but it is standardized (i.e. there is a relation listed with every dependent name.
In the dictionary, we use Last Name and Birthdate as the "key" so as to combine what you stated were the duplicates.
We define two Class objects
Dependent object which includes the Name and the Relation
Family object which includes the First and Last Names, and Birthdate as well as a collection (dictionary) of the dependent objects.
Once we have it organized, it is relatively simple to output it as we want.
For a discussion of Classes, you can do an Internet search. I would recommend Chip Pearson's Introduction to Classes
Be sure to read the notes in the code about renaming the class modules, and also setting a reference to Microsoft Scripting Runtime
Class1
Option Explicit
'Rename this module: cDependents
'set reference to Microsoft Scripting Runtime
Private pRelation As String
Private pDepName As String
Public Property Get Relation() As String
Relation = pRelation
End Property
Public Property Let Relation(Value As String)
pRelation = Value
End Property
Public Property Get DepName() As String
DepName = pDepName
End Property
Public Property Let DepName(Value As String)
pDepName = Value
End Property
Class2
Option Explicit
'rename this module: cFamily
'set reference to Microsoft Scripting Runtime
Private pFirstName As String
Private pLastName As String
Private pBirthdate As Date
Private pDependents As Dictionary
Public Property Get FirstName() As String
FirstName = pFirstName
End Property
Public Property Let FirstName(Value As String)
pFirstName = Value
End Property
Public Property Get LastName() As String
LastName = pLastName
End Property
Public Property Let LastName(Value As String)
pLastName = Value
End Property
Public Property Get Birthdate() As Date
Birthdate = pBirthdate
End Property
Public Property Let Birthdate(Value As Date)
pBirthdate = Value
End Property
Public Function ADDDependents(Typ, Nme)
Dim cD As New cDependents
Dim sKey As String
With cD
.DepName = Nme
.Relation = Typ
sKey = .Relation & Chr(1) & .DepName
End With
If Not pDependents.Exists(sKey) Then
pDependents.Add Key:=sKey, Item:=cD
End If
End Function
Public Property Get Dependents() As Dictionary
Set Dependents = pDependents
End Property
Private Sub Class_Initialize()
Set pDependents = New Dictionary
End Sub
Regular Module
Option Explicit
'set reference to Microsoft Scripting Runtime
Sub Family()
Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range
Dim vSrc As Variant, vRes As Variant
Dim dF As Dictionary, cF As cFamily
Dim I As Long, J As Long
Dim sKey As String
Dim V As Variant, W As Variant
'Set source and results worksheets and results range
Set wsSrc = Worksheets("sheet1")
Set wsRes = Worksheets("sheet2")
Set rRes = wsRes.Cells(1, 1)
'read source data into array
With wsSrc
vSrc = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=5)
End With
'Collect and organize the family and dependent objects
Set dF = New Dictionary
For I = 2 To UBound(vSrc, 1)
Set cF = New cFamily
With cF
.FirstName = vSrc(I, 1)
.LastName = vSrc(I, 2)
.Birthdate = vSrc(I, 3)
.ADDDependents vSrc(I, 4), vSrc(I, 5)
sKey = .LastName & Chr(1) & .Birthdate
If Not dF.Exists(sKey) Then
dF.Add Key:=sKey, Item:=cF
Else
dF(sKey).ADDDependents vSrc(I, 4), vSrc(I, 5)
End If
End With
Next I
'Results will have two columns for each relation, including Main
' + three columns at the beginning
'get number of extra columns
Dim ColCount As Long
For Each V In dF
I = dF(V).Dependents.Count
ColCount = IIf(I > ColCount, I, ColCount)
Next V
ColCount = ColCount * 2 + 3
ReDim vRes(0 To dF.Count, 1 To ColCount)
vRes(0, 1) = "First Name"
vRes(0, 2) = "Last Name"
vRes(0, 3) = "Birthdate"
vRes(0, 4) = "Dependant"
vRes(0, 5) = "Dependant Name"
For J = 6 To UBound(vRes, 2) Step 2
vRes(0, J) = "Relation " & J - 5
vRes(0, J + 1) = "Dependant Name"
Next J
I = 0
For Each V In dF
I = I + 1
With dF(V)
vRes(I, 1) = .FirstName
vRes(I, 2) = .LastName
vRes(I, 3) = .Birthdate
J = 2
For Each W In .Dependents
J = J + 2
With .Dependents(W)
vRes(I, J) = .Relation
vRes(I, J + 1) = .DepName
End With
Next W
End With
Next V
Set rRes = rRes.Resize(rowsize:=UBound(vRes, 1) + 1, columnsize:=UBound(vRes, 2))
With rRes
.EntireColumn.Clear
.Value = vRes
With .Rows(1)
.Font.Bold = True
.HorizontalAlignment = xlCenter
End With
.EntireColumn.AutoFit
End With
End Sub
Source Data
Results
I have a string compressed into one cell. I need to separate each part of the string into their own cell, while copying the data from the same row.
Here is my example data:
A | B
Row1 ABC ABD ABE ABF | CODE1
Row2 BCA DBA EBA FBA | CODE2
Row3 TEA BEF | CODE3
The result would be:
A B
ABC CODE1
ABD CODE1
ABE CODE1
ABF CODE1
BCA CODE2
DBA CODE2
EBA CODE2
FBA CODE2
TEA CODE3
BEF CODE3
I have about 2000 rows and would literally take 30 years to use the text to column function for this. So I am trying to write a vba macro. I think I am making this harder than it needs to be. Any thoughts or pushes in the right direction would be appreciated. Thanks in advance for any help.
This will work, (but it's mighty inefficient unless you do it in an array... nevertheless for only 2000 rows, you won't even notice the lag)
Function SplitThis(Str as String, Delimiter as String, SerialNumber as Long) As String
SplitThis = Split(Str, Delimiter)(SerialNumber - 1)
End Function
Use it as
= SPLITTHIS("ABC EFG HIJ", " ", 2)
' The result will be ...
"EFG"
You will still need to put in a whole lot of extra error checking, etc. if you need to use it for a distributed application, as the users might put in values greater than the number of 'split elements' or get delimiters wrong, etc.
I like iterating over cells for problems like this post.
' code resides on input sheet
Sub ParseData()
Dim wksOut As Worksheet
Dim iRowOut As Integer
Dim iRow As Integer
Dim asData() As String
Dim i As Integer
Dim s As String
Set wksOut = Worksheets("Sheet2")
iRowOut = 1
For iRow = 1 To UsedRange.Rows.Count
asData = Split(Trim(Cells(iRow, 1)), " ")
For i = 0 To UBound(asData)
s = Trim(asData(i))
If Len(s) > 0 Then
wksOut.Cells(iRowOut, 1) = Cells(iRow, 2)
wksOut.Cells(iRowOut, 2) = s
iRowOut = iRowOut + 1
End If
Next i
Next iRow
MsgBox "done"
End Sub
Assuming your data is on the first sheet, this populates the second sheet with the formatted data. I also assume that the data is uniform, meaning there is the same type of data on every row until the data ends. I did not attempt the header line.
Public Sub FixIt()
Dim fromSheet, toSheet As Excel.Worksheet
Dim fromRow, toRow, k As Integer
Dim code As String
Set fromSheet = Me.Worksheets(1)
Set toSheet = Me.Worksheets(2)
' Ignore first row
fromRow = 2
toRow = 1
Dim outsideArr() As String
Dim insideArr() As String
Do While Trim(fromSheet.Cells(fromRow, 1)) <> ""
' Split on the pipe
outsideArr = Split(fromSheet.Cells(fromRow, 1), "|")
' Split left of pipe, trimmed, on space
insideArr = Split(Trim(outsideArr(0)), " ")
' Save the code
code = Trim(outsideArr(UBound(outsideArr)))
' Skip first element of inside array
For k = 1 To UBound(insideArr)
toSheet.Cells(toRow, 1).Value = insideArr(k)
toSheet.Cells(toRow, 2).Value = code
toRow = toRow + 1
Next k
fromRow = fromRow + 1
Loop
End Sub
Let me try as well using Dictionary :)
Sub Test()
Dim r As Range, c As Range
Dim ws As Worksheet
Dim k, lrow As Long, i As Long
Set ws = Sheet1 '~~> change to suit, everything else as is
Set r = ws.Range("B1", ws.Range("B" & ws.Rows.Count).End(xlUp))
With CreateObject("Scripting.Dictionary")
For Each c In r
If Not .Exists(c.Value) Then
.Add c.Value, Split(Trim(c.Offset(0, -1).Value))
End If
Next
ws.Range("A:B").ClearContents
For Each k In .Keys
lrow = ws.Range("A" & ws.Rows.Count).End(xlUp).Row
If lrow = 1 Then i = 0 Else i = 1
ws.Range("A" & lrow).Offset(i, 0) _
.Resize(UBound(.Item(k)) + 1).Value = Application.Transpose(.Item(k))
ws.Range("A" & lrow).Offset(i, 1).Resize(UBound(.Item(k)) + 1).Value = k
Next
End With
End Sub
Above code loads all items in Dictionary and then return it in the same Range. HTH.
Here is an approach using a User Defined Type, Collection and arrays. I've been using this lately and thought it might apply. It does make writing the code easier, once you get used to it.
The user defined type is set in a class module. I called the type "CodeData" and gave it two properties -- Code and Data
I assumed your data was in columns A & B starting with row 1; and I put the results on the same worksheet but in columns D & E. This can be easily changed, and put on a different worksheet if that's preferable.
First, enter the following code into a Class Module which you have renamed "CodeData"
Option Explicit
Private pData As String
Private pCode As String
Property Get Data() As String
Data = pData
End Property
Property Let Data(Value As String)
pData = Value
End Property
Property Get Code() As String
Code = pCode
End Property
Property Let Code(Value As String)
pCode = Value
End Property
Then put the following code into a Regular module:
Option Explicit
Sub ParseCodesAndData()
Dim cCodeData As CodeData
Dim colCodeData As Collection
Dim vSrc As Variant, vRes() As Variant
Dim V As Variant
Dim rRes As Range
Dim I As Long, J As Long
'Results start here. But could be on another sheet
Set rRes = Range("D1:E1")
'Get Source Data
vSrc = Range("A1", Cells(Rows.Count, "B").End(xlUp))
'Collect the data
Set colCodeData = New Collection
For I = 1 To UBound(vSrc, 1)
V = Split(vSrc(I, 1), " ")
For J = 0 To UBound(V)
Set cCodeData = New CodeData
cCodeData.Code = Trim(vSrc(I, 2))
cCodeData.Data = Trim(V(J))
colCodeData.Add cCodeData
Next J
Next I
'Write results to array
ReDim vRes(1 To colCodeData.Count, 1 To 2)
For I = 1 To UBound(vRes)
Set cCodeData = colCodeData(I)
vRes(I, 1) = cCodeData.Data
vRes(I, 2) = cCodeData.Code
Next I
'Write array to worksheet
Application.ScreenUpdating = False
rRes.EntireColumn.Clear
rRes.Resize(rowsize:=UBound(vRes, 1)) = vRes
Application.ScreenUpdating = True
End Sub
Here is the solution I devised with help from above. Thanks for the responses!
Sub Splt()
Dim LR As Long, i As Long
Dim X As Variant
Application.ScreenUpdating = False
LR = Range("A" & Rows.Count).End(xlUp).Row
Columns("A").Insert
For i = LR To 1 Step -1
With Range("B" & i)
If InStr(.Value, " ") = 0 Then
.Offset(, -1).Value = .Value
Else
X = Split(.Value, " ")
.Offset(1).Resize(UBound(X)).EntireRow.Insert
.Offset(, -1).Resize(UBound(X) - LBound(X) + 1).Value = Application.Transpose(X)
End If
End With
Next i
Columns("B").Delete
LR = Range("A" & Rows.Count).End(xlUp).Row
With Range("B1:C" & LR)
On Error Resume Next
.SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "=R[-1]C"
On Error GoTo 0
.Value = .Value
End With
Application.ScreenUpdating = True
End Sub
I need to sync up the values in the curly braces {} found in column C and put them against the user id in column F as seen below.
E.g. on the Emails sheet
becomes this on a new sheet
Sub CopyConditional()
Dim wshS As Worksheet
Dim WhichName As String
Set wshS = ActiveWorkbook.Sheets("Emails")
WhichName = "NewSheet"
Const NameCol = "C"
Const FirstRow = 1
Dim LastRow As Long
Dim SrcRow As Long
Dim TrgRow As Long
Dim wshT As Worksheet
Dim cpt As String
Dim user As String
Dim computers() As String
Dim computer As String
On Error Resume Next
Set wshT = Worksheets(WhichName)
If wshT Is Nothing Then
Set wshT = Worksheets.Add(After:=wshS)
wshT.Name = WhichName
End If
On Error GoTo 0
If wshT.Cells(1, NameCol).value = "" Then
TrgRow = 1
Else
TrgRow = wshT.Cells(wshT.Rows.Count, NameCol).End(xlUp).Row + 1
End If
LastRow = wshS.Cells(wshS.Rows.Count, NameCol).End(xlUp).Row
For SrcRow = FirstRow To LastRow
cpt = wshS.Range("C" & SrcRow).value
user = wshS.Range("F" & SrcRow).value
If InStr(cpt, ":") Then
cpt = Mid(cpt, InStr(1, cpt, ":") + 1, Len(cpt))
End If
If InStr(cpt, ";") Then
computers = Split(cpt, ";")
For i = 0 To UBound(computers)
If computers(i) <> "" Then
wshT.Range("A" & TrgRow).value = user
wshT.Range("B" & TrgRow).value = Mid(Left(computers(i), Len(computers(i)) - 1), 2)
TrgRow = TrgRow + 1
End If
Next
Else
computer = cpt
If computer <> "" Then
wshT.Range("A" & TrgRow).value = user
wshT.Range("B" & TrgRow).value = Mid(Left(computer, Len(computer) - 1), 2)
TrgRow = TrgRow + 1
End If
End If
Next SrcRow
End Sub
I managed to resolve it with the above code but there are 3 niggling issues:
1) The first curly brace is always copied, how do I omit this so something like {Computer1 looks like Computer 1
2) Where there are two computers in a row, then the output looks something like this:
when it should really be split into two different rows i.e.
User 1 | Computer 1
User 1 | Computer 2
3) If there is text after the last curly brace with text in it e.g. {Computer1};{Computer2};Request submitted then that text is added as a new row, I don't want this, I want it to be omitted e.g.
should just be:
User 1 | Computer 1
User 1 | Computer 2
How do I go about rectifying these issues?
Try this:
Sub Collapse()
Dim uRng As Range, cel As Range
Dim comps As Variant, comp As Variant, r As Variant, v As Variant
Dim d As Dictionary '~~> Early bind, for Late bind use commented line
'Dim d As Object
Dim a As String
With Sheet1 '~~> Sheet that contains your data
Set uRng = .Range("F1", .Range("F" & .Rows.Count).End(xlUp))
End With
Set d = CreateObject("Scripting.Dictionary")
With d
For Each cel In uRng
a = Replace(cel.Offset(0, -3), "{", "}")
comps = Split(a, "}")
Debug.Print UBound(comps)
For Each comp In comps
If InStr(comp, "Computer") <> 0 _
And Len(Trim(comp)) <= 10 Then '~~> I assumed max Comp# is 99
If Not .Exists(cel) Then
.Add cel, comp
Else
If IsArray(.Item(cel)) Then
r = .Item(cel)
ReDim Preserve r(UBound(r) + 1)
r(UBound(r)) = comp
.Item(cel) = r
Else
r = Array(.Item(cel), comp)
.Item(cel) = r
End If
End If
End If
Next
Next
End With
For Each v In d.Keys
With Sheet2 '~~> sheet you want to write your data to
If IsArray(d.Item(v)) Then
.Range("A" & .Rows.Count).End(xlUp).Offset(1, 0) _
.Resize(UBound(d.Item(v)) + 1) = v
.Range("B" & .Rows.Count).End(xlUp).Offset(1, 0) _
.Resize(UBound(d.Item(v)) + 1) = Application.Transpose(d.Item(v))
Else
.Range("A" & .Rows.Count).End(xlUp).Offset(1, 0) = v
.Range("B" & .Rows.Count).End(xlUp).Offset(1, 0) = d.Item(v)
End If
End With
Next
Set d = Nothing
End Sub
Above code uses Replace and Split Function to pass your string to array.
a = Replace(cel.Offset(0, -3), "{", "}") '~~> standardize delimiter
comps = Split(a, "}") '~~> split using standard delimiter
Then information are passed to dictionary object using User as key and computers as items.
We filter the items passed to dictionary using Instr and Len Function
If InStr(comp, "Computer") <> 0 _
And Len(Trim(comp)) <= 10 Then
As I've commented, I assumed your max computer number is 99.
Else change 10 to whatever length you need to check.
Finally we return the dictionary information to the target worksheet.
Note: You need to add reference to Microsoft Scripting Runtime if you prefer early bind
Result: I tried it on a small sample data patterned on how I see it in you SS.
So assuming you have this data in Sheet1:
Will output data in Sheet2 like this:
I use a custom parse function for this type of operation:
Sub CopyConditional()
' some detail left out
Dim iRow&, Usern$, Computer$, Computers$
For iRow = ' firstrow To lastrow
Usern = Sheets("Emails").Cells(iRow, "F")
Computers = Sheets("Emails").Cells(iRow, "C")
Do
Computer = zParse(Computers) ' gets one computer
If Computer = "" Then Exit Do
' Store Computer and Usern
Loop
Next iRow
End Sub
Function zParse$(Haystack$) ' find all {..}
Static iPosL& '
Dim iPosR&
If iPosL = 0 Then iPosL = 1
iPosL = InStr(iPosL, Haystack, "{") ' Left
If iPosL = 0 Then Exit Function ' no more
iPosR = InStr(iPosL, Haystack, "}") ' Right
If iPosR = 0 Then MsgBox "No matching }": Stop
zParse = Mid$(Haystack, iPosL + 1, iPosR - iPosL - 1)
iPosL = iPosR
End Function
1) Use the Mid function to drop the first character:
str = "{Computer1"
str = Mid(str,2)
now str = "Computer1"
2) You can use the Split function to separate these out and combine with the Mid function above
str = "{Computer1}{Computer2}"
splt = Split(str,"}")
for a = 0 to Ubound(splt)
result = Mid(splt(a),2)
next a
3) Add a conditional statement to the above loop
str = "{Computer1}{Computer2}"
splt = Split(str,"}")
for a = 0 to Ubound(splt)
if Left(splt(a),1) = "{" then result = Mid(splt(a),2)
next a
Use this loop and send each result to the desired cell (in the for-next loop) and you should be good to go.