I am working with an excel workbook where I want to find all unique values in a column.
I have code that works by looping through all the rows and for each row looping through a collection of values seen so far and checking if I've seen it before.
It works like this.
Function getUnique(Optional col As Integer) As Collection
If col = 0 Then col = 2
Dim values As Collection
Dim value As Variant
Dim i As Integer
Dim toAdd As Boolean
i = 3 'first row with data
Set values = New Collection
Do While Cells(i, col) <> ""
toAdd = True
For Each value In values
If Cells(i, col).value = value Then toAdd = False
Next value
If toAdd Then values.Add (Cells(i, col).value)
i = i + 1
Loop
Set getUnique = values
End Function
However, Excel AutoFilter is able to find these values much faster. Is there a way to filter and then read the unique values?
I've tried using the AutoFilter.Filters object but all of the .ItemX.Criteria1 values have a "Application-defined or object-defined error" (found using a watch on ActiveSheet.AutoFilter.Filters).
This isn't quite doing what you describe, I think it's processing it less-efficiently because it's checking every cell against every value.
I think this is probably inefficient, because as the values collection grows in length, the second loop will take longer to process.
You could get some improvement if you exit your nested For early:
Do While Cells(i, col) <> ""
For Each value In values
If Cells(i, col).value = value Then
toAdd = False
Else:
values.Add (Cells(i, col).value)
Exit For '### If the value is found, there's no use in checking the rest of the values!
End If
Next value
i = i + 1
Loop
But I think a Dictionary may give you performance improvement. This way, we don't need to loop over the collection, we just make use of the dictionary's .Exists method. If it doesn't exist, we add to the collection, if it does, we don't. Then the function still returns the collection of uniques.
Function getUnique(Optional col As Integer) As Collection
If col = 0 Then col = 2
Dim values As Object
Dim value As Variant
Dim i As Integer
Dim toAdd As Boolean
Dim ret as New Collection
i = 3 'first row with data
Set values = CreateObject("Scripting.Dictionary")
With Cells(i, col)
Do While .Value <> ""
If Not values.Exists(.Value)
values(.Value) = 1
ret.Add(.Value) '## Add the item to your collection
Else
'## Count the occurences, in case you need to use this later
values(.Value) = values(.Value) + 1
End If
i = i + 1
Loop
Set getUnique = ret
End Function
The AdvancedFilter method may come in handy here and produce cleaner, easier to maintain code. This will work so long as you are calling this Function from another VBA module and not from a cell.
Function getUnique(Optional col As Integer) As Collection
If col = 0 Then col = 2
Dim values As Collection
Dim value As Variant
Dim i As Integer
i = 3 'first row with data
Range(Cells(i, col), Cells(Rows.Count, col).End(xlUp)).AdvancedFilter xlFilterCopy, CopyToRange:=Cells(1, Columns.Count)
Set values = New Collection
Dim cel As Range
For Each cel In Range(Cells(1, Columns.Count), Cells(1, Columns.Count).End(xlDown))
values.Add cel.value
Next
Range(Cells(2, Columns.Count), Cells(1, Columns.Count).End(xlDown)).Clear
Set getUnique = values
End Function
Tested with this sub:
Sub Test()
Dim c As Collection
Set c = getUnique(4)
For i = 1 To c.Count
Debug.Print c.Item(i)
Next
End Sub
Related
I have values in column B separated by commas. I need to split them into new rows and keep the other data the same.
I have a variable number of rows.
I don't know how many values will be in the cells in Column B, so I need to loop over the array dynamically.
Example:
ColA ColB ColC ColD
Monday A,B,C Red Email
Output:
ColA ColB ColC ColD
Monday A Red Email
Monday B Red Email
Monday C Red Email
Have tried something like:
colArray = Split(ws.Cells(i, 2).Value, ", ")
For i = LBound(colArray) To UBound(colArray)
Rows.Insert(i)
Next i
Try this, you can easily adjust it to your actual sheet name and column to split.
Sub splitByColB()
Dim r As Range, i As Long, ar
Set r = Worksheets("Sheet1").Range("B999999").End(xlUp)
Do While r.row > 1
ar = Split(r.value, ",")
If UBound(ar) >= 0 Then r.value = ar(0)
For i = UBound(ar) To 1 Step -1
r.EntireRow.Copy
r.Offset(1).EntireRow.Insert
r.Offset(1).value = ar(i)
Next
Set r = r.Offset(-1)
Loop
End Sub
You can also just do it in place by using a Do loop instead of a For loop. The only real trick is to just manually update your row counter every time you insert a new row. The "static" columns that get copied are just a simple matter of caching the values and then writing them to the inserted rows:
Dim workingRow As Long
workingRow = 2
With ActiveSheet
Do While Not IsEmpty(.Cells(workingRow, 2).Value)
Dim values() As String
values = Split(.Cells(workingRow, 2).Value, ",")
If UBound(values) > 0 Then
Dim colA As Variant, colC As Variant, colD As Variant
colA = .Cells(workingRow, 1).Value
colC = .Cells(workingRow, 3).Value
colD = .Cells(workingRow, 4).Value
For i = LBound(values) To UBound(values)
If i > 0 Then
.Rows(workingRow).Insert xlDown
End If
.Cells(workingRow, 1).Value = colA
.Cells(workingRow, 2).Value = values(i)
.Cells(workingRow, 3).Value = colC
.Cells(workingRow, 4).Value = colD
workingRow = workingRow + 1
Next
Else
workingRow = workingRow + 1
End If
Loop
End With
This will do what you want.
Option Explicit
Const ANALYSIS_ROW As String = "B"
Const DATA_START_ROW As Long = 1
Sub ReplicateData()
Dim iRow As Long
Dim lastrow As Long
Dim ws As Worksheet
Dim iSplit() As String
Dim iIndex As Long
Dim iSize As Long
'Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
With ThisWorkbook
.Worksheets("Sheet4").Copy After:=.Worksheets("Sheet4")
Set ws = ActiveSheet
End With
With ws
lastrow = .Cells(.Rows.Count, ANALYSIS_ROW).End(xlUp).Row
End With
For iRow = lastrow To DATA_START_ROW Step -1
iSplit = Split(ws.Cells(iRow, ANALYSIS_ROW).Value2, ",")
iSize = UBound(iSplit) - LBound(iSplit) + 1
If iSize = 1 Then GoTo Continue
ws.Rows(iRow).Copy
ws.Rows(iRow).Resize(iSize - 1).Insert
For iIndex = LBound(iSplit) To UBound(iSplit)
ws.Cells(iRow, ANALYSIS_ROW).Offset(iIndex).Value2 = iSplit(iIndex)
Next iIndex
Continue:
Next iRow
Application.CutCopyMode = False
Application.Calculation = xlCalculationAutomatic
'Application.ScreenUpdating = True
End Sub
A formula solution is close to your requirement.
Cell G1 is the delimiter. In this case a comma.
Helper E1:=SUM(E1,LEN(B1)-LEN(SUBSTITUTE(B1,$H$1,"")))+1
You must fill the above formula one row more.
A8:=a1
Fill this formula to the right.
A9:=LOOKUP(ROW(1:1),$E:$E,A:A)&""
Fill this formula to the right and then down.
B9:=MID($H$1&LOOKUP(ROW(A1),E:E,B:B)&$H$1,FIND("艹",SUBSTITUTE($H$1&LOOKUP(ROW(A1),E:E,B:B)&$H$1,$H$1,"艹",ROW(A2)-LOOKUP(ROW(A1),E:E)))+1,FIND("艹",SUBSTITUTE($H$1&LOOKUP(ROW(A1),E:E,B:B)&$H$1,$H$1,"艹",ROW(A2)-LOOKUP(ROW(A1),E:E)+1))-FIND("艹",SUBSTITUTE($H$1&LOOKUP(ROW(A1),E:E,B:B)&$H$1,$H$1,"艹",ROW(A2)-LOOKUP(ROW(A1),E:E)))-1)&""
Fill down.
Bug:
Numbers will be converted to Text. Of course you can remove the &"" at the end of the formula, but blank cells will be filled with 0.
Given #A.S.H.'s excellent and brief answer, the VBA function below might be a bit of an overkill, but it will hopefully be of some help to someone looking for a more "generic" solution. This method makes sure not to modify the cells to the left, to the right, or above the table of data, in case the table does not start in A1 or in case there is other data on the sheet besides the table. It also avoids copying and inserting entire rows, and it allows you to specify a separator other than a comma.
This function happens to have similarities to #ryguy72's procedure, but it does not rely on the clipboard.
Function SplitRows(ByRef dataRng As Range, ByVal splitCol As Long, ByVal splitSep As String, _
Optional ByVal idCol As Long = 0) As Boolean
SplitRows = True
Dim oldUpd As Variant: oldUpd = Application.ScreenUpdating
Dim oldCal As Variant: oldCal = Application.Calculation
On Error GoTo err_sub
'Modify application settings for the sake of speed
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
'Get the current number of data rows
Dim rowCount As Long: rowCount = dataRng.Rows.Count
'If an ID column is specified, use it to determine where the table ends by finding the first row
' with no data in that column
If idCol > 0 Then
With dataRng
rowCount = .Offset(, idCol - 1).Resize(, 1).End(xlDown).Row - .Row + 1
End With
End If
Dim splitArr() As String
Dim splitLb As Long, splitUb As Long, splitI As Long
Dim editedRowRng As Range
'Loop through the data rows to split them as needed
Dim r As Long: r = 0
Do While r < rowCount
r = r + 1
'Split the string in the specified column
splitArr = Split(dataRng.Cells(r, splitCol).Value & "", splitSep)
splitLb = LBound(splitArr)
splitUb = UBound(splitArr)
'If the string was not split into more than 1 item, skip this row
If splitUb <= splitLb Then GoTo splitRows_Continue
'Replace the unsplit string with the first item from the split
Set editedRowRng = dataRng.Resize(1).Offset(r - 1)
editedRowRng.Cells(1, splitCol).Value = splitArr(splitLb)
'Create the new rows
For splitI = splitLb + 1 To splitUb
editedRowRng.Offset(1).Insert 'Add a new blank row
Set editedRowRng = editedRowRng.Offset(1) 'Move down to the next row
editedRowRng.Offset(-1).Copy Destination:=editedRowRng 'Copy the preceding row to the new row
editedRowRng.Cells(1, splitCol).Value = splitArr(splitI) 'Place the next item from the split string
'Account for the new row in the counters
r = r + 1
rowCount = rowCount + 1
Next
splitRows_Continue:
Loop
exit_sub:
On Error Resume Next
'Resize the original data range to reflect the new, full data range
If rowCount <> dataRng.Rows.Count Then Set dataRng = dataRng.Resize(rowCount)
'Restore the application settings
If Application.ScreenUpdating <> oldUpd Then Application.ScreenUpdating = oldUpd
If Application.Calculation <> oldCal Then Application.Calculation = oldCal
Exit Function
err_sub:
SplitRows = False
Resume exit_sub
End Function
Function input and output
To use the above function, you would specify
the range containing the rows of data (excluding the header)
the (relative) number of the column within the range with the string to split
the separator in the string to split
the optional (relative) number of the "ID" column within the range (if a number >=1 is provided, the first row with no data in this column will be taken as the last row of data)
The range object passed in the first argument will be modified by the function to reflect the range of all the new data rows (including all inserted rows). The function returns True if no errors were encountered, and False otherwise.
Examples
For the range illustrated in the original question, the call would look like this:
SplitRows Range("A2:C2"), 2, ","
If the same table started in F5 instead of A1, and if the data in column G (i.e. the data that would fall in column B if the table started in A1) was separated by Alt-Enters instead of commas, the call would look like this:
SplitRows Range("F6:H6"), 2, vbLf
If the table contained the row header plus 10 rows of data (instead of 1), and if it started in F5 again, the call would look like this:
SplitRows Range("F6:H15"), 2, vbLf
If there was no certainty about the number of rows, but we knew that all the valid rows are contiguous and always have a value in column H (i.e. the 3rd column in the range), the call could look something like this:
SplitRows Range("F6:H1048576"), 2, vbLf, 3
In Excel 95 or lower, you would have to change "1048576" to "16384", and in Excel 97-2003, to "65536".
I'm trying to create an associative table on a sheet that is pulling in data from a different sheet. By associative I mean, if the data is changed in the source data sheet, it would be reflected on the new sheet. I also want to only have the new sheet's table to be contingent on having a certain unique value. In my case, I want to pull up information related to a part number. The original source data will have many rows that contain the same part number, but I only care to display one of them.
This is what I have so far:
Function IsInArray(stringToBeFound As String, arr As Variant) As Boolean
IsInArray = (UBound(Filter(arr, stringToBeFound)) > -1)
End Function
Dim ref() As Variant
Dim row As Integer
row = 92
Worksheets("Part Tracking Scorecard").Activate
While Cells(row, 6).Value:
If IsInArray(Cells(row, 6).Value, ref) Then
row = row + 1
ElseIf Not IsInArray(Cells(row, 6).Value, ref) Then
ReDim Preserve ref(1 To UBound(ref) + 1) As Variant
ref(UBound(ref)) = Cells(row, 6).Value
Worksheets("Unique Parts").Activate
?????
row = row + 1
To satisfy my condition to only showcase the unique part numbers, I initialized an empty array called "ref". Then, as I iterate through the source sheet, I would check if the part number was in ref with the function "IsInArray". If it was in it, it would move onto the next row, if it wasn't add the part number into the empty array and move to the next row.
The portion with the "????" is where I'm having most of my issue trying to figure out. That part is supposed to be where I make the new table with the date from the unique part number. The very simple and tedious thing I could do is make some loop to run through the columns of the rows and put in a vlookup function. I was wondering if there may be a more robust or more elegant way in doing this.
You've had the right reflex tyring to define an array to stock your values. Here are a few tips of how I would get around to doing it (not perfect, but it should help you out):
Function IsInArray(stringToBeFound As String, arr As Variant) As Boolean
IsInArray = (UBound(Filter(arr, stringToBeFound)) > -1)
End Function
Dim Source as Worksheets
Set Source = Worksheets("Part Tracking Scoreboard")
Dim ref1(), ref2() As Variant
Dim row, index, index2 As Integer
row = 92
ref1 = Source.Range(Worksheets(Source.Cells(row,1), Source.Cells(lastrow, last column))
'Start by placing your ENTIRE source sheet in ref1, if your source sheet is big, this will help you win A LOT of time during the looping phase. Notice how I start from row 92 seeing as this is where you started your loop
'lastrow and lastcolumn represent the position of the last cell in your source file
For index = row to lastrow
If Not IsInArray(ref1(row, 6).Value, ref2) Then
ref2(index) = ref1(index) 'copy the entire row from source to ref2
Next index
Dim NewFile as Worksheet
Set Newfile = Sheets("NewSheetName")
Dim ref2dimension_x, ref2dimension_y as Integer 'find dimensions of ref2 array
ref2dimension_x= UBound(ref2, 1) - LBound(ref2, 1) + 1
ref2dimension_y = UBound(ref2, 2) - LBound(ref2, 2) + 1
For index = 2 to ref2dimension_x 'go through entire new sheet and set values
For index2 = 1 to ref2dimension_y
NewFile.Cells(index, index2).Value = ref2(index - 1, index2)
Next index2
Next index
ref1() = nothing
ref2() = nothing 'free up the space occupied by these arrays
I was not sure about what you were trying to do exactly during the else loop. If you intention is to copy the entire row, this should work. If you want to copy only specific data from the source sheet, you will need to find the indexes of the corresponding columns (hardcode them if they are not going to budge, or use a loop to find them through string comparison otherwise).
This solution combines some macros that I use frequently (so even if you don't use them now, they might be helpful in the future). It won't work if the data in the unique table needs to be "live", but if it'd be sufficient for it to be updated whenever the workbook is opened/closed (or on demand), this is a lot less complicated than the array version.
Basically you just:
Copy the main/unduplicated table to a new sheet
Remove duplicates by part number
Remove unnecessary columns from unduplicated table (if applicable)
I'm assuming that your source data is in a formal Excel Table (ListObject). Just swap out "PartTable" for whatever your actual table is called.
Sub makeUniqueTable()
Application.ScreenUpdating = False
Dim MainWS As Worksheet
Set MainWS = ThisWorkbook.Sheets("Part Tracking Scorecard")
Dim UniqueWS As Worksheet
Set UniqueWS = ThisWorkbook.Sheets("Unique Parts")
UniqueWS.Cells.Clear
Call cloneTable(MainWS.ListObjects("PartTable"), "UniquePartTable", UniqueWS)
Dim UniquePartTable As ListObject
Set UniquePartTable = UniqueWS.ListObjects("UniquePartTable")
Call removeDuplicates(UniquePartTable, "Part Number")
'Optional: remove unnecessary columns by listing columns to be deleted...
'Call deleteColumns(UniquePartTable, Array("Unnecessary Column 1", "Unnecessary Column 2"))
'...or kept:
'Call deleteColumns(UniquePartTable, Array("Part Number", "Manufacturer", "Product Description"), True)
Application.ScreenUpdating = True
End Sub
Sub cloneTable(tbl As ListObject, newName As String, Optional newWS As Worksheet = Nothing)
'Copies a table (tbl) to a new worksheet (newWS) and gives it a name (newName)
'If there is any data in newWS, the new table will be added to the right of the used range
'If newWS is omitted, new table will be added to same worksheet as original table
Dim ws As Worksheet
Dim lastColumn As Long
Dim newRng As Range
Dim newTbl As ListObject
If newWS Is Nothing Then
Set ws = tbl.Parent
lastColumn = ws.Cells(1, ws.Columns.Count).End(xlToLeft).Column
Set newRng = ws.Range(ws.Cells(1, lastColumn + 2), ws.Cells(1 + tbl.ListRows.Count, lastColumn + tbl.ListColumns.Count + 1))
Else
Set ws = newWS
If ws.ListObjects.Count > 0 Then
lastColumn = ws.Cells(1, ws.Columns.Count).End(xlToLeft).Column
Set newRng = ws.Range(ws.Cells(1, lastColumn + 2), ws.Cells(1 + tbl.ListRows.Count, lastColumn + tbl.ListColumns.Count + 1))
Else
Set newRng = ws.Range(ws.Cells(1, 1), ws.Cells(1 + tbl.ListRows.Count, tbl.ListColumns.Count))
End If
End If
tbl.Range.Copy
newRng.PasteSpecial Paste:=xlPasteValuesAndNumberFormats
Application.CutCopyMode = False
Set newTbl = ws.ListObjects.Add(xlSrcRange, newRng, , xlYes)
newTbl.Name = newName
End Sub
Sub removeDuplicates(tbl As ListObject, Optional colName As Variant = "")
'Removes duplicates from a table (tbl) based on column header names (colName()) provided by user
'If no column names are provided, duplicates will be removed based on all columns in table
Dim i As Long
Dim j As Long
If Not IsArray(colName) Then
If colName = "" Then
ReDim colNumArr(0 To tbl.ListColumns.Count - 1) As Variant
For i = 0 To tbl.ListColumns.Count - 1
colNumArr(i) = tbl.ListColumns(i + 1).Range.Column
Next
Else
ReDim colNumArr(0 To 0) As Variant
colNumArr(0) = tbl.ListColumns(colName).Range.Column
End If
Else
ReDim colNumArr(0 To UBound(colName) - LBound(colName)) As Variant
j = 0
For i = LBound(colName) To UBound(colName)
colNumArr(j) = tbl.ListColumns(colName(i)).Range.Column
j = j + 1
Next
End If
tbl.Range.removeDuplicates Columns:=(colNumArr), Header:=xlYes
End Sub
Sub deleteColumns(tbl As ListObject, ByVal colName As Variant, Optional invert As Boolean = False, Optional sheetCol As Boolean = True)
'Deletes column(s) from sheet based on header names (colName) from a table (tbl)
'Will result in error if provided column contains multiple tables
'colName can be a String or an array of Strings
'Inverted mode deletes all columns *except* those in colName
Dim i As Long
Dim j As Long
Dim x As Boolean
If Not IsArray(colName) Then
tempStr = colName
ReDim colName(1 To 1) As String
colName(1) = tempStr
End If
If invert = False Then
For i = LBound(colName) To UBound(colName)
If sheetCol = True Then
tbl.Parent.Columns(tbl.ListColumns(colName(i)).Range.Column).Delete
Else
tbl.ListColumns(colName(i)).Delete
End If
Next
Else
For i = tbl.ListColumns.Count To 1 Step -1
x = False
For j = LBound(colName) To UBound(colName)
If tbl.HeaderRowRange(i).Value = colName(j) Then
x = True
Exit For
End If
Next
If x = False Then
If sheetCol = True Then
tbl.Parent.Columns(tbl.ListColumns(i).Range.Column).Delete
Else
tbl.ListColumns(i).Delete
End If
End If
Next
End If
End Sub
I have data in a single column.
Some of the cells starts with "index" like: (1), (2), etc.
I want to concatenate such cells in order, and place the result to the next column, and clear the original cells.
Could you please tell me how to do this in VBA? Thank you!
Please see the picture: col_A has the data, col_C and col_D are the desired result
You could do something like this. I can't vouch for it as I've only tested in on your sample case. So it won't work for non-contiguous numbered sub-entries - nor will it work for cases where the sub-entries are not in order. Both of the later could, of course, be incorporated into a more robust version that you've have to refactor on your own. In fact the regular expression already picks up the sub-entry # if you want to implement the latter.
Sub process()
Dim maxRow As Integer: maxRow = 100
Dim items As Collection
Dim regEx As Object
Dim matches As Object
Set items = New Collection
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.IgnoreCase = True
re.Pattern = "\((\d+)\).*"
Dim val As String
Dim row As Integer, rowPtr As Integer: row = 1
Dim matchTest As Boolean, preMatchTest As Boolean: preMatchTest = False
Do While row < maxRow:
val = Cells(row, "A").Value
matchTest = re.Test(val)
If Not preMatchTest And matchTest Then
rowPtr = row
Do While row < maxRow + 1:
val = Cells(row, "A").Value
matchTest = re.Test(val)
If matchTest Then
Set matches = re.Execute(val)
itemNum = matches(0).submatches(0)
items.Add val
Cells(row, "A") = ""
Else
For Each colVal In items:
Cells(rowPtr - 1, "B") = Cells(rowPtr - 1, "B") & colVal
Next
Set items = New Collection
Exit Do
End If
row = row + 1
preMatchTest = matchTest
Loop
End If
preMatchTest = False
row = row + 1
Loop
End Sub
The prematch/match if statement looks for the start of sub-entries and once found goes into the inner loop that adds them to the 'items' collection. After the last one is found the collection is concatenated and stored at the saved location ('rowPtr') of the main entry. Also note that column 'A' and the max # of rows looked at (maxRow) are hardcoded into the macro.
you could use AutoFilter() method and Areas property of Range object
Option Explicit
Sub main()
Dim area As Range
With Range("A1", Cells(Rows.Count, 1).End(xlUp)).Offset(, 2)
.Offset(, -2).Copy .Cells
.AutoFilter Field:=1, Criteria1:="(*"
If Application.WorksheetFunction.Subtotal(103, .Cells) > 1 Then
For Each area In .Resize(.Rows.Count - 1).Offset(1).SpecialCells(xlCellTypeVisible).Areas
area(1).Offset(-1, 1).Value = Join(Application.Transpose(area.Value), "")
area.ClearContents
Next
End If
.Parent.AutoFilterMode = False
End With
End Sub
What I am trying to achieve is to create a vba code to completely delete the rows if the value in column C (Id) is unique. So in example below the rows 6 and 7 would be deleted since the 111115 and 111116 are not showing up more than once in this column C. Any help is welcome! Thanks a lot.
Code so far: (but not working yet)
Sub delete_not_duplicates()
Dim i As Integer, j As Integer, toDel As Boolean, theNum As Integer
i = 2
Do While Cells(i, 3).Value <> ""
toDel = True
theNum = Cells(i, 3).Value
Do While Cells(j, 3).Value <> ""
If i <> j and Cells(j, 3) == theNum Then
toDel = False
Loop
If toDel == true Then
Rows(i).Delete
Else
i = i + 1
End If
Loop
End Sub
The general approach to do do this in a reasonable fast way is to
Get your data into a Variant Array
Loop the array, identifying unique values
Build a range reference to rows to be deleted, but defer the deletion
After the loop, delete all rows in one go
Sub demo()
Dim rDel As Range, rng As Range
Dim dat As Variant
Dim i As Long, cnt As Long
Dim TestCol As Long
' Avoid magic numbers
TestCol = 3 ' Column C
' Reference the correct sheet
With ActiveSheet
' Get data range
Set rng = .Range(.Cells(1, TestCol), .Cells(.Rows.Count, TestCol).End(xlUp))
' Get data as a Variant Array to speed things up
dat = rng.Value
' Loop the Variant Array
For i = 2 To UBound(dat, 1)
' Is value unique?
cnt = Application.CountIfs(rng, dat(i, 1))
If cnt = 1 Then
' If so, add to delete range
If rDel Is Nothing Then
Set rDel = .Cells(i, TestCol)
Else
Set rDel = Union(rDel, .Cells(i, TestCol))
End If
End If
Next
End With
' Do the delete
If Not rDel Is Nothing Then
rDel.EntireRow.Delete
End If
End Sub
I think the most efficient way would be:
Initialize an empty HashSet< Integer> (or whatever generic type you want) which will represent all the unique entries of C (id), let's name it uniqueIdSet
Iterate through the 2D array
if(uniqueIdSet.contains(id)){
//if the id was already seen before, it means it's not unique
uniqueIdSet.remove(id);
}
else{
//we haven't seen this id yet, add it to the unique set
uniqueIdSet.add(id);
}
Iterate through the original array again and do:
if(uniqueSet.contains(id)){
//if the id is unique, remove it from the array.
array.remove(currentRow);
}
Depending on your implementation, you might not be able to remove from the array as you iterate through it. A way around it is initializing a copy of the original array and remove the respective row from there.
I have a column of cells whose values are something like this:
a
a
b
b
c
c
c
c
d
e
f
f
etc.
I'm looking to take the non-duplicated values and paste them into a new column. My pseudocode for this is as follows:
ActiveSheet.Range("a1").End(xlDown).Select
aend = Selection.Row
for acol= 1 to aend
ActiveSheet.Range("b1").End(xlDown).Select
bend = Selection.Row
'if Cells(1,acol).Value <> any of the values in the range Cells(2,1).Value
'to Cells(2,bend).Value, then add the value of Cells(1,acol) to the end of
'column b.
Does my logic in this make sense? I'm not sure how to code the commented portion. If this isn't the most efficient way to do it, could someone suggest a better way? Thanks so much!
Depending on which version of Excel you are using, you can use some built-in Excel functionality to obtain what you want- the whole solution depends on your level of skill with VBA.
Excel 2003:
You can use the Advancedfilter method (documentation) of your range to obtain the unique values and copy them to your target area. Example:
With ActiveSheet
.Range("A1", .Range("A1").End(xlDown)).AdvancedFilter Action:=xlFilterCopy, CopyToRange:=.Range("B1"), Unique:=True
End With
Where B1 is the first cell of the column you wish to copy the unique values to. The only problem with this method is that the first row of the source column ("A1") will be copied to the target range even if it is duplicated. This is because the AdvancedFilter method assumes that the first row is a header.
Therefore, adding an additional code line we have:
With ActiveSheet
.Range("A1", .Range("A1").End(xlDown)).AdvancedFilter Action:=xlFilterCopy, CopyToRange:=.Range("B1"), Unique:=True
.Range("B1").Delete Shift:=xlShiftUp
End With
Excel 2007 / 2010:
You can use the same method as above, or use the RemoveDuplicates method (documentation). This is similar to the AdvancedFilter method, except that RemoveDuplicates works in-place, which means you need to make a duplicate of your source column and then perform the filtering, for example:
With ActiveSheet
.Range("A1", .Range("A1").End(xlDown)).Copy Destination:=.Range("B1")
.Range("B1", .Range("B1").End(xlDown)).RemoveDuplicates Columns:=1, Header:=xlNo
End With
The final parameter Header controls whether the first cell of the source data is copied to the destination (if it's set to true then the method similarly to the AdvancedFilter method).
If you're after a "purer" method, then you can use a VBA Collection or dictionary - I am sure that someone else will offer a solution with this.
I use a collection, which can't have duplicate keys, to get the unique items from a list. Try to add each item to a collection and ignore the errors when there's a duplicate key. Then you'll have a collection with a subset of unique values
Sub MakeUnique()
Dim vaData As Variant
Dim colUnique As Collection
Dim aOutput() As Variant
Dim i As Long
'Put the data in an array
vaData = Sheet1.Range("A1:A12").Value
'Create a new collection
Set colUnique = New Collection
'Loop through the data
For i = LBound(vaData, 1) To UBound(vaData, 1)
'Collections can't have duplicate keys, so try to
'add each item to the collection ignoring errors.
'Only unique items will be added
On Error Resume Next
colUnique.Add vaData(i, 1), CStr(vaData(i, 1))
On Error GoTo 0
Next i
'size an array to write out to the sheet
ReDim aOutput(1 To colUnique.Count, 1 To 1)
'Loop through the collection and fill the output array
For i = 1 To colUnique.Count
aOutput(i, 1) = colUnique.Item(i)
Next i
'Write the unique values to column B
Sheet1.Range("B1").Resize(UBound(aOutput, 1), UBound(aOutput, 2)).Value = aOutput
End Sub
For completeness, I'm posting the Scripting.Dictionary method: it's the commonest alternative to using a VBA.Collection and it avoids the need to rely on error-handling in normal operation.
A VBA Function using the Scripting.Dictionary Object to Return Unique Values from an Excel Range Containing Duplicates:
Option Explicit
' Author: Nigel Heffernan
' May 2012 http://excellerando.blogspot.com
' **** THIS CODE IS IN THE PUBLIC DOMAIN ****
'
' You are advised to segregate this code from
' any proprietary or commercially-confidential
' source code, and to label it clearly. If you
' fail do do so, there is a risk that you will
' impair your right to assert ownership of any
' intellectual property embedded in your work,
' or impair your employers or clients' ability
' to do so if the intellectual property rights
' in your work have been assigned to them.
'
Public Function UniqueValues(SourceData As Excel.Range, _
Optional Compare As VbCompareMethod = vbBinaryCompare _
) As Variant
Application.Volatile False
' Takes a range of values and returns a single-column array of unique items.
' The returned array is the expected data structure for Excel.Range.Value():
' a 1-based 2-Dimensional Array with dimensions 1 to RowCount, 1 to ColCount
' All values in the source are treated as text, and uniqueness is determined
' by case-sensitive comparison. To change this, set the Compare parameter to
' to 1, the value of the VbCompareMethod enumerated constant 'VbTextCompare'
' Error values in cells are returned as "#ERROR" with no further comparison.
' Empty or null cells are ignored: they do not appear in the returned array.
Dim i As Long, j As Long, k As Long
Dim oSubRange As Excel.Range
Dim arrSubRng As Variant
Dim arrOutput As Variant
Dim strKey As String
Dim arrKeys As Variant
Dim dicUnique As Object
' Note the late-binding as 'object' - best practice is to create a reference
' to the Windows Scripting Runtime: this allows you to declare dictUnique as
' Dim dictUnique As Scripting.Dictionary and instantiate it using the 'NEW'
' keyword instead of CreateObject, giving slightly better speed & stability.
If SourceData Is Nothing Then
Exit Function
End If
If IsEmpty(SourceData) Then
Exit Function
End If
Set dicUnique = CreateObject("Scripting.Dictionary")
dicUnique.CompareMode = Compare
For Each oSubRange In SourceData.Areas ' handles noncontiguous ranges
'Use Worksheetfunction.countA(oSubRange) > 0 to ignore empty ranges
If oSubRange.Cells.Count = 1 Then
ReDim arrSubRng(1 To 1, 1 To 1)
arrSubRng(1, 1) = oSubRange.Cells(1, 1).Value
Else
arrSubRng = oSubRange.Value
End If
For i = LBound(arrSubRng, 1) To UBound(arrSubRng, 1)
For j = LBound(arrSubRng, 2) To UBound(arrSubRng, 2)
If IsError(arrSubRng(i, j)) Then
dicUnique("#ERROR") = vbNullString
ElseIf IsEmpty(arrSubRng(i, j)) Then
' no action: empty cells are ignored
Else
' We use the error-tolerant behaviour of the Dictionary:
' If you query a key that doesn't exist, it adds the key
dicUnique(CStr(arrSubRng(i, j))) = vbNullString
End If
Next j
Next i
Erase arrSubRng
Next oSubRange
If dicUnique.Count = 0 Then
UniqueValues = Empty
Else
arrKeys = dicUnique.keys
dicUnique.RemoveAll
ReDim arrOutput(1 To UBound(arrKeys) + 1, 1 To 1)
For k = LBound(arrKeys) To UBound(arrKeys)
arrOutput(k + 1, 1) = arrKeys(k)
Next k
Erase arrKeys
UniqueValues = arrOutput
Erase arrOutput
End If
Set dicUnique = Nothing
End Function
A couple of notes:
This is code for any Excel range, not just the single-column range you asked for.This function tolerates cells with errors, which are difficult to handle in VBA.This isn't Reddit: you can read the comments, they are an aid to understanding and generally beneficial to your sanity.
I would use a simple array, go through all the letters and check if the letter you are on is in the array:
Sub unique_column()
Dim data() As Variant 'array that will store all of the unique letters
c = 1
Range("A1").Select
Do While ActiveCell.Value <> ""
ReDim Preserve data(1 To c) As Variant
If IsInArray(ActiveCell.Value, data()) = False Then 'we are on a new unique letter and will add it to the array
data(c) = ActiveCell.Value
c = c + 1
End If
ActiveCell.Offset(1, 0).Select
Loop
'now we can spit out the letters in the array into a new column
Range("B1").Value = "Unique letters:"
Dim x As Variant
Range("B2").Select
For Each x In data()
ActiveCell.Value = x
ActiveCell.Offset(1, 0).Select
Next x
Range("A1").Select
c = c - 1
killer = MsgBox("Processing complete!" & vbNewLine & c & "unique letters applied.", vbOKOnly)
End Sub
Function IsInArray(stringToBeFound As String, arr As Variant) As Boolean
IsInArray = (UBound(Filter(arr, stringToBeFound)) > -1)
End Function