VBA Optimizing macro loop - vba

The scenario is that I have 40 sheets and there can be up to ~5k rows in each sheet so I'm dealing with a lot of data which is causing this macro to run extremely slow. For example the first sheet alone has around 15219162 computations which only has about 380 rows. Is there a way to trim down the amount of computations my macro has to run?
There is 39326 unqiue twitter names so far which means 39326 x 387 rows in the first page.
Sub CountInvestorsByTwitterName()
With Application
.Calculation = xlCalculationManual: .ScreenUpdating = False: .DisplayAlerts = False
End With
Dim row_total As Long
Dim Unique_Values_Sheet As Worksheet
Set Unique_Values_Sheet = Sheets(Sheets.Count)
Unique_Values_Sheet.Columns("B:XFD").EntireColumn.Delete
Dim Unique_Values_Sheet_row_total As Long
Unique_Values_Sheet_row_total = Unique_Values_Sheet.Cells(Rows.Count, "A").End(xlUp).Row
Dim Unqiue_Twitter_Names As Range
Set Unqiue_Twitter_Names = Unique_Values_Sheet.Range("A2:A" & Unique_Values_Sheet_row_total).Cells
For Each s In Sheets
If s.Name <> "UNIQUE_DATA" Then
row_total = s.Cells(Rows.Count, "B").End(xlUp).Row
For Each r In s.Range("B2:B" & row_total).Cells
Twitter_Name = r.Value
For Each c In Unqiue_Twitter_Names
If c.Value = Twitter_Name Then
With c
.Offset(0, 1).Value = CDbl(.Offset(0, 1).Value) + 1
.End(xlToRight).Offset(0, 1).Value = s.Name
End With
End If
Next
Next
End If
' Loop through first sheet
' Exit For
Next
With Application
.Calculation = xlCalculationAutomatic: .ScreenUpdating = True: .DisplayAlerts = True
End With
End Sub

try this
Option Explicit
Sub CountInvestorsByTwitterName2()
Dim row_total As Long
Dim Unqiue_Twitter_Names As Range
Dim found As Range
Dim sht As Worksheet
Dim r As Range, shtRng As Range
With Application
.Calculation = xlCalculationManual: .ScreenUpdating = False: .DisplayAlerts = False
End With
With Sheets("UNIQUE_DATA")
.Columns("B:XFD").EntireColumn.Delete
Set Unqiue_Twitter_Names = .Range("A2:A" & .Cells(.Rows.Count, "A").End(xlUp).Row).SpecialCells(xlCellTypeConstants, xlTextValues)
End With
For Each sht In Sheets
With sht
If .Name <> "UNIQUE_DATA" Then
Set shtRng = .Range("B2:B" & .Cells(.Rows.Count, "B").End(xlUp).Row).SpecialCells(xlCellTypeConstants, xlTextValues)
For Each r In shtRng
Set found = Unqiue_Twitter_Names.Find(What:=r.Value, LookIn:=xlValues, LookAt:=xlWhole)
If Not found Is Nothing Then
With found
.Offset(0, 1).Value = CDbl(.Offset(0, 1).Value) + 1
.End(xlToRight).Offset(0, 1).Value = sht.Name
End With
End If
Next
End If
End With
Next
With Application
.Calculation = xlCalculationAutomatic: .ScreenUpdating = True: .DisplayAlerts = True
End With
End Sub
if not sufficiently fast, you could try some "array" approach, storing relevant sheet cells values in a array and performing searching with them
also a Dictionary approach could be worth examinating

What I would do:
1) Clear the entire 'UNIQUE_DATA' sheet.
2) Loop through all worksheets, and if the name of the sheet isn't 'UNIQUE DATA', copy all rows with content to 'UNIQUE_DATA' (copy-paste rows, after detecting beforehand which rows, and at which lines to insert them)
3) Sort all rows in 'UNIQUE DATA' on the column containing the twitter handles. Macro code is easy to figure out if you macro-record it once.
4) Loop through all rows in sheet 'UNIQUE_DATA', and compare value of Twitter handle with the Twitter handle for the row below. If they match, delete the next row (and lower the upper bound of your loop counter).
You should end up with all unique Twitter handles.
I do have to agree the last step may take some time. But at least doing this is a complexity of O(n) rather then O(n²) you currently have with two nested loops. Especially for high values of n, the time difference should be significant.

Related

VBA Excel 2010 - For Loop Delete Row if Next Record is different from previous record based on column values

I have a list of rows, that have several columns, and what I wish to do is, remove the rows that don't match a criteria based on the value of the previous rows.
Basicly i have a column with a bunch of ID's that repeat themselfs, and another column with a date.
I've sorted the records ascending by those two columns
Public Sub sbOrderRecords()
Application.Sheets("sheet1").Select
ActiveSheet.Sort.SortFields.Clear
ActiveSheet.Sort.SortFields.Add Range("A1"), xlSortOnValues, xlAscending
ActiveSheet.Sort.SortFields.Add Range("E1"), xlSortOnValues, xlAscending
With ActiveSheet.Sort
.SetRange Range("A1").CurrentRegion
.Header = xlYes
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
End Sub
So my goal is to delete the records where the ID is equal to the previous record but the date is older, leaving only one record by ID with the Newest date.
Public Sub sbDeleteByIMAndDate()
Dim currentIM As String
Dim MaxDateCurrentIM As Date
Dim dateRange As Range
Dim imRange As Range
With Sheets("sheet1")
Set imRange = .Range(.Range("A2"), .Range("A2").End(xlDown))
End With
Application.ScreenUpdating = False
For IM = 1 To imRange.Rows.Count
currentIM = Sheets("Sheet1").Cells(IM, 1).value
currentDate = Sheets("Sheet1").Cells(IM, 5).value
For J = Range(Range("E2"), Range("E2").End(xlDown)).Rows.Count + 1 To 2 Step -1
If currentIM = Sheets("Sheet1").Cells(J, 1).Value And currentDate > (Sheets("Sheet1").Cells(J, 5).Value) Then
Rows(J).EntireRow.Delete
End If
Next J
Next IM
Application.ScreenUpdating = True
End Sub
This seems to work but it's very slow, and only has around 6000 records.
Any suggestion would by highly appreciated
Okay, give this a try and tweak it accordingly if required.
Sub DuplicateRows()
Dim ws As Worksheet
Dim lr As Long, i As Long
Dim Rng As Range
With Application
.Calculation = xlCalculationManual
.EnableEvents = False
.ScreenUpdating = False
End With
Set ws = Sheets("Sheet1")
lr = ws.Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
'Assuming Column A is ID column and column E is Date column
ws.Sort.SortFields.Clear
ws.Range("A1").CurrentRegion.Sort key1:=Range("A2"), order1:=xlAscending, key2:=Range("E2"), order2:=xlDescending, Header:=xlYes
For i = lr To 2 Step -1
'Comparing ID column A
If ws.Cells(i, 1) = ws.Cells(i - 1, 1) Then
If Rng Is Nothing Then
Set Rng = ws.Cells(i, 1)
Else
Set Rng = Union(Rng, ws.Cells(i, 1))
End If
End If
Next i
If Not Rng Is Nothing Then
Rng.EntireRow.Delete
End If
With Application
.Calculation = xlCalculationAutomatic
.EnableEvents = True
.ScreenUpdating = True
End With
End Sub
Fastest would probably be to go record a macro and run remove duplicates. Take that and modify it out to meet your needs.
NOTE: Remove duplicates will keep the entry it finds first and delete the rest faster than anything I have ever written. Good for you you are sorting already.
1) Change the E column sort to xlDecsending so your newest fall above your oldest.
2) Select all the cells and click Remove Duplicates in the Data Tab.
3) Un-select all and select only column A.
I think this should do what you want.
Efficiency: You are hitting the sheet to hard. All those checks directly to cells and modifications to those cells are killing you. Research the variant array.
Dim arr() as variant
arr = sheets("WHATEVER").range("A1:B100").value
That is easy and fast. Now your data is in RAM not excel. A variant array assigned like this will start at row 1, column 1 for the first element. arr(1, 1) is cell A1 and arr(1, 2) is B1.
For IM = 1 To 1000
currentIM = arr(IM, 1).value
currentDate = arr(IM,5).value
when you want to delete a row in your comparison you can arr(1,1) = "": arr(1,2) = "" when you are finished you can read the data back into the worksheet.
Range("A1:B100") = arr
You would need to sort after but this would be faster than your code and slower than remove duplicates.

Delete all but header and first visible row

I've been tasked with removing duplicates from a dataset, but in a specific way; I need to apply a filter with two criteria, then remove all visible rows except the first one, which I will be editted on the fly.
I'm sure the solution rests with a loop filtering each criteria and deleting the relevant rows. However, I'm not sure how to go about it. Using offset is no good; setting and offsetting a range from used & visible cells doesn't seem to work; it always offsets from row 1, not the visible rows.
The range Dive is from the sheet WS, not the "Compilation Sheet" where the autofilter and duplication removal is taking place.
Sub Dupe_killer()
Dim List As Worksheet
Dim Dive As Range
Dim Hit As Range
Set List = Sheets.Add
Dim aRow As Range
Dim fRow As Range
Dim lRow As Range
Dim r As Range
Dim Rng As Range
Dim FilterRange As Range
With Application
.ScreenUpdating = False
.EnableEvents = False
.Calculation = xlCalculationManual
.DisplayAlerts = False
End With
Worksheets("Compilation Sheet").Activate
If ActiveSheet.FilterMode = False Then
ActiveSheet.Range("A1:bc1").AutoFilter
End If
ActiveWorkbook.Worksheets("Compilation Sheet").AutoFilter.Sort.SortFields.Clear
ActiveSheet.Range("$A$1:$BC$11188").AutoFilter Field:=2, Criteria1:=RGB(255 _
, 0, 255), Operator:=xlFilterCellColor
List.Range("A:A").Value = Worksheets("Compilation Sheet").Range("B:B").Value
List.Range("A:A").RemoveDuplicates Columns:=Array(1)
Set r = List.Range("A2")
Set Dive = Range(r, r.End(xlDown))
For Each Hit In Dive
With Worksheets("Compilation Sheet")
.Range("A1:BC1").AutoFilter Field:=2, Criteria1:=Hit
.Range("A1:BC1").AutoFilter Field:=10, Criteria1:="*", Criteria2:="*,*", Operator:=xlAnd
End With
Set FilterRange = ActiveSheet.UsedRange.Offset(2, 0) _
.SpecialCells(xlCellTypeVisible)
FilterRange.Select
Next Hit
With Application
.ScreenUpdating = True
.EnableEvents = True
.Calculation = xlCalculationAutomatic
.DisplayAlerts = True
End With
End Sub
You want to set the range in Dive to be only the visible rows.
Google the syntax for
.SpecialCells(xlCellTypeVisible)
I think i've cracked it. Found a nifty bit of code for selecting the first visible cell. I could then hide that row and delete all visible.
Range("A2", Cells(Rows.Count, "A").End(xlUp)).SpecialCells(xlCellTypeVisible).Cells(1, 1).Select
ActiveCell.EntireRow.Hidden = True
ActiveSheet.UsedRange.SpecialCells(xlCellTypeVisible).Delete

Large range duplicate removal from another sheet

The object is to remove all the rows in sheet1 column A if they exist in the list in sheet2 column A.
Both columns only contain numbers.
Sheet one column A may contain duplicates which is fine if they are not on the list in sheet2.
One option that I'm not familiar with and might be missing out on is Autofilter.
The code executes on a small data range 100 to 1000 but I have many books with over 1,000,000 records to clean up and anything over 10,000 brings Excel to not responding and freezes up indefinitely.
Sub remDupesfromTwoWs()
With Application
.EnableEvents = False
CalcMode = .Calculation
.Calculation = xlCalculationManual
.ScreenUpdating = False
End With
' set range to be searched
Dim masterRecordRange As Range ' declare an unallocated array.
Set masterRecordRange = Range("Sheet1!A2:A316730") ' masterRecordRange is now an allocated array
' store sheet2 column A as searchfor array
Dim unwantedRecords() As Variant ' declare an unallocated array.
unwantedRecords = Range("Sheet2!A1:A282393") ' unwantedRecords is now an allocated array
' foreach masterRecord loop to search masterRecordRange for match in unwantedRecords
Dim i As Double
Dim delRange As Range
Set delRange = Range("A" & ActiveSheet.Rows.Count)
'go through all rows starting at last row
For i = masterRecordRange.Rows.Count To 1 Step -1
' loop through unwantedRecords check each offset
For Each findMe In unwantedRecords
'If StrComp(cell, findMe, 1) = 0 Then not as fast
' unwantedRecord found
If Cells(i, 1).Value = findMe Then
Set delRange = Union(delRange, Range("A" & i))
'MsgBox i
Exit For
End If
Next findMe
Next i
'remove them all in one shot
delRange.EntireRow.Delete
With Application
.EnableEvents = True
CalcMode = .Calculation
.Calculation = xlCalculationAutomatic
.ScreenUpdating = True
End With
'possibly count and display quantity found
MsgBox "finally done!"
End Sub
It is very slow to walk through a range one cell at a time because there is a large overhead on each call to Cells. So you should get both ranges into variant arrays, then compare them to build up another array of matches which you would then write back to the worksheet and use Autofilter to select the rows to delete.
Here is a blog post on various methods of comparing lists:
VBA Comparing lists shootout
The fastest method is to use either a Dictionary or a collection. You should be able to adapt the code to do what you want.
Have you ever tried Range.Find:
Sub TestIt()
Dim ws1 As Worksheet, ws2 As Worksheet
Dim LastRow As Long, DestLast As Long, CurRow As Long
Set ws1 = Sheets("Sheet1")
Set ws2 = Sheets("Sheet2")
LastRow = ws1.Range("A" & Rows.Count).End(xlUp).Row
DestLast = ws2.Range("A" & Rows.Count).End(xlUp).Row
For CurRow = LastRow to 2 Step -1 'Must go backwards because you are deleting rows
If Not ws2.Range("A2:A" & DestLast).Find(ws1.Range("A" & CurRow).Value, LookIn:=xlValues, LookAt:=xlWhole) is Nothing Then
Range("A" & CurRow).EntireRow.Delete xlShiftUp
End If
Next CurRow
End Sub

Handling Merged Cells when Deleting Rows

I'm working on writing a module to remove unwanted text from a number of worksheets within a single workbook. I've pieced together enough to remove rows that have a specific font type, and rows that are empty; however, I've hit a snag.
The worksheets have a number of merged cells. I want to delete specific rows based on key phrases. Example, if "Comments" is found anywhere in Column A delete the row. However, if comments is merged between A2:A4, text in B3:B4 remains, leaving junk in the sheets I don't want.
Is there a way to delete the merged cell, and all rows to the right of that cell, if in the value in Column A is any number of keywords I'm looking for?
Here's what I have so far...
Sub Delete_Rows_Courier()
Dim ws As Excel.Worksheet
Dim LastRow As Long
Dim i As Integer
For Each ws In Application.ThisWorkbook.Worksheets
LastRow = ws.Cells(Rows.Count, 1).End(xlUp).Row
i = 1
Do While i <= LastRow
If ws.Range("A" & i).Font.Name = "Courier New" Then
ws.Rows(i).Delete
i = i - 1
LastRow = LastRow - 1
End If
i = i + 1
Loop
Next
End Sub
Sub Delete_Empty_Rows()
Dim ws As Worksheet
Dim wb As Workbook
Dim i As Long
For Each ws In Application.ThisWorkbook.Worksheets
'Deletes the entire row within the selection if the ENTIRE row contains no data.
'We use Long in case they have over 32,767 rows selected.
'We turn off calculation and screenupdating to speed up the macro.
With Application
.Calculation = xlCalculationManual
.ScreenUpdating = False
'We work backwards because we are deleting rows.
For i = ws.UsedRange.Rows.Count To 1 Step -1
If WorksheetFunction.CountA(ws.Rows(i)) = 0 Then
ws.Rows(i).EntireRow.Delete
End If
Next i
.Calculation = xlCalculationAutomatic
.ScreenUpdating = True
End With
Next ws
End Sub
Sub RunMacros()
Delete_Empty_Rows
Delete_Rows_Courier
End Sub
With .Range("A" & i).Mergearea
x = .Rows.Count 'if you need to know how many rows were deleted
.EntireRow.Delete 'delete merged rows
End With

Use Cell Value Reference in VBA to determine range

I know this is a pretty basic question, but im still working on building my VBA skills. I am in a predicament where I have made a mapping system of various reports I receive that get placed in a compiled workbook. These reports have entirely different formats etc. I have a copy/paste macro that copies columns and places them in their correct position on the compiled workbook.
I've come into situations however where there are a lot of duplicate / empty rows that screw up my Macro. I have used two VBA functions to solve this, one is a "delete row if reference column is blank":
Sub DeleteBlankARows()
With Application
.Calculation = xlCalculationManual
.ScreenUpdating = False
Dim r As Long
For r = Cells(Rows.Count, 6).End(xlUp).Row To 1 Step -1
If Cells(r, 6) = "" Then Rows(r).Delete
Next r
.Calculation = xlCalculationAutomatic
.ScreenUpdating = True
End With End Sub
This deletes rows where cells in column F are empty
I also use a copy/paste down macro:
Sub CopyUntilBlank()
Dim last_row As Integer
last_row = Range("f1").End(xlDown).Row
Dim rng As Range
Set rng = Range("d2:d" & last_row)
For Each cell In rng.Cells
cell.Activate
If ActiveCell.Value = "" Then
ActiveCell.Value = ActiveCell.Offset(-1, 0).Value
End If
Next cell End Sub
This copies and pastes down blank rows in column D until you hit a non-blank cell then re-does this until the range of values in column F.
These macros work well for me, but because I have multiple sheets like this, I would like to create a cell references that make the ranges dynamic. For instance: in the DeleteBlankRows macro, I would like to have the column reference in Cells(r,6) be determined off of a cell value in sheet1 - so for instance if the value in cell A1 on sheet 1 is 2 it would change the column reference to "2" (column B).
I would like the same to happen for the copy/paste down macro. I'm pretty sure this is just some reference to A1.Value but I don't know how to properly write such thing.
Thank you for your support, I've gone quite a long way with all the support of the community.
An example using your first sub:
Sub DeleteBlankARows(colIndex as Long)
Dim colIndex as long
colIndex = Sheet1.Range("a1").value
With Application
.Calculation = xlCalculationManual
.ScreenUpdating = False
Dim r As Long
For r = Cells(Rows.Count, colIndex).End(xlUp).Row To 1 Step -1
If Cells(r, colIndex) = "" Then Rows(r).Delete
Next r
.Calculation = xlCalculationAutomatic
.ScreenUpdating = True
End With
End Sub
It's not clear from your question which column in the second sub needs to be dynamic (or both of them?)
EDIT try this:
Sub CopyUntilBlank()
Dim last_row As Long, col1 as Long, col2 as Long
Dim rng as Range
col1 = Sheet1.Range("a2").value
col2 = Sheet1.Range("a3").value
last_row = Cells(1, col1).End(xlDown).Row
'This next line is better if there's any chance
' of blanks in this column
'last_row = Cells(Rows.Count, col1).End(xlUp).Row
With ActiveSheet
Set rng = .Range(.Cells(2, col2), .Cells(last_row, col2))
End With
For Each cell In rng.Cells
If cell.Value = "" Then
cell.Value = cell.Offset(-1, 0).Value
End If
Next cell
End Sub