Word VBA Progress Bar with Unknown Number of Steps - vba

I have a macro that loops through an unknown number of times. The number of times varies based on a total number of rows in multiple tables in a reference document, and that number of rows will vary across reference documents that may be used. The relevant snippet of code for the loop is below:
For Each oRow In oTbl.Rows
p = p + 1
Helper.ProgressIndicator_Code (p)
strPhrase = Split(Trim(oRow.Range.Cells(1).Range.Text), vbCr)(0)
strRule = Split(Trim(oRow.Cells(2).Range.Text), vbCr)(0)
If strPhrase <> "" Then
If Not strStartWord = vbNullString Then
'Process defined sections
arrEndWords = Split(strEndWord, "|")
For lngIndex = 0 To UBound(arrEndWords)
Set oRng = GetDocRange(strStartWord, arrEndWords(lngIndex))
If Not oRng Is Nothing Then Exit For
Next lngIndex
Else
'Process whole document
Set oRng = m_oDocCurrent.Range
End If
If Not oRng Is Nothing Then
Set oRngScope = oRng.Duplicate
With oRng.Find
.Text = strPhrase
Do While .Execute
If Not oRng.InRange(oRngScope) Then Exit For
oRng.HighlightColorIndex = wdTurquoise
If strRule <> "" Then
Set oComment = m_oDocCurrent.Comments.Add(Range:=oRng, Text:=strUsr & ": " & strRule)
oComment.Author = UCase("WordCheck")
oComment.Initial = UCase("WC")
End If
Loop
End With
End If
End If
Next oRow
The progress bar is a classic progress bar for which a label field width is updated using the below code based on a value of p as updated in the above code:
Sub progress(pctCompl As Integer)
ProgressIndicator.Text.Caption = pctCompl & "% Completed"
ProgressIndicator.Bar.Width = pctCompl * 2
DoEvents
End Sub
Here's my problem: The value of p varies based on which reference document is used, so my progress bar is never even approximately accurate with respect to the processing of the VBA macro. The progress bar doesn't have to be exact, merely close and to indicate that progress is being made and nothing has hung.
I'm not looking for written code, just would be very grateful for suggestions or advice as to approaches for making my progress bar more accurate so that I can learn (e.g., I just ran the macro for three different reference documents - one gave me 25%, one gave 44%, and one gave 82%; none showed even close to 100% when completed). Essentially I need to divide i by an unknown number to get my percentage, which is clearly impossible, so some function for a close approximation is needed.
Edit: New code based on #macropod suggestion.
Dim strCheckDoc As String, docRef As Document, projectPath As String, _
j As Integer, i As Integer, k As Integer, oNumRows as Long
j = 1
For i = 0 To UBound(strUsr)
strCheckDoc = [path to reference document unique to each strUsr]
Set docRef = Documents.Open(strCheckDoc, ReadOnly:=True, Visible:=False)
For k = 1 To docRef.Tables.Count
oNumRows = oNumRows + docRef.Tables(i).Rows.Count
Next k
Next i
Then the code to update the progress bar is:
Dim pctCompl As Single
pctCompl = Round((p / oNumRows) * 100)
ProgressIndicator.Text.Caption = pctCompl & "% Completed"
ProgressIndicator.Bar.Width = pctCompl * 2
DoEvents
The progress bar now gets to 64% when complete (i.e., it should be at 100%). I'm also working on a way to make oNumRows only count a row if the row has content in the first column.

Related

VBA Public Variable Not Working - MS Word - Counting Characters in a Table

The main VBA procedure counts characters in table cells in a Word document. Since it can count characters different ways:
Count the "Objective" text for the selected table
Count the "Accomplishment" text for the selected table
Count both the Obj and Acc texts in each table (loop), for all tables (another loop)
I created calling procedures for each option above that calls the main procedure. This way I pass variables from the calling Sub to the main Sub. These variables (1) tell the main Sub whether I want to count what is in row 3 (objective) or in row 5 (accomplishment) or both, and (2) feed the If/then lines in the main Sub to make sure the right row is counted. At the time, it seemed elegant, in hindsight - not so much.
Word template below:
There will be text in O1 and the VBA will count it (characters, spaces + paragraphs) and output it in C1, and the C1 fill changes red or green if over/under the character limit. The same for A1 and C2 and so on for any number of following tables.
PROBLEM DESCRIPTION
The VBA was working for the actions above when I had the row/columns hard coded into various places in the code. If rows/columns were ever added/deleted from the tables, they would have to updated in multiple spots. It would be simpler if the row/column numbers were in one place and referred back to as variables, so I changed the row/col #s to public variables. Then the problem began.
In the code, I track (debug.print) what becomes of oRow (output row) & chcct (character count col) and both are 0 as the main Sub runs, despite both being initialized as 3 in the public Sub Row_Col_Num() below.
My public variables are at the top of the module before the first Sub() and denoted as Public. Sub Row_Col_Num() which contains the variable assignments is also Public. All Subs are in the same standard module.
Option Explicit
Public oRow As Integer 'row with "Objectives" text
Public aRow As Integer 'row with "Accomplishments" text
Public cOnA As Integer 'column that both obj and accmp text are in
Public cChCt As Integer 'column that the char count is output to
Public Sub Row_Col_Num()
oRow = 3
aRow = 5
cOnA = 1
cChCt = 3
Debug.Print "cchct pub sub: " & cChCt
End Sub
ATTEMPTS TO FIX PROBLEM & RESULTS
I used the variable normally and left it Public as well as the Sub that assigns the variables (oRow =3) values.
Sub TableCharCount_Obj()
'Run character count for the "Objectives" in the SELECTED table
Debug.Print "orow = " & oRow
Call TableCharCount(oRow, oRow) 'provide it 2x to make IF and FOR loop
End Sub
I tried putting the Sub() name in front of the variable when it is used, e.g. Row_Col_Num.orow, in the Sub above.
Call TableCharCount(Row_Col_Num.oRow, Row_Col_Num.oRow)
I tried the module name in front of the variable as well, e.g. Module1.orow.
Call TableCharCount(Module1.oRow, Module1.oRow)
RESULTS
#1 & #3 resulted in the macro counting the wrong row and outputting to the wrong cell.
#2 resulted in error "Expected Function or variable" at line: Call TableCharCount(Row_Col_Num.oRow, Row_Col_Num.oRow)
All 3 cases orow and cchct both continued to be 0 throughout the run.
QUESTIONS / SOLUTIONS
a) Can a Public variable (oRow) be used as an argument passed from calling Sub to called Sub as ByVal a As Integer?
b) Does Public Sub Row_Col_Num(), which assigns values to the public variables, have to be explicitly run or called to populate the variables in the other Subs w/ the correct values?
c) Should I call Public Sub Row_Col_Num() in every calling Sub before calling the main Sub?
Sub TableCharCount_Obj()
Call Public Sub Row_Col_Num() '<<< add this call
Call TableCharCount(oRow, oRow) 'provide it 2x to make IF and FOR loop
End Sub
This option seems like a bad design.
If it's not obvious, there was some mission creep as I added more capability For now, if I could get the public variables to work, it would be done. Appreciate any suggestion to get these variables to work. For the purposes of this question, I only left the code for the variable Sub, the first calling Sub and the main Sub. VBA below:
'#0 -- This creates variables for column and row number used in all the macros. Only need to change row/col number here if row/col are added/deleted
Option Explicit
Public oRow As Integer 'row with "Objectives" text
Public aRow As Integer 'row with "Accomplishments" text
Public cOnA As Integer 'column that both obj and accmp text are in
Public cChCt As Integer 'column that the char count is output to
'This assigns row/column numbers to the variables
Public Sub Row_Col_Num()
oRow = 3
aRow = 5
cOnA = 1
cChCt = 3
Debug.Print "cchct pub sub: " & cChCt End Sub
'#2
Sub TableCharCount_Obj() 'Run character count for the "Objectives" in the SELECTED table
Debug.Print "orow = " & oRow
Call TableCharCount(oRow, oRow) 'provide it 2x to make IF and FOR loop
End Sub
'other calling procedures removed
'#5
Option Explicit
Sub TableCharCount(ByVal a As Integer, ByVal b As Integer)
'Counts total characters in a cell w/in a table and outputs the number to a different cell, and colors the cell red or green if over/under the maximum number of characters.
Dim charCount, charWSCount, paraCount, charTot As Double
Dim iRng, oRng, txtRng As Word.Range
Dim i, max, s, t, x As Integer
Dim tcount, tbl As Integer
Dim DocT As Table 'for active doc tables
Debug.Print "cchct1= " & cChCt 'Debug.Print vbCr & "-----START-------" & vbCr Application.ScreenUpdating = False
If a <> b Then
tcount = ActiveDocument.Tables.Count
tbl = 1 'used in FOR loop, start w/ table #1
s = b - a '"STEP" used in FOR loop = # of rows between objectives text and accomplishments text Else
On Error GoTo ErrMsg 'handles expected user error of not selecting a table to execute on
tbl = ActiveDocument.Range(0, Selection.Tables(1).Range.End).Tables.Count 'ID the table that is selected
tcount = tbl 'prevents FOR loop from trying to run again
s = 1 '"STEP" used in FOR loop = # of rows between objectives text and accomplishments text / do not set to zero = infinite loop End If
'Debug.Print "# of Tables: " & tcount
For t = tbl To tcount 'loops thru the tables
Set DocT = ActiveDocument.Tables(t)
For x = a To b Step s 'loops thru the applicable row(s) in the table
'Debug.Print "x # start = " & x
'Debug.Print "table " & t
iRng = DocT.Cell(x, cOnA)
iRng.Select
'Count used in output
Selection.MoveLeft wdCharacter, 1, wdExtend 'computerstats requires the text itself selected, characters.count can use the whole cell selected
charWSCount = Selection.Range.ComputeStatistics(Statistic:=wdStatisticCharactersWithSpaces) 'counts bullets & space after bullet / not line breaks (paragraphs)
'Debug.Print "Comp statchar# " & charWSCount
'---------
paraCount = Selection.Range.ComputeStatistics(Statistic:=wdStatisticParagraphs)
'Debug.Print "#paras = " & paraCount
'----------
charTot = charWSCount + paraCount
'Output to table cell
i = x - 1 'output cell is 1 row above cell that is counted
Set oRng = DocT.Cell(i, cChCt).Range 'Char count ouput row,column
Debug.Print "cchct2= " & cChCt
oRng.Text = charTot
Set txtRng = DocT.Cell(i, cChCt - 1).Range '"# Char:" location row,column
txtRng.Text = "# Char:"
'Maximum # of char allowed in a cell. Used to change cell fill red or green.
max = 2000 '"Accomplishment" row (row 5) has a max of 2000
If i = 2 Then max = 1500 '"Objective" row (row 3) has a max of 1500
'Change color of cell to indicate over/under max # of characters
If charCount < max Then
oRng.Shading.BackgroundPatternColor = wdColorBrightGreen
Else: oRng.Shading.BackgroundPatternColor = wdColorRed
End If
'Debug.Print "x # end = " & x
'Debug.Print "--------Next x--------------"
Next x
'Debug.Print "------Next Table------"
Next t
ActiveDocument.Tables(tbl).Select 'attempt to move to top of 1st table if using CharCount_AllTab() or just to the top of the selected table for the other macros
Selection.GoTo What:=wdGoToBookmark, Name:="\Page" Selection.StartOf
Application.ScreenUpdating = True
Exit Sub
ErrMsg: Msgbox "Select a table by placing the cursor anywhere in the table. Press OK and try the macro again numnuts!", _
vbOKOnly, "Table not selected"
End Sub

How can get a list of included records in a Word Mail Merge Document?

I have a Mail Merge Document which is controlled via VBA. After the user selects the records he wants to print, I want these to get a print date set in the database. For that I need a list of the records included in the mail merge.
I tried using the .Included property, setting ThisDocument.MailMerge.DataSource.ActiveRecord = wdFirstRecord (I later removed the wdFirstRecord in favour of 1 since it was giving me trouble)
and then checking for ThisDocument.MailMerge.DataSource.Included to be true but got a 5852 runtime-error "Object not available". EDIT: I used the following code to iterate over the records. When I remove either of the commented .Included statements, I get said error. (The execution does not feel as slow as yesterday anymore, though it is not especially fast.)
Function outputRecords(Optional limitRecords = -1)
With ThisDocument.MailMerge.DataSource
Dim str
For currentFieldNameIndex = 1 To .FieldNames.Count
str = str & .FieldNames(currentFieldNameIndex) & vbTab
Next
Debug.Print str
For currentRecordIndex = 1 To .RecordCount
If currentRecordIndex <= limitRecords Or limitRecords < 0 Then
.ActiveRecord = currentRecordIndex
str = ""
For currentDataFieldIndex = 1 To .DataFields.Count
str = str & .DataFields(currentDataFieldIndex) & vbTab
Next
'Debug.Print str
End If
'.Included = True
'Debug.Print .Included
Next
End With
End Function
Is there a solution to know which records are selected by the user?
About my document: After some general computation, the data source is linked to the document using
Dim sql As String
sql = "SELECT * FROM `Sheet0$` "
sql = sql & "WHERE ((`" & photoPathHeader & "` > '') AND (`" & photoLastEditHeader & "` >= #" & Format(printFromDate, "yyyy-mm-dd") & "#)) "
sql = sql & "ORDER BY `klasse#name` ASC"
ThisDocument.MailMerge.OpenDataSource _
name:=ThisDocument.Path & "\" & ThisDocument.Variables("masterDataFileName"), _
SQLStatement1:=sql, _
ReadOnly:=True, LinkToSource:=True
a dialog where the user can select individual records for printing is shown. I used this code for that:
Application.Dialogs(wdDialogMailMergeRecipients).Display
And finally, the Mail Merge is executed using
ThisDocument.MailMerge.Execute
Many thanks in advance!
This Answer is based on the comments so far.
I do not get the problem with .Included here unless ThisDocument is the wrong document. However, since your loop code does not appear to throw an error when accessing other members of ThisDocument.MailMerge.DataSource it is not obvious what is wrong. However, I don't think you need to use .Included.
I see an immediate slowdown as soon as I try to access any of the .Datasource.DataFields items. I do not know why that is. There is a delay for each access, not just the first access for each record. The one idea I had is that as soon as you change the active record, Word refreshes the values of the merge fields in the main document, and that might cause a problem, particularly if Word needed to access the printer driver for some reason. But nothing I tried (e.g. change the View to Draft) changes that.
You don't really need .Included because you can iterate the records like this:
With ActiveDocument.MailMerge.DataSource
previousRecord = 0
.ActiveRecord = wdFirstRecord
While .ActiveRecord <> previousRecord
previousRecord = .ActiveRecord
Debug.Print .ActiveRecord ' (just lists the record number)
previousRecord = .ActiveRecord
.ActiveRecord = wdNextRecord
DoEvents ' advisable if you need to stop the code
Wend
If you need to iterate all the records then you need to use wdFirstDataSourceRecord and wdNextDataSourceRecord instead.
The only way I was able to speed up access to the data was to modify the document to include all the fields I wanted to list, then get the values from the document rather than from .DataSource.Datafields. Personally I do not think modify the Mail Merge main document is ideal - in particular, something might stop you from inserting material at the beginning, as I have done. But my test code (needs more error trapping etc. is like this):
Sub checkincluded()
Dim b As Word.Bookmark
Dim bVMMFC As Boolean
Dim bSFC As Boolean
Dim i As Integer
Dim previousRecord As Long
Dim p As Word.Paragraph
Dim r As Word.Range
With ActiveWindow.View
bSFC = .ShowFieldCodes
.ShowFieldCodes = False
End With
With ActiveDocument.MailMerge
bVMMFC = .ViewMailMergeFieldCodes
.ViewMailMergeFieldCodes = False
With .DataSource
Set r = ActiveDocument.Range(0, 0)
Set p = ActiveDocument.Range(0, 0).Paragraphs.Add
For i = 1 To .FieldNames.Count
If i > 1 Then
r.Text = vbTab
r.Start = p.Range.End - 1
r.End = p.Range.End - 1
End If
r.Fields.Add r, WdFieldType.wdFieldMergeField, .FieldNames(i), False
r.Start = p.Range.End - 1
r.End = p.Range.End - 1
Next
r.Start = 0
r.End = p.Range.End - 1
Set b = r.Bookmarks.Add("recorddata")
Set r = Nothing
previousRecord = 0
.ActiveRecord = wdFirstRecord
While .ActiveRecord <> previousRecord
previousRecord = .ActiveRecord
Debug.Print .ActiveRecord, b.Range.Text
previousRecord = .ActiveRecord
.ActiveRecord = wdNextRecord
DoEvents
Wend
b.Delete
Set b = Nothing
p.Range.Text = ""
Set p = Nothing
End With
.ViewMailMergeFieldCodes = bVMMFC
End With
ActiveWindow.View.ShowFieldCodes = bSFC
End Sub
Finally, it is probably worth pointing out that there are other ways to exclude records, e.g. using NEXT, NEXTIF, SKIP and SKIPIF fields. But that is perhaps another question.
Thanks to #yokki's answer, I was able to combine that with what I know and ended up with the following
Sub listIncluded()
Dim LastRecord As Long
With ActiveDocument.MailMerge.DataSource
' Storing the index of the last record for later use
.ActiveRecord = wdLastRecord
LastRecord = .ActiveRecord
.ActiveRecord = wdFirstRecord
Do
Debug.Print .ActiveRecord
DoEvents
If .ActiveRecord = LastRecord Then Exit Do
.ActiveRecord = wdNextRecord
Loop Until .ActiveRecord > LastRecord ' Note this will never be satisfied and the loop will always be exited via the Exit Do statement
End With
End Sub
I find the WdMailMergeActiveRecord enumeration a bit confusing but this works.
The original answer threw a 5853 Invalid parameter error on the wdNextRecord if the last included record had already been reached and was not equal to the last record overall (i.e. the last record was not included). I circumvented this issue by storing the index of the last record into a local variable and using that to check if the last included record was reached.
Thank you very much to everyone who helped!
As briefly pointed out in a comment, I already have a list of all the
data in the data source as an array since I also use it in other parts
of the script. I used the following statement for that
numRows = wks.Range("A1").CurrentRegion.Rows.Count
masterData = wks.Range("A1").CurrentRegion.Offset(RowOffset:=1).Resize(RowSize:=numRows).Value
wks refers to an Excel worksheet object.

Progress bar in Status bar, a blank and a filled in char are not equal width

I am playing around creating code for a progress bar that runs in the Excel Status Bar. I want to replace my old dated userform with the 2 rectangles (which worked but I would sooner a less obtrusive method now).
Problem: The width of the chars I am using to signify "Filled In" and "Not Filled in" are slightly different, when using 100 of them you can see the percentage at the end appears to shift right as the progress increases.
Here is some working sample code to show you exactly what I mean:
Sub TestNewProgBar()
Dim X As Long
For X = 1 To 100000
Call NewProgressBar("Testing", X, 100000)
Next
End Sub
Sub NewProgressBar(MyMessage As String, CurrentVal As Long, MaxVal As Long)
Dim FilledIn As Long, NotFilledIn As Long
If CurrentVal >= MaxVal Then
Application.StatusBar = MyMessage & ": Complete"
Else
FilledIn = Round((CurrentVal / MaxVal) * 100, 0)
NotFilledIn = (100 - FilledIn)
Application.StatusBar = MyMessage & ": " & Application.WorksheetFunction.Rept(ChrW(9608), FilledIn) & Application.WorksheetFunction.Rept(ChrW(9620), NotFilledIn) & "| " & FilledIn & "%"
End If
End Sub
Run TestNewProgBar and look at the status bar.
Is this going to be a simple case of choosing a different Unicode symbol or are there forces beyond my control at work here?
There's a Unicode block from U+25A0 to U+25FF called Geometric Shapes. There are some matching pairs of black/ white shapes in there that will successfully work for your progress bar implementation.
In the test code below, some pairs work and some do not!. Personally I like the last example (pairing U+25AE and U+25AD).
Option Explicit
Sub TestNewProgBar()
Dim lngCounter As Long
Dim lngMax As Long
Dim strFilledChar As String
Dim strNotFilledChar As String
'iterations
lngMax = 100000
'small squares - works
strFilledChar = ChrW(&H25AA)
strNotFilledChar = ChrW(&H25AB)
For lngCounter = 1 To lngMax
Call NewProgressBar("Small squares", lngCounter, lngMax, strFilledChar, strNotFilledChar)
Next
'large squares - doesn't work
strFilledChar = ChrW(&H25A0)
strNotFilledChar = ChrW(&H25A1)
For lngCounter = 1 To lngMax
Call NewProgressBar("Large squares", lngCounter, lngMax, strFilledChar, strNotFilledChar)
Next
'large squares 2 - doesn't work (but opposite effect)
strFilledChar = ChrW(&H25A3)
strNotFilledChar = ChrW(&H25A1)
For lngCounter = 1 To lngMax
Call NewProgressBar("Large squares 2", lngCounter, lngMax, strFilledChar, strNotFilledChar)
Next
'mixed vertical/ horizontal rectangles - works!
strFilledChar = ChrW(&H25AE)
strNotFilledChar = ChrW(&H25AD)
For lngCounter = 1 To lngMax
Call NewProgressBar("Mixed rectangles", lngCounter, lngMax, strFilledChar, strNotFilledChar)
Next
End Sub
Sub NewProgressBar(strMyMessage As String, lngCurrentVal As Long, lngMaxVal As Long, strFilledChar As String, strNotFilledChar As String)
Dim lngFilledIn As Long
Dim lngNotFilledIn As Long
Dim strStatus As String
If lngCurrentVal >= lngMaxVal Then
Application.StatusBar = strMyMessage & ": Complete"
Else
lngFilledIn = Round((lngCurrentVal / lngMaxVal) * 100, 0)
lngNotFilledIn = (100 - lngFilledIn)
strStatus = strMyMessage & ": " & _
String(lngFilledIn, strFilledChar) & _
String(lngNotFilledIn, strNotFilledChar) & _
"| " & lngFilledIn & "%"
Application.StatusBar = strStatus
End If
End Sub
Edit:
To follow up on my 'aside' below, I did some experimenting, and Comintern was onto something when s/he provided a link to this issue. The problem described above is to do with ScreenUpdating. If Screenupdating is set to false when the status bar is changed, the character widths of ChrW(9608) and ChrW(9620) are the same.
I've no idea why, but it does the trick. So you'll want to do the following:
Application.Screenupdating = False
'code which changes the status bar
Application.Screenupdating = True
(my previous comment continues below)
I prefer this pairing:
strFilledChar = ChrW(&H2588) 'a black rectangle, "Full Block"
strUnfilledChar = ChrW(&H2584) 'an array of sparse dots, "Light Shade"
Or this one:
strFilledChar = ChrW(&H2588) 'a black rectangle, "Full Block"
strUnfilledChar = ChrW(&H2500) 'a horizontal line, "Block Drawings Light Horizontal"
(as an aside, I encountered the same problem as described in the question whereby ChrW(9608) and ChrW(9620) have different widths - but only in one of my workbooks. In another workbook, they have the same widths and so the progress bar displays properly. I have no idea why.)

Duplicate removal for VBA Word not working effectively

I have a program to remove duplicates and everything is working properly. It is just freezing with large data sets i.e. 1 to 2.5 million words.
What is wrong with my approach? Is there a better one?
Sub DeleteDuplicateParagraphs()
Dim p1 As Paragraph
Dim p2 As Paragraph
Dim DupCount As Long
Dim StartTime As Double
Dim SecondsElapsed As Double
StartTime = Timer
For Each p1 In ActiveDocument.Paragraphs
If p1.range.Text <> vbCr Then
For Each p2 In ActiveDocument.Paragraphs
If p1.range.Text = p2.range.Text Then
DupCount = DupCount + 1
If p1.range.Text = p2.range.Text And DupCount > 1 Then p2.range.Delete
End If
Next p2
End If
DupCount = 0
Next p1
SecondsElapsed = Round(Timer - StartTime, 2)
MsgBox "This code ran successfully in " & SecondsElapsed & " seconds", vbInformation
DupCount = 0
End Sub
Try this (first add a reference to the Microsoft Scripting Runtime to your VBA project):
Sub DeleteDuplicateParagraphs()
Dim p As Paragraph
Dim d As New Scripting.Dictionary
Dim t As Variant
Dim i As Integer
Dim StartTime As Single
StartTime = Timer
' collect duplicates
For Each p In ActiveDocument.Paragraphs
t = p.Range.Text
If t <> vbCr Then
If Not d.Exists(t) Then d.Add t, New Scripting.Dictionary
d(t).Add d(t).Count + 1, p
End If
Next
' eliminate duplicates
Application.ScreenUpdating = False
For Each t In d
For i = 2 To d(t).Count
d(t)(i).Range.Delete
Next
Next
Application.ScreenUpdating = True
MsgBox "This code ran successfully in " & Round(Timer - StartTime, 2) & " seconds", vbInformation
End Sub
This makes use of the fact that the Scripting.Dictionary is a hash table that is geared towards very quickly associating unique keys with values. It is therefore very good at spotting duplicate keys. Dictionary keys have to be strings, conveniently we can use the paragraph texts for that.
For values we use more dictionary objects, solely for the fact that they work a lot better than VBA's arrays. In them we collect the references to the actual paragraph instances with the same text.
Actually deleting duplicate paragraphs is a very simple matter afterwards.
Note: The duplicate detection part in the above code is very fast. However, if Word becomes unresponsive in large documents then it's in the duplicate removal part, namely because of Word's undo buffer.
The culprit is that the paragraph ranges are deleted one after another, causing Word to build a very large undo buffer. Unfortunately there is no way (that I know of) to either
delete multiple separate ranges in one step (which would result in only a single entry in the undo buffer), or
disable the undo buffer altogether from VBA
Calling UndoClear periodically in the "eliminate duplicates" loop might help, disabling ScreenUpdating is also not a bad idea:
' eliminate duplicates
Dim x As Integer
Application.ScreenUpdating = False
For Each t In d
x = x + 1
For i = 2 To d(t).Count
d(t)(i).Range.Delete
Next
If x Mod 50 = 0 Then ActiveDocument.UndoClear
Next
ActiveDocument.UndoClear
Application.ScreenUpdating = True
First of all, Just wanted to thank you so much for the time and effort you have put in to helping me.
Your idea behind the method is really impressive. I did change the code slightly and would like you to peruse it when you have the time, to see if it is of optimal standard. Again, I truly thank you, the code ran 20 splits faster than the previous and that is not even over a larger data set.
> Sub DeleteDuplicateParagraphs()
>
> Dim p As Paragraph
> Set d = CreateObject("Scripting.Dictionary")
> Dim t As Variant
> Dim i As Integer
> Dim StartTime As Single
>
> StartTime = Timer
>
> ' collect duplicates For Each p In ActiveDocument.Paragraphs
> t = p.range.Text
> If t <> vbCr Then
> If Not d.Exists(t) Then d.Add t, CreateObject("Scripting.Dictionary")
> d(t).Add d(t).Count + 1, p
> End If Next
>
> ' eliminate duplicates For Each t In d
> For i = 2 To d(t).Count
> d(t)(i).range.Delete
> Next Next
>
> MsgBox "This code ran successfully in " & Round(Timer - StartTime,
> 2) & " seconds", vbInformation
>
> End Sub

Unexplained Type Mismatch error at about every 10,000 iterations in Excel VBA

I have a VBA macro that uses Microsoft MapPoint to calculate the distance between two locations for each record in my spreadsheet. I have about 120,000 records to process. The program runs smoothly for about 10,000 iterations then returns a Type Mismatch error where I define the MapPoint locations in my error handler. At which point, I select 'Debug' and then resume execution without editing any code, and it will run successfully for another 10,000 or so records before the same thing happens again.
I've checked my data, and I can't see why there would be a type mismatch, or for that matter why the code would choke on a record one time, and then, without resetting anything, handle the same record upon resuming. Any idea why this would happen?
For reference,
- column M contains locations of the form "X County, ST"
- column AN contains a separate location as ZIP
- column G contains the same location data as AN but in the form "X County, ST"
Sub distance_from_res()
Dim oApp As MapPoint.Application
Dim k As Long
Dim count As Long
Dim errors As Long
k = 0
count = Sheets("i1_20041").Range("A2", Sheets("i1_20041").Range("A2").End(xlDown)).count
errors = 0
Set oApp = CreateObject("MapPoint.Application.NA.11")
oApp.Visible = False
Set objMap = oApp.NewMap
Dim objRes As MapPoint.Location
Dim objFish As MapPoint.Location
'Error executes code at 'LocError' and then returns to point of error.
On Error GoTo LocError
Do While k < count
If Sheets("i1_20041").Range("M2").Offset(k, 0) <> "" Then
'Sets MapPoint locations as [County],[State] from Excel sheet columns "INT_CNTY_ST" and "ZIP".
Set objRes = objMap.FindResults(Sheets("i1_20041").Range("AN2").Offset(k, 0)).Item(1)
Set objFish = objMap.FindResults(Sheets("i1_20041").Range("M2").Offset(k, 0)).Item(1)
'Calculates distance between two locations and prints it in appropriate cell in Column AO.
Sheets("i1_20041").Range("AO2").Offset(k, 0) = objRes.DistanceTo(objFish)
Else
errors = errors + 1
End If
k = k + 1
Loop
'Displays appropriate message at termination of program.
If errors = 0 Then
MsgBox ("All distance calculations were successful!")
Else
MsgBox ("Complete! Distance could not be calculated for " & errors & " of " & count & " records.")
End If
Exit Sub
LocError:
If Sheets("i1_20041").Range("G2").Offset(k, 0) = "" Then
errors = errors + 1
Else
'THIS IS WHERE THE ERROR OCCURS!
Set objRes = objMap.FindResults(Sheets("i1_20041").Range("G2").Offset(k, 0)).Item(1)
Set objFish = objMap.FindResults(Sheets("i1_20041").Range("M2").Offset(k, 0)).Item(1)
'Calculates distance between two locations and prints it in appropriate cell in Column AO.
Sheets("i1_20041").Range("AO2").Offset(k, 0) = objRes.DistanceTo(objFish)
End If
k = k + 1
Resume
End Sub
UPDATE:
I incorporated most of the suggestions from #winwaed and #Mike D, and my code is now more accurate and doesn't choke on errors. However, the old problem reared its head in a new form. Now, after around 10,000 iterations, the code continues but prints the distance of the ~10,000th record for every record afterwards. I can restart the code at the trouble point, and it will find the distances normally for those records. Why would this happen? I've posted my updated code below.
Sub distance_from_res()
Dim oApp As MapPoint.Application
Dim k As Long
Dim rc As Long
Dim errors As Long
Dim dist As Double
Dim zipRes As Range
Dim coRes As Range
Dim coInt As Range
Dim distR As Range
Set zipRes = Sheets("Sheet1").Range("C2")
Set coRes = Sheets("Sheet1").Range("B2")
Set coInt = Sheets("Sheet1").Range("E2")
Set distR = Sheets("Sheet1").Range("G2")
k = 0
rc = Sheets("Sheet1").Range("F2", Sheets("Sheet1").Range("F2").End(xlDown)).Count
errors = 0
'Start MapPoint application.
Set oApp = CreateObject("MapPoint.Application.NA.11")
oApp.Visible = False
Set objMap = oApp.NewMap
Dim objResultsRes As MapPoint.FindResults
Dim objResultsInt As MapPoint.FindResults
Dim objRes As MapPoint.Location
Dim objInt As MapPoint.Location
Do While k < rc
'Check results for Res Zip Code. If good, set first result to objRes. If not, check results for Res County,ST. If good, set first result to objRes. Else, set objRes to Nothing.
Set objResultsRes = objMap.FindResults(zipRes.Offset(k, 0))
If objResultsRes.ResultsQuality = geoFirstResultGood Then
Set objRes = objResultsRes.Item(1)
Else
Set objResultsRes = Nothing
Set objResultsRes = objMap.FindResults(coRes.Offset(k, 0))
If objResultsRes.ResultsQuality = geoFirstResultGood Then
Set objRes = objResultsRes.Item(1)
Else
If objResultsRes.ResultsQuality = geoAmbiguousResults Then
Set objRes = objResultsRes.Item(1)
Else
Set objRes = Nothing
End If
End If
End If
Set objResultsInt = objMap.FindResults(coInt.Offset(k, 0))
If objResultsInt.ResultsQuality = geoFirstResultGood Then
Set objInt = objResultsInt.Item(1)
Else
If objResultsInt.ResultsQuality = geoAmbiguousResults Then
Set objInt = objResultsInt.Item(1)
Else
Set objInt = Nothing
End If
End If
On Error GoTo ErrDist
distR.Offset(k, 0) = objRes.DistanceTo(objInt)
k = k + 1
Loop
Exit Sub
ErrDist:
errors = errors + 1
Resume Next
End Sub
You are constructing a somewhat complex range object (Range -> Offset -> Item). DIM temporary range objects and do it in steps so you can see where exactly the problem occurs
tmpR1 = Sheets("i1_20041").Range("G2")
tmpR2 = tmpR1.Offset(k,0)
then examine the .Count property of the .FindResult before you try accessing Item(1) .... maybe this item doesn't exist ?!?
Debug.Print objMap.FindResult(tmpR2).Count
Hint:
looking at your code, I observe that you use a variable "count". This variable name overlaps with the "Count" property in your second line of code - that's why the "Count" keyword at the end of the statement is printed all lowercase. It's not got anything to do with the errors (we pretend ;-) ), but bad style anyway.
MikeD is right with your dangerous FindResults() calls. However, there is a better way to check the results. The "FindResults collection" isn't a pure collection but includes an extra properties called "ResultsQuality". Docs are here:
http://msdn.microsoft.com/en-us/library/aa493061.aspx
Resultsquality returns a GeoFindResultsQuality enumeration. You want to check for the values geoAllResultsGood and geFirstResultGood. All other results should give an error of some result. Note that your existing code would work find with (for example) Ambiguous Results, even though it is unlikely the first result is the correct one. Also it might match on State or Zipcode (because that is the best it can find) whcih give you an erroneous result. Using ResultsQuality, you can detect this.
I would still check the value of Count as an additional check.
Note that your code is calculating straight line (Great Circle) distances. As such the bottleneck will be the geocoding (FindResults). If you are using the same locations a lot, then a caching mechanism could greatly speed things up.
If you want to calculate driving distances, then there are a number of products on the market for this (yes I wrote two of them!).