VBA Loop in Word to find wildcard strings misses first occurrence - vba

I have some VBA for Microsoft Word that is supposed to find some five digit numbers using wildcards in multiple files and then the sticks them and the path/file into an excel file. Unfortunately, it ALWAYS misses the first occurrence of the wildcard string. I cannot determine why!
I've tried reordering things to make sure that it's not being missed, however, I am unable to get it working properly. When I run the wildcard search myself by hand, it finds the first occurence. It doesn't do it in VBA, however.
Public Sub TestFindNumbers()
Dim i As Long
i = 2 ' Row in Excel to start
Dim ObjExcel As Object, ObjWorkBook As Object, ObjWorksheet As Object
Set ObjExcel = CreateObject("EXCEL.APPLICATION")
Set ObjWorkBook = ObjExcel.Workbooks.Add
Set ObjWorksheet = ObjWorkBook.Worksheets("Sheet1")
Set dlgFile = Application.FileDialog(msoFileDialogFilePicker)
With dlgFile
dlgFile.AllowMultiSelect = True
If .Show = -1 Then
For nDocx = 1 To dlgFile.SelectedItems.Count
Documents.Open dlgFile.SelectedItems(nDocx)
Set objDocx = ActiveDocument
With objDocx.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "[0-9]{5}"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchWildcards = True
.Execute
End With
Do While .Find.Found
.Collapse wdCollapseEnd
.Find.Execute
If .Text <> "" Then
ObjWorksheet.Cells(i, 1) = Left(.Text, 8)
ObjWorksheet.Cells(i, 2) = dlgFile.SelectedItems(nDocx)
Else
i = i - 1
End If
i = i + 1
Loop
End With
objDocx.Close SaveChanges:=wdDoNotSaveChanges
Next nDocx
Else
MsgBox ("You need to select documents first!")
Exit Sub
End If
End With
ObjWorksheet.Cells(1, 1) = "Number"
ObjWorksheet.Cells(1, 2) = "Path & Filename"
ObjExcel.Visible = 1
Set objDocx = Nothing
Set ObjExcel = Nothing
Set ObjWorkBook = Nothing
Set ObjWorksheet = Nothing
End Sub
I created a single test file with the following:
1234 Shouldn’t be selected
12345 Select this one. First occurrence.
98765 Another good one
568 Nope
This one is 55555 in the middle
End
When I run my VBA code, I'm getting 98765 and 55555 as hits. Unfortunately, 12345 isn't being found.

The reason the code in the question is not finding the search terms as expected:
The Collapse, then Find.Execute methods are in the loop before the first result is picked up. Since .Execute is also in the With block preceding the loop, Find runs twice, thus masking the first occurrence of the search term.
In addition:
1) Preferably, a specific Range should be used for the search, rather than the entire document (objDocx.Range). This is due to the "collapsing" - it works more reliably when there's a specific Range object.
2) Do not use Find.Wrap = wdFindContinue as suggested in comments. wdFindStop (as in the code in the question) is correct when using Find in a loop. wdFindContinue will often lead to an "infinite loop" as Word will start at the beginning of the document again, and again...
3) It's possible (better) to set a Document object when a file is being opened (or created), rather than relying on ActiveDocument in a second step:
Set objDocx = Documents.Open dlgFile.SelectedItems(nDocx)
Here's the part of the code that has to do with the Find - I've left out the Excel parts to make it easier to read
Dim objDocx As Word.Document
Dim rngFind As Word.Range
Set objDocx = Documents.Open dlgFile.SelectedItems(nDocx)
Set rngFind = objDocx.content
With rngFind
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "[0-9]{5}"
.Replacement.Text = ""
.Forward = True
.wrap = wdFindStop
.Format = False
.MatchWildcards = True
.Execute
End With
Do While .Find.Found
If .Text <> "" Then
ObjWorksheet.Cells(i, 1) = Left(.Text, 8)
ObjWorksheet.Cells(i, 2) = dlgFile.SelectedItems(nDocx)
Else
i = i - 1
End If
i = i + 1
.Collapse wdCollapseEnd
.Find.Execute
Loop
End With

The problem is with your Do While loop. Change it to:
Do While .Find.Found
ObjWorksheet.Cells(i, 1) = Left(.Text, 8)
ObjWorksheet.Cells(i, 2) = objDocx.Name
i = i + 1
.Collapse wdCollapseEnd
.Find.Execute
Loop
Also, instead of:
Documents.Open dlgFile.SelectedItems(nDocx)
Set objDocx = ActiveDocument
use:
Set objDocx = Documents.Open(dlgFile.SelectedItems(nDocx))

Related

Find each table whose first cell has one of two words followed by three digits

I would like to fix the following code to make it find each table in the document where it has the pattern ARC or MEC words followed by the wildcard digits [1-4][1-9]{2} without any leading/trailing characters, digits, spaces, etc.
The chosen table should have a total of 11 rows.
If possible, I need another version of the code to search for the pattern in the table first cell .Cell(1,1) while making sure the table has a total of 11 rows.
Sub FindTables()
Dim wdDoc As Word.Document, t As Long
Set wdDoc = ThisDocument
With wdDoc
For t = 1 To .Tables.Count
With .Tables(t).Range.Find
.ClearFormatting
.Format = FALSE
.Text = "(ARC)|(MEC)[1-4][1-9]{2}"
.Forward = TRUE
.Wrap = wdFindStop
.MatchCase = TRUE
.MatchWildcards = TRUE
.Execute
If .Found = TRUE Then
' some operations on the table
wdDoc.Tables(t).AutoFitBehavior (wdAutoFitWindow)
wdDoc.Tables(t).Range.Collapse wdCollapseEnd
End If
End With
Next
End With
End Sub
For example:
Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Format = False
.Forward = True
.Wrap = wdFindStop
.MatchWildcards = True
.Text = "<[ACEMR]{3}[1-4][1-9]{2}>"
.Replacement.Text = ""
End With
Do While .Find.Execute = True
If .Information(wdWithInTable) = True Then
If .Tables(1).Rows.Count = 11 Then
'If .Cells(1).RowIndex = 1 And .Cells(1).ColumnIndex = 1 Then
If Split(.Cells(1).Range.Text, vbCr)(0) = .Text Then
Select Case Left(.Text, 3)
Case "ARC", "MEC": .Tables(1).AutoFitBehavior (wdAutoFitWindow)
End Select
End If
'End If
End If
.Start = .Tables(1).Range.End
End If
.Collapse wdCollapseEnd
Loop
End With
Application.ScreenUpdating = True
End Sub
To process only those tables where the found content is in the first cell, delete the tick marks from the two comment-out lines.
Pattern:
"(ARC[1-4][1-9]{2})|(MEC[1-4][1-9]{2})"
Tested successfully with Microsoft VbScript Regular Expressions 5.5. (set this Reference on VBE).
Code sample - adapt it to suit your needs (working with tables - I didn't reproduce your scenario):
Function fnFindPatterns()
Dim objRegExp As RegExp
Dim ObjMatch As Match
Dim colMatches As MatchCollection
Dim strText As String
Dim strResult As String
Set objRegExp = New RegExp
objRegExp.Pattern = "(ARC[1-4][1-9]{2})|(MEC[1-4][1-9]{2})"
objRegExp.IgnoreCase = True
objRegExp.Global = True
Selection.WholeStory
strText = Selection.Text
If objRegExp.Test(strText) = True Then 'we have something there...
Set colMatches = objRegExp.Execute(strText)
For Each ObjMatch In colMatches 'Iterate on the collection
strResult = strResult & ObjMatch.Value & vbCrLf
Next
Else
End If
MsgBox strResult
End Function
Edited 2022 07 11:
I realized that the "|" (OR) do not work in MSWord . It doesn't exist on the limited "Regular Expressions" set of tools within MsWord, compared to VbScript.RegExp. Wich, in turn, is also limited set of tools, if compared with other (powerfull) programming languages. But with some coding we "simulate" this OR, using "Choose", testing each partial set of patterns that way:
Sub FindTables()
Dim wdDoc As Word.Document, t As Long, intChoose As Integer
Set wdDoc = ThisDocument
With wdDoc
For intChoose = 1 To 2
For t = 1 To .Tables.Count
With .Tables(t).Range.Find
.ClearFormatting
.Format = False
.Text = VBA.Choose(intChoose, "<[ARC]{3}[1-4][1-9]{2}>", "<(MEC)[1-4][1-9]{2}>")
.Forward = True
.Wrap = wdFindStop
.MatchCase = True
.MatchWildcards = True
.Execute
If .Found = True Then
' some operations on the table
wdDoc.Tables(t).AutoFitBehavior (wdAutoFitWindow)
wdDoc.Tables(t).Range.Collapse wdCollapseEnd
End If
End With
Next
Next
End With
End Sub
To test this code I mounted a Word Doc with 7 tables (varying dimensions from 1 x 11 to 1 x 13). To ensure the correct dimension of each table insert the suggestion posted in Macropod's code.

Range.Find Word VBA: finding heading with specific heading number only works if heading style is specified

I'd like to find the location of a heading that has a specific heading number. E.g. "2.3."
For some reason, I can only find the location of the heading if i specify what Style that heading is going to be. If i don't specify the heading style then I don't get any matches (i.e. .Execute is never True).
How can I find the location of a heading without having to specify it's style?
Code that works:
Function FindHeadingPos(oRng As Word.Range) As Long
Dim rng As Word.Range
With oRng.Find
.ClearAllFuzzyOptions
.ClearHitHighlight
.ClearFormatting
.Text = ""
.Forward = True
.Wrap = 2
.Format = True
.Style = "Heading 2,H2 Numb"
Do While .Execute
'.Parent contains the found range
Set rng = .Parent
If rng.ListFormat.ListString = "2.3." Then
FindHeadingPos = rng.Start
Exit Do
End If
'below statement seems to prevent code hanging on some headings.
rng.Collapse Direction:=wdCollapseEnd
Loop
End With
End Function
Code that doesn't work:
Function FindHeadingPos(oRng As Word.Range) As Long
Dim rng As Word.Range
With oRng.Find
.ClearAllFuzzyOptions
.ClearHitHighlight
.ClearFormatting
.Text = ""
.Forward = True
.Wrap = 2
'.Format = True
'.Style = "Heading 2,H2 Numb"
Do While .Execute
'.Parent contains the found range
Set rng = .Parent
If rng.ListFormat.ListString = "2.3." Then
FindHeadingPos = rng.Start
Exit Do
End If
'below statement seems to prevent code hanging on some headings.
rng.Collapse Direction:=wdCollapseEnd
Loop
End With
End Function
Thanks #GSerg for suggesting the .ParagraphFormat.OutlineLevel property.
The code below seems to solve my problem in case it helps anyone else.
Function getParaOutlineLevel(headNumberRaw As String) As Integer
Dim numberOfDecimals As Integer
numberOfDecimals = Len(headNumberRaw) - Len(Replace(headNumberRaw, ".", ""))
If Not IsNumeric(Left(headNumberRaw, 1)) Then
getParaOutlineLevel = numberOfDecimals + 5
Else
getParaOutlineLevel = numberOfDecimals
End If
End Function
Function FindHeadingPos(oRng As Word.Range) As Long
Dim headNumber As String
Dim rng As Word.Range
headNumber = "2.3."
With oRng.Find
.ClearAllFuzzyOptions
.ClearHitHighlight
.ClearFormatting
.Text = ""
.Forward = True
.Wrap = 2
.Format = True
.ParagraphFormat.OutlineLevel = getParaOutlineLevel(headNumber)
Do While .Execute
'.Parent contains the found range
Set rng = .Parent
If rng.ListFormat.ListString = headNumber Then
FindHeadingPos = rng.Start
Exit Do
End If
'below statement seems to prevent it hanging on some headings.
rng.Collapse Direction:=wdCollapseEnd
Loop
End With
End Function

vba Word create a variable with value of text between 2 cursor points

We are trying to revise rtf docs that are created by Molecular Device software.
Here is an example of part of one of these documents:
Protocol 'C:\ALL USERS\Params\Current\2017 Opto Params\0 VoltageClampContinuous.pro' opened.
C:\ALL USERS\Alan\2018_07_11\2018_07_11_0000.abf started at 00:19:48 stopwatch time.
So for right now - all I am trying to do is automatically find the experiment date (in this case = "2018_07_11_")
My sub so far can find the correct cursor positions but how do I select the text between 2 cursor positions?
Below is what I have the CursorPosition statement is of course wrong - this is what I am looking to correct.
Sub FindfilenameDate()
txt_prior_to_expDate = "\"
txt_after_expDate = "0000"
With ActiveDocument.Content.Find
.Text = txt_after_expDate
.Forward = True
.Execute
If .Found = True Then
.Parent.Select
Set after_rng = Selection.Range
expDateEnd_cursorPos = after_rng.Start - 1
Selection.HomeKey Unit:=wdStory, Extend:=wdExtend
With Selection.Find
.Text = txt_prior_to_expDate
.Forward = False
.Execute
If .Found = True Then
.Parent.Select
Set charBefore_expDate = Selection.Range
expDateStart_cursorPos = charBefore_expDate.Start + 1
End If
End With
End If
End With
'expDate = CursorPosition(expDateStart_cursorPos, expDateEnd_cursorPos)
'MsgBox ("expDate = " & expDate) 'DELETEMSGBOX
End Sub
The trick to something like this is to work with multiple Range objects. My personal preference is to declare a Range for each separate thing to be worked with, rather than trying to figure out the minimum and re-use a Range - at least for the initial code and testing purposes.
For this task, then, I use four Ranges: 1) For the original search, 2) for the end of the "cursor position" that's wanted, 3) For the second search, 4) for the final "cursor position".
The other important concepts are how to "collapse" a Range and how to "copy" one.
Collapsing a Range is like pressing the right- or left-arrow key with a selection, so that it is a "point" and doesn't contain anything. A Range can be collapsed to its start or end position.
Copying a Range (setting one Range to another) needs to be done using the Duplicate property so that the copy is independent of the original. Otherwise, when one is changed the other changes, as well.
Sub FindfilenameDate()
Dim rngFind As Word.Range, rngBefore As Word.Range
Dim rngAfter As Word.Range, rngFound As Word.Range
txt_prior_to_expDate = "\"
txt_after_expDate = "0000"
Set rngFind = ActiveDocument.content
With rngFind.Find
.Text = txt_after_expDate
.Forward = True
.Execute
If .found = True Then
Set rngAfter = rngFind.Duplicate
rngAfter.Collapse wdCollapseStart
Set rngBefore = rngFind.Duplicate
rngBefore.Collapse wdCollapseStart
With rngBefore.Find
.Text = txt_prior_to_expDate
.Forward = False
.Execute
If .found = True Then
Set rngFound = rngBefore.Duplicate
rngFound.Collapse wdCollapseEnd
rngFound.End = rngAfter.Start
'rngFound.Select
End If
End With
End If
End With
'expDate = CursorPosition(expDateStart_cursorPos, expDateEnd_cursorPos)
MsgBox ("expDate = " & rngFound.Text) 'DELETEMSGBOX
End Sub
Though it's not apparent why you're after the date string ending in _0000 rather than the date that is the parent folder name, a much simpler approach for a single date would be:
Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "[0-9]{4}_[0-9]{2}_[0-9]{2}_0000"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchWildcards = True
.Execute
End With
If .Find.Found = True Then MsgBox "expDate = " & Split(.Text, "_0000")(0)
End With
Application.ScreenUpdating = True
End Sub
And, for all such dates in a document:
Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "[0-9]{4}_[0-9]{2}_[0-9]{2}_0000"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchWildcards = True
.Execute
End With
Do While .Find.Found = True
MsgBox "expDate = " & Split(.Text, "_0000")(0)
.Collapse wdCollapseEnd
.Find.Execute
Loop
End With
Application.ScreenUpdating = True
End Sub

VBA Word - Finding specific text in a document and replacing the content of the paragraph following that specific text

Here is a sample of text from my word document :
https://www.noelshack.com/2018-31-2-1533054408-word.png
I am new to VBA and I am trying to write a macro that looks for the specific text """"Eligible Currency"" means the Base Currency and each other currency specified here:" and replace the two following lines (filled with some dots, not necessarily in the same paragraph) with a list of text (for instance : Euro, Dollar).
So far I have been able to loop through the document, find the specific text and edit it, using the code :
Sub FindAndFormat()
Dim objWord As Object
Dim wdDoc As Object
Dim ParagraphRange As Object
Dim intParaCount
On Error Resume Next
Set objWord = GetObject(, "Word.Application")
If Err.Number <> 0 Then
Set objWord = CreateObject("Word.Application")
End If
On Error GoTo 0
Set wdDoc = objWord.Documents.Open("D:\Modele.docx")
objWord.Visible = True
Dim Paragraph As Word.Paragraph
For Each Paragraph In wdDoc.Paragraphs
Set ParagraphRange = Paragraph.Range
ParagraphRange.Find.Text = """Eligible Currency"" means the Base Currency and each other currency specified here:"
ParagraphRange.Find.Execute
If ParagraphRange.Find.Found Then
ParagraphRange.Text = """Eligible Currency"" means the Base Currency and each other currency specified here: Euro, Dollar"
End If
Next
End Sub
Note that the style of the whole line is getting bold and italic.
https://www.noelshack.com/2018-31-2-1533055581-word2.png
What I really would like to achieve is replacing the dotty lines :
https://www.noelshack.com/2018-31-2-1533055647-word3.png
Now there may be several other dotty lines in my document, and they may not always contain exactly the same amount of dots.
Thank you for reading.
Try something along the lines of:
Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = """Eligible Currency""[!:]#:[ ….^13^l^t]{2,}"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchWildcards = True
.Execute
End With
Do While .Find.Found
.End = .End - 1
.Start = .Start + InStr(.Text, ":")
.Text = Chr(11) & vbTab
.Collapse wdCollapseEnd
.Text = "Euro, Dollar"
.Font.Bold = True
.Font.Italic = True
.Collapse wdCollapseEnd
.Find.Execute
Loop
End With
Application.ScreenUpdating = True
End Sub

To delete everything except for words between a start and end point

I happen to have problems trying to manipulate the below code to my liking.
First off, the code below deletes everything in between the start and end conditions I have stipulated in my program.
I would like to change this, to delete everything besides those stipulated between the start and end words.
Sub SomeSub()
Dim StartWord As String, EndWord As String
Dim Find1stRange As Range, FindEndRange As Range
Dim DelRange As Range, DelStartRange As Range, DelEndRange As Range
Application.ScreenUpdating = False
Application.DisplayAlerts = False
'Setting up the Ranges
Set Find1stRange = ActiveDocument.Range
Set FindEndRange = ActiveDocument.Range
Set DelRange = ActiveDocument.Range
'Set your Start and End Find words here to cleanup the script
StartWord = "From: Research.TA#traditionanalytics.com|Tradition Analytics Commentary| | |"
EndWord = "This message has been scanned for malware by Websense. www.websense.com"
'Starting the Find First Word
With Find1stRange.Find
.Text = StartWord
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
'Execute the Find
Do While .Execute
'If Found then do extra script
If .Found = True Then
'Setting the Found range to the DelStartRange
Set DelStartRange = Find1stRange
'Having these Selections during testing is benificial to test your script
DelStartRange.Select
'Setting the FindEndRange up for the remainder of the document form the end of the StartWord
FindEndRange.Start = DelStartRange.End
FindEndRange.End = ActiveDocument.Content.End
'Having these Selections during testing is benificial to test your script
FindEndRange.Select
'Setting the Find to look for the End Word
With FindEndRange.Find
.Text = EndWord
.Execute
'If Found then do extra script
If .Found = True Then
'Setting the Found range to the DelEndRange
Set DelEndRange = FindEndRange
'Having these Selections during testing is benificial to test your script
DelEndRange.Select
End If
End With
'Selecting the delete range
DelRange.Start = DelStartRange.Start
DelRange.End = DelEndRange.End
'Having these Selections during testing is benificial to test your script
DelRange.Select
'Remove comment to actually delete
DelRange.Delete
End If 'Ending the If Find1stRange .Found = True
Loop 'Ending the Do While .Execute Loop
End With 'Ending the Find1stRange.Find With Statement
End Sub
Hah! That's a new twist There's certainly more than one way to go about it; my inclination would be to work with (at least) three Ranges. Something like this:
Sub FindAndDeleteEverythingElse()
Dim strFind1 As String, strFind2 As String
Dim rngDoc As word.Range, rngFind1 As word.Range
Dim rngFind2 As word.Range
Dim bFound As Boolean
strFind1 = "You"
strFind2 = "directly."
Set rngDoc = ActiveDocument.content
Set rngFind1 = rngDoc.Duplicate
Set rngFind2 = rngDoc.Duplicate
With rngFind1.Find
.Text = strFind1
bFound = .Execute
End With
If bFound Then
With rngFind2.Find
.Text = strFind2
bFound = .Execute
End With
If bFound Then
rngDoc.End = rngFind1.Start
rngDoc.Delete
rngDoc.Start = rngFind2.End
rngDoc.End = ActiveDocument.content.End
rngDoc.Delete
End If
End If
End Sub
The "main" Range is that of the entire document: ActiveDocument.Content. The Range object is a bit different than other objects, if you set one Range to another it becomes that Range, not a copy. So you need the Duplicate method to make a copy of a Range. This lets you use Find independently for the various Ranges.
If the first Find is successful, the second one is executed. If that is also successful then the Document Range's end-point is set to the starting point of the successful Find and the content of the Range deleted. The Document Range is then re-defined to start at the end-point of the second found Range and end at the end of the Document, and deleted.
You will probably have to set more Range.Find properties than I did in this code snippet - I used the absolute minimum to make working with the Ranges clearer.
There maybe another way but till then you can do this.
try to add dummy character after and before your string like this
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchWildcards = False
.Text = "From: Research.TA#traditionanalytics.com|Tradition Analytics Commentary| | |"
.Replacement.Text = "From: Research.TA#traditionanalytics.com|Tradition Analytics Commentary| | |######"
.Execute Replace:=wdReplaceAll
.Text = "This message has been scanned for malware by Websense. www.websense.com"
.Replacement.Text = "######This message has been scanned for malware by Websense. www.websense.com"
.Execute Replace:=wdReplaceAll
End With
End With
End Sub
Then try to set range between ###### and ######
this is best answer to set range select a range of text from one Word document and copy into another Word document
Please note that in my word 2007 it is not possible to find within hyperlinks. Try to remove all hyperlink or within range before doing replacement.
another best answer for that: How do you remove hyperlinks from a Microsoft Word document?