Word document highlight issues - vba

Hi I am very new to VBA word programming. I am trying to use regex and word range to select a pattern of words in a word document containing 108 pages to highlight them in Yellow and Green. When I execute the VBA code the word document hangs for a minute of 2 until the code has processed the request. Please check the code below and suggest.
**extract of the word**
*QR233A(M/W)
if LRD233 xs , LRD237 xs
LRDE233 xs , LRDE237 xs
then #R233A(M/W)
\
.
*QZR233A(M/W) #R233A(M/W) .
*QAR233A(M/W)
if LRD233 xs , LRD237 xs
LRDE233 xs , LRDE237 xs
LARSUDKFTHJS s , LARSUDKFLMS s
then #R233A(M/W)
\
.
*R233A(M/W)
if R233A(M/W) a
P831 cnf , P833 cnf , L(PB-1)SETTINGAVAIL xs
LSGPBA xs
then R233A(M/W) s
P831 cn , P833 cn
**VBA Code**
Sub Reminder_Highlight()
Dim match As VBScript_RegExp_55.match
Dim matches As VBScript_RegExp_55.MatchCollection
Dim myrange As Range
Dim rng3 As Selection
Dim counter As Integer
Set myrange = ActiveDocument.Content
Set rng3 = Selection
Dim Panel_request As Boolean
Dim Reminder_latch As Boolean
With New VBScript_RegExp_55.RegExp
.Pattern = "(\*Q(A|R|RD)\S+|LRD\S+\s(xs\s+,|xs)|(\#R|\*R)\S+)"
.Global = True
Set matches = .Execute(rng3.Text)
End With
Debug.Print matches.Count
For Each match In matches
myrange.SetRange rng3.Characters(match.FirstIndex + 1).Start, rng3.Characters(match.FirstIndex + match.Length).End
If Left(match, 1) = "#" Or Mid(match, 1, 2) = "*R" Then myrange.HighlightColorIndex = wdBrightGreen Else myrange.HighlightColorIndex = wdYellow
Debug.Print matches.Item(counter) & " "; counter
counter = counter + 1
Next
Set matches = Nothing
Set rng2 = Nothing
Set rng1 = Nothing
Set rng3 = Nothing
End Sub

Try this.
Sub HighlightSpecificWords()
Dim sArr() As String
Dim rTmp As Range
Dim x As Long
sArr = Split("highlight specific words") ' your list
Options.DefaultHighlightColorIndex = wdYellow
For x = 0 To UBound(sArr)
Set rTmp = ActiveDocument.Range
With rTmp.Find
.Text = sArr(x)
.Replacement.Text = sArr(x)
.Replacement.Highlight = True
.Execute Replace:=wdReplaceAll
End With
Next
End Sub
Before:
After:

Related

Delete duplicate words from document more efficiently

I compare each word with other and check if that is the duplicate if yes then delete it. For 1 to 4 pages it takes at most 5 minutes.
For a document of 50 or 100 pages I need of modification or a new idea to compare and delete duplicates with less time.
Sub Delete_Duplicates()
'***********'
'By
'MBA
'***********'
Dim AD As Range
Dim F As Range
Dim i As Long
Set AD = ActiveDocument.Range
Z = AD.Words.Count
y = 1
For i = Z To 1 Step -1
y = y + 1
Set F = AD.Words(i)
On Error Resume Next
Set s = AD.Words(i - 1)
If Trim(AD.Words(i - 1)) = "," Then Set s = AD.Words(i - 2): Set c = AD.Words(i - 1)
If Err.Number > 0 Then Exit Sub
If Not F.Text = Chr(13) And UCase(Trim(F.Text)) = UCase(Trim(s.Text)) Then
F.Text = ""
If Not c Is Nothing Then c.Text = " ": Set c = Nothing
End If
If Not c Is Nothing Then Set c = Nothing
On Error Resume Next
Call ProgressBar.Progress(y / Z * 100, True) '<<-- Progress Bar
On Error GoTo 0
Next
Beep
End Sub
Before/After
Assuming that the entire document is plain text, we can assign the entire document's text and use Split to convert it into array of words.
Since it's in array, it will be faster to process through them all vs accessing the Words collection.
This is all I can think of but perhaps there's a better way to do this? Below example uses Regex to search through and replace all matched duplicate:
Option Explicit
Sub Delete_Duplicate()
Const maxWord As Long = 2 'Change this to increase the max amount of words should be used to match as a phrase.
Dim fullTxt As String
fullTxt = ActiveDocument.Range.Text
Dim txtArr() As String
txtArr = Split(fullTxt, " ")
Dim regex As RegExp
Set regex = New RegExp
regex.Global = True
regex.IgnoreCase = True
Dim outputTxt As String
outputTxt = fullTxt
Dim n As Long
Dim i As Long
For i = UBound(txtArr) To 0 Step -1
Dim matchWord As String
matchWord = vbNullString
For n = 0 To maxWord - 1
If (i - n) < 0 Then Exit For
matchWord = txtArr(i - n) & " " & matchWord
matchWord = Trim$(Replace(matchWord, vbCr, vbNullString))
regex.Pattern = matchWord & "[, ]{0,}" & matchWord
If regex.test(outputTxt) Then
outputTxt = regex.Replace(outputTxt, matchWord)
End If
Next n
Next i
Set regex = Nothing
Application.UndoRecord.StartCustomRecord "Delete Duplicates"
ActiveDocument.Range.Text = outputTxt
Application.UndoRecord.EndCustomRecord
End Sub
You might try something along the lines of:
Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Replacement.Text = "\1"
.Forward = True
.Wrap = wdFindContinue
.MatchWildcards = True
.Text = "([A-Za-z0-9'’]#)[, ]#\1"
.Execute
Do While .Found = True
.Execute Replace:=wdReplaceAll
Loop
.Text = "([A-Za-z0-9'’]#[, ]#[A-Za-z0-9'’]#)[, ]#\1"
.Execute
Do While .Found = True
.Execute Replace:=wdReplaceAll
Loop
End With
End With
Application.ScreenUpdating = True
End Sub
It is only conception but try to prepare list of all words in document and replace double or triple words if existing.
Private Sub DeleteDuplicate()
Dim wholeTxt As String
Dim w As Range
Dim col As New Collection
Dim c
For Each w In ActiveDocument.Words
AddUniqueItem col, Trim(w.Text)
Next w
wholeTxt = ActiveDocument.Range.Text
For Each c In col
'add case with ","
'maybe one letter word should be forbidden, or add extra boundary
If InStr(1, wholeTxt, c & " " & c, vbBinaryCompare) <> 0 Then
'start of doc
Selection.HomeKey wdStory
'here should be all stuff to prepare replacement
'(...)
Selection.Find.Execute Findtext:=c & " " & c, ReplaceWith:=c
wholeTxt = ActiveDocument.Range.Text
End If
Next c
Set col = Nothing
End Sub
Private Sub AddUniqueItem(ByRef col As Collection, ByVal itemValAndKey As String)
Dim s As String
On Error Resume Next
s = col(itemValAndKey)
If Err.Number <> 0 Then
col.Add itemValAndKey, itemValAndKey
Err.Clear
End If
On Error GoTo 0
End Sub

Remove highlighting in a specified range

I have highlighted paragraphs in a Word document, from which I have to remove highlighting from 3rd to 5th character of each paragraph.
By searching for highlighted ranges within Set r = ActiveDocument.Range in VBA the segments of text are perfectly found.
The error appears on the line r(Start:=r_start, End:=r_end).HighlightColorIndex = wdNoHighlight.
Compile error: Wrong number of arguments or invalid property assignment.
How danI correctly specify the subrange from 3rd to 5th character within the range r? Your help is appreciated.
Dim r as Range
Dim r_start As Integer
Dim r_end As Integer
r_start = 2
r_end = 5
Set r = ActiveDocument.Range
With r.Find
.Highlight = True
Do While .Execute(FindText:="", Forward:=True) = True
if r.Characters.Count > 7 Then
r(Start:=r_start, End:=r_end).HighlightColorIndex = wdNoHighlight
End If
r.Collapse 0
Loop
End With
The problem causing the error message is that only the Range method (as in Document.Range) takes arguments. The Range object, since it's not a method, can take no arguments. In order to set the Start and End of a Range object you need the properties of those names. So:
r.Start = r.Start + r_start
r.End = r.Start + r_end
Your code has a number of other problems which I encountered while testing. For example, if you set the Start position to r_start and the End position to r_end the Range r will be from the second to the fifth characters of the entire document, not the second to fifth characters of the Range r. That's why the two lines of code, above, have been changed from your original.
The next problem is that the code, as it stands, goes into an infinite loop since the search always begins from within the "found" highlighting. For that reason I've added a variable to capture the end point of the originally Found range and use that as the starting point for the Range to be searched in each loop. The end of the Range to search is set to the end of the document.
Here's my sample code:
Sub FindRemoveHighlighting()
Dim r As Range, rDoc As Range
Dim r_foundEnd As Long
Dim r_start As Long
Dim r_end As Long
r_start = 2
r_end = 5
Set rDoc = ActiveDocument.content
Set r = rDoc.Duplicate
With r.Find
.Highlight = True
.Text = ""
.Format = True
.Format = True
Do While .Execute() = True
If r.Characters.Count > 7 Then
rFoundEnd = r.End
r.Start = r.Start + r_start
r.End = r.Start + r_end
r.HighlightColorIndex = wdNoHighlight
End If
r.Start = rFoundEnd
r.End = rDoc.End
Loop
End With
End Sub
You were almost there, I modified the code and tested it and it worked perfectly on my end. It will find any highlighted range in the document and will remove the highlights from character 2 to character 5:
Sub GetHighlights()
Dim rng1 As Range
Dim rng2 As Range
Dim r_start As Integer
Dim r_end As Integer
r_start = 2
r_end = 5
Set rng1 = ActiveDocument.Range
With rng1.Find
.Highlight = True
Do While .Execute(FindText:="", Forward:=True) = True
If rng1.Characters.Count > 7 Then
Set rng2 = ActiveDocument.Range(Start:=rng1.Start + r_start, End:=rng1.Start + r_end)
rng2.HighlightColorIndex = wdNoHighlight
End If
rng1.Collapse 0
Loop
End With
End Sub

Applying style to particular words

I am using RegEx search, for find out the particular word in my MS-Word document, and the search result is stored into a variable. My problem is I want to apply a custom style only for the search result
Input:
worldwide[1,2]. Before, during or after the [1,3,4][1,2,4,5] [1,2,6,7,8] [1,2] [1,2]
I am using the following code
Sub RegexReplaces()
Set matches = New regExp
Dim Sure As Integer
Dim rng As Range
matches.Pattern = "([\[\(][0-9, -]*[\)\]])"
matches.Global = True
Dim mat As MatchCollection
Set mat = matches.Execute(ActiveDocument.Range)
For Each m In mat
Sure = MsgBox("Are you sure?" + m, vbOKCancel)
If Sure = 1 Then
m.Style = ActiveDocument.Styles("Heading 1") 'this is the error line
Else
MsgBox "not1111"
End If
Next m
End Sub
The For Each m In mat loop iterates over each item in the mat collection. M is not a range. You need to set a range starting at m.FirstIndex and ending at m.FirstIndex + m.Length. Then you'll need to select the range and use Selection.Style to style the range.
Sub RegexReplaces()
Set matches = New regExp
Dim Sure As Integer
Dim rng As Range
matches.Pattern = "([\[\(][0-9, -]*[\)\]])"
matches.Global = True
Dim mat As MatchCollection
Set mat = matches.Execute(ActiveDocument.Range)
For Each m In mat
Sure = MsgBox("Are you sure?" + m, vbOKCancel)
If Sure = 1 Then
Set rng = ActiveDocument.Range(Start:=m.FirstIndex, End:=m.Length + m.FirstIndex)
rng.Select
Selection.Style = ActiveDocument.Styles("Heading 1")
Else
MsgBox "not1111"
End If
Next m
End Sub

How to get all text between <strong> </strong> in MS word to turn Bold using VBA?

Basically I want to transform the text in between the tags into bold. This text will always be in the comments. The current code doesnt do anything.
I am not really sure if this code makes any sense at all, but I usually use VBA for Excel and word seems to be a bit trickier.
Sub Bold()
Dim eCom As Comment
Dim iFound As Integer
Dim rbold As Range
Dim iDot As Integer
Dim flag As Boolean
Dim aDoc As Document
Set aDoc = ActiveDocument
flag = True
Application.ScreenUpdating = False
For Each eCom In ActiveDocument.Comments
iFound = InStr(eCom.Range.Text, "<strong>")
iDot = 0
If iFound > 0 Then
iDot = InStrRev(eCom.Range, "</") - iFound + 1
Set rbold = aDoc.Range(Start:=eCom.Range.Start + iFound, End:=eCom.Range.Start + InStrRev(eCom.Range, "<"))
rbold.Select
Selection.Font.Bold = wdToggle
End If
Next eCom
Application.ScreenUpdating = True
End Sub
There are a few problems here. First, it appears that the Comment Ranges do not use the same numbering as the document ranges. So
Set rbold = aDoc.Range(Start:=eCom.Range.Start + iFound, End:=eCom.Range.Start + InStrRev(eCom.Range, "<"))
is not actually the range in the comments, it is instead a range in the document starting with the place in the comment that has the strong html tag.
Second, even if this was working, it would start the bolding in the wrong place, starting with "strong>"
Third, there's no reason to select the range, just set it to bold.
This code will do what you want (I commented out a line as I couldn't figure out what it was supposed to do):
Sub Bold()
Dim eCom As Comment
Dim iFound As Integer
Dim rbold As Range
Dim iDot As Integer
Dim flag As Boolean
Dim aDoc As Document
Dim newCom As Comment
Set aDoc = ActiveDocument
flag = True
Application.ScreenUpdating = False
For Each eCom In ActiveDocument.Comments
iFound = InStr(eCom.Range.Text, "<strong>")
iDot = 0
If iFound > 0 Then
'iDot = InStrRev(eCom.Range, "</") - iFound + 1
Set rbold = eCom.Range
rbold.MoveEnd Unit:=wdCharacter, Count:=-(Len(rbold) - InStrRev(rbold, "</") + 1)
rbold.MoveStart Unit:=wdCharacter, Count:=iFound + Len("<strong>") - 1
rbold.Bold = True
End If
Next eCom
Application.ScreenUpdating = True
End Sub

VBA Excel lookup

I need help in finding an exact match by using VBA in Excel. Here is my object 7 problem.
Objective - to batch process finding and replacing words.
This is a routine task which I'm trying to automate. The task involves finding terms and then replacing them with an alternate word. E.g if the term to be found is "microsoft", I want it to be replaced with say "Company".
While majority of the code is working the limitation is --> if there are two words to be found e.g. 1. Gold 2. Golden and then replace "gold" with "metal" and golden with " mineral here's what happens. If the code find Golden anywhere then the word gold is replaced first and the end product looks like this. Metalen. can someone please help?
Dim wksheet As Worksheet
Dim wkbook As Workbook
Dim fo_filesys As New Scripting.FileSystemObject
Dim RegExpObject As Object
Private Sub cmd_Start_Click()
Dim lsz_dest_path As String
Dim lsz_extn_to_use As String
Dim lsz_filename As String
Dim li_rowtoread As Integer
Application.DisplayAlerts = False
Application.ScreenUpdating = False
lsz_dest_path = VBA.Strings.Trim(Cells(1, 6))
lsz_extn_to_use = VBA.Strings.Trim(Cells(2, 6))
Set RegExpObject = CreateObject("VBScript.RegExp")
RegExpObject.IgnoreCase = True
RegExpObject.Global = True
lsz_filename = Dir(lsz_dest_path & "\" & lsz_extn_to_use)
Do While lsz_filename <> ""
Application.StatusBar = "Scrubbing " & lsz_filename
Set wkbook = Workbooks.Open(lsz_dest_path & "\" & lsz_filename)
For Each wksheet In wkbook.Worksheets
wksheet.Activate
li_rowtoread = 2
Do While Cells(li_rowtoread, 1) <> ""
user_process_file Cells(li_rowtoread, 1), Cells(li_rowtoread, 2), lsz_filename
li_rowtoread = li_rowtoread + 1
DoEvents
Loop
Next wksheet
wkbook.Close True
lsz_filename = Dir
Loop
Application.StatusBar = ""
End Sub
Sub user_process_file(lsz_searh_str As String, lsz_replace_str As String, filename As String)
Dim myRange As Range
Dim lo_tstream As TextStream
Dim lo_reader_tstream As TextStream
Dim lsz_file As String
Dim lb_replaced As Boolean
If fo_filesys.FileExists(filename & ".log") Then
Set lo_reader_tstream = fo_filesys.OpenTextFile(filename & ".log", ForReading)
lsz_file = lo_reader_tstream.ReadAll
lo_reader_tstream.Close
End If
If lsz_searh_str = "RRD" Then
' MsgBox "Here"
End If
Set myRange = wksheet.Cells
myRange.Cells.Find(What:="", After:=ActiveCell, lookat:=xlPart).Activate
'myRange.Replace What:=lsz_searh_str, Replacement:=lsz_replace_str, LookAt:=xlWorkbook, MatchCase:=False, searchorder:=xlByRows ', LookIn:=xlFormulas
With myRange
Set c = .Find(lsz_searh_str, LookIn:=xlValues, lookat:=xlPart)
If Not c Is Nothing Then
firstAddress = c.Address
Do
c.Value = CustomReplace(c.Value, lsz_searh_str, lsz_replace_str)
Set c = .FindNext(c)
Loop While Not c Is Nothing And c.Address <> firstAddress
End If
End With
Set lo_tstream = fo_filesys.OpenTextFile(filename & ".log", ForAppending, True)
lb_replaced = myRange.Replace(What:=lsz_searh_str, Replacement:=lsz_replace_str, lookat:=xlWhole, MatchCase:=True, searchorder:=xlByRows)
If lb_replaced = True Then
lo_tstream.WriteLine lsz_replace_str
lo_tstream.Close
End If
End Sub
Function user_eval(lookfor As String, loc_data As String) As Boolean
Dim lsz_val_at_loc As String
If InStr(1, loc_data, lookfor) = 1 Then
user_eval = True
Else
user_eval = False
End If
End Function
Function CustomReplace(OriginalString As String, FindString As String, ReplaceString As String)
RegExpObject.Pattern = "[^a-zA-Z0-9]*" & FindString & "[^a-zA-Z0-9]*"
CustomReplace = RegExpObject.Replace(OriginalString, ReplaceString)
End Function
I do not have permissions to add a comment, so answering the only way I can:
There is a problem with your regex find string [^a-zA-Z0-9]* and [^a-zA-Z0-9]*.
Try using \bgold\w+\b to match words starting with gold and \bgold\b to match gold exactly.
Although I'm answering late, it might help somebody who has a similar problem...