Identify Invalid Characters - vba

I need to loop through each letter of a word document and compare each against a list of valid characters. If the current character is not in the list, its font color should be changed.
I am a newbie to VBA and have written a small looping code but it takes very long to go through even a small word file. Code is below -
Sub LoopThruFile()
Dim doc As Document
Dim CurrChar As String
Application.ScreenUpdating = False
Set doc = ActiveDocument
For i = 1 To doc.Range.Characters.Count
CurrChar = doc.Range.Characters(i)
If InStr("01234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,.-_()/#:&\%", CurrChar) = 0 Then
doc.Range.Characters(i).Font.ColorIndex = wdRed
End If
Next
Application.ScreenUpdating = True
End Sub
Is there a better and faster code to do this?

Looping individual characters is slow. One thing that could speed performance in your example is to reduce the number of hierarchy levels working against the Range directly:
Dim doc as Document
Dim docRange as Range
Dim CurrChar as String
Set doc = ActiveDocument
Set docRange = doc.Content
Note that Document.Range is actually a method that expects two parameters; Document.Content automatically returns the entire Range as a property, so would be more correct.
Then, the Character object is actually a Range object. The VBA has to convert your line of code to include the Text property in order to assign CurrChar to a String. Probably doesn't make a lot of difference, but more correct and probably a little faster:
CurrChar = docRange.Characters(i).Text
Sometimes a loop can be faster if you run from the end of the document to the beginning:
For i = docRange.Characters.Count to 1 Step -1
CurrChar = docRange.Characters(i).Text
If InStr("01234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,.-_()/#:&\%", CurrChar) = 0 Then
docRange.Characters(i).Font.ColorIndex = wdRed
End If
Next
You could also try using For Each, which might be the fastest in this scenario since you save multiple calls of doc.Range.Characters(i) each of which consumes resources.
Dim CurrChar as Range
For Each CurrChar in docRange.Characters
If InStr("01234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,.-_()/#:&\%", CurrChar.Text) = 0 Then
CurrChar.Font.ColorIndex = wdRed
End If
Next

The following worked for me. Try this:
With ActiveDocument.Content.Find
.ClearFormatting
.Replacement.ClearFormatting
.text = "[!abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0-9\,\.\-_\(\)\/\\\#\:\&\%]"
.Replacement.text = ""
.Replacement.Font.Color = wdColorRed
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With

Related

How to print row of found string?

I'd like to find several strings within Word document and for each string found, I like to print (debug.print for example) the whole row content where the string is found, not the paragraph.
How can I do this? Thanks
Sub FindStrings
Dim StringsArr (1 to 3)
StringsArr = Array("string1","string2","string3")
For i=1 to 3
With
Selection.Find
.ClearFormatting
.Text = Strings(i)
Debug.Print CurrentRow 'here I need help
End With
Next
End Sub
The term Row in Word is used only in the context of a table. I assume the term you mean is Line, as in a line of text.
The Word object model has no concept of "line" (or "page") due to the dynamic layout algorithm: anything the user does, even changing the printer, could change where a line or a page breaks over. Since these things are dynamic, there's no object.
The only context where "line" can be used is in connection with a Selection. For example, it's possible to extend a Selection to the start and/or end of a line. Incorporating this into the code in the question it would look something like:
Sub FindStrings()
Dim StringsArr As Variant
Dim bFound As Boolean
Dim rng As Word.Range
Set rng = ActiveDocument.content
StringsArr = Array("string1", "string2", "string3")
For i = LBound(StringsArr) To UBound(StringsArr)
With rng.Find
.ClearFormatting
.Text = StringsArr(i)
.Wrap = wdFindStop
bFound = .Execute
'extend the selection to the start and end of the current line
Do While bFound
rng.Select
Selection.MoveStart wdLine, -1
Selection.MoveEnd wdLine, 1
Debug.Print Selection.Text
rng.Collapse wdCollapseEnd
bFound = .Execute
Loop
End With
Set rng = ActiveDocument.content
Next
End Sub
Notes
Since it's easier to control when having to loop numerous times, a Range object is used as the basic search object, rather than Selection. The found Range is only selected for the purpose of getting the entire line as these "Move" methods for lines only work on a Selection.
Before the loop can continue, the Range (or, if we were working with a selection, the selection) needs to be "collapsed" so that the code does not search and find the same instance of the search term, again. (This is also the reason for Wrap = wdFindStop).

Iterate through paragraphs and trim spaces in MS Word

I need to create a macros which removes whitespaces and indent before all paragraphs in the active MS Word document. I've tried following:
For Each p In ActiveDocument.Paragraphs
p.Range.Text = Trim(p.range.Text)
Next p
which sets macros into eternal loop. If I try to assign string literal to the paragraphs, vba always creates only 1 paragraph:
For Each p In ActiveDocument.Paragraphs
p.Range.Text = "test"
Next p
I think I have a general misconception about paragraph object. I would appreciate any enlightment on the subject.
The reason the code in the question is looping is because replacing one paragraph with the processed (trimmed) text is changing the paragraphs collection. So the code will continually process the same paragraph at some point.
This is normal behavior with objects that are getting deleted and recreated "behind the scenes". The way to work around it is to loop the collection from the end to the front:
For i = ActiveDocument.Paragraphs.Count To 1 Step -1
Set p = ActiveDocument.Paragraphs(i)
p.Range.Text = Trim(p.Range.Text)
Next
That said, if the paragraphs in the document contain any formatting this will be lost. String processing does not retain formatting.
An alternative would be to check the first character of each paragraph for the kinds of characters you consider to be "white space". If present, extend the range until no more of these characters are detected, and delete. That will leave the formatting intact. (Since this does not change the entire paragraph a "normal" loop works.)
Sub TestTrimParas()
Dim p As Word.Paragraph
Dim i As Long
Dim rng As Word.Range
For Each p In ActiveDocument.Paragraphs
Set rng = p.Range.Characters.First
'Test for a space or TAB character
If rng.Text = " " Or rng.Text = Chr(9) Then
i = rng.MoveEndWhile(" " + Chr(9))
Debug.Print i
rng.Delete
End If
Next p
End Sub
You could, of course, do this in a fraction of the time without a loop, using nothing fancier than Find/Replace. For example:
Find = ^p^w
Replace = ^p
and
Find = ^w^p
Replace = ^p
As a macro this becomes:
Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Range
.InsertBefore vbCr
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchWildcards = False
.Text = "^p^w"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
.Text = "^w^p"
.Execute Replace:=wdReplaceAll
End With
.Characters.First.Text = vbNullString
End With
Application.ScreenUpdating = True
End Sub
Note also that trimming text the way you're doing is liable to destroy all intra-paragraph formatting, cross-reference fields, and the like; it also won't change indents. Indents can be removed by selecting the entire document and changing the paragraph format; better still, modify the underlying Styles (assuming they've been used correctly).
Entering "eternal" loop is a bit unpleasant. Only Chuck Norris can exit one. Anyway, try to make a check before trimming and it will not enter:
Sub TestMe()
Dim p As Paragraph
For Each p In ThisDocument.Paragraphs
If p.Range <> Trim(p.Range) Then p.Range = Trim(p.Range)
Next p
End Sub
As has been said by #Cindy Meister, I need to prevent endless creation of another paragraphs by trimming them. I bear in mind that paragraph range contains at least 1 character, so processing range - 1 character would be safe. Following has worked for me
Sub ProcessParagraphs()
Set docContent = ActiveDocument.Content
' replace TAB symbols throughout the document to single space (trim does not remove TAB)
docContent.Find.Execute FindText:=vbTab, ReplaceWith:=" ", Replace:=wdReplaceAll
For Each p In ActiveDocument.Paragraphs
' delete empty paragraph (delete operation is safe, we cannot enter enternal loop here)
If Len(p.range.Text) = 1 Then
p.range.Delete
' remove whitespaces
Else
Set thisRg = p.range
' shrink range by 1 character
thisRg.MoveEnd wdCharacter, -1
thisRg.Text = Trim(thisRg.Text)
End If
p.LeftIndent = 0
p.FirstLineIndent = 0
p.Reset
p.range.Font.Reset
Next
With Selection
.ClearFormatting
End With
End Sub
I saw a number of solutions here are what worked for me. Note I turn off track changes and then revert back to original document tracking status.
I hope this helps some.
Option Explicit
Public Function TrimParagraphSpaces()
Dim TrackChangeStatus: TrackChangeStatus = ActiveDocument.TrackRevisions
ActiveDocument.TrackRevisions = False
Dim oPara As Paragraph
For Each oPara In ActiveDocument.StoryRanges(wdMainTextStory).Paragraphs
Dim oRange As Range: Set oRange = oPara.Range
Dim endRange, startRange As Range
Set startRange = oRange.Characters.First
Do While (startRange = Space(1))
startRange.Delete 'Remove last space in each paragraphs
Set startRange = oRange.Characters.First
Loop
Set endRange = oRange
' NOTE: for end range must select the before last characted. endRange.characters.Last returns the chr(13) return
endRange.SetRange Start:=oRange.End - 2, End:=oRange.End - 1
Do While (endRange = Space(1))
'endRange.Delete 'NOTE delete somehow does not work for the last paragraph
endRange.Text = "" 'Remove last space in each paragraphs
Set endRange = oPara.Range
endRange.SetRange Start:=oRange.End - 1, End:=oRange.End
Loop
Next
ActiveDocument.TrackRevisions = TrackChangeStatus
End Function

VBA Word macro not working as expected with field results in document

I have a word document (report) and in that document, I'm importing many text files with fields like this:
{INCLUDETEXT "C:\\PATH\\TOXMLFILES\\Request.xml" \*CHARFORMAT}
Also I'm updating all those fields with a macro on opening the document...
Sub AutoOpen()
With Options
.UpdateFieldsAtPrint = True
.UpdateLinksAtPrint = True
End With
ActiveDocument.Fields.Update
End Sub
Now I need to highlight the text of those imported XMLs (in the IncludeText fields) between <faultstring></faultstring> tags
Here is code I got here on stackoverflow for highlighting text (making it bold)
Sub BoldBetweenQuotes()
' base for a quotes finding macro
Dim blnSearchAgain As Boolean
Dim blnFindStart As Boolean
Dim blnFindEnd As Boolean
Dim rngFind As word.Range
Dim rngFindStart As word.Range
Dim rngFindEnd As word.Range
Set rngFind = ActiveDocument.content
Set rngFindStart = rngFind.Duplicate
Do
' set up find of first of quote pair
With rngFindStart.Find
.ClearFormatting
.Text = "<faultstring>"
.Replacement.Text = ""
.Forward = True
.wrap = wdFindStop
blnFindStart = .Execute
End With
If blnFindStart Then
rngFindStart.Collapse wdCollapseEnd
Set rngFindEnd = rngFindStart.Duplicate
rngFindEnd.Find.Text = "</faultstring>"
blnFindEnd = rngFindEnd.Find.Execute
If blnFindEnd Then
rngFindStart.End = rngFindEnd.Start
' make it bold
rngFindStart.Font.Bold = True
rngFindStart.Start = rngFindEnd.End
rngFindStart.End = rngFind.End
blnSearchAgain = True
Else
blnSearchAgain = False
End If
Else
blnSearchAgain = False
End If
Loop While blnSearchAgain = True
End Sub
Problem is, when I run the macro in my Word document (with the IncludeText fields) it keeps cycling and bolding just the first appearance of text between faultstring tags. When I run it in a new Word document with some random text and faultrstring tags it works well...
EDIT: It turns out the problem is due to the faultstring tags being inside the IncludeText fields. I need to turn the fields into static text after opening the document and updating the fields. How can I do that?
In order to convert dynamic field content to static text using Word's object model (such as VBA) the Fields.Unlink method is required. For the entire document:
ActiveDocument.Fields.Unlink
This is also possible for any given Range; to remove the fields in the last paragraph, for example:
ActiveDocument.Paragraphs.Last.Range.Fields.Unlink
In order to unlink only a certain type of field, loop the Fields collection, test the Field.Type and unlink accordingly. For example, for IncludeText:
Sub DeleteIncludeTextFields()
Dim doc As word.Document
Set doc = ActiveDocument
Debug.Print DeleteFieldType(wdFieldIncludeText, doc)
End Sub
Function DeleteFieldType(fldType As word.WdFieldType, doc As word.Document) _
As Long
Dim fld As word.Field
Dim counter As Long
counter = 0
For Each fld In doc.Fields
If fld.Type = wdFieldIncludeText Then
fld.Unlink
counter = counter + 1
End If
Next
DeleteFieldType = counter
End Function
Assuming you want to do this for all the fields in your document, after updating it:
Sub AutoOpen()
With Options
.UpdateFieldsAtPrint = True
.UpdateLinksAtPrint = True
End With
ActiveDocument.Fields.Update
ActiveDocument.Fields.Unlink
End Sub

Word VBA: finding a set of words and inserting predefined comments

I need to automate the insertion of comments into a word document: searching for a predefined set of words (sometimes word strings, and all non case-sensitive) each to which I add a predefined comment.
There are two word sets, with two goals:
Wordset 1: identical comment for each located word
Wordset 2: individual comments (I suggest new text based on the word identified)
I have been semi-automating this with a code that IDs all identified words and highlights them, helping me through the process (but I still need to enter all the comments manually - and I've also been able to enter comments - but only on one word at a time.) As my VBA skills are limited, my attempts to compile a robust macro from bits of other code with similar purposes has unfortunately led me nowhere.
Below are the bits of code I've been using.
Sub HighlightWordList()
Dim range As range
Dim i As Long
Dim TargetList
TargetList = Array("word1", "word2", "word3")
For i = 0 To UBound(TargetList)
Set range = ActiveDocument.range
With range.Find
.Text = TargetList(i)
.Format = True
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
Do While .Execute(Forward:=True) = True
range.HighlightColorIndex = wdYellow
Loop
End With
Next
End Sub
The following code has been able to get me to insert bubbles directly
Sub CommentBubble()
'
'
Dim range As range
Set range = ActiveDocument.Content
Do While range.Find.Execute("Word x") = True
ActiveDocument.Comments.Add range, "my comment to enter in the bubble"
Loop
End Sub
I've tried to have the process repeat itself by doing as shown below, but for reasons I'm certain are evident to many of you (and completely unknown to me) - this strategy has failed, working for "word x" but failing to function for all subsequent words:
Sub CommentBubble()
'
'
Dim range As range
Set range = ActiveDocument.Content
Do While range.Find.Execute("Word x") = True
ActiveDocument.Comments.Add range, "my 1st comment to enter in the bubble"
Loop
Do While range.Find.Execute("Word y") = True
ActiveDocument.Comments.Add range, "my 2nd comment to enter in the bubble"
Loop
End Sub
I've mixed and matched bits of these codes to no avail. Any ideas to help me with either wordset?
Thanks for everyone's help!
Best regards
Benoit, you're almost there! All you need to do is redefine the range object after your first loop (because it would have been exhausted at that point). Like so:
Sub CommentBubble()
Dim rng As range
Set rng = ActiveDocument.Content
Do While rng.Find.Execute("Word x") = True
ActiveDocument.Comments.Add rng, "my 1st comment to enter in the bubble"
Loop
Set rng = ActiveDocument.Content ' <---------------Add This.
Do While rng.Find.Execute("Word y") = True
ActiveDocument.Comments.Add rng, "my 2nd comment to enter in the bubble"
Loop
End Sub
That should do the trick for you (it works on my end). If not, let me know.

Search sentence and replace for hyperlink in Word VB.NET

I trying to replace or search and add hyperlink to specyfy sentence in Word document. I try using this codes. Anyway code is changing only first finding word, not all in document:
Dim r As Word.Range
r = Globals.ThisAddIn.Application.ActiveDocument.Content
With r.Find
.ClearFormatting()
.Text = ("MyWordA MyWordB")
.MatchWholeWord = True
.Forward = True
.Execute()
'If .Found = True Then r.Hyperlinks.Add(r, "http:\\www.whatever", , "Displayed text")
Do While .Execute(Forward:=True) = True
r.Hyperlinks.Add(r, "http:\\www.whatever", , "Displayed text")
'r.Font.ColorIndex = Word.WdColorIndex.wdBlue 'works for all(?)
Loop
End With
Eaven when I want to find only single word in loop for, then code find first one:
doc = Globals.ThisAddIn.Application.ActiveDocument
Dim r As Word.Range = doc.Range
Dim ww As Word.Range
For Each ww In r.Words
If ww.Text = "MyWord" Then _
ww.Hyperlinks.Add(ww, "http:\\www.whatever", , "Displayed text")
Next
Anyone could tell me how I can search all text to replace/add hyperlinks to all text I was looking for?
The problem is that you keep finding the same text over and over again. Within your loop, after adding the hyperlink, you need to move the range after the added hyperlink. The simplest way to do this is to collapse the range by calling
r.Collapse(WdCollapseDirection.wdCollapseEnd)
To troubleshoot issues like this it is helpful to select the current range so that you can see what is going on.
Do While .Execute(Forward:=True) = True
' select range for troubleshooting
r.Select()
r.Hyperlinks.Add(r, "http:\\www.whatever", , "Displayed text")
' move the range after the link
r.Collapse(WdCollapseDirection.wdCollapseEnd)
Loop