Find non-printable characters using MS word VBA - vba

I have written a macro which searches in a word document for a certain wind-card defined text string and then find which paragraph this found match belongs to. Everything works fine, but now I have the following challenge: I need to find non-printable characters, specifically I need to find the Ms word index references, which are show in a word like {XE “text to index”}.
I found a dilemma, that whilst when I call the Ms Word find dialogue box (CNTRL+F) and then define the wildcard search pattern ‘XE “*”’ then ms Word finds these. However when I pass the same pattern for VBA find function, then it does not find them, so I observe some difference between behavior of manually called Find function and the one from VBA.
Any idea how to find these non-printable characters using VBA?
Just for info, this is how I call find function
With range1.Find
.Text = searchString
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
‘some code goes here
End With

KazJaw, thank you for a link, very informative.
But, I must say, it did not help in my case. Still the problem which I had is that MS Word Search would produce different results depending on whether it was called manually (CNTRL+F) or from VBA (range.find). The difference is, once again, that manually-called search function finds non-printable index-related characters (provided they are show) but the VBA-called function does not do it.
At certain moment I though that it was fixed, by inserting the line to programmatically show non-printable characters ( ActiveWindow.ActivePane.View.ShowAll = True). It was important that it was programmatically, as an opposite to running macro on the document, where it was manually enabled. But even in this case the behaviour was very unstable: it would only work say 1 time out of 10 on exactly the same document. At that moment I must say I thought I was getting crazy because of this instability, but my colleague confirmed that he independently has stumbled across the same issue.
Therefore we concluded that using range.find function to search for non-printable characters does not produce a stable result in MS Word.
In our case we reached our goal by searching for indexes (fields objects) directly as an opposite of searching for them as for patterns of non-printable characters
For Each aField In range1.Fields
If aField.Type = 4 Then
aField.Select
pageOfFoundIndexEntry = Selection.Information(wdActiveEndPageNumber)
textOfFoundIndexEntry = Mid(aField.Code, 4)
...

Related

Find selected words from Word document

I'm trying to find selected words from Word document using Word VBA but I'm getting stuck e.g. "25to30" anysuggestions below are the code used in wordvba
Selection.find.Execute FindText:="([0-9]{1,})([To])([0-9]{1,})", MatchWildcards:=False, Forward:=True
Dpedending on whether there is a space either side of 'To':
FindText:="(<[0-9]#)(To)([0-9]#>)
or
FindText:="(<[0-9]#)( To )([0-9]#>)
And, unless you're going to do something with the individual components of what you find, there's no point having the parentheses.

Convert Bulletpoints into Listitems in Word

In my word-doc there are Automatically imported lists like lookin like that:
- Listitem one
- Listitem two
- Listitem three
- ...
They are only pagraphs starting with a dash '-'. So im trying to convert them into lists:
Selection.WholeStory
Selection.MoveLeft Unit:=wdCharacter, count:=1
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Style = ActiveDocument.Styles("Aufzählungszeichen")
With Selection.Find
.Text = "^p- "
.Replacement.Text = "^p "
.Forward = True
.Wrap = wdFindStop
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
However this results in lists that are only styled like a list but don't behave like real lists in word. How can i replace the paragraphs (starting ^p-) with real list items through a macro?
Word actually has a built-in feature that can convert text lists into bullet lists. It's disappeared from the UI in recent years, but is still available in the list of "Commands not in the Ribbon" and in the object model. It's called AutoFormat. These can be added to the Ribbon or the QAT, or used with code.
For example select the list then
Selection.Range.AutoFormat
or, to use the UI functionality (which will show and interactive dialog allowing finer control):
Application.CommandBars.ExecuteMso("AutoFormat")
There are also options settings for controlling what AutoFormat does
Application.CommandBars.ExecuteMso("AutoFormatOptions")
It's also possible to apply a style using Find/Replace, but it's important that the style is linked to a list. Create a new style if you don't have one already, then...
Go to the Multilevel list control on the Ribbon and choose Define new list style. Assign a name, click Format, choose Numbering and define the list properties (assign a bullet symbol from Number style for this level). Click the More button and from Link level to style choose the style name that should be used.
Now, when you run the Find/Replace code in the question, using the style name (not the list style name) it should apply the list as well as the style formatting.
If you run into issues with defining the style pair, best to ask in an end-user venue such as Super User.
You're on the right track with applying a style. If the style applied is a member of a list numbering style, the text will behave as a list. The trick is to apply one of the styles that are a member of the list style, since you can't apply the list style directly.
To clarify this difference, here is Shauna Kelly's article about creating numbering lists: How to create numbered headings or outline numbering in Word
In her article, "Headings" is the list style used to organize the sub-styles. If you created a list style using this article, you would apply Heading 1, Heading 2, etc. as a style.

VBA-Word bug: Undo list get error if use UndoRecord to record "Replace All" while there is a "Apply Quick Style" in the undo list. How to avoid?

On Word window, do something like typing, format font, paragraph... to ensure the undo list is not empty, and then change style of some text by clicking any Style on Ribbon. An entry named "Apply Quick Style" appear in the undo list. Then run macro like:
Sub SampleMacro()
Dim myUndoRecord As UndoRecord
Set myUndoRecord = Application.UndoRecord
myUndoRecord.StartCustomRecord ("VBA - Format Text")
'I do a lot of step here, but for this example, just simple like below
Selection.Characters(1).Bold = True 'just for example
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "Find Text"
.Replacement.Text = "Replace Text"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
'Word undo list get error after below step
Selection.Find.Execute Replace:=wdReplaceAll
'no crash, no error message, but the entry "Apply Quick Style"
'become to "Replace All", and Word can't go back before that entry
myUndoRecord.EndCustomRecord
End Sub
After this line of code:
Selection.Find.Execute Replace:=wdReplaceAll
The entry named "Apply Quick Style" in the undo list will become to "Replace All" and I can't undo (by press Ctrl-Z, or click arrow button on Quick Access Toolbar) to go back any step before that entry "Replace All". It's allway appear in undo list and Word will not go back anymore.
How to avoid this bug?
I'm using Word 2016 pro 64 bit
Additional information: using copy paste by press Ctrl+C, Ctrl+V (from other document to current document) instead of "Apply Quick Style" also get error, the entry "Paste" in undo list also rename to "Replace all". The different is still able to go back before that entry. Maybe there is another way will get error if use UndoRecord to record "Replace all".
Update 10/01
A working work around
The problem is specific to wdReplaceAll. If that specific wdConstant is omitted or replaced then "Apply Quick Style" will not be renamed and the undo stack remains accessible.
Lucky for us, Find.Execute returns a Boolean value (True for success). That means, we can loop with wdFindOne to replace all matches and use .Execute = False as the exit condition.
Add the word Do above your With block and replace the line with .Execute.
Do
With Selection.Find
[....]
End With
Loop While Selection.Find.Execute(Replace:=wdReplaceOne)
A word of caution: some circumstances will create an infinite loop (such as replacing "A" with "A"). As such, you should consider using a second exit condition or replacing wdFindContinue with wdFindAsk or wdFindStop.
Update 9/30
(Edit 10/01) Oops!! The Apply Quick Style entry was not renamed (good) but I wrongly assumed the undo stack could be reached when the same limitation exists (bad). Also, testing today, I learned there is a third condition: .Execute must not find any matches (worse). Suffice it to say, this is an exemplary demonstration of how not to test a solution, I hope everyone learned thier lesson!
When I create a new document and follow your steps, I can consistently reproduce the issue you describe. Thank you for making that easy!
As much as I am able to replicate the problem, I am also able prevent it by meeting two conditions.
Ensure Replace All is listed in the Undo Stack above Apply Quick Style.
Replace All is applied to the entire document (the problem persists if Replace All is applied to a Selection within the document)
This method worked regardless if it was done manually with 'Ctrl + H' or if it was part of a macro. Replacing a character with the same character was sufficient.
A screenshot showing the execution point after the problematic line:
The Undo Stack contains an intentionally placed Replace All to preserve Apply Quick Style.
This article is for Excel but it's relevant for Word.
https://excel.tips.net/T002060_Preserving_the_Undo_List.html
In short, you're on your own.
You have two options: revert to a previously saved version; or, write a macro that mimics Undo and ensure this macro is running before you start the one that messes with the Undo list

Using VBA in Word, how can I find a specific piece of text and continue numbering in an outline

Using VBA in Microsoft Word, how can I automatically search for a specific piece of text, remove that piece of text and make Word continue a previous list. I can record a macro of the action of clicking the list button in Word, and it gives me some code involving Selection.Range.ListFormat.ApplyListTemplateWithLevel. I would like to be able to figure out how to find a piece of code and then automatically continue the previous list.
Here's what I have before the code starts:
First sentence
Second sentence
Third sentence
Fourth sentence
*/R*Fifth sentence
Here's what I want to have after the code finishes:
First sentence
Second sentence
Third sentence
Fourth sentence
Fifth sentence
Can you provide some more context to why you're looking to do this? This code will accomplish what you asked for (as long as the list already exists, and */R* is inside the list), but I imagine there's a better way.
With Selection.Find
.Text = "*/R*"
.Wrap = wdFindContinue
End With
If Selection.Find.Execute = True Then
Selection.TypeParagraph
End If

Macro to delete all repeated instances of text in word doc

I'm looking for a simple way to delete repeated text in a Word 2007 document. If there are some shortcuts with the Find/Replace commands, I'm not seeing it. Otherwise, can someone recommend how I might write a macro that works like the following:
1- Select a block of text (could be mulitple paragraphs, have bullet points, etc).
2- Run the macro or do the command.
3- The macro or command deletes all instances of the selected text block.
Any insight here?
Selection.Text returns the current selection's text.
In principle, the syntax for your Replace command is:
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = Selection.Text
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
You replace by an empty string, thus delete every instance of the text to find.
But: You have to decide of course how to handle the formatting of the selected text, maybe have to analyze your selected block first because replacing might not work with control chars neglected by Selection.Text ... This is just the start, you need to specify what you want and then ask again, yourself or us. Meanwhile Record macro and the word vba reference are your friends.
For alpha number text (with out newlines), you can use Find/Replace however as soon as you get in the realm of bullets, new paragraphs, etc. it will no longer do exact matching. If you often need to do that, I would suggest using a program like LaTeX to write your documents. LaTeX would allow you do this style of exact matching of large blocks of text. If you're on a Windows machine, a great LaTeX package would be proTeXt which can be found at http://www.tug.org/protext/
I had exactly the same problem, huge mailing list and I just needed the emails. I solved it by copying the text in excel, filtering sentence by sentence (or bullets) the paragraph i wanted to remove and deleting all rows. Mine was more than 270 pages, worked just fine (as long as the text is not too long, it was a lot faster than replaceing sentence by sentence in word.) Or if it is an option, just filter with a text filter - begins with "to:" , then you are done in 10 secs. Hope it helps.
Just Press Ctrl+H to select and replace text, Type the text you want to remove and Press Replace All leaving replace with field BLANK