Regular expression not working in MS Word - vba

I have markups enclosing text in a word document.
If the markups enclose valid text or any valid dynamic field, the regular expression works :
But the second line, which contains an invalid cross-reference is not found by the search engine... whereas I need to find and delete it actually.
Edit on the 25/01/2018 :
Thanks for your answer.
I run the search from a macro, that is right. I have no error from the macro, the sequence [SP]invalid reference[\SP] is just not found
I actually want to select anything between the two markups (table, image, text, reference, fields ...) including the markups

It is impossible to find fields via wildcard search. I also cannot find an error field by searching for ^d without wildcard.
Therefore: use two loops in your code. Before searching for [SP], iterate over all fields, identify those with error and delete / Replace with a space:
Dim fld As Field, ran As Range
For Each fld In ActiveDocument.Fields
If InStr(1, fld.Result.text, "Erreur !") >= 0 Then
Set ran = fld.Result
fld.Delete
ran.InsertAfter " "
End If
Next fld
After that, run your normal wildcard search/replace for [SP]...[/SP] and you should be fine.

Related

VBA Macro accepting spelling suggestions and removing repeated words

I'm trying to write a VBA macro that accepts all spelling suggestions. I've got this long automatic transcriptions full of things like repeated words. For some reason with MS Word Editor does not have an accept all function. This is what I'm working with:
Sub AcceptSpellingSuggestions()
Dim er As Range
For Each er In ActiveDocument.SpellingErrors
If er.GetSpellingSuggestions.Count > 0
Then er.Text = er.GetSpellingSuggestions.Item(1).Name
End If
Next
End Sub
This works with suggestions that have several options (it selects the 1st), but the problem is that errors like repeated words do not have suggestions. The options there are just "Ignore once" and "Remove repeated words". How can I make the macro select that second option?
You can clean up repeated words with a wildcard Find/Replace, where:
Find = (<*>) \1
Replace = \1
No macro needed, but you can incorporate it into one. Do note that the cases of the repeated words must be the same. Note also that word repeats are sometimes intentional.

How can I find a word with a new line in the VBA editor using find and replace?

I would like to go through and find all of the "End" statements in my code but skipping all of the "End x" statements like "End If", "End Sub", "End function", etc.--Just the pure "End". My thought was to use pattern matching, but I am unsure of how to do that.
I already tried using "End\n" and "End[\n]".
Does anyone know how to search for words that end in new lines?
The "find" function in the VBA editor does not support this kind of parameter/functionality.
You will have to manually step through the results and skip the ones you don't want to skip, or manually modify the "End" instances you don't want to catch, then search & replace, and finally restore all the End instances back to what you want.
Apologies for answering so long after the question was asked, but thought this information would help future readers as this question is still being actively found.
#TylerH is right that the specific search requested by the user cannot be performed in the VBE Find tool. For information, when "Use Pattern Matching" is selected the VBE Find tool supports use of:
? - single character
* - zero or more characters (on the same line)
# - single digit (0 to 9)
[charlist] - any single character in charlist
[!charlist] - any single character not in charlist
... where charlist can be a range of characters (eg [A-Z]) but must be in order (eg [Z-A] is not valid), it can also include multiple ranges of characters (eg [A-BD-E] matches A, B, D or E). Also to match any of ?, * or # then enclose them in square brackets (eg [*] matches an asterisk).
This means the VBE Find tool performs very similarly (perhaps identically ... but I can't provide assurances, VB and VBA not being the same language) to the VB Like operator, for which documentation is here
The alternative (which will perform the specific search in the question) is to use the 'Find Text' tool in the VBE Add-In MZ-Tools - though note MZ-Tools is a paid-for tool ... please note I am NOT in any way associated with MZ-Tools or it's author. The search text to use in MZ-Tools for the specific search requested in the question is: end\r?$

Why does Msgbox display question marks instead of spaces that appear in text body?

dim oEmail As Outlook.mailitem
dim textbody as string
textbody = oEmail.body
msgbox textbody
Some incoming mail (foreign and domestic) contents appear fine in Outlook, but when I run the above macro program, the message box (variable textbody) shows text with question marks between words, instead of spaces.
To illustrate with example,
Outlook Mail reads:
Hello there how are you doing?
Msgbox shows:
Hello?there?how?are?you?doing?
It seems that characters are not stored properly in the variable.
The following test code results in "0" for the first instr(), while the latter code part results in ">0". It seems the question marks in the text body prevent proper detection of consecutive matching words in the string.
if InStr(1, LCase$(textbody), "how are you") > 0 Then
msgbox "found 3 consecutive matching words in string"
end if
if InStr(1, LCase$(textbody), "how") > 0 Then
msgbox "found a word match in string"
end if
Without a sample, I can't give an absolute answer, but most likely the question marks are representing Unicode Characters (so mostly your foreign characters) as ? since Unicode cannot be rendered in the font that is used by the MsgBox.
For example, an email that contains this:
Smiley Face [☺] Smile in Chinese [微笑]
...will render in the MsgBox as:
Smiley Face [?] Smile in Chinese [??]
The same goes if you try to display it in the Immediate Window with Debug.Print.
However, the correct characters are stored in the String. For example, if you were to programmatically put the value into an Excel cell, it would likely display properly:
That being said, I'm sure that regional versions of Windows/Office can properly display Unicode characters, or else foreign symbols could never be displayed in message boxes.
A workaround may be to change the default message box font to one that supports Unicode.
This article may also be helpful:
VBA: Unicode Strings and the Windows API

Remove repeated adjacent words in a word document

word document may contain repeated adjacent words. can there be a vba macro code to retain single occurrence, and delete the repeat.
eg,
He is is doing well.
should change to
He is doing well.
Help would be much appreciated.
I can't try it because office.live.com doesn't seem to support wildcards, but you can try this:
In Find and Replace > Replace > check Use wildcards and in the Find what: enter "(<*>) <\1>" and click Find Next to see if that matches the two words. If it does, enter "\1" in Replace with: and click Replace All to see if everything works as expected. If it does, you can Record Macro of those steps and check the generated code.
The above expression should also find repeating numbers like a123 a123. If you don't want that, you can try this expression in the Find what:
(<[A-Za-z]{1,}>) \1[!A-Za-z]
from http://www.louiseharnbyproofreader.com/blog-the-proofreaders-parlour/proofreading-in-word-one-of-my-favourite-findreplace-strings

MS Word, how to change formatting of entire paragraphs automatically in whole document?

I have a 20-page word document punctuated with descriptive notes throughout, like this:
3 Input Data Requirements
Some requirement text.
NOTE: This is a descriptive note about the requirement, which is the paragraph that I would like to use find-and-replace or a VBA script to select automatically and change the formatting to italicized. The notes invariably end in a carriage-return: ¶.
If it was just a text document, not MS-Word, I would just use a regex in a code editor like sublime to wrap it with <I>...</I> or something along those lines.
Preferably, is there a way to do this in Word's "advanced" find-and-replace feature? Or if not, what's the best way to do it in VBA?
I've tried using a search string like this in find-and-replace: NOTE: *[a-z0-9,. A-Z)(-]{1,255}^l but the line-break part doesn't seem to work, and the 255 char max isn't enough for many of the paragraphs.
EDIT: Another slightly important detail: The doc is automatically generated from another piece of software as a .RTF, which I promptly converted to .docx.
Attempt #2: Use Notepad++ to find and replace using regex. Remove quotes.
Find: "( NOTE: .*?)\r"
Replace with: " \i \1 \i0 \r "
//OLD
Sure is. No VBA or fancy tricks needed.
CTRL + H to bring up the replace dialog.
Click "More".
Select "Font" in the drop down menu called "Format".
Click italics.
Enter find and replace text as the same thing. Make sure you set this up right so that you don't accidentally replace substrings (e.g. goal to replace all " test " with " nice ", testing -> niceing).
Should work. If you need to alter entire paragraphs, consistently, then you probably should have used the styles on those paragraphs to begin with. That way, you can change all of them at once by updating the style itself.
You can use Advance Find, yes. Find Next and then Replace makes the selection Italic.