Remove repeated adjacent words in a word document - vba

word document may contain repeated adjacent words. can there be a vba macro code to retain single occurrence, and delete the repeat.
eg,
He is is doing well.
should change to
He is doing well.
Help would be much appreciated.

I can't try it because office.live.com doesn't seem to support wildcards, but you can try this:
In Find and Replace > Replace > check Use wildcards and in the Find what: enter "(<*>) <\1>" and click Find Next to see if that matches the two words. If it does, enter "\1" in Replace with: and click Replace All to see if everything works as expected. If it does, you can Record Macro of those steps and check the generated code.
The above expression should also find repeating numbers like a123 a123. If you don't want that, you can try this expression in the Find what:
(<[A-Za-z]{1,}>) \1[!A-Za-z]
from http://www.louiseharnbyproofreader.com/blog-the-proofreaders-parlour/proofreading-in-word-one-of-my-favourite-findreplace-strings

Related

VBA Macro accepting spelling suggestions and removing repeated words

I'm trying to write a VBA macro that accepts all spelling suggestions. I've got this long automatic transcriptions full of things like repeated words. For some reason with MS Word Editor does not have an accept all function. This is what I'm working with:
Sub AcceptSpellingSuggestions()
Dim er As Range
For Each er In ActiveDocument.SpellingErrors
If er.GetSpellingSuggestions.Count > 0
Then er.Text = er.GetSpellingSuggestions.Item(1).Name
End If
Next
End Sub
This works with suggestions that have several options (it selects the 1st), but the problem is that errors like repeated words do not have suggestions. The options there are just "Ignore once" and "Remove repeated words". How can I make the macro select that second option?
You can clean up repeated words with a wildcard Find/Replace, where:
Find = (<*>) \1
Replace = \1
No macro needed, but you can incorporate it into one. Do note that the cases of the repeated words must be the same. Note also that word repeats are sometimes intentional.

How to find and replace mixing format text in MS-Word or using VBA

Here is an example of what I want. Suppose a text:
paper[1], some texts[2], paper[3]
Here is the expected result ==>
paper[1], some texts[2], paper[3]
That is, I want to replace all "paper[1]" with "paper[1]" and similarly, replace "paper[3]" with "paper[3]" but keep the texts[2] unchanged.
I have noticed that word can not search the mixing format text, e.g., I can not find the text "paper[3]". So I may need the VBA to achieve this. Any ideas? Thanks!
You don't even need VBA for that! A wildcard Find/Replace can be used, where:
Find = paper\[[13]\]
Replace = ^&
and the replacement font is set to 'not superscript'. If you really want a macro, record the above.

MS Word, how to change formatting of entire paragraphs automatically in whole document?

I have a 20-page word document punctuated with descriptive notes throughout, like this:
3 Input Data Requirements
Some requirement text.
NOTE: This is a descriptive note about the requirement, which is the paragraph that I would like to use find-and-replace or a VBA script to select automatically and change the formatting to italicized. The notes invariably end in a carriage-return: ΒΆ.
If it was just a text document, not MS-Word, I would just use a regex in a code editor like sublime to wrap it with <I>...</I> or something along those lines.
Preferably, is there a way to do this in Word's "advanced" find-and-replace feature? Or if not, what's the best way to do it in VBA?
I've tried using a search string like this in find-and-replace: NOTE: *[a-z0-9,. A-Z)(-]{1,255}^l but the line-break part doesn't seem to work, and the 255 char max isn't enough for many of the paragraphs.
EDIT: Another slightly important detail: The doc is automatically generated from another piece of software as a .RTF, which I promptly converted to .docx.
Attempt #2: Use Notepad++ to find and replace using regex. Remove quotes.
Find: "( NOTE: .*?)\r"
Replace with: " \i \1 \i0 \r "
//OLD
Sure is. No VBA or fancy tricks needed.
CTRL + H to bring up the replace dialog.
Click "More".
Select "Font" in the drop down menu called "Format".
Click italics.
Enter find and replace text as the same thing. Make sure you set this up right so that you don't accidentally replace substrings (e.g. goal to replace all " test " with " nice ", testing -> niceing).
Should work. If you need to alter entire paragraphs, consistently, then you probably should have used the styles on those paragraphs to begin with. That way, you can change all of them at once by updating the style itself.
You can use Advance Find, yes. Find Next and then Replace makes the selection Italic.

Check if string follows format

Basically I have created an acronym finding macro and it works well except it includes all of our reference numbers. Now unfortunately changing the search parameters won't work as many acronyms include both letters and numbers.
My idea was to compare the string, once found, and if it is in the reference number format e.g.
LetterNumberNumberLetterLetterNumberNumberNumberNumber
I will simply not include it.
I'm certain there must be a simple way of doing this and me not being able to locate it is a case of not knowing what to search for but anyway thank you in advance for the help.
Use LIKE:
'//LetterNumberNumberLetterLetterNumberNumberNumberNumber
if ucase$("A12BC3456") like "[A-Z][0-9][0-9][A-Z][A-Z][0-9][0-9][0-9][0-9]" then
msgbox "is ref no."

VBA Macro to extract strings from word doc

i have a word document containing several strings. These strings have the first part always the same, for example ABC_001, ABC_002, ABC_003. I need to search for "ABC_" substring in the doc, extract all the occurences ("ABC_001", "ABC_002", "ABC_003") and copy them in an Excel sheet.
Anyone can help?
Thanks in advance.
You can reference the VBScript Regular Expressions 5.5 and regex them.
Have a look at http://www.macrostash.com/2011/10/08/simple-regular-expression-tutorial-for-excel-vba/
and http://txt2re.com/
and some of VBA multiple matches within one string using regular expressions execute method
EDIT:
Actually it is probably easier to go to data and "Get external data" choose de-limiter and import, either manually or record a macro to get a feeling for the vba structure.
This should get you all the entrys in seperate cells, then go over them with a MID to get the part you need