Word VBA to copy certain highlighted colors and paste in new document with no formatting lost - vba

I have a word document that is 180 pages and uses all the highlight Colors randomly throughout the document. The document has several different formats on it from italics, bullets and underline and different size fonts.
What I am trying to do is filter through the document select all paragraphs that contain a certain color highlight then paste it in a new document keeping all formatting in place. It then loops through again and selects the next color highlight and pastes it in the same new document with a page break in between or just a new document all together. I have been trying to figure this out for 2 days.
I have tried the formulas from this Word VBA copy highlighted text to new document and preserve formatting and other ones on Stack Overflow but none of them preserve all the formatting or one that I found I could only get it to copy the whole document with formatting and paste in but not the selected highlights.
This one does the trick but it removes all formatting and can't figure out how to place page break in.
Sub ExtractHighlightedTextsInSameColor()
Dim objDoc As Document, objDocAdd As Document
Dim objRange As Range
Dim strFindColor As String
Dim highliteColor As Variant
highliteColor = Array(wdYellow, wdTeal)
Set objDoc = ActiveDocument
Set objDocAdd = Documents.Add
objDoc.Activate
For i = LBound(highliteColor) To UBound(highliteColor)
With Selection
.HomeKey Unit:=wdStory
With Selection.Find
.Highlight = True
Do While .Execute
If Selection.Range.HighlightColorIndex = highliteColor(i) Then
Set objRange = Selection.Range
objDocAdd.Range.InsertAfter objRange & vbCr
Selection.Collapse wdCollapseEnd
End If
Loop
End With
End With
Next
End Sub
'This one only copies all text in document and not just highliteColor asked for
Sub HighlightedColor()
Dim objDoc As Document, objDocAdd As Document
Dim objRange As Range
Dim highliteColor As Variant
highliteColor = Array(wdYellow, wdTeal, wdPink)
Set objDoc = ActiveDocument
Set objDocAdd = Documents.Add
objDoc.Activate
For i = LBound(highliteColor) To UBound(highliteColor)
With Selection
.HomeKey Unit:=wdStory
With Selection.Find
.Highlight = True
Do While .Execute
If Selection.Range.HighlightColorIndex = highliteColor(i) Then
Set objRange = Selection.Range.FormattedText
objRange.Collapse wdCollapseEnd
objDocAdd.Content.FormattedText = objRange
End If
Loop
End With
End With
Next
End Sub
I expect the output to copy all text that are a certain highlight color, paste them into a new document preserving all formatting and then page break it. Go back select the next highlight color and paste in document until all colors are gotten.

I've made adjustments to your code based on what I understand you want to do. In some cases I tried to make it a little more readable, for example I removed one of the With methods.
Look closely at the use of FormattedText and how it is transferred from one range to another. And also look at the end of the routine for how a page break is inserted.
Sub ExtractHighlightedTextsInSameColor()
Dim objDoc As Document, objDocAdd As Document
Dim objRange As Range
Dim strFindColor As String
Dim highliteColor As Variant
Dim i As Long
highliteColor = Array(wdYellow, wdTeal)
Set objDoc = ActiveDocument
Set objDocAdd = Documents.Add
Set objRange = objDocAdd.Content
For i = LBound(highliteColor) To UBound(highliteColor)
objDoc.Activate
Selection.HomeKey unit:=wdStory
objRange.Collapse wdCollapseEnd
With Selection.Find
.ClearFormatting
.Forward = True
.Format = True
.Highlight = True
.Wrap = wdFindStop
.Execute
Do While .found
If Selection.Range.HighlightColorIndex = highliteColor(i) Then
' the following copies only the highlighted text
' objRange.FormattedText = Selection.Range.FormattedText
'if you want the entire paragraph that contains a highlighted text item then use this
objRange.FormattedText = Selection.Range.Paragraphs(1).Range.FormattedText
Selection.Collapse wdCollapseEnd
objRange.InsertParagraphAfter
objRange.Collapse wdCollapseEnd
Else
objRange.Collapse wdCollapseEnd
End If
.Execute
Loop
End With
objRange.Collapse wdCollapseEnd
If i < UBound(highliteColor) Then
'added a conditional check so an extra page break is not inserted at end of document
objRange.InsertBreak Word.WdBreakType.wdPageBreak
End If
Next
End Sub

Related

Copy/paste highlighted text into new different documents

I use Word documents which are highlighted in a number of colors. What I need is for the VBA to find a highlighted section and paste that text in a new document, find the next highlighted section in a different color and copy that text in a new (different) document). My code does that for all 15 colors available for highlighting in Word. However, even if the color is not present in the text, it will create a new document for that color. So every time I run my code, I get 15 new documents.
I need the code to ignore a highlight color if that color is not present, while still creating new (and different) documents for the colors that are present in the document.
For example, I might get a document that only has highlighted text in blue and green, so I would need two new documents for that. Or I might get a document that has highlighted text in blue, green, yellow and red, so I would need four new focuments, one for each color.
Any ideas on what I need to change?
Sub ExtractHighlightedTextsInSameColor()
Dim objDoc As Document, objDocAdd As Document
Dim objRange As Range
Dim strFindColor As String
Dim highliteColor As Variant
Dim i As Long
highliteColor = Array(wdYellow, wdBlack, wdBlue, wdBrightGreen, wdDarkBlue, wdDarkRed, wdDarkYellow, wdGreen, wdPink, wdRed, wdTeal, wdTurquoise, wdViolet, wdWhite)
Set objDoc = ActiveDocument
For i = LBound(highliteColor) To UBound(highliteColor)
Set objDocAdd = Documents.Add
Set objRange = objDocAdd.Content
objRange.Collapse wdCollapseEnd
objDoc.Activate
Selection.HomeKey unit:=wdStory
With Selection.Find
.ClearFormatting
.Forward = True
.Format = True
.Highlight = True
.Wrap = wdFindStop
.Execute
Do While .Found
If Selection.Range.HighlightColorIndex = highliteColor(i) Then
' the following copies only the highlighted text
objRange.FormattedText = Selection.Range.FormattedText
'if you want the entire paragraph that contains a highlighted text item then use this
' objRange.FormattedText = Selection.Range.Paragraphs(1).Range.FormattedText
Selection.Collapse wdCollapseEndwdYellow
objRange.InsertParagraphAfter
objRange.Collapse wdCollapseEnd
Else
objRange.Collapse wdCollapseEnd
End If
.Execute
Loop
End With
objRange.Collapse wdCollapseEnd
If i < UBound(highliteColor) Then
'added a conditional check so an extra page break is not inserted at end of document
objRange.InsertBreak Word.WdBreakType.wdPageBreak
End If
Next
End Sub
I think this is easier to manage if you split your code up a bit: first collect all of the highlighted ranges, and then process them.
This should be pretty close.
Sub TestDoc()
Dim col As Collection, rng As Range, dict As Object, hc As String
Dim doc As Document, destRng As Range
Set dict = CreateObject("scripting.dictionary") 'for tracking documents vs highlight colors
Set col = HighlightedRanges(ActiveDocument) 'first get all highlighted ranges
For Each rng In col
hc = CStr(rng.HighlightColorInde 'get the highlight color
If Not dict.Exists(hc) Then 'need a new doc for this one?
dict.Add hc, Documents.Add 'add doc and key to highlight color
End If
Set doc = dict(hc) 'get doc for this color
Set destRng = doc.Content 'copy the content over...
destRng.Collapse Direction:=wdCollapseEnd
destRng.InsertParagraphAfter
destRng.Collapse Direction:=wdCollapseEnd
destRng.FormattedText = rng.FormattedText
Next rng
End Sub
'return a collection of all highlighted ranges in `doc`
Function HighlightedRanges(doc As Document) As Collection
Dim rng As Range, col As New Collection
Set rng = doc.Content
With rng.Find
.ClearFormatting
.Forward = True
.Format = True
.Highlight = True
.Wrap = wdFindStop
Do While .Execute
col.Add doc.Range(rng.Start, rng.End) 'clone the range
Loop
End With
Set HighlightedRanges = col 'return all the found ranges
End Function

Find method search text with special character in variable

I am trying to create a macro that finds all text formulas and replaces it with OMath formulas.
I create a text, where all formulas are surrounded with special tags (in my case its word "formula"). Then I find all cases with regex and create variable that contains found formula. Then I give this variable to find method to create range, which I then modify.
Some formulas contain special characters (in my case its caret (^), which is used to create power) and find method doesn't select them.
How can I ignore special characters in variable that I give to find method?
Macro that I created:
Dim regexObject As Object
Set regexObject = CreateObject("VBScript.RegExp")
Dim matches As Object
Dim objEq As OMath
Dim objRange As Range
Dim match As Object
regexObject.Pattern = "formula(.*?)formula"
regexObject.MultiLine = True
Selection.WholeStory
While regexObject.test(Selection.Text)
Set matches = regexObject.Execute(Selection.Text)
For Each match In matches
MsgBox match
With Selection.Find
.Text = match
.MatchWildcards = False
.Execute
Set objRange = Selection.Range
objRange.Text = Mid(match, 8, Len(match) - 14)
Set objRange = Selection.OMaths.Add(objRange)
Set objEq = objRange.OMaths(1)
objEq.BuildUp
End With
Selection.WholeStory
Next
Wend
Sample Text:
formulaΨ=1,67∙0,72∙0,9∙1=1,09.formula
formulac_c=0,9formula
formulaE_q=Ψ WV_n^2/2,formula
formulac_m, c_e, c_c, c_sformula
For example:
Sub FindFormulaeCreateOMath()
Dim objEq As OMath
Dim findRange As Range
Dim eqRange As Range
Set findRange = ActiveDocument.Content
With findRange
With .Find
.Text = "formula*formula"
.MatchWildcards = True
End With
Do While .Find.Execute
.Text = Mid(.Text, 8, Len(.Text) - 14)
Set eqRange = findRange.OMaths.Add(.Duplicate)
Set objEq = eqRange.OMaths(1)
objEq.BuildUp
.Collapse wdCollapseEnd
Loop
End With
End Sub

VBA - Find paragraph starting with numbers

I'm using a VBA script to try to find the starting number of a paragraph (they are list items not formatted as such - not trying to format, just find the numbers).
1. First Item
2. Second Item
No number - don't include despite 61.5 in paragraph.
25 elephants should not be included
12. Item Twelve, but don't duplicate because of Susie's 35 items
Is there any way to say in VBA "If start of paragraph has 1-2 numbers, return those numbers". In regex, what I'm looking for is ^(\d\+)\.
Here is a working bit of VBA code - haven't figured out how to CREATE the excel file yet, so if you go to test create a blank test.xslx in your temp folder. Of course this may be simple enough that testing isn't necessary.
Sub FindWordCopySentence()
On Error Resume Next
Dim appExcel As Object
Dim objSheet As Object
Dim aRange As Range
Dim intRowCount As Integer
intRowCount = 1
' Open Excel File
If objSheet Is Nothing Then
Set appExcel = CreateObject("Excel.Application")
'Change the file path to match the location of your test.xls
Set objSheet = appExcel.workbooks.Open("C:\temp\test.xlsx").Sheets("Sheet1")
intRowCount = 1
End If
' Word Document Find
Set aRange = ActiveDocument.Range
With aRange.Find
Do
.ClearFormatting
' Find 1-2 digit number
.Text = "[0-9]{1,2}"
.MatchWildcards = True
.Execute
If .Found Then
' Copy to Excel file
aRange.Expand Unit:=wdSentence
aRange.Copy
aRange.Collapse wdCollapseEnd
objSheet.Cells(intRowCount, 1).Select
objSheet.Paste
intRowCount = intRowCount + 1
End If
Loop While .Found
End With
Set aRange = Nothing
If Not objSheet Is Nothing Then
appExcel.workbooks(1).Close True
appExcel.Quit
Set objSheet = Nothing
Set appExcel = Nothing
End If
End Sub
Thanks!
I would go quite a bit simpler and just check the first few characters of the paragraph:
Option Explicit
Sub test()
Dim para As Paragraph
For Each para In ThisDocument.Paragraphs
With para.Range
If (.Characters(2) = ".") Or (.Characters(3) = ".") Then
If IsNumeric(para.Range.Words(1)) Then
Debug.Print "Do something with paragraph number " & _
para.Range.Words(1) & "."
End If
End If
End With
Next para
End Sub
A more efficient approach, which obviates the need to test every paragraph:
Sub Demo()
Application.ScreenUpdating = False
Dim StrOut As String
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "^13[0-9.]{1,}" ' or: .Text = "^13[0-9]{1,}
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.MatchWildcards = True
.Execute
End With
Do While .Find.Found
StrOut = StrOut & .Text
' or: MsgBox Split(.Text, vbCr)(1)
.Collapse wdCollapseEnd
.Find.Execute
Loop
End With
Application.ScreenUpdating = True
MsgBox StrOut
End Sub
As coded, the macro returns the entire list strings where there may be multiple levels (e.g. 1.2). Comments show how to find just the first number where there may be multiple levels and how to extract that number for testing (the Find expression includes the preceding paragraph break).

Use VBA with Powerpoint to Search titles in a Word Doc and Copy Text into another Word Document

I'm working on a Powerpoint slide, where I few texts are listed. I have to search for these texts in a Word Document which has a lot of Headings and Texts. After I find the title text, I need to copy the text under the Heading and paste in a new document.
Basically, the VBA coding has to be done in the Powerpoint VBA, with two documents in the background for searching text and pasting it in another.
I've opened the word doc. But searching the text in it and selecting it for copying to another document is what I've not been able to do. Kindly help me.
I see. The following is not exactly elegant since it uses Selection which I always try to avoid but it is the only way I know to achieve such a thing.
Disclaimer 1: this is made in Word VBA, so you will need a slight adaption, like set a reference to Word, use a wrdApp = New Word.Application object and declare doc and newdoc explicitely as Word.Document.
Disclaimer 2: Since you search for text instead of the respective heading, beware that this will find the first occurence of that text so you better not have the same text in several chapters. ;-)
Disclaimer 3: I cannot paste anymore! :-( My clipboard is set, it pastes elsewhere but I just cannot paste in here.
Code follows with first edit, hopefully in a minute...
Edit: yepp, pasting works again. :-)
Sub FindChapter()
Dim doc As Document, newdoc As Document
Dim startrange As Long, endrange As Long
Dim HeadingToFind As String, ChapterToFind As String
ChapterToFind = "zgasfdiukzfdggsdaf" 'just for testing
Set doc = ActiveDocument
Set newdoc = Documents.Add
doc.Activate
Selection.HomeKey unit:=wdStory
With Selection
With .Find
.ClearFormatting
.Text = ChapterToFind
.MatchWildcards = False
.MatchCase = True
.Execute
End With
If .Find.Found Then
'**********
'Find preceding heading to know where chapter starts
'**********
.Collapse wdCollapseStart
With .Find
.Text = ""
.Style = "Heading 1"
.Forward = False
.Execute
If Not .Found Then
MsgBox "Could not find chapter heading"
Exit Sub
End If
End With
.MoveDown Count:=1
.HomeKey unit:=wdLine
startrange = .Start
'*********
'Find next heading to know where chapter ends
'*********
.Find.Forward = True
.Find.Execute
.Collapse wdCollapseStart
.MoveUp Count:=1
.EndKey unit:=wdLine
endrange = .End
doc.Range(startrange, endrange).Copy
newdoc.Content.Paste
newdoc.SaveAs2 doc.Path & "\" & HeadingToFind & ".docx", wdFormatFlatXML
Else
MsgBox "Chapter not found"
End If
End With
End Sub
Edit: If you need to search for a "feature" that will be in some table in column 1 with the description in column 2 and you need that description in a new doc, try this:
Sub FindFeature()
Dim doc As Document, newdoc As Document
Dim FeatureToFind As String
Dim ro As Long, tbl As Table
FeatureToFind = "zgasfdiukzfdggsdaf" 'just for testing
Set doc = ActiveDocument
Set newdoc = Documents.Add
doc.Activate
Selection.HomeKey unit:=wdStory
With Selection
With .Find
.ClearFormatting
.Text = FeatureToFind
.MatchWildcards = False
.MatchCase = True
.Execute
End With
If .Find.Found Then
Set tbl = Selection.Tables(1)
ro = Selection.Cells(1).RowIndex
tbl.Cell(ro, 2).Range.Copy
newdoc.Range.Paste
End If
End With
End Sub
Edit: Slight adaptation so you can paste without overwriting existing content in newdoc:
Instead of newdoc.Range.Paste just use something along the line of this:
Dim ran As Range
Set ran = newdoc.Range
ran.Start = ran.End
ran.Paste

VBA loop won't stop/doesn't find the "\EndofDoc" marker

I am writing a vba macro to search a word document line by line and trying to find certain names in the document. The looping works fine except for when it gets to the end of the document, it just continues from the top and starts over. Here is the code:
Application.ScreenUpdating = False
Dim i As Integer, Rng As Range
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "?"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchWildcards = True
.Execute
End With
Do While .Find.found
i = i + 1
Set Rng = .Duplicate
Set Rng = Rng.GoTo(What:=wdGoToBookmark, Name:="\line")
MsgBox "Line " & i & vbTab & Rng.Text
If Rng.Bookmarks.Exists("\EndOfDoc") Then Exit Do
.start = Rng.End
.Collapse wdCollapseEnd
.Find.Execute
Loop
End With
Set Rng = Nothing
Application.ScreenUpdating = True
I have also tried this piece of code:
Dim appWD As Word.Application
Dim docWD As Word.Document
Dim rngWD As Word.Range
Dim strDoc As String
Dim intVal As Integer
Dim strLine As String
Dim bolEOF As Boolean
bolEOF = False
' Set strDoc here to include the full
' file path and file name
On Error Resume Next
Set appWD = GetObject(, "Word.Application")
If Err.Number <> 0 Then
Set appWD = CreateObject("Word.Application")
End If
Err.Clear
On Error GoTo 0
strDoc = "c:\KenGraves\Project2\output\master.doc"
Set docWD = appWD.Documents.Open(strDoc)
appWD.Visible = True
docWD.Characters(1).Select
Do
appWD.Selection.MoveEnd Unit:=wdLine, Count:=1
strLine = appWD.Selection.Text
Debug.Print strLine
intVal = LineContainsDescendant(strLine)
If intVal = 1 Then
MsgBox strLine
End If
appWD.Selection.Collapse wdCollapseEnd
If appWD.Selection.Bookmarks.Exists("\EndOfDoc") Then bolEOF = True
Loop Until bolEOF = True
Neither seem to recognize the bookmark ("\EndOfDoc"). It doesn't matter which one gets working. Is it possible that my document does not contain this bookmark?
Not terribly elegant, but this change to one line of your first procedure seems to stop it at the appropriate time. I believe you actually have to insert bookmarks into your document if you want to reference them. They aren't automatically generated.
If i >= ActiveDocument.BuiltInProperties("NUMBER OF LINES") Then Exit Do
Cheers, LC
Unless you have a corrupted document, all Word documents should have the \EndOfDoc bookmark. You can check using simply ActiveDocument.Range.Bookmarks("\EndOfDoc").Exists. If it doesn't then you'll need to supply more details on the version of Word and if possible supply a sample document via Dropbox or the like.
I'm not sure why you're looping to the start of the Word document, when I run the code it works fine. However, if I put a footnote at the end of the document it runs into an endless loop, depending on your documents you may run into additional situations like this where your code fails to handle the document setup.
I would suggest modifying slightly how you check for the end of the document to make your code a bit more robust. I'd still use the bookmark "\EndOfDoc", however I'd check the limits of the range against your current search range.
So at the top of your code declare a range variable and set it to range of the end of the document eg:
Dim rEnd As Range
Set rEnd = ActiveDocument.Bookmarks("\EndOfDoc").Range
and then in your loop, instead of this line:
If Rng.Bookmarks.Exists("\EndOfDoc") Then Exit Do
use this line:
If Rng.End >= rEnd.End Then Exit Do