VBA - Find paragraph starting with numbers - vba

I'm using a VBA script to try to find the starting number of a paragraph (they are list items not formatted as such - not trying to format, just find the numbers).
1. First Item
2. Second Item
No number - don't include despite 61.5 in paragraph.
25 elephants should not be included
12. Item Twelve, but don't duplicate because of Susie's 35 items
Is there any way to say in VBA "If start of paragraph has 1-2 numbers, return those numbers". In regex, what I'm looking for is ^(\d\+)\.
Here is a working bit of VBA code - haven't figured out how to CREATE the excel file yet, so if you go to test create a blank test.xslx in your temp folder. Of course this may be simple enough that testing isn't necessary.
Sub FindWordCopySentence()
On Error Resume Next
Dim appExcel As Object
Dim objSheet As Object
Dim aRange As Range
Dim intRowCount As Integer
intRowCount = 1
' Open Excel File
If objSheet Is Nothing Then
Set appExcel = CreateObject("Excel.Application")
'Change the file path to match the location of your test.xls
Set objSheet = appExcel.workbooks.Open("C:\temp\test.xlsx").Sheets("Sheet1")
intRowCount = 1
End If
' Word Document Find
Set aRange = ActiveDocument.Range
With aRange.Find
Do
.ClearFormatting
' Find 1-2 digit number
.Text = "[0-9]{1,2}"
.MatchWildcards = True
.Execute
If .Found Then
' Copy to Excel file
aRange.Expand Unit:=wdSentence
aRange.Copy
aRange.Collapse wdCollapseEnd
objSheet.Cells(intRowCount, 1).Select
objSheet.Paste
intRowCount = intRowCount + 1
End If
Loop While .Found
End With
Set aRange = Nothing
If Not objSheet Is Nothing Then
appExcel.workbooks(1).Close True
appExcel.Quit
Set objSheet = Nothing
Set appExcel = Nothing
End If
End Sub
Thanks!

I would go quite a bit simpler and just check the first few characters of the paragraph:
Option Explicit
Sub test()
Dim para As Paragraph
For Each para In ThisDocument.Paragraphs
With para.Range
If (.Characters(2) = ".") Or (.Characters(3) = ".") Then
If IsNumeric(para.Range.Words(1)) Then
Debug.Print "Do something with paragraph number " & _
para.Range.Words(1) & "."
End If
End If
End With
Next para
End Sub

A more efficient approach, which obviates the need to test every paragraph:
Sub Demo()
Application.ScreenUpdating = False
Dim StrOut As String
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "^13[0-9.]{1,}" ' or: .Text = "^13[0-9]{1,}
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.MatchWildcards = True
.Execute
End With
Do While .Find.Found
StrOut = StrOut & .Text
' or: MsgBox Split(.Text, vbCr)(1)
.Collapse wdCollapseEnd
.Find.Execute
Loop
End With
Application.ScreenUpdating = True
MsgBox StrOut
End Sub
As coded, the macro returns the entire list strings where there may be multiple levels (e.g. 1.2). Comments show how to find just the first number where there may be multiple levels and how to extract that number for testing (the Find expression includes the preceding paragraph break).

Related

Update a FORMTEXT to a MERGEFIELD VBA

I have a project that I will likely be doing pretty regularly so I think a macro or vba module would be worth looking into.
The document has several [FORMTEXT] [FORMCHECKBOX] and such and I would like to automate replacing the [FORMTEXT] with {MERGEFIELD formtextname}. I've sone similar with {command} from Crystal to {mergefield } but that was just word replacement not field type. I found things about wdFormtextfield and wdMergeField just not sure how to .find type wdformtext. I assume if I can .find the type I can then .Replace with wdMergeField. I will be looping through the document. Any thoughts?
I might be going the wrong way but this is what I am thinking
Sub Change_FormTextToMergeField()
Application.ScreenUpdating = False
Dim StrFld As String 'change to a wd type for formtext
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "\{*\}"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchWildcards = True
End With
Do While .Find.Execute
i = 1
'add .Replace code to replace wdFieldFormTextInput with wdFieldMergeField
MsgBox .Words.Parent
i = i + 1
Loop
End With
Application.ScreenUpdating = True
End Sub
Sample Doc
The following procedure converts all FormFields to MergeFields perserving bookmarks as the Mergefield name
Sub FormFieldToMergeField() '
' A scratch Word macro coded by Charles Kenyon - based on Greg Maxey format
' https://stackoverflow.com/questions/75393869/update-a-formtext-to-a-mergefield-vba
' 2023-02-10
' started with Doug Robbins macro https://answers.microsoft.com/en-us/msoffice/forum/all/convert-formtext-fields-into-bookmarked-content/1f2d7aa2-a335-4667-9955-998d0525ba09
' Converts FormFields to Mail Merge Fields - the Bookmark name becomes the mergefield name
' If no FormField bookmark, then mergefield says Unnamed
'
' DECLARE VARIABLES / CONSTANTS
Dim oRng As range
Dim oFld As Field
Dim strName As String
Dim i As Long
'================================================
' ACTIONS
Application.ScreenUpdating = False
On Error GoTo lbl_Exit
With ActiveDocument
For i = .FormFields.Count To 1 Step -1
Set oRng = .FormFields(i).range
strName = .FormFields(i).Name
If strName = "" Then strName = "Unnamed Field " & i
.FormFields(i).Delete
oRng.Select
Set oFld = .Fields.Add(range:=oRng, Type:=wdFieldMergeField, Text:=strName)
Selection.Collapse wdCollapseStart
Next i
End With
' EXIT PROCEDURE
lbl_Exit:
' CLEAR ERROR HANDLER AND OBJECTS
Application.ScreenUpdating = True
Application.ScreenRefresh
On Error GoTo -1
Set oRng = Nothing
Set oFld = Nothing
Exit Sub
End Sub
If there is no bookmark, the mergefield will be to Unnamed.
Instructions for Installing Macros from Forums or Websites by Word MVP Graham Mayor

How to print row of found string?

I'd like to find several strings within Word document and for each string found, I like to print (debug.print for example) the whole row content where the string is found, not the paragraph.
How can I do this? Thanks
Sub FindStrings
Dim StringsArr (1 to 3)
StringsArr = Array("string1","string2","string3")
For i=1 to 3
With
Selection.Find
.ClearFormatting
.Text = Strings(i)
Debug.Print CurrentRow 'here I need help
End With
Next
End Sub
The term Row in Word is used only in the context of a table. I assume the term you mean is Line, as in a line of text.
The Word object model has no concept of "line" (or "page") due to the dynamic layout algorithm: anything the user does, even changing the printer, could change where a line or a page breaks over. Since these things are dynamic, there's no object.
The only context where "line" can be used is in connection with a Selection. For example, it's possible to extend a Selection to the start and/or end of a line. Incorporating this into the code in the question it would look something like:
Sub FindStrings()
Dim StringsArr As Variant
Dim bFound As Boolean
Dim rng As Word.Range
Set rng = ActiveDocument.content
StringsArr = Array("string1", "string2", "string3")
For i = LBound(StringsArr) To UBound(StringsArr)
With rng.Find
.ClearFormatting
.Text = StringsArr(i)
.Wrap = wdFindStop
bFound = .Execute
'extend the selection to the start and end of the current line
Do While bFound
rng.Select
Selection.MoveStart wdLine, -1
Selection.MoveEnd wdLine, 1
Debug.Print Selection.Text
rng.Collapse wdCollapseEnd
bFound = .Execute
Loop
End With
Set rng = ActiveDocument.content
Next
End Sub
Notes
Since it's easier to control when having to loop numerous times, a Range object is used as the basic search object, rather than Selection. The found Range is only selected for the purpose of getting the entire line as these "Move" methods for lines only work on a Selection.
Before the loop can continue, the Range (or, if we were working with a selection, the selection) needs to be "collapsed" so that the code does not search and find the same instance of the search term, again. (This is also the reason for Wrap = wdFindStop).

Iterate through paragraphs and trim spaces in MS Word

I need to create a macros which removes whitespaces and indent before all paragraphs in the active MS Word document. I've tried following:
For Each p In ActiveDocument.Paragraphs
p.Range.Text = Trim(p.range.Text)
Next p
which sets macros into eternal loop. If I try to assign string literal to the paragraphs, vba always creates only 1 paragraph:
For Each p In ActiveDocument.Paragraphs
p.Range.Text = "test"
Next p
I think I have a general misconception about paragraph object. I would appreciate any enlightment on the subject.
The reason the code in the question is looping is because replacing one paragraph with the processed (trimmed) text is changing the paragraphs collection. So the code will continually process the same paragraph at some point.
This is normal behavior with objects that are getting deleted and recreated "behind the scenes". The way to work around it is to loop the collection from the end to the front:
For i = ActiveDocument.Paragraphs.Count To 1 Step -1
Set p = ActiveDocument.Paragraphs(i)
p.Range.Text = Trim(p.Range.Text)
Next
That said, if the paragraphs in the document contain any formatting this will be lost. String processing does not retain formatting.
An alternative would be to check the first character of each paragraph for the kinds of characters you consider to be "white space". If present, extend the range until no more of these characters are detected, and delete. That will leave the formatting intact. (Since this does not change the entire paragraph a "normal" loop works.)
Sub TestTrimParas()
Dim p As Word.Paragraph
Dim i As Long
Dim rng As Word.Range
For Each p In ActiveDocument.Paragraphs
Set rng = p.Range.Characters.First
'Test for a space or TAB character
If rng.Text = " " Or rng.Text = Chr(9) Then
i = rng.MoveEndWhile(" " + Chr(9))
Debug.Print i
rng.Delete
End If
Next p
End Sub
You could, of course, do this in a fraction of the time without a loop, using nothing fancier than Find/Replace. For example:
Find = ^p^w
Replace = ^p
and
Find = ^w^p
Replace = ^p
As a macro this becomes:
Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Range
.InsertBefore vbCr
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchWildcards = False
.Text = "^p^w"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
.Text = "^w^p"
.Execute Replace:=wdReplaceAll
End With
.Characters.First.Text = vbNullString
End With
Application.ScreenUpdating = True
End Sub
Note also that trimming text the way you're doing is liable to destroy all intra-paragraph formatting, cross-reference fields, and the like; it also won't change indents. Indents can be removed by selecting the entire document and changing the paragraph format; better still, modify the underlying Styles (assuming they've been used correctly).
Entering "eternal" loop is a bit unpleasant. Only Chuck Norris can exit one. Anyway, try to make a check before trimming and it will not enter:
Sub TestMe()
Dim p As Paragraph
For Each p In ThisDocument.Paragraphs
If p.Range <> Trim(p.Range) Then p.Range = Trim(p.Range)
Next p
End Sub
As has been said by #Cindy Meister, I need to prevent endless creation of another paragraphs by trimming them. I bear in mind that paragraph range contains at least 1 character, so processing range - 1 character would be safe. Following has worked for me
Sub ProcessParagraphs()
Set docContent = ActiveDocument.Content
' replace TAB symbols throughout the document to single space (trim does not remove TAB)
docContent.Find.Execute FindText:=vbTab, ReplaceWith:=" ", Replace:=wdReplaceAll
For Each p In ActiveDocument.Paragraphs
' delete empty paragraph (delete operation is safe, we cannot enter enternal loop here)
If Len(p.range.Text) = 1 Then
p.range.Delete
' remove whitespaces
Else
Set thisRg = p.range
' shrink range by 1 character
thisRg.MoveEnd wdCharacter, -1
thisRg.Text = Trim(thisRg.Text)
End If
p.LeftIndent = 0
p.FirstLineIndent = 0
p.Reset
p.range.Font.Reset
Next
With Selection
.ClearFormatting
End With
End Sub
I saw a number of solutions here are what worked for me. Note I turn off track changes and then revert back to original document tracking status.
I hope this helps some.
Option Explicit
Public Function TrimParagraphSpaces()
Dim TrackChangeStatus: TrackChangeStatus = ActiveDocument.TrackRevisions
ActiveDocument.TrackRevisions = False
Dim oPara As Paragraph
For Each oPara In ActiveDocument.StoryRanges(wdMainTextStory).Paragraphs
Dim oRange As Range: Set oRange = oPara.Range
Dim endRange, startRange As Range
Set startRange = oRange.Characters.First
Do While (startRange = Space(1))
startRange.Delete 'Remove last space in each paragraphs
Set startRange = oRange.Characters.First
Loop
Set endRange = oRange
' NOTE: for end range must select the before last characted. endRange.characters.Last returns the chr(13) return
endRange.SetRange Start:=oRange.End - 2, End:=oRange.End - 1
Do While (endRange = Space(1))
'endRange.Delete 'NOTE delete somehow does not work for the last paragraph
endRange.Text = "" 'Remove last space in each paragraphs
Set endRange = oPara.Range
endRange.SetRange Start:=oRange.End - 1, End:=oRange.End
Loop
Next
ActiveDocument.TrackRevisions = TrackChangeStatus
End Function

Use VBA with Powerpoint to Search titles in a Word Doc and Copy Text into another Word Document

I'm working on a Powerpoint slide, where I few texts are listed. I have to search for these texts in a Word Document which has a lot of Headings and Texts. After I find the title text, I need to copy the text under the Heading and paste in a new document.
Basically, the VBA coding has to be done in the Powerpoint VBA, with two documents in the background for searching text and pasting it in another.
I've opened the word doc. But searching the text in it and selecting it for copying to another document is what I've not been able to do. Kindly help me.
I see. The following is not exactly elegant since it uses Selection which I always try to avoid but it is the only way I know to achieve such a thing.
Disclaimer 1: this is made in Word VBA, so you will need a slight adaption, like set a reference to Word, use a wrdApp = New Word.Application object and declare doc and newdoc explicitely as Word.Document.
Disclaimer 2: Since you search for text instead of the respective heading, beware that this will find the first occurence of that text so you better not have the same text in several chapters. ;-)
Disclaimer 3: I cannot paste anymore! :-( My clipboard is set, it pastes elsewhere but I just cannot paste in here.
Code follows with first edit, hopefully in a minute...
Edit: yepp, pasting works again. :-)
Sub FindChapter()
Dim doc As Document, newdoc As Document
Dim startrange As Long, endrange As Long
Dim HeadingToFind As String, ChapterToFind As String
ChapterToFind = "zgasfdiukzfdggsdaf" 'just for testing
Set doc = ActiveDocument
Set newdoc = Documents.Add
doc.Activate
Selection.HomeKey unit:=wdStory
With Selection
With .Find
.ClearFormatting
.Text = ChapterToFind
.MatchWildcards = False
.MatchCase = True
.Execute
End With
If .Find.Found Then
'**********
'Find preceding heading to know where chapter starts
'**********
.Collapse wdCollapseStart
With .Find
.Text = ""
.Style = "Heading 1"
.Forward = False
.Execute
If Not .Found Then
MsgBox "Could not find chapter heading"
Exit Sub
End If
End With
.MoveDown Count:=1
.HomeKey unit:=wdLine
startrange = .Start
'*********
'Find next heading to know where chapter ends
'*********
.Find.Forward = True
.Find.Execute
.Collapse wdCollapseStart
.MoveUp Count:=1
.EndKey unit:=wdLine
endrange = .End
doc.Range(startrange, endrange).Copy
newdoc.Content.Paste
newdoc.SaveAs2 doc.Path & "\" & HeadingToFind & ".docx", wdFormatFlatXML
Else
MsgBox "Chapter not found"
End If
End With
End Sub
Edit: If you need to search for a "feature" that will be in some table in column 1 with the description in column 2 and you need that description in a new doc, try this:
Sub FindFeature()
Dim doc As Document, newdoc As Document
Dim FeatureToFind As String
Dim ro As Long, tbl As Table
FeatureToFind = "zgasfdiukzfdggsdaf" 'just for testing
Set doc = ActiveDocument
Set newdoc = Documents.Add
doc.Activate
Selection.HomeKey unit:=wdStory
With Selection
With .Find
.ClearFormatting
.Text = FeatureToFind
.MatchWildcards = False
.MatchCase = True
.Execute
End With
If .Find.Found Then
Set tbl = Selection.Tables(1)
ro = Selection.Cells(1).RowIndex
tbl.Cell(ro, 2).Range.Copy
newdoc.Range.Paste
End If
End With
End Sub
Edit: Slight adaptation so you can paste without overwriting existing content in newdoc:
Instead of newdoc.Range.Paste just use something along the line of this:
Dim ran As Range
Set ran = newdoc.Range
ran.Start = ran.End
ran.Paste

VBA loop won't stop/doesn't find the "\EndofDoc" marker

I am writing a vba macro to search a word document line by line and trying to find certain names in the document. The looping works fine except for when it gets to the end of the document, it just continues from the top and starts over. Here is the code:
Application.ScreenUpdating = False
Dim i As Integer, Rng As Range
With ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "?"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchWildcards = True
.Execute
End With
Do While .Find.found
i = i + 1
Set Rng = .Duplicate
Set Rng = Rng.GoTo(What:=wdGoToBookmark, Name:="\line")
MsgBox "Line " & i & vbTab & Rng.Text
If Rng.Bookmarks.Exists("\EndOfDoc") Then Exit Do
.start = Rng.End
.Collapse wdCollapseEnd
.Find.Execute
Loop
End With
Set Rng = Nothing
Application.ScreenUpdating = True
I have also tried this piece of code:
Dim appWD As Word.Application
Dim docWD As Word.Document
Dim rngWD As Word.Range
Dim strDoc As String
Dim intVal As Integer
Dim strLine As String
Dim bolEOF As Boolean
bolEOF = False
' Set strDoc here to include the full
' file path and file name
On Error Resume Next
Set appWD = GetObject(, "Word.Application")
If Err.Number <> 0 Then
Set appWD = CreateObject("Word.Application")
End If
Err.Clear
On Error GoTo 0
strDoc = "c:\KenGraves\Project2\output\master.doc"
Set docWD = appWD.Documents.Open(strDoc)
appWD.Visible = True
docWD.Characters(1).Select
Do
appWD.Selection.MoveEnd Unit:=wdLine, Count:=1
strLine = appWD.Selection.Text
Debug.Print strLine
intVal = LineContainsDescendant(strLine)
If intVal = 1 Then
MsgBox strLine
End If
appWD.Selection.Collapse wdCollapseEnd
If appWD.Selection.Bookmarks.Exists("\EndOfDoc") Then bolEOF = True
Loop Until bolEOF = True
Neither seem to recognize the bookmark ("\EndOfDoc"). It doesn't matter which one gets working. Is it possible that my document does not contain this bookmark?
Not terribly elegant, but this change to one line of your first procedure seems to stop it at the appropriate time. I believe you actually have to insert bookmarks into your document if you want to reference them. They aren't automatically generated.
If i >= ActiveDocument.BuiltInProperties("NUMBER OF LINES") Then Exit Do
Cheers, LC
Unless you have a corrupted document, all Word documents should have the \EndOfDoc bookmark. You can check using simply ActiveDocument.Range.Bookmarks("\EndOfDoc").Exists. If it doesn't then you'll need to supply more details on the version of Word and if possible supply a sample document via Dropbox or the like.
I'm not sure why you're looping to the start of the Word document, when I run the code it works fine. However, if I put a footnote at the end of the document it runs into an endless loop, depending on your documents you may run into additional situations like this where your code fails to handle the document setup.
I would suggest modifying slightly how you check for the end of the document to make your code a bit more robust. I'd still use the bookmark "\EndOfDoc", however I'd check the limits of the range against your current search range.
So at the top of your code declare a range variable and set it to range of the end of the document eg:
Dim rEnd As Range
Set rEnd = ActiveDocument.Bookmarks("\EndOfDoc").Range
and then in your loop, instead of this line:
If Rng.Bookmarks.Exists("\EndOfDoc") Then Exit Do
use this line:
If Rng.End >= rEnd.End Then Exit Do