Write and Style In Loop - vba

Say that I have the following code to write headings from an array to a word document and to apply defined styles:
With wdDoc
Set wrdRange = .Range(0, 0) ' Set initial Range.
i = 2
Do Until i > 6
' Debug.Print wrdRange.Start, wrdRange.End
wrdRange.text = totalArray(i, colIndex(3)) & Chr(11)
Set wrdRange = .Paragraphs(i - 1).Range
wrdRange.Style = totalArray(i, colIndex(2))
wrdRange.Collapse 0
i = i + 1
Loop
End With
One would expect the following to occur:
The word range moves programmatically as I move through the document.
The word style is updated for the new range (defined by the set statement)
The Range collapses to the end (0 = wdCollapseEnd) and the loop continues until the initial conditions are satisfied.
What I can't seem to fix is the styles being applied to ALL existing paragraphs in the document. The Debug.Print statement should show the range being updated as expected, despite the fact that the style applies to all existing paragraphs.
As you can tell, I've toyed around with this quite a bit, to no avail. Any help would be appreciated in this matter.
Thanks.

In the following line of code:
wrdRange.text = totalArray(i, colIndex(3)) & Chr(11)
Use Chr(13) instead of Chr(11). The latter is simply a line break, not a new paragraph. So applying a style to any part of the Range is actually applying it to all the text your code is generating because it's a single paragraph.

Related

Finding and Replacing with VBA for Word overwrites previous style

I'm writing a VBA script to generate word documents from an already defined template. In it, I need to be able to write headings along with a body for each heading. As a small example, I have a word document that contains only <PLACEHOLDER>. For each heading and body I need to write, I use the find-and-replace feature in VBA to find <PLACEHOLDER> and replace it with the heading name, a newline, and then <PLACEHOLDER> again. This is repeated until each heading name and body is written and then the final <PLACEHOLDER> is replaced with a newline.
The text replacing works fine, but the style I specify gets overwritten by the next call to the replacement. This results in everything I just replaced having the style of whatever my last call to my replacement function is.
VBA code (run main)
Option Explicit
Sub replace_stuff(search_string As String, replace_string As String, style As Integer)
With ActiveDocument.Range.Find
.Text = search_string
.Replacement.Text = replace_string
.Replacement.style = style
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchWholeWord = False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Execute Replace:=wdReplaceAll
End With
End Sub
Sub main()
Dim section_names(2) As String
section_names(0) = "Introduction"
section_names(1) = "Background"
section_names(2) = "Conclusion"
Dim section_bodies(2) As String
section_bodies(0) = "This is the body text for the introduction! Fetched from some file."
section_bodies(1) = "And Background... I have no issue fetching data from the files."
section_bodies(2) = "And for the conclusion... But I want the styles to 'stick'!"
Dim i As Integer
For i = 0 To 2
' Writes each section name as wsStyleHeading2, and then the section body as wdStyleNormal
Call replace_stuff("<PLACEHOLDER>", section_names(i) & Chr(11) & "<PLACEHOLDER>", wdStyleHeading2)
Call replace_stuff("<PLACEHOLDER>", section_bodies(i) & Chr(11) & "<PLACEHOLDER>", wdStyleNormal)
Next i
Call replace_stuff("<PLACEHOLDER>", Chr(11), wdStyleNormal)
End Sub
Input document: A word document with only <PLACEHOLDER> in it.
<PLACEHOLDER>
Expected Output:
I expect that each heading will be displayed in the style I specified and can be viewed from the navigation pane like this:
Actual Output: However what I actually get is everything as wdStyleNormal style like this:
I think the problem can be solved by inserting a paragraph break between every style transition, but when I try using vbCrLF or Chr(10) & Chr(13) or vbNewLine instead of the chr(11) I am using now, Each line begins with a boxed question mark like this:
Update from discussion in comments on another answer. The problem described below applies to Word 2016 and earlier. Starting in Office 365 (and probably Word 2019, but that's not been confirmed) the Replace behavior has been changed to "convert" ANSI 13 to a "real" paragraph mark, so the problem in the question would not occur.
Answer
The reason for the odd formatting behavior is the use of Chr(11), which inserts a new line (Shift + Enter) instead of a new paragraph. So a paragraph style applied to any part of this text formats the entire text with the same style.
In this particular case (working with Replace), vbCr or the equivalent Chr(13) also don't work because these are not really Word's native paragraph. A paragraph is much more than just ANSI code 13 - it contains paragraph formatting information. So, while the code is running, Word is not really recognizing these as true paragraph marks and the paragraph style assignment is being applied to "everything".
What does work is to use the string ^p, which in Word's Find/Replace is the "alias" for a complete paragraph mark. So, for example:
replace_stuff "<PLACEHOLDER>", section_names(i) & "^p" & "<PLACEHOLDER>", wdStyleHeading2
replace_stuff "<PLACEHOLDER>", section_bodies(i) & "^p" & "<PLACEHOLDER>", wdStyleNormal
There is, however, a more efficient way to build a document than inserting a placeholder for each new item and using Find/Replace to replace the placeholder with the document content. The more conventional approach is to work with a Range object (think of it like an invisible selection)...
Assign content to the Range, format it, collapse (like pressing right-arrow for a selection) and repeat. Here's an example that returns the same result as the (corrected) code in the question:
Sub main()
Dim rng As Range
Set rng = ActiveDocument.content
Dim section_names(2) As String
section_names(0) = "Introduction"
section_names(1) = "Background"
section_names(2) = "Conclusion"
Dim section_bodies(2) As String
section_bodies(0) = "This is the body text for the introduction! Fetched from some file."
section_bodies(1) = "And Background... I have no issue fetching data from the files."
section_bodies(2) = "And for the conclusion... But I want the styles to 'stick'!"
Dim i As Integer
For i = 0 To 2
BuildParagraph section_names(i), wdStyleHeading2, rng
BuildParagraph section_bodies(i), wdStyleNormal, rng
Next i
End Sub
Sub BuildParagraph(para_text As String, para_style As Long, rng As Range)
rng.Text = para_text
rng.style = para_style
rng.InsertParagraphAfter
rng.Collapse wdCollapseEnd
End Sub
The problem is caused by your use of Chr(11) which is a manual line break. This results in all of the text being in a single paragraph. When the paragraph style is applied it applies to the entire paragraph.
Replace Chr(11) with vbCr to ensure that each piece of text is in a separate paragraph.

What does a hyperlink range.start and range.end refer to?

I'm trying to manipulate some text from a MS Word document that includes hyperlinks. However, I'm tripping up at understanding exactly what Range.Start and Range.End are returning.
I banged a few random words into an empty document, and added some hyperlinks. Then wrote the following macro...
Sub ExtractHyperlinks()
Dim rHyperlink As Range
Dim rEverything As Range
Dim wdHyperlink As Hyperlink
For Each wdHyperlink In ActiveDocument.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Set rEverything = ActiveDocument.Range
rEverything.TextRetrievalMode.IncludeFieldCodes = True
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start) & "#" & vbCrLf
Next
End Sub
However, the output between the #s does not quite match up with the hyperlinks, and is more than a character or two out. So if the .Start and .End do not return char positions, what do they return?
This is a bit of a simplification but it's because rEverything counts everything before the hyperlink, then all the characters in the hyperlink field code (including 1 character for each of the opening and closing field code braces), then all the characters in the hyperlink field result, then all the characters after the field.
However, the character count in the range (e.g. rEverything.Characters.Count or len(rEverything)) only includes the field result if TextRetrievalMode.IncludeFieldCodes is set to False and only includes the field code if TextRetrievalMode.IncludeFieldCodes is set to True.
So the character count is always smaller than the range.End-range.Start.
In this case if you change your Debug expression to something like
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start - (rEverything.End - rEverything.Start - 1 - Len(rEverything))) & "#" & vbCrLf
you may see results more along the lines you expect.
Another way to visualise what is going on is as follows:
Create a very short document with a piece of text followed by a short hyperlink field with short result, followed by a piece of text. Put the following code in a module:
Sub Select1()
Dim i as long
With ActiveDocument
For i = .Range.Start to .Range.End
.Range(i,i).Select
Next
End With
End Sub
Insert a breakpoint on the "Next" line.
Then run the code once with the field codes displayed and once with the field results displayed. You should see the progress of the selection "pause" either at the beginning or the end of the field, as the Select keeps "selecting" something that you cannot actually see.
Range.Start returns the character position from the beginning of the document to the start of the range; Range.End to the end of the range.
BUT everything visible as characters are not the only things that get counted, and therein lies the problem.
Examples of "hidden" things that are counted, but not visible:
"control characters" associated with content controls
"control characters" associated with fields (which also means hyperlinks), which can be seen if field result is toggled to field code display using Alt+F9
table structures (ANSI 07 and ANSI 13)
text with the font formatting "hidden"
For this reason, using Range.Start and Range.End to get a "real" position in the document is neither reliable nor recommended. The properties are useful, for example, to set the position of one range relative to the position of another.
You can get a somewhat more accurate result using the Range.TextRetrievalMode boolean properties IncludeHiddenText and IncludeFieldCodes. But these don't affect the structural elements involved with content controls and tables.
Thank you both so much for pointing out this approach was doomed but that I could still use .Start/.End for relative positions. What I was ultimately trying to do was turn a passed paragraph into HTML, with the hyperlinks.
I'll post what worked here in case anyone else has a use for it.
Function ExtractHyperlinks(rParagraph As Range) As String
Dim rHyperlink As Range
Dim wdHyperlink As Hyperlink
Dim iCaretHold As Integer, iCaretMove As Integer, rCaret As Range
Dim s As String
iCaretHold = 1
iCaretMove = 1
For Each wdHyperlink In rParagraph.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Do
Set rCaret = ActiveDocument.Range(rParagraph.Characters(iCaretMove).Start, rParagraph.Characters(iCaretMove).End)
If RangeContains(rHyperlink, rCaret) Then
s = s & Mid(rParagraph.Text, iCaretHold, iCaretMove - iCaretHold) & "" & IIf(wdHyperlink.TextToDisplay <> "", wdHyperlink.TextToDisplay, wdHyperlink.Address) & ""
iCaretHold = iCaretMove + Len(wdHyperlink.TextToDisplay)
iCaretMove = iCaretHold
Exit Do
Else
iCaretMove = iCaretMove + 1
End If
Loop Until iCaretMove > Len(rParagraph.Text)
Next
If iCaretMove < Len(rParagraph.Text) Then
s = s & Mid(rParagraph.Text, iCaretMove)
End If
ExtractHyperlinks = "<p>" & s & "</p>"
End Function
Function RangeContains(rParent As Range, rChild As Range) As Boolean
If rChild.Start >= rParent.Start And rChild.End <= rParent.End Then
RangeContains = True
Else
RangeContains = False
End If
End Function

Check if a Range of text fits onto a single line

I'm programmatically filling in a regulated form template where lines are predefined (as table cells):
(Using plain text Content Controls as placeholders but this isn't relevant to the current question.)
So, I have to break long text into lines manually (auto-adding rows or something is not an option because page breaks are also predefined).
Now, since characters have different width, I cannot just set some hardcoded character limit to break at (or rather, I can, and that's what I'm doing now, but this has proven to be inefficient and unreliable, as expected). So:
How do I check if a Range of text fits on a single line -- and if it doesn't, how much of it fits?
I've checked out Range Members (Word) but can't see anything relevant.
The only way is to .Select that text, them manipulate the selection. Selection in the only object for which you can use wdLine as a boundary. Nothing else in the Word object model works with automatic line breaks.
Sub GetFirstLineOfRange(RangeToCheck As Range, FirstLineRange As Range)
'Otherwise, Word doesn't always insert automatic line breaks
'and all the text will programmatically look like it's on a single line
If Not Application.Visible Or Not Application.ScreenUpdating Then
Application.ScreenRefresh
End If
Dim SelectionRange As Range
Set SelectionRange = Selection.Range
Set FirstLineRange = RangeToCheck
FirstLineRange.Select
Selection.Collapse Direction:=wdCollapseStart
Selection.EndOf Unit:=wdLine, Extend:=wdExtend
Set FirstLineRange = Selection.Range
If FirstLineRange.End > RangeToCheck.End Then
FirstLineRange.End = RangeToCheck.End
End If
SelectionRange.Select
End Sub
Function IsRangeOnOneLine(RangeToCheck As Range) As Boolean
Dim FirstLineRange As Range
GetFirstLineOfRange RangeToCheck, FirstLineRange
IsRangeOnOneLine = FirstLineRange.End >= RangeToCheck.End
End Function
The subroutine GetFirstLineOfRange takes a RangeToCheck and sets FirstLineRange to the first text line in the given range.
The function IsRangeOnOneLine takes a RangeToCheck and returns True if the range fits on one line of text, and False otherwise. The function works by getting the first text line in the given range and checking whether it contains the range or not.
The manipulation of the Selection in GetFirstLineOfRange is necessary because the subroutine wants to move the end of the range to the end of the line, and the movement unit wdLine is available only with Selection. The subroutine saves and restores the current Selection; if this is not necessary then the temporary variable SelectionRange and the associated statements can be deleted.
Note:
There is no need to scroll anything - which in any event is not reliable. Try something based on:
With Selection
If .Characters.First.Information(wdVerticalPositionRelativeToPage) = _
.Characters.Last.Information(wdVerticalPositionRelativeToPage) Then
MsgBox .Text & vbCr & vbCr & "Spans one line or less."
Else
MsgBox .Text & vbCr & vbCr & "Spans more than one line."
End If
End With

Word VBA match paragraph indent to heading text

How can I align a paragraph with just the text portion of a numbered heading? e.g:
1.1.2 This Is A Numbered Heading
This is the aligned text I'm trying to achieve
This is aligned to the numbers not the text
2.4 This Is Another Example
This is where the text should be
I'm aware of the CharacterUnitLeftIndent, CharacterUnitFirstLineIndent, FirstLineIndent etc properties but after a few hours experimentation & searching online can't figure out how to achieve this programmatically. I know how to test for the heading style and how to refer to the following paragraph so just need to know how to get the indent right.
To use a macro to accomplish this, you have to check each paragraph in your document and check to see if it is a "Header" style. If so, then pick off the value of the first tab stop to set as the indent for the subsequent paragraphs.
UPDATE1: the earlier version of the code below set the paragraphs to the Document level first tab stop, and did not accurately grab the tabstop set for the Heading styles. The code update below accurately determines each Heading indent tab stop.
UPDATE2: the sample text original I used in shown in this first document:
The code that automatically performs a first line indent to the tab level of the preceding heading is the original Sub from the first example:
Option Explicit
Sub SetParaIndents1()
Dim myDoc As Document
Set myDoc = ActiveDocument
Dim para As Paragraph
Dim firstIndent As Double 'value in "points"
For Each para In myDoc.Paragraphs
If para.Style Like "Heading*" Then
firstIndent = myDoc.Styles(para.Style).ParagraphFormat.LeftIndent
Debug.Print para.Style & " first tab stop at " & _
firstIndent & " points"
Else
Debug.Print "paragraph first line indent set from " & _
para.FirstLineIndent & " to " & _
firstIndent
para.FirstLineIndent = firstIndent
End If
Next para
'--- needed to show the changes just made
Application.ScreenRefresh
End Sub
And the results looks like this (red lines added manually to show alignment):
If you want the entire paragraph indented in alignment with the heading style, the code is modified to this:
Option Explicit
Sub SetParaIndents2()
Dim myDoc As Document
Set myDoc = ActiveDocument
Dim para As Paragraph
For Each para In myDoc.Paragraphs
If para.Style Like "Heading*" Then
'--- do nothing
Else
para.Indent
End If
Next para
'--- needed to show the changes just made
Application.ScreenRefresh
End Sub
And the resulting text looks like this:

Range variable vs Paragraph variable different behaviour with selection

I thought the two following programs would be identical, why arent they?
This code works:
For i = 1 To n
Set r = Selection.Range.Paragraphs(i).Range
r.Collapse
r.Text = " "
r.ContentControls.Add (wdContentControlCheckBox)
Next i
This code doesn't:
For i = 1 To n
Set r = Selection.Range.Paragraphs(i).Range
Set p = r.Paragraphs(1)
p.Range.Text = " " + p.Range.Text
r.Collapse
r.ContentControls.Add (wdContentControlCheckBox)
Next i
As far as I can tell, the only difference is instead of concatenating the old text behind a space then placing the cursor at the start of the para, I just place the cursor at the start of the para and input a space.
Tl;dr: I don't understand why the two programs above aren't equivalent
I lack the general knowledge to google the reason. My attempts pulled up general purpose guides. I tried stepping through the debugger to get a grasp of the control flow, but that didn't help either.
Try:
Dim sel As Range
Set sel = Selection.Range
For i = 1 To n
Set r = Selection.Range.Paragraphs(i).Range
Set p = r.Paragraphs(1)
p.Range.Text = " " + p.Range.Text
r.Collapse
r.ContentControls.Add (wdContentControlCheckBox)
sel.Select
Next i
The problem is that:
p.Range.Text = " " + p.Range.Text
is changing the selection....
**** Edited to include better explanation****
When you use r.Collapse - you are setting the range r to have equal start and end positions.
For example if you have a paragraph like so:
"This is my first paragraph"
when you set r, it has a start of 0 and an end of 27. After you run r.Collapse the start and end both become 0 (assuming the para is at the start of the document).
You then insert a space (under your first method) at position 0 and then add your content control. Word can cope with this whilst the selection is selected.
Under the second method, you are changing the text of a paragraph directly. You are collapsing r later, but that will not change p. P will be range 0,27 to start. Word cannot change range (0,27) to be (0,28) by adding the space without selecting it.
In short, the difference is the collapse causing Word being able to insert the space before what is (to Word) a null range at that time.