I am currently trying to parse through a Word document which is full of bullet lists. I am able to iterate over and count all bullets in the document (see below), but I can't seem to find any way to determine what bullet style is being used for each bullet. All these bullets exist on the same level, so the list level doesn't help. I need to be able to determine "Is this bullet an Arrow icon or a Black dot icon?" However, since the styles are the defaults baked into Word I can see a way to SET the style for that list, but I can't find a way to GET a value of the current style.
Thanks in advance for your help!
Dim oPara As Word.Paragraph
Dim count As Integer
count = 0
'Select Entire document
Selection.WholeStory
With Selection
For Each oPara In .Paragraphs
If oPara.Range.ListFormat.ListType = WdListType.wdListBullet Then
MsgBox "debug: " & oPara.Range.ListFormat.ListType & "//" & oPara.Range.Text
count = count + 1
End If
Next
End With
'Gives the count of bullets in a document
MsgBox count
Related
I'm trying to manipulate some text from a MS Word document that includes hyperlinks. However, I'm tripping up at understanding exactly what Range.Start and Range.End are returning.
I banged a few random words into an empty document, and added some hyperlinks. Then wrote the following macro...
Sub ExtractHyperlinks()
Dim rHyperlink As Range
Dim rEverything As Range
Dim wdHyperlink As Hyperlink
For Each wdHyperlink In ActiveDocument.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Set rEverything = ActiveDocument.Range
rEverything.TextRetrievalMode.IncludeFieldCodes = True
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start) & "#" & vbCrLf
Next
End Sub
However, the output between the #s does not quite match up with the hyperlinks, and is more than a character or two out. So if the .Start and .End do not return char positions, what do they return?
This is a bit of a simplification but it's because rEverything counts everything before the hyperlink, then all the characters in the hyperlink field code (including 1 character for each of the opening and closing field code braces), then all the characters in the hyperlink field result, then all the characters after the field.
However, the character count in the range (e.g. rEverything.Characters.Count or len(rEverything)) only includes the field result if TextRetrievalMode.IncludeFieldCodes is set to False and only includes the field code if TextRetrievalMode.IncludeFieldCodes is set to True.
So the character count is always smaller than the range.End-range.Start.
In this case if you change your Debug expression to something like
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start - (rEverything.End - rEverything.Start - 1 - Len(rEverything))) & "#" & vbCrLf
you may see results more along the lines you expect.
Another way to visualise what is going on is as follows:
Create a very short document with a piece of text followed by a short hyperlink field with short result, followed by a piece of text. Put the following code in a module:
Sub Select1()
Dim i as long
With ActiveDocument
For i = .Range.Start to .Range.End
.Range(i,i).Select
Next
End With
End Sub
Insert a breakpoint on the "Next" line.
Then run the code once with the field codes displayed and once with the field results displayed. You should see the progress of the selection "pause" either at the beginning or the end of the field, as the Select keeps "selecting" something that you cannot actually see.
Range.Start returns the character position from the beginning of the document to the start of the range; Range.End to the end of the range.
BUT everything visible as characters are not the only things that get counted, and therein lies the problem.
Examples of "hidden" things that are counted, but not visible:
"control characters" associated with content controls
"control characters" associated with fields (which also means hyperlinks), which can be seen if field result is toggled to field code display using Alt+F9
table structures (ANSI 07 and ANSI 13)
text with the font formatting "hidden"
For this reason, using Range.Start and Range.End to get a "real" position in the document is neither reliable nor recommended. The properties are useful, for example, to set the position of one range relative to the position of another.
You can get a somewhat more accurate result using the Range.TextRetrievalMode boolean properties IncludeHiddenText and IncludeFieldCodes. But these don't affect the structural elements involved with content controls and tables.
Thank you both so much for pointing out this approach was doomed but that I could still use .Start/.End for relative positions. What I was ultimately trying to do was turn a passed paragraph into HTML, with the hyperlinks.
I'll post what worked here in case anyone else has a use for it.
Function ExtractHyperlinks(rParagraph As Range) As String
Dim rHyperlink As Range
Dim wdHyperlink As Hyperlink
Dim iCaretHold As Integer, iCaretMove As Integer, rCaret As Range
Dim s As String
iCaretHold = 1
iCaretMove = 1
For Each wdHyperlink In rParagraph.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Do
Set rCaret = ActiveDocument.Range(rParagraph.Characters(iCaretMove).Start, rParagraph.Characters(iCaretMove).End)
If RangeContains(rHyperlink, rCaret) Then
s = s & Mid(rParagraph.Text, iCaretHold, iCaretMove - iCaretHold) & "" & IIf(wdHyperlink.TextToDisplay <> "", wdHyperlink.TextToDisplay, wdHyperlink.Address) & ""
iCaretHold = iCaretMove + Len(wdHyperlink.TextToDisplay)
iCaretMove = iCaretHold
Exit Do
Else
iCaretMove = iCaretMove + 1
End If
Loop Until iCaretMove > Len(rParagraph.Text)
Next
If iCaretMove < Len(rParagraph.Text) Then
s = s & Mid(rParagraph.Text, iCaretMove)
End If
ExtractHyperlinks = "<p>" & s & "</p>"
End Function
Function RangeContains(rParent As Range, rChild As Range) As Boolean
If rChild.Start >= rParent.Start And rChild.End <= rParent.End Then
RangeContains = True
Else
RangeContains = False
End If
End Function
Say that I have the following code to write headings from an array to a word document and to apply defined styles:
With wdDoc
Set wrdRange = .Range(0, 0) ' Set initial Range.
i = 2
Do Until i > 6
' Debug.Print wrdRange.Start, wrdRange.End
wrdRange.text = totalArray(i, colIndex(3)) & Chr(11)
Set wrdRange = .Paragraphs(i - 1).Range
wrdRange.Style = totalArray(i, colIndex(2))
wrdRange.Collapse 0
i = i + 1
Loop
End With
One would expect the following to occur:
The word range moves programmatically as I move through the document.
The word style is updated for the new range (defined by the set statement)
The Range collapses to the end (0 = wdCollapseEnd) and the loop continues until the initial conditions are satisfied.
What I can't seem to fix is the styles being applied to ALL existing paragraphs in the document. The Debug.Print statement should show the range being updated as expected, despite the fact that the style applies to all existing paragraphs.
As you can tell, I've toyed around with this quite a bit, to no avail. Any help would be appreciated in this matter.
Thanks.
In the following line of code:
wrdRange.text = totalArray(i, colIndex(3)) & Chr(11)
Use Chr(13) instead of Chr(11). The latter is simply a line break, not a new paragraph. So applying a style to any part of the Range is actually applying it to all the text your code is generating because it's a single paragraph.
I have a 400+ page coding manual I use, and unfortunately turned off the green for all the comments in the manual. I can't undo it, as I hadn't noticed it until it was too late. Its ruined years of work.
How would I write VBA to parse the document finding sentences starting with // and ending in a Paragraph mark and change the color of them? Or assign a style to them?
Here is a start that I have cobbled together, I admire people who can write code without intellisence, its like trying to find your way blindfolded
Dim oPara As Word.Paragraph
Dim rng As Range
Dim text As String
For Each oPara In ActiveDocument.Paragraphs
If Len(oPara.Range.text) > 1 Then
Set rng = ActiveDocument.Range(oPara.Range.Start,oPara.Range.End)
With rng.Font
.Font.Color = wdColorBlue
End With
End If
Next
End Sub
The following seems to work:
Dim oPara As Word.Paragraph
Dim text As String
For Each oPara In ActiveDocument.Paragraphs
text = oPara.Range.text
'Check the left 2 characters for //
If Left(oPara.Range.text, 2) = "//" Then
oPara.Range.text = "'" & text
End If
Next
I assume you are using VBA so by placing a ' in front of // it will turn the line green. You could modify the code to replace // with ' if desired. The opera.range.text should grab the entire paragraph.
How can I align a paragraph with just the text portion of a numbered heading? e.g:
1.1.2 This Is A Numbered Heading
This is the aligned text I'm trying to achieve
This is aligned to the numbers not the text
2.4 This Is Another Example
This is where the text should be
I'm aware of the CharacterUnitLeftIndent, CharacterUnitFirstLineIndent, FirstLineIndent etc properties but after a few hours experimentation & searching online can't figure out how to achieve this programmatically. I know how to test for the heading style and how to refer to the following paragraph so just need to know how to get the indent right.
To use a macro to accomplish this, you have to check each paragraph in your document and check to see if it is a "Header" style. If so, then pick off the value of the first tab stop to set as the indent for the subsequent paragraphs.
UPDATE1: the earlier version of the code below set the paragraphs to the Document level first tab stop, and did not accurately grab the tabstop set for the Heading styles. The code update below accurately determines each Heading indent tab stop.
UPDATE2: the sample text original I used in shown in this first document:
The code that automatically performs a first line indent to the tab level of the preceding heading is the original Sub from the first example:
Option Explicit
Sub SetParaIndents1()
Dim myDoc As Document
Set myDoc = ActiveDocument
Dim para As Paragraph
Dim firstIndent As Double 'value in "points"
For Each para In myDoc.Paragraphs
If para.Style Like "Heading*" Then
firstIndent = myDoc.Styles(para.Style).ParagraphFormat.LeftIndent
Debug.Print para.Style & " first tab stop at " & _
firstIndent & " points"
Else
Debug.Print "paragraph first line indent set from " & _
para.FirstLineIndent & " to " & _
firstIndent
para.FirstLineIndent = firstIndent
End If
Next para
'--- needed to show the changes just made
Application.ScreenRefresh
End Sub
And the results looks like this (red lines added manually to show alignment):
If you want the entire paragraph indented in alignment with the heading style, the code is modified to this:
Option Explicit
Sub SetParaIndents2()
Dim myDoc As Document
Set myDoc = ActiveDocument
Dim para As Paragraph
For Each para In myDoc.Paragraphs
If para.Style Like "Heading*" Then
'--- do nothing
Else
para.Indent
End If
Next para
'--- needed to show the changes just made
Application.ScreenRefresh
End Sub
And the resulting text looks like this:
I am using this code to count number of bullets in word documents. But its always returning zero.
Sub FindBullet()
Dim oPara As Word.Paragraph
Dim count As Integer
count = 0
'Select Entire document
Selection.WholeStory
With Selection
For Each oPara In .Paragraphs
If oPara.Range.ListFormat.ListType = WdListType.wdListBullet Then
count = count + 1
End If
Next
End With
'Gives the count of bullets in a document
MsgBox count
End Sub
Well, I just opened up a blank Word doc, popped the code in and ran it, and it is returning however many bullets I put in. I did not modify your code...
Could your bullets really be something else, like a numbered list or some other symbol?