Distinguishing Table of Contents in Word document - vba

Does anyone know how when programmatically iterating through a word document, you can tell if a paragraph forms part of a table of contents (or indeed, anything else that forms part of a field).
My reason for asking is that I have a VB program that is supposed to extract the first couple of paragraphs of substantive text from a document - it's doing so by iterating through the Word.Paragraphs collection. I don't want the results to include tables of contents or other fields, I only want stuff that a human being would recognize as a header, title or a normal text paragraph. However it turns out that if there's a table of contents, then not only the table of contents itself but EVERY line in the table of contents appears as a separate item in Word.Paragraphs. I don't want these but haven't been able to find any property on the Paragraph object that would allow me to distinguish and so ignore them (I'm guessing I need the solution to apply to other field types too, like table of figures and table of authorities, which I haven't yet actually encountered but I guess potentially would cause the same problem)

Because of the limitations in the Word object model I think the best way to achieve this would be to temporarily remove the TOC field code, iterate through the Word document, and then re-insert the TOC. In VBA, it would look like this:
Dim doc As Document
Dim fld As Field
Dim rng As Range
Set doc = ActiveDocument
For Each fld In doc.Fields
If fld.Type = wdFieldTOC Then
fld.Select
Selection.Collapse
Set rng = Selection.Range 'capture place to re-insert TOC later
fld.Cut
End If
Next
Iterate through the code to extract paragraphs and then
Selection.Range = rng
Selection.Paste
If you are coding in .NET this should translate pretty closely. Also, this should work for Word 2003 and earlier as is, but for Word 2007/2010 the TOC, depending on how it is created, sometimes has a Content Control-like region surrounding it that may require you to write additional detect and remove code.

This is not guaranteed, but if the standard Word styles are being used for the TOC (highly likely), and if no one has added their own style prefixed with "TOC", then it is OK. This is a crude approach, but workable.
Dim parCurrentParagraph As Paragraph
If Left(parCurrentParagraph.Format.Style.NameLocal, 3) = "TOC" Then
' Do something
End If

What you could do is create a custom style for each section of your document.
Custom styles in Word 2003 (not sure which version of Word you're using)
Then, when iterating through your paragraph collection you can check the .Style property and safely ignore it if it equals your TOCStyle.
I believe the same technique would work fine for Tables as well.

The following Function will return a Range object that begins after any Table of Contents or Table of Figures. You can then use the Paragraphs property of the returned Range:
Private Function GetMainTextRange() As Range
Dim toc As TableOfContents
Dim tof As TableOfFigures
Dim mainTextStart As Long
mainTextStart = 1
For Each toc In ActiveDocument.TablesOfContents
If toc.Range.End > mainTextStart Then
mainTextStart = toc.Range.End + 1
End If
Next
For Each tof In ActiveDocument.TablesOfFigures
If tof.Range.End > mainTextStart Then
mainTextStart = tof.Range.End + 1
End If
Next
Set GetMainTextRange = ActiveDocument.Range(mainTextStart, ActiveDocument.Range.End)
End Function

Related

Batch add formatted autocorrects with VBA in Word

I use a long Excel spreadsheet containing incorrect and correct terms to check consistency between documents (e.g. anti-citrullinated is always hyphenated). I've added quite a few of these as autocorrect entries via the AutoCorrect Options feature in Word but it's time-consuming .
I came across the following code that will add long lists of autocorrects.
Sub BatchAddAutoCorrectEntries()
Dim objTable As Table
Dim objOriginalWord As Cell
Dim objOriginalWordRange As Range
Dim objReplaceWordRange As Range
Dim nRowNumber As Integer
Set objTable = ActiveDocument.Tables(1)
nRowNumber = 1
For Each objOriginalWord In objTable.Columns(1).Cells
Set objOriginalWordRange = objOriginalWord.Range
objOriginalWordRange.MoveEnd Unit:=wdCharacter, Count:=-1
Set objReplaceWordRange = objTable.Cell(nRowNumber, 2).Range
objReplaceWordRange.MoveEnd Unit:=wdCharacter, Count:=-1
AutoCorrect.Entries.Add Name:=objOriginalWordRange.Text, Value:=objReplaceWordRange.Text
nRowNumber = nRowNumber + 1
Next objOriginalWord
MsgBox ("All autocorrect items in the table1 are added.")
End Sub
It doesn't preserve any formatting: super- or subscripts, etc. Formatting autocorrect entries are stored in the Normal.dotm file and not in the regular .acl file so I haven't been able to figure out a way around this.
In a similar post, someone suggested a Find and Replace macro but Find and Replace doesn't allow me to replace with super- or subscripts.
There are two methods of adding Auto Correct Entries, Add and AddRichText. It is this second one that you use for formatted entries.
When faced with an issue like this my first resort is to check the Object Brower in the VBA editor (press F2 to display) to see what methods and properties may be available. My next step is to look them up in the VBA technical reference, aka Help, to check the usage.
If the problem is just sub/superscribt, then you could use uni-codes. Those are also available in autocorrect. Fx writing the unicodes ₁₂₃₄₅₆₇₈₉ instead of using formating on a normal 2. Most (but not all) characters exist in super and sub unicode.
The program is not working. It is giving an error message
Compile Error Expected Function or Variable
It is showing the following line as error
Autocorrect.Entries.Add Name:=objOriginalWordRange.Text, Value:=objReplaceWordRange.Text

How to bold only a portion of Word paragraph in Visual Studio (VB)

I am trying to export a Word document from a Visual Basic program. Different parts of the document will need different formatting.
I have several paragraphs, and I need to bold only portions of each of those paragraphs. I am trying to set the range within each paragraph that needs to be bolded, but no matter what I do, it only seems to want to to bold the entire paragraph.
I want to do something like this:
Dim Para1 As Word.Paragraph
Para1 = WordDoc.Content.Paragraphs.Add
Para1.Range.Start = 1
Para1.Range.End = 14
Para1.Range.Font.Bold = True
Para1.Range.Text = "Job number is: " + myJobID
... so that it bolds from the 'J' to the ':' (in Para1.Range.Text) but does not bold the myJobID (which is a variable I'm getting from the user). However, no matter what I do, it bolds the entire paragraph, including the myJobID.
I've also tried creating a Range variable that sets a range based on the entire document, but the problem with that is, the lengths of several variables I'm outputting on the Word document are going to be varying sizes, and thus there's no way to know where the start of the next section I want to bold will start at. So basically, I have to work within the Paragraph object rather than iterating through all of the characters in the entire document.
Hope that made sense. Any ideas?
In order to format individual text runs it's necessary to break the text down into individual runs when inserting. Also, it's best to work with an independent Range object. Between formatting commands the Range needs to be "collapsed" - think of it like pressing the right (or left) arrow of a selection to make it a blinking cursor. Something along these lines
Dim Para1 As Word.Paragraph
Dim rng as Word.Range
Para1 = WordDoc.Content.Paragraphs.Add
rng = Para1.Range
rng.Text = "Job number is: "
rng.Font.Bold = True
rng.Collapse(Word.WdCollapseDirection.wdCollapseEnd)
rng.Text = myJobID
rng.Font.Bold = False
rng.Collapse Word.WdCollapseDirection.wdCollapseEnd
If it's really necessary to insert the full text in one go, then Find/Replace to locate the text that should be formatted differently is one way to format after-the-fact, although less efficient.
Another possibility is to use string manipulation functions, such as Instr (or Contains), Left, Mid etc. to determine where in a longer string the substring is located. Then Range.Start and Range.End can work with those values. But generally it's better to not rely on the start and end values since Word can insert non-visible characters that can throw this numbering off.
Create another Range object that only covers the characters that you want to bold.
The code below is not tested (don't have full VS set up on this machine), but should give you an idea:
Dim para1 As Word.Paragraph
Dim textToBeBolded As Word.Range
para1 = WordDoc.Content.Paragraphs.Add 'ThisDocument.Paragraphs.Add in VBA
para1.Range.Text = "Job number is: " + myJobID
para1.Range.SetRange 1, 14
textToBeBolded = para1.Range
textToBeBolded.SetRange 1, 14
textToBeBolded.Font.Bold = True

Converting Headings to Bookmarks

I'm using Office 2016. I'd like to make a macro that loops through each heading in a document, and then creates a bookmark at the heading's location using the heading text (modified as necessary) as the bookmark name. Most of the headings are in X.X.X.X format, such as "3.3.4.1. sometexthere".
I'm still a beginner using VBA, but after a lot of googling I managed to adapt some Frankenstein code that almost works:
Sub HeadingsToBookmarks()
Dim heading As Range
Set heading = ActiveDocument.Range(Start:=0, End:=0)
Do
Dim current As Long
current = heading.Start
Set heading = heading.GoTo(What:=wdGoToHeading, Which:=wdGoToNext)
If heading.Start = current Then
Exit Do
End If
ActiveDocument.Bookmarks.Add MakeValidBMName(heading.Paragraphs(1).Range.Text), Range:=heading.Paragraphs(1).Range
Loop
End Sub
Function MakeValidBMName(strIn As String)
Dim pFirstChr As String
Dim i As Long
Dim tempStr As String
strIn = Trim(strIn)
pFirstChr = Left(strIn, 1)
If Not pFirstChr Like "[A-Za-z]" Then
strIn = "Section_" & strIn
End If
For i = 1 To Len(strIn)
Select Case Asc(Mid$(strIn, i, 1))
Case 49 To 58, 65 To 90, 97 To 122
tempStr = tempStr & Mid$(strIn, i, 1)
Case Else
tempStr = tempStr & "_"
End Select
Next i
tempStr = Replace(tempStr, " ", " ")
tempStr = Replace(tempStr, ":", "")
If Right(tempStr, 1) = "_" Then
tempStr = Left(tempStr, Len(tempStr) - 1)
End If
MakeValidBMName = tempStr
End Function
This code almost works, and makes appropriate bookmarks at some of the headings, but not all. Can anyone help me figure out what I need to fix here, or have other recommendations on how else I can clean up this code?
Edit: More information: The code above converts the first 5 or so headings in the document I've been testing it on, along with a few others scattered around. The second half of the code, which does the actual conversion, seems to work fine- the problem is located in the section that loops through each heading. The second half converts unusable characters to those that work with the requirements for bookmark names, and adds "Section_" to the beginning of bookmarks / headings that start with numbers (as bookmarks aren't allowed to start with numbers).
My goal is to be able to hyperlink to all sections within the document that has headings from a different word document. The standard Table of Contents creator allows only for links to be built within the same document, as far as I can tell. I'm aware that when word saves to PDF, it can convert headings to bookmarks; I would like to be able to do the same thing but retain the document in word format.
I unfortunately can't use the built in numbering. I'm working with documents that are already created and have a set and specific format.
You haven't described why you want bookmarks, or how a future user of the document would use/access the bookmarks.
MS Word has a number of built in features that act as bookmarks. The best way to do this is to use Styles. The built-in heading styles allow for some native navigation functionality (Word's own hidden bookmarks). Also, don't re-invent the wheel - use built in numbering.
This requires some document discipline. Use headings only for headings, and body text for the non-heading text.
The benefits make the discipline worth it. You can easily create tables of contents that use the headings (or even some of your custom styles), and headings show up in the navigation pane. When you save to PDF, you can use the headings as bookmarks in the PDF (show up on the Reader navigation bar).
Note that what I have described doesn't even touch VBA.
If you use set styles for your headings and you want to do a little more than what you can do natively, then you can simply:
Loop through all paragraphs in the document
See if that paragraph is set to your heading style
Place a bookmark (valid bookmark name!) over that paragraph
I have left the actual coding to you, but I think you will find it easy to do based on the pseudo code above. My pseudo code loop is not the only way to find the paragraphs, but it is the easiest to visualise.
Once you use the simplified method above and built-in numbering, you should find that you can modify your ValidBMName function - simplifying it. But, as noted and depending on why you want bookmarks, you may be able to avoid VBA altogether.
This code works for me:
Sub HeadingsToBookmarks()
Dim heading As Range
Set heading = ActiveDocument.Range(Start:=0, End:=0)
Do
Dim current As Long
current = heading.Start
Set heading = heading.GoTo(What:=wdGoToHeading, Which:=wdGoToNext)
If heading.Start = current Then
Exit Do
End If
'This is the part I changed: ListFormat.ListString
ActiveDocument.Bookmarks.Add MakeValidBMName(heading.Paragraphs(1).Range.ListFormat.ListString), Range:=heading.Paragraphs(1).Range
Loop
End Sub

Insert text after numbers and before words in a Word hierarchical heading

I am working my way through two books (Roman's Writing Word Macros, Mansfield's Mastering VBA for MS Office). In my work environment, I use both Word 2007 and Word 2010.
My issue is that I want to use VBA to insert a very brief amount of standardized text before the English-language string in my numbered hierarchical headings. For instance, I have:
1.1.1 The Quick Brown Fox.
What I want is:
1.1.1 (XXxx) The Quick Brown Fox.
I guess my most basic issue is that I don't know how to approach the situation. I have hierarchical headings yet I don't know how to say, in effect, "Go to each hierarchical heading regardless of level. Insert yourself in front of the first English language word of the heading. Paste the text "XXxx" in front of the first word in the heading. Go on to the next heading and all remaining headings and do the same. My document is over 700 pages and has hundreds of hierarchical headings.
I see that paragraphs are objects and that hierarchical headings are paragraphs. However, I can't see any way to make VBA recognize what I am talking about. I haven't been able to use Selection approaches successfully. I've tried using the Range approach but just have not been able to phrase the VBA code intelligently. I haven't been able to specify a range that includes all and only the hierarchical headings and, especially, I don't understand how to get the insertion to happen in front of the first English-language word of the heading.
I have just begun to look at using Bookmarks. However, don't bookmarks require me to go to every heading and enter them? I may as well just paste my content if that is the case. I'm stumped. It is interesting that in no way, as might have been expected, does this appear to be a simple matter
Assuming you are using Word's outline levels (I think this is what you mean by hierarchical headings), you can check a paragraph for this state. For example, assuming I have a paragraph in my document that has the Heading 1 style applied to it:
Sub PrintHeadings()
Dim objDoc as Document
Dim objPara as Paragraph
Set objDoc = ActiveDocument
For each objPara in objDoc.Content.Paragraphs
If objPara.OutlineLevel <> wdOutlineLevelBodyText then
Debug.Print objPara.Range.Text
End If
Next objPara
End Sub
This code would print the contents of any paragraph that has an outline level above body text to the VBA Immediate Window. There are other approaches as well; you could use Find and Replace to search for each of the Outline Levels. This gives you a bit less control; you'd want your change to be something that could be encapsulated in a Word Find and Replace. But, it would be faster if you have a long document and not too many heading levels. A basic example:
Sub UnderlineHeadings()
Dim objDoc as Document
Set objDoc = ActiveDocument
With objDoc.Content.Find
.ClearFormatting
.ParagraphFormat.OutlineLevel = wdOutlineLevel1
With .Replacement
.ClearFormatting
.Font.Underline = wdUnderlineSingle
End With
.Execute Forward:=True, Wrap:=wdFindContinue, Format:=True, Replace:=wdReplaceAll
End With
End Sub
That would underline all of your text of Outline Level 1.
Perhaps that will get you started.
I asked this question some months ago: "My issue is that I want to use VBA to insert a very brief amount of standardized text before the English-language string in my numbered hierarchical headings." By "numbered hierarchical headings" I meant Word multilevel lists. The answers I received were appreciated but did not respond effectively to my question or guide me to a resolution. I pass this along in the hope it may be of use to others.
First, the "number" part of the Word heading is irrelevant. In writing your code, there is NO need to think of a "number" portion and a "text" portion of the heading. I was afraid that any text I was trying to insert would be inserted BEFORE the multilevel numbering rather than BEFORE the English language text. The multilevel numbering is apparently automatically ignored. Below are two solutions that worked.
This first macro succeeded in producing the desired result: 1.1.1 (FOUO). I used this macro to create individual macros for each order of heading. I haven't learned how to combine them all into one macro. But they work individually (but not without the flaw of taking too much time ~5 to 10 minutes for a complex, paragraph-heavy 670 page document).
Public Sub InsertFOUOH1()
Dim doc As Document
Dim para As Paragraph
Dim paraNext As Paragraph
Dim MyText As String
Dim H1 As HeadingStyle
Set doc = ActiveDocument
Set para = doc.Paragraphs.First
Do While Not para Is Nothing
Set paraNext = para.Next
MyText = "(U//FOUO) "
If para.Style = doc.Styles(wdStyleHeading1) Then
para.Range.InsertBefore (MyText)
End If
Set para = paraNext
Loop
End Sub
THIS WORKS ON ALL FIRST ORDER HEADINGS (1, 2, 3 ETC.)
I used the macro below to add my security marking all body paragraphs:
Public Sub InsertFOUObody()
'Inserts U//FOUO before all body paragraphs
Dim doc As Document
Dim para As Paragraph
Dim paraNext As Paragraph
Dim MyText As String
Set doc = ActiveDocument
Set para = doc.Paragraphs.First
Do While Not para Is Nothing
Set paraNext = para.Next
MyText = "(U//FOUO) "
If para.Style = doc.Styles(wdStyleBodyText) Then
para.Range.InsertBefore (MyText)
End If
Set para = paraNext
Loop
End Sub
These macros are running slowly and, at the end, generating Error 28 Out of stack space errors. However the error is displayed at the end of running the macros and after the macros have successfully performed their work.

Get all cross references in word with VBA

I have quite a large word document (> 400 pages) with lots of cross references to headings. So far, I have always referred to the title of the heading, but now I would like to change that and refer to the page the heading resides on.
I didn't find a solution to this via the GUI (except manual treatment, of course), so I was looking into writing some VBA. Unfortunately, I have only found a way to list all targets that can be cross referenced (via GetCrossReferenceItems), but I need a way to access the actual cross reference field.
Can you help me with that? Is a cross reference field the same as a hyperlink?
Cross-references are fields in a Word document, and can be accessed via the Fields collection (ActiveDocument.Fields). You can loop through them like any other collection and check their types to see if it's one you want to work on. It looks like cross-references to text are type 3 (wdFieldRef) and cross-references to page numbers are type 37 (wdFieldPageRef). Changing fields can be a little tricky; the following should get you started:
Sub ChangeFields()
Dim objDoc As Document
Dim objFld As Field
Dim sFldStr As String
Dim i As Long, lFldStart As Long
Set objDoc = ActiveDocument
' Loop through fields in the ActiveDocument
For Each objFld In objDoc.Fields
' If the field is a cross-ref, do something to it.
If objFld.Type = wdFieldRef Then
'Make sure the code of the field is visible. You could also just toggle this manually before running the macro.
objFld.ShowCodes = True
'I hate using Selection here, but it's probably the most straightforward way to do this. Select the field, find its start, and then move the cursor over so that it sits right before the 'R' in REF.
objFld.Select
Selection.Collapse wdCollapseStart
Selection.MoveStartUntil "R"
'Type 'PAGE' to turn 'REF' into 'PAGEREF'. This turns a text reference into a page number reference.
Selection.TypeText "PAGE"
'Update the field so the change is reflected in the document.
objFld.Update
objFld.ShowCodes = True
End If
Next objFld
End Sub