Converting Headings to Bookmarks - vba

I'm using Office 2016. I'd like to make a macro that loops through each heading in a document, and then creates a bookmark at the heading's location using the heading text (modified as necessary) as the bookmark name. Most of the headings are in X.X.X.X format, such as "3.3.4.1. sometexthere".
I'm still a beginner using VBA, but after a lot of googling I managed to adapt some Frankenstein code that almost works:
Sub HeadingsToBookmarks()
Dim heading As Range
Set heading = ActiveDocument.Range(Start:=0, End:=0)
Do
Dim current As Long
current = heading.Start
Set heading = heading.GoTo(What:=wdGoToHeading, Which:=wdGoToNext)
If heading.Start = current Then
Exit Do
End If
ActiveDocument.Bookmarks.Add MakeValidBMName(heading.Paragraphs(1).Range.Text), Range:=heading.Paragraphs(1).Range
Loop
End Sub
Function MakeValidBMName(strIn As String)
Dim pFirstChr As String
Dim i As Long
Dim tempStr As String
strIn = Trim(strIn)
pFirstChr = Left(strIn, 1)
If Not pFirstChr Like "[A-Za-z]" Then
strIn = "Section_" & strIn
End If
For i = 1 To Len(strIn)
Select Case Asc(Mid$(strIn, i, 1))
Case 49 To 58, 65 To 90, 97 To 122
tempStr = tempStr & Mid$(strIn, i, 1)
Case Else
tempStr = tempStr & "_"
End Select
Next i
tempStr = Replace(tempStr, " ", " ")
tempStr = Replace(tempStr, ":", "")
If Right(tempStr, 1) = "_" Then
tempStr = Left(tempStr, Len(tempStr) - 1)
End If
MakeValidBMName = tempStr
End Function
This code almost works, and makes appropriate bookmarks at some of the headings, but not all. Can anyone help me figure out what I need to fix here, or have other recommendations on how else I can clean up this code?
Edit: More information: The code above converts the first 5 or so headings in the document I've been testing it on, along with a few others scattered around. The second half of the code, which does the actual conversion, seems to work fine- the problem is located in the section that loops through each heading. The second half converts unusable characters to those that work with the requirements for bookmark names, and adds "Section_" to the beginning of bookmarks / headings that start with numbers (as bookmarks aren't allowed to start with numbers).
My goal is to be able to hyperlink to all sections within the document that has headings from a different word document. The standard Table of Contents creator allows only for links to be built within the same document, as far as I can tell. I'm aware that when word saves to PDF, it can convert headings to bookmarks; I would like to be able to do the same thing but retain the document in word format.
I unfortunately can't use the built in numbering. I'm working with documents that are already created and have a set and specific format.

You haven't described why you want bookmarks, or how a future user of the document would use/access the bookmarks.
MS Word has a number of built in features that act as bookmarks. The best way to do this is to use Styles. The built-in heading styles allow for some native navigation functionality (Word's own hidden bookmarks). Also, don't re-invent the wheel - use built in numbering.
This requires some document discipline. Use headings only for headings, and body text for the non-heading text.
The benefits make the discipline worth it. You can easily create tables of contents that use the headings (or even some of your custom styles), and headings show up in the navigation pane. When you save to PDF, you can use the headings as bookmarks in the PDF (show up on the Reader navigation bar).
Note that what I have described doesn't even touch VBA.
If you use set styles for your headings and you want to do a little more than what you can do natively, then you can simply:
Loop through all paragraphs in the document
See if that paragraph is set to your heading style
Place a bookmark (valid bookmark name!) over that paragraph
I have left the actual coding to you, but I think you will find it easy to do based on the pseudo code above. My pseudo code loop is not the only way to find the paragraphs, but it is the easiest to visualise.
Once you use the simplified method above and built-in numbering, you should find that you can modify your ValidBMName function - simplifying it. But, as noted and depending on why you want bookmarks, you may be able to avoid VBA altogether.

This code works for me:
Sub HeadingsToBookmarks()
Dim heading As Range
Set heading = ActiveDocument.Range(Start:=0, End:=0)
Do
Dim current As Long
current = heading.Start
Set heading = heading.GoTo(What:=wdGoToHeading, Which:=wdGoToNext)
If heading.Start = current Then
Exit Do
End If
'This is the part I changed: ListFormat.ListString
ActiveDocument.Bookmarks.Add MakeValidBMName(heading.Paragraphs(1).Range.ListFormat.ListString), Range:=heading.Paragraphs(1).Range
Loop
End Sub

Related

How to bold only a portion of Word paragraph in Visual Studio (VB)

I am trying to export a Word document from a Visual Basic program. Different parts of the document will need different formatting.
I have several paragraphs, and I need to bold only portions of each of those paragraphs. I am trying to set the range within each paragraph that needs to be bolded, but no matter what I do, it only seems to want to to bold the entire paragraph.
I want to do something like this:
Dim Para1 As Word.Paragraph
Para1 = WordDoc.Content.Paragraphs.Add
Para1.Range.Start = 1
Para1.Range.End = 14
Para1.Range.Font.Bold = True
Para1.Range.Text = "Job number is: " + myJobID
... so that it bolds from the 'J' to the ':' (in Para1.Range.Text) but does not bold the myJobID (which is a variable I'm getting from the user). However, no matter what I do, it bolds the entire paragraph, including the myJobID.
I've also tried creating a Range variable that sets a range based on the entire document, but the problem with that is, the lengths of several variables I'm outputting on the Word document are going to be varying sizes, and thus there's no way to know where the start of the next section I want to bold will start at. So basically, I have to work within the Paragraph object rather than iterating through all of the characters in the entire document.
Hope that made sense. Any ideas?
In order to format individual text runs it's necessary to break the text down into individual runs when inserting. Also, it's best to work with an independent Range object. Between formatting commands the Range needs to be "collapsed" - think of it like pressing the right (or left) arrow of a selection to make it a blinking cursor. Something along these lines
Dim Para1 As Word.Paragraph
Dim rng as Word.Range
Para1 = WordDoc.Content.Paragraphs.Add
rng = Para1.Range
rng.Text = "Job number is: "
rng.Font.Bold = True
rng.Collapse(Word.WdCollapseDirection.wdCollapseEnd)
rng.Text = myJobID
rng.Font.Bold = False
rng.Collapse Word.WdCollapseDirection.wdCollapseEnd
If it's really necessary to insert the full text in one go, then Find/Replace to locate the text that should be formatted differently is one way to format after-the-fact, although less efficient.
Another possibility is to use string manipulation functions, such as Instr (or Contains), Left, Mid etc. to determine where in a longer string the substring is located. Then Range.Start and Range.End can work with those values. But generally it's better to not rely on the start and end values since Word can insert non-visible characters that can throw this numbering off.
Create another Range object that only covers the characters that you want to bold.
The code below is not tested (don't have full VS set up on this machine), but should give you an idea:
Dim para1 As Word.Paragraph
Dim textToBeBolded As Word.Range
para1 = WordDoc.Content.Paragraphs.Add 'ThisDocument.Paragraphs.Add in VBA
para1.Range.Text = "Job number is: " + myJobID
para1.Range.SetRange 1, 14
textToBeBolded = para1.Range
textToBeBolded.SetRange 1, 14
textToBeBolded.Font.Bold = True

How to change filename using macro?

Now I have a document and i want to save it but the filename should be iterative
eg. 1.docx then if i run the macro it should save as 2.docx and so on.
can I take a word from the document as the variable for the filename
eg. I want to use the 45th word in my document as the filename
So how can i do it?
I don't have word open at the moment, so cannot code specific examples. What follows is hints on how to find your answer.
In word, you can select a range of text. Within these ranges are a number of built-in collections. "Paragraphs" and "Words" are some of the examples.
The easiest way to address your question is to select the Document Range, and then iterate through the Words collection until you get to the 45th word.
However, MSDN notes that some non-words are also kept in this collection, so you may want to filter as you iterate. (as an aside, be careful with their example, deleting items in a for-each loop has consequences)
Assuming "rngWords" is your selected document range (words collection), you could would be similar to (not true code):
Counter = 0
selectedWord = ""
for iterator = 1 to rngWords.Count
loopWord = rngWords(iterator)
if trueword then Counter = Counter + 1
if counter = 45 then ' magic number = 45 based on your post.
selectedword = loopWord
exit loop
end if
next iterator
You can experiment (e.g. debug.print loopWord) to work out how to determine a true word. You will also have to address what happens if you don't have 45 words in your document - e.g. you could take the last word. Probably the best way is to check the count first (if rngWords.Count < 45 then .... else (loop))
To retrieve the no. from from document, use the below link
Getting char from string at specified index in the visual basic
and for saving the file
Sub ExampleToSaveWorkbook()
Workbooks.Add
'Saving the Workbook
ActiveWorkbook.SaveAs "C:\WorkbookName.xls"
'OR
'ActiveWorkbook.SaveAs Filename:="C:\WorkbookName1.xls"
End Sub

Automating Mail Merge

I need to dynamically generate word documents using text from an Access database table. The caveat here is that some of the text coming from the database will need to be modified into MergeFields. I am currently using Interop.Word (VBA Macro) and VB.NET to generate the document.
My steps so far look like this:
Pull standard .docx Template
Fill In template with pre-defined filler text from table
Add MergeFields by replacing pieces of the filler text with actual mergefields
Attach Datasource and execute Mail Merge
After testing, I noticed that I cannot simply store MergeFields into the access database as a string, they do not feed over into the actual document. What I am thinking then is creating a table of known strings that should be replaced with MergeFields by the coding.
For Example:
Step 2 will insert "After #INACTIVE_DATE# your account will no longer be active." which will be in the database text.
Step 3 will find/replace #INACTIVE_DATE# with a MergeField «INACTIVE_DATE». I am using Word Interop to do this, so theoretically I can loop through the document.
I wasnt able to do a "Find And Replace" from text to MergeField, so how should I go about implementing this?
Tagging VBA additionally as I am seeking a "VBA" style answer (Word Interop).
You've left out quite a lot of detail, so I'll go about answering this in somewhat general terms.
What you want to do is definitely achievable. Two possible solutions immediately come to mind:
Replacing ranges using Find
Inserting tokens using TypeText
Replacing ranges using Find
Assuming the text has already been inserted, you can search the document for the given pattern and replace it with a merge field. For instance:
Sub FindInsertMerge()
Dim rng As Range
Set rng = ActiveDocument.Range
With rng.Find
.Text = "(\#*\#)"
.MatchWildcards = True
.Execute
If .Found Then
ActiveDocument.MailMerge.Fields.Add rng, Mid(rng.Text, 2, Len(rng.Text) - 2)
End If
End With
End Sub
Will find the first occurence of text starting with #, matches any string and ends with #. The contents of the found range will then be replaced by a merge field. The code above can easily be extended to loop for all fields.
Inserting tokens using TypeText
While I would normally advice against using Selection to insert data, this solution makes things simple. Say you have a target range, rng, you tokenize the database text to be inserted, select the range, start typing and whenever a designated mail merge field is found, insert a field instead of the text.
For instance:
Private Sub InsertMergeText(rng As Range, txt As String)
Dim i As Integer
Dim t As String
Dim tokens() As String
tokens = Split(txt, " ")
rng.Select
For i = 0 To UBound(tokens)
t = tokens(i)
If Left(t, 1) = "#" And Right(t, 1) = "#" Then
'Insert field if it's a mail merge label.
ActiveDocument.MailMerge.Fields.Add Selection.Range, Mid(t, 2, Len(t) - 2)
Else
'Simply insert text.
Selection.TypeText t
End If
'Insert the whitespace we replaced earlier.
If i < UBound(tokens) Then Selection.TypeText " "
Next
End Sub
Call example:
InsertMergeText Selection.Range, "After #INACTIVE_DATE# your account will no longer be active which will be in the database text."

Insert text after numbers and before words in a Word hierarchical heading

I am working my way through two books (Roman's Writing Word Macros, Mansfield's Mastering VBA for MS Office). In my work environment, I use both Word 2007 and Word 2010.
My issue is that I want to use VBA to insert a very brief amount of standardized text before the English-language string in my numbered hierarchical headings. For instance, I have:
1.1.1 The Quick Brown Fox.
What I want is:
1.1.1 (XXxx) The Quick Brown Fox.
I guess my most basic issue is that I don't know how to approach the situation. I have hierarchical headings yet I don't know how to say, in effect, "Go to each hierarchical heading regardless of level. Insert yourself in front of the first English language word of the heading. Paste the text "XXxx" in front of the first word in the heading. Go on to the next heading and all remaining headings and do the same. My document is over 700 pages and has hundreds of hierarchical headings.
I see that paragraphs are objects and that hierarchical headings are paragraphs. However, I can't see any way to make VBA recognize what I am talking about. I haven't been able to use Selection approaches successfully. I've tried using the Range approach but just have not been able to phrase the VBA code intelligently. I haven't been able to specify a range that includes all and only the hierarchical headings and, especially, I don't understand how to get the insertion to happen in front of the first English-language word of the heading.
I have just begun to look at using Bookmarks. However, don't bookmarks require me to go to every heading and enter them? I may as well just paste my content if that is the case. I'm stumped. It is interesting that in no way, as might have been expected, does this appear to be a simple matter
Assuming you are using Word's outline levels (I think this is what you mean by hierarchical headings), you can check a paragraph for this state. For example, assuming I have a paragraph in my document that has the Heading 1 style applied to it:
Sub PrintHeadings()
Dim objDoc as Document
Dim objPara as Paragraph
Set objDoc = ActiveDocument
For each objPara in objDoc.Content.Paragraphs
If objPara.OutlineLevel <> wdOutlineLevelBodyText then
Debug.Print objPara.Range.Text
End If
Next objPara
End Sub
This code would print the contents of any paragraph that has an outline level above body text to the VBA Immediate Window. There are other approaches as well; you could use Find and Replace to search for each of the Outline Levels. This gives you a bit less control; you'd want your change to be something that could be encapsulated in a Word Find and Replace. But, it would be faster if you have a long document and not too many heading levels. A basic example:
Sub UnderlineHeadings()
Dim objDoc as Document
Set objDoc = ActiveDocument
With objDoc.Content.Find
.ClearFormatting
.ParagraphFormat.OutlineLevel = wdOutlineLevel1
With .Replacement
.ClearFormatting
.Font.Underline = wdUnderlineSingle
End With
.Execute Forward:=True, Wrap:=wdFindContinue, Format:=True, Replace:=wdReplaceAll
End With
End Sub
That would underline all of your text of Outline Level 1.
Perhaps that will get you started.
I asked this question some months ago: "My issue is that I want to use VBA to insert a very brief amount of standardized text before the English-language string in my numbered hierarchical headings." By "numbered hierarchical headings" I meant Word multilevel lists. The answers I received were appreciated but did not respond effectively to my question or guide me to a resolution. I pass this along in the hope it may be of use to others.
First, the "number" part of the Word heading is irrelevant. In writing your code, there is NO need to think of a "number" portion and a "text" portion of the heading. I was afraid that any text I was trying to insert would be inserted BEFORE the multilevel numbering rather than BEFORE the English language text. The multilevel numbering is apparently automatically ignored. Below are two solutions that worked.
This first macro succeeded in producing the desired result: 1.1.1 (FOUO). I used this macro to create individual macros for each order of heading. I haven't learned how to combine them all into one macro. But they work individually (but not without the flaw of taking too much time ~5 to 10 minutes for a complex, paragraph-heavy 670 page document).
Public Sub InsertFOUOH1()
Dim doc As Document
Dim para As Paragraph
Dim paraNext As Paragraph
Dim MyText As String
Dim H1 As HeadingStyle
Set doc = ActiveDocument
Set para = doc.Paragraphs.First
Do While Not para Is Nothing
Set paraNext = para.Next
MyText = "(U//FOUO) "
If para.Style = doc.Styles(wdStyleHeading1) Then
para.Range.InsertBefore (MyText)
End If
Set para = paraNext
Loop
End Sub
THIS WORKS ON ALL FIRST ORDER HEADINGS (1, 2, 3 ETC.)
I used the macro below to add my security marking all body paragraphs:
Public Sub InsertFOUObody()
'Inserts U//FOUO before all body paragraphs
Dim doc As Document
Dim para As Paragraph
Dim paraNext As Paragraph
Dim MyText As String
Set doc = ActiveDocument
Set para = doc.Paragraphs.First
Do While Not para Is Nothing
Set paraNext = para.Next
MyText = "(U//FOUO) "
If para.Style = doc.Styles(wdStyleBodyText) Then
para.Range.InsertBefore (MyText)
End If
Set para = paraNext
Loop
End Sub
These macros are running slowly and, at the end, generating Error 28 Out of stack space errors. However the error is displayed at the end of running the macros and after the macros have successfully performed their work.

Distinguishing Table of Contents in Word document

Does anyone know how when programmatically iterating through a word document, you can tell if a paragraph forms part of a table of contents (or indeed, anything else that forms part of a field).
My reason for asking is that I have a VB program that is supposed to extract the first couple of paragraphs of substantive text from a document - it's doing so by iterating through the Word.Paragraphs collection. I don't want the results to include tables of contents or other fields, I only want stuff that a human being would recognize as a header, title or a normal text paragraph. However it turns out that if there's a table of contents, then not only the table of contents itself but EVERY line in the table of contents appears as a separate item in Word.Paragraphs. I don't want these but haven't been able to find any property on the Paragraph object that would allow me to distinguish and so ignore them (I'm guessing I need the solution to apply to other field types too, like table of figures and table of authorities, which I haven't yet actually encountered but I guess potentially would cause the same problem)
Because of the limitations in the Word object model I think the best way to achieve this would be to temporarily remove the TOC field code, iterate through the Word document, and then re-insert the TOC. In VBA, it would look like this:
Dim doc As Document
Dim fld As Field
Dim rng As Range
Set doc = ActiveDocument
For Each fld In doc.Fields
If fld.Type = wdFieldTOC Then
fld.Select
Selection.Collapse
Set rng = Selection.Range 'capture place to re-insert TOC later
fld.Cut
End If
Next
Iterate through the code to extract paragraphs and then
Selection.Range = rng
Selection.Paste
If you are coding in .NET this should translate pretty closely. Also, this should work for Word 2003 and earlier as is, but for Word 2007/2010 the TOC, depending on how it is created, sometimes has a Content Control-like region surrounding it that may require you to write additional detect and remove code.
This is not guaranteed, but if the standard Word styles are being used for the TOC (highly likely), and if no one has added their own style prefixed with "TOC", then it is OK. This is a crude approach, but workable.
Dim parCurrentParagraph As Paragraph
If Left(parCurrentParagraph.Format.Style.NameLocal, 3) = "TOC" Then
' Do something
End If
What you could do is create a custom style for each section of your document.
Custom styles in Word 2003 (not sure which version of Word you're using)
Then, when iterating through your paragraph collection you can check the .Style property and safely ignore it if it equals your TOCStyle.
I believe the same technique would work fine for Tables as well.
The following Function will return a Range object that begins after any Table of Contents or Table of Figures. You can then use the Paragraphs property of the returned Range:
Private Function GetMainTextRange() As Range
Dim toc As TableOfContents
Dim tof As TableOfFigures
Dim mainTextStart As Long
mainTextStart = 1
For Each toc In ActiveDocument.TablesOfContents
If toc.Range.End > mainTextStart Then
mainTextStart = toc.Range.End + 1
End If
Next
For Each tof In ActiveDocument.TablesOfFigures
If tof.Range.End > mainTextStart Then
mainTextStart = tof.Range.End + 1
End If
Next
Set GetMainTextRange = ActiveDocument.Range(mainTextStart, ActiveDocument.Range.End)
End Function