Counting number of words on each page in Word Document - vba

I'm attempting to write a macro in Word that inserts a marker at the beginning of each page in tiny font so it doesn't bump any text forward to the next page. (I then output the document to HTML and do some additional processing of the markers there in Javascript.)
I have basically figured out the script to do that:
Sub Test()
Dim PageCounter As Long
Dim FirstPageInput As String
Dim PageFirstWordRange As Range
Dim InsertWord As String
FirstPageInput = InputBox("What is the initial page number for this article?", "First Page Number?")
For PageCounter = 1 To ActiveDocument.BuiltInDocumentProperties("Number of Pages")
Set PageFirstWordRange = ActiveDocument.Range.GoTo(wdGoToPage, wdGoToAbsolute, PageCounter)
'If this is the first page, add the initial page number. Otherwise, just insert a marker.'
If PageCounter = 1 Then
InsertWord = "!#" + FirstPageInput + "#! "
PageFirstWordRange.InsertBefore (InsertWord)
Else
'PageFirstWordRange.MoveEnd wdCharacter, -1'
InsertWord = " !##! "
PageFirstWordRange.InsertAfter (InsertWord)
End If
'Select the word we just inserted, and change its size to 1'
With PageFirstWordRange.Font
.Size = 1
End With
Next
End Sub
The issue is that there are sometimes pages and pages filled up with footnotes. Because the script tries to find the first word on each page, it ends up skipping these footnote pages (not writing a marker for them) and I need the script to do that.
The solution I've decided is to count the words on each page as I iterate. If there are any pages with 0 text (not including the footnotes) then I will add an additional marker for each blank page.
I have no idea how to do this - I'm so bad with VBA. Does anyone have a solution for me? This feels like an easy thing to do, but I'm pulling my hair out!

Related

How to get a page / the beginning of a Section?

Good Day! I have a Word Document and a macro that takes content of that document and inserts it into another Document(with all boundaries, frames etc.), Basically unorganized into organized document.
I would like to check the first page of a section (and any section) and see if any table is present if so, exit the loop, and just readjust the page accordingly with our country document standards.
I have done mostly without checking a table on the first page of a new section.
If objSection.Index = 1 Then 'the first section
'I need to check if there is a table on the first page of a section
Checking page size and orientation for the first Section.'(this is done)
ElseIf objSection.Index > 1 Then 'other sections
'practically the same thing, checking if a table exists on the first page of a section
Checking page size and orientation for all other Sections.'(this is also done)
End If
Something like the following should get you there.
Function TableInFirstPageOfSection(sec As Section) As Boolean
Dim secRange As Range, secStart As Long
Dim tblRange As Range, tblStart As Long
TableInFirstPageOfSection = False
Set secRange = sec.Range
secRange.Collapse wdCollapseStart
secStart = secRange.Information(wdActiveEndPageNumber)
If sec.Range.Tables.Count > 0 Then
Set tblRange = sec.Range.Tables(1).Range
tblRange.Collapse wdCollapseStart
tblStart = tblRange.Information(wdActiveEndPageNumber)
If tblStart = secStart Then TableInFirstPageOfSection = True
End If
End Function

How can i change every occurence of a specific font ind a Word document?

i have following problem. Im currently creating a Macro that gets every font thats been used in a Word document. Afterwards it checks, if this font is even installed and changes the font into predefined fonts. (As the Microsoft auto-font-change in Word is pretty bad and changes my fonts into Comic Sans (no joke ...).
Everything works as intended except for one thing.
This here is the code i am using to exchange every occurence of the found
font in the document:
For i = 0 To UBound(missingFont)
For Each oCharacter In ActiveDocument.Range.Characters
If oCharacter.Font.name = missingFont(i) Then
oCharacter.Font.name = fontToUse
If InStr(missingFont(i), "bold") Then
oCharacter.Font.Bold = True
End If
If InStr(missingFont(i), "italic") Then
oCharacter.Font.Italic = True
End If
End If
Next oCharacter
Next i
So basically im checking every Character in my document and change it if needed. Now this only works for Characters that are not inside of textfields, the header or footer. How can i check every, EVERY, character inside of the Document?
First i've tried to use ActiveDocument.Range.Paragraphs instead of ActiveDocument.Range.Characters. I've also tried using the macro given here: http://www.vbaexpress.com/forum/showthread.php?55726-find-replace-fonts-macro but couldnt get this to work at all.
It's not clear what is meant by "textfield" as that could be any of five or six different things in Word...
But there is a way to access almost everything (excluding ActiveX controls) in a Word document by looping all StoryRanges. A StoryRange includes the main body of the document, headers, footers, footnotes, text ranges in Shapes, etc.
The following code sample demonstrates how to loop all the "Stories" in a document. I've put the code provided in the question in a separate procedure that's called from the "Stories" loop. (Note that I am not able to test, not having access to either the documents or relevant portions of code used in the question.)
Sub ProcessAllStories()
Dim doc as Word.Document
Dim missingFont as Variant
Dim myStoryRange as Word.StoryRange
'Define missingFont
Set doc = ActiveDocument
For Each myStoryRange In doc.StoryRanges
CheckFonts myStoryRange, missingFont
Do While Not (myStoryRange.NextStoryRange Is Nothing)
Set myStoryRange = myStoryRange.NextStoryRange
CheckFonts myStoryRange, missingFont
Loop
Next myStoryRange
End Sub
Sub CheckFonts(rng as Word.Range, missingFont as Variant)
Dim oCharacter as Word.Range
For i = 0 To UBound(missingFont)
For Each oCharacter In rng.Characters
If oCharacter.Font.name = missingFont(i) Then
oCharacter.Font.name = fontToUse
If InStr(missingFont(i), "bold") Then
oCharacter.Font.Bold = True
End If
If InStr(missingFont(i), "italic") Then
oCharacter.Font.Italic = True
End If
End If
Next oCharacter
Next i
End Sub

Is there a way to list broken internal hyperlinks with VBA in MS Word? (Hyperlink Subaddress)

In MS Word, you can create hyperlinks to a "Place in this document" so that a link takes you someplace else in the same Word file. However, if you change headers or move things around these links will sometimes break. I want to write some VBA to check for broken links.
With VBA, you can list each hyperlink subaddress using the code below:
Sub CheckLinks()
Set doc = ActiveDocument
Dim i
For i = 1 To doc.Hyperlinks.Count
Debug.Print doc.Hyperlinks(i).SubAddress
Next
End Sub
The output from the code above also matches what is shown in the field codes for the hyperlink.
However, I'm not really clear on how to verify if the SubAddress is correct. For example, an excerpt from the program output shows this:
_Find_a_Staff_1
_Edit_Organization_Settings_2
_Set_the_Staff
_Find_a_Staff_1
But there's no obvious way to tell what the "correct" suffix should be for a given heading. Any thoughts on how to check if these are valid?
Is there a way to get the list of all valid subaddresses for the headings in the document?
The code below will list the hyperlinks where the corresponding bookmark does not exist in the document. (Note that it only detects missing links, not links that go to the wrong place.)
Sub CheckLinks()
Dim doc As Document
Set doc = ActiveDocument
Dim i, j
Dim found As Boolean
For i = 1 To doc.Hyperlinks.Count
found = False
For j = 1 To doc.Bookmarks.Count
If doc.Range.Bookmarks(j).Name = doc.Hyperlinks(i).SubAddress Then
found = True
End If
Next
If found = False Then
Debug.Print doc.Hyperlinks(i).SubAddress
End If
Next
End Sub

Running headers for multiple levels of headings

I know, there is a simple way to create running headers using STYLEREF field. Just put this:
{ STYLEREF "Heading 1" }
in the header of your document, and it works fine.
However, the problem arises when I want to match multiple heading levels. For example, on 1st page of my document I have a Heading 1 style with text Foo. And on the 2nd page of document I have Heading 2 style with text Bar.
When I'm on 1st page of document, I want to see the "Foo" in the page header. When I'm on the 2nd page, I want to see "Bar" in the page header.
It is very simple in LibreOffice, but I haven't find any "proper" way to achieve it in MS Word.
Sidenote: Well, there exists a workaround: create a character style "My headings" and apply it on the paragraph styles "Heading 1" and "Heading 2", and then use it in STYLEREF field:
{ STYLEREF "My headings" }
But it is not convenient.
I post it as StackOverflow question, because I believe, this probably could be fixed with macro.
I checked again and tried to write a macro. The macro would be possible but the difficulties starts when it comes to insert the current header style text into the page header. Since in Microsoft Word the page headers display always the same content on every page, you would need to introduce Section break on every single page. This would then allow you to have different page header content on every page. Also it's necessary to ensure that the header option connect with previous is unticked so this would work at all.
Given that it would be possible to insert the style header text of each page into the page headers of each page. It would be a really "hacky" solution and your document would be full of sections due to the section breaks. I wouldn't want to work with such a document though but that's up to you.
Here is the NOT WORKING macro I've came up with until I realized the section issue:
Sub RunningHeader()
' THIS MACRO DOES NOT WORK!
' Date: 2017.08.08
' Running header macro
' Assumes every page ends with a section break
' Supports to set running header up to level 3
Dim mPageCount As Integer
Dim mCurrentPage As Integer
Dim mPageRange As Range
Dim mSection As Section
Dim mRunningHeader As String
mPageCount = ActiveDocument.ComputeStatistics(wdStatisticPages)
' ToDo
' Ensure each page of the document ends with a "section break"
' Loop through each page.
' Idea looping through pages from:
' https://support.microsoft.com/en-us/help/269565/how-to-automate-word-to-set-and-retrieve-section-header-and-footer-inf
For mCurrentPage = 1 To mPageCount
'
If intpage <> 1 Then
' Goes to the top of the next page.
Selection.GoToNext What:=wdGoToPage
Else
' Goes to the top of the first page.
Selection.HomeKey Unit:=wdStory
End If
' Selects the content of the current page
ActiveDocument.Bookmarks("\page").Range.Select
' Get text of highest header style on current page
mRunningHeader = GetHighestHeader
' Get section of current page
Set mPageRange = ActiveDocument.GoTo(What:=wdGoToPage, Which:=wdGoToAbsolute, Count:=mCurrentPage)
Set mSection = mPageRange.Sections(1)
' Disconnect page header from previous
' ToDo
' Set text into page header
' ToDo
Next
End Sub
Private Function GetHighestHeader() As String
Dim mParagraph As Paragraph
Dim mHighestHeaderText As String
Dim mHighestHeaderNumber As Integer
mHighestHeaderText = ""
For Each mParagraph In Selection.Paragraphs
If mParagraph.Style = ActiveDocument.Styles(wdStyleHeading1) Then
mHighestHeaderText = mParagraph.Range.Text
Exit For
End If
If mParagraph.Style = ActiveDocument.Styles(wdStyleHeading2) Then
If mHighestHeaderNumber < 2 Then
mHighestHeaderText = mParagraph.Range.Text
mHighestHeaderNumber = 2
End If
End If
If mParagraph.Style = ActiveDocument.Styles(wdStyleHeading3) Then
If mHighestHeaderNumber < 3 Then
mHighestHeaderText = mParagraph.Range.Text
mHighestHeaderNumber = 3
End If
End If
Next mParagraph
GetHighestHeader = mHighestHeaderText
End Function

VBS exit loop where no more track changes exist in word document

I have the following code that builds a string of MS word pages that contain track changes (markup). At the end of the loop I exit where track changes are contained on the last page.
However where there are no track changes on the last page I am stuck in a loop (no pun intended). During the run I get the confirmation box 'do you want to start the the beginning of the document' and it loops
How can I add an 'exit do', so where the last track changes are found (not on the last page) the do statement is exited?
Sub finaltest()
Dim CurPage As Integer
Dim totalPages As Integer
Dim Pages As String
'declare variables
Pages = ""
'total page count
ActiveDocument.Repaginate
totalPages = ActiveDocument.BuiltInDocumentProperties(wdPropertyPages)
'get us home
Selection.HomeKey Unit:=wdStory
Do
'find next change
WordBasic.NextChangeOrComment
'get current page
CurPage = Selection.Information(wdActiveEndAdjustedPageNumber)
Pages = Pages & ", " & CurPage
'<exit loop if there is no more track changes and not at last page>
Loop Until CurPage = totalPages
Pages = Right(Pages, Len(Pages) - 2)
MsgBox Pages
End Sub
It is a bit unclear which revision elements you really need to track as far as you use WordBasic.NextChangeOrComment instruction.
However, assuming that you need to track revisions but not comments you could insert the code below right after the comment you put to your Sub:
'<exit loop if there is no more track changes and not at last page>
With ActiveDocument.Content.Revisions
If Selection.Range = .Item(.Count).Range Then
'MsgBox "Last one" - for test only
Exit Do
End If
End With
If you need to trace comments to you need to create similar solution as for revision items (Comments collection doesn't include Revisions collection and vice-versa).