How to caculate number of words in a page - vba

How to caculate number of words in a page of word document
I need the VB code for this action that will allow me to automate calculation of word count that will be place at the footer of each page
I tries searching for solution but no luck

Unless you're prepared to insert 'Next Page' Section breaks between your pages, you cannot have the individual page counts in the page headers or footers. The following code does the required calculations and inserts the page counts as hidden text at the top of each page's body text. Content in page headers, footers, footnotes, endnotes & textboxes is ignored.
Sub Demo()
Application.ScreenUpdating = False
Dim p As Long, w As Long, Rng As Range
With ActiveDocument
For p = .ComputeStatistics(wdStatisticPages) To 1 Step -1
Set Rng = .GoTo(What:=wdGoToPage, Count:=p).GoTo(What:=wdGoToBookmark, Name:="\Page")
With Rng
w = .ComputeStatistics(wdStatisticWords)
.Collapse wdCollapseStart
.Text = "[" & w & "]"
.Font.Hidden = True
End With
Next
End With
Application.ScreenUpdating = True
End Sub
Do note that I have not used the .Words.Count property as that is unreliable.

Related

Iterate through paragraphs and trim spaces in MS Word

I need to create a macros which removes whitespaces and indent before all paragraphs in the active MS Word document. I've tried following:
For Each p In ActiveDocument.Paragraphs
p.Range.Text = Trim(p.range.Text)
Next p
which sets macros into eternal loop. If I try to assign string literal to the paragraphs, vba always creates only 1 paragraph:
For Each p In ActiveDocument.Paragraphs
p.Range.Text = "test"
Next p
I think I have a general misconception about paragraph object. I would appreciate any enlightment on the subject.
The reason the code in the question is looping is because replacing one paragraph with the processed (trimmed) text is changing the paragraphs collection. So the code will continually process the same paragraph at some point.
This is normal behavior with objects that are getting deleted and recreated "behind the scenes". The way to work around it is to loop the collection from the end to the front:
For i = ActiveDocument.Paragraphs.Count To 1 Step -1
Set p = ActiveDocument.Paragraphs(i)
p.Range.Text = Trim(p.Range.Text)
Next
That said, if the paragraphs in the document contain any formatting this will be lost. String processing does not retain formatting.
An alternative would be to check the first character of each paragraph for the kinds of characters you consider to be "white space". If present, extend the range until no more of these characters are detected, and delete. That will leave the formatting intact. (Since this does not change the entire paragraph a "normal" loop works.)
Sub TestTrimParas()
Dim p As Word.Paragraph
Dim i As Long
Dim rng As Word.Range
For Each p In ActiveDocument.Paragraphs
Set rng = p.Range.Characters.First
'Test for a space or TAB character
If rng.Text = " " Or rng.Text = Chr(9) Then
i = rng.MoveEndWhile(" " + Chr(9))
Debug.Print i
rng.Delete
End If
Next p
End Sub
You could, of course, do this in a fraction of the time without a loop, using nothing fancier than Find/Replace. For example:
Find = ^p^w
Replace = ^p
and
Find = ^w^p
Replace = ^p
As a macro this becomes:
Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Range
.InsertBefore vbCr
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchWildcards = False
.Text = "^p^w"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
.Text = "^w^p"
.Execute Replace:=wdReplaceAll
End With
.Characters.First.Text = vbNullString
End With
Application.ScreenUpdating = True
End Sub
Note also that trimming text the way you're doing is liable to destroy all intra-paragraph formatting, cross-reference fields, and the like; it also won't change indents. Indents can be removed by selecting the entire document and changing the paragraph format; better still, modify the underlying Styles (assuming they've been used correctly).
Entering "eternal" loop is a bit unpleasant. Only Chuck Norris can exit one. Anyway, try to make a check before trimming and it will not enter:
Sub TestMe()
Dim p As Paragraph
For Each p In ThisDocument.Paragraphs
If p.Range <> Trim(p.Range) Then p.Range = Trim(p.Range)
Next p
End Sub
As has been said by #Cindy Meister, I need to prevent endless creation of another paragraphs by trimming them. I bear in mind that paragraph range contains at least 1 character, so processing range - 1 character would be safe. Following has worked for me
Sub ProcessParagraphs()
Set docContent = ActiveDocument.Content
' replace TAB symbols throughout the document to single space (trim does not remove TAB)
docContent.Find.Execute FindText:=vbTab, ReplaceWith:=" ", Replace:=wdReplaceAll
For Each p In ActiveDocument.Paragraphs
' delete empty paragraph (delete operation is safe, we cannot enter enternal loop here)
If Len(p.range.Text) = 1 Then
p.range.Delete
' remove whitespaces
Else
Set thisRg = p.range
' shrink range by 1 character
thisRg.MoveEnd wdCharacter, -1
thisRg.Text = Trim(thisRg.Text)
End If
p.LeftIndent = 0
p.FirstLineIndent = 0
p.Reset
p.range.Font.Reset
Next
With Selection
.ClearFormatting
End With
End Sub
I saw a number of solutions here are what worked for me. Note I turn off track changes and then revert back to original document tracking status.
I hope this helps some.
Option Explicit
Public Function TrimParagraphSpaces()
Dim TrackChangeStatus: TrackChangeStatus = ActiveDocument.TrackRevisions
ActiveDocument.TrackRevisions = False
Dim oPara As Paragraph
For Each oPara In ActiveDocument.StoryRanges(wdMainTextStory).Paragraphs
Dim oRange As Range: Set oRange = oPara.Range
Dim endRange, startRange As Range
Set startRange = oRange.Characters.First
Do While (startRange = Space(1))
startRange.Delete 'Remove last space in each paragraphs
Set startRange = oRange.Characters.First
Loop
Set endRange = oRange
' NOTE: for end range must select the before last characted. endRange.characters.Last returns the chr(13) return
endRange.SetRange Start:=oRange.End - 2, End:=oRange.End - 1
Do While (endRange = Space(1))
'endRange.Delete 'NOTE delete somehow does not work for the last paragraph
endRange.Text = "" 'Remove last space in each paragraphs
Set endRange = oPara.Range
endRange.SetRange Start:=oRange.End - 1, End:=oRange.End
Loop
Next
ActiveDocument.TrackRevisions = TrackChangeStatus
End Function

Finding a "Heading" Style in a Word Document

I have a Word macro that allows to put his/her cursor anywhere in a Word document and it finds and saves the Heading 1, Heading 2 and Heading 3 text that is above the text selected by the user in order capture the chapter, section and sub-section that is associated with any sentence in the document.
I am currently using the code below which moves up the document line-by-line until it finds a style that contains "Heading x". When I have completed this task I move down the number of lines that I moved up to get to Heading 1, which may be many pages.
As you can imagine this is awkward, takes a long time (sometimes 60+ seconds) and is visually disturbing.
The code below is that subroutine that identifies the heading.
Dim str_heading_txt, hdgn_STYLE As String
Dim SELECTION_PG_NO as Integer
hdng_STYLE = Selection.Style
Do Until Left(hdng_STYLE, 7) = "Heading"
LINESUP = LINESUP + 1
Selection.MoveUp Unit:=wdLine, COUNT:=1
Selection.HomeKey Unit:=wdLine
Selection.EndKey Unit:=wdLine, Extend:=wdExtend
hdng_STYLE = Selection.Style
'reached first page without finding heading
SELECTION_PG_NO = Selection.Information(wdActiveEndPageNumber)
If SELECTION_PG_NO = 1 Then 'exit if on first page
a_stop = True
Exit Sub
End If
Loop
str_heading_txt = Selection.Sentences(1)
I tried another approach below in order to eliminate the scrolling and performance issues using the Range.Find command below.
I am having trouble getting the selection range to move to the text with the "Heading 1" style. The code selects the sentence at the initial selection, not the text with the "Heading 1" style.
Ideally the Find command would take me to any style that contained "Heading" but, if required, I can code separately for "Heading 1", "Heading 2" and "Heading 3".
What changes to the code are required so that "Heading 1" is selected or, alternatively, that "Heading" is selected?
Dim str_heading_txt, hdgn_STYLE As String
Dim Rng As Range
Dim Fnd As Boolean
Set Rng = Selection.Range
With Rng.Find
.ClearFormatting
.Style = "Heading 1"
.Forward = False
.Execute
Fnd = .Found
End With
If Fnd = True Then
With Rng
hdng_STYLE = Selection.Style
str_heading_txt = Selection.Sentences(1)
End With
End If
Any assistance is sincerely appreciated.
You can use the range.GoTo() method.
Dim rngHead As Range, str_heading_txt As String, hdgn_STYLE As String
Set rngHead = Selection.GoTo(wdGoToHeading, wdGoToPrevious)
'Grab the entire text - headers are considered a paragraph
rngHead.Expand wdParagraph
' Read the text of your heading
str_heading_txt = rngHead.Text
' Read the style (name) of your heading
hdgn_STYLE = rngHead.Style
I noticed that you used Selection.Sentences(1) to grab the text, but headings are already essentially a paragraph by itself - so you can just use the range.Expand() method and expand using wdParagraph
Also, a bit of advice:
When declaring variables such as:
Dim str_heading_txt, hdgn_STYLE As String
Your intent was good, but str_heading_txt was actually declared as type Variant. Unfortunately with VBA, if you want your variables to have a specific data type, you much declare so individually:
Dim str_heading_txt As String, hdgn_STYLE As String
Or some data types even have "Shorthand" methods known as Type Characters:
Dim str_heading_txt$, hdgn_STYLE$
Notice how the $ was appended to the end of your variable? This just declared it as a String without requiring the As String.
Some Common Type-Characters:
$ String
& Long
% Integer
! Single
# Double
You can even append these to the actual value:
Dim a
a = 5
Debug.Print TypeName(a) 'Prints Integer (default)
a = 5!
Debug.Print TypeName(a) 'Prints Single
Try something based on:
Sub Demo()
Dim Rng As Range, StrHd As String, s As Long
s = 10
With Selection
Set Rng = .Range
Set Rng = Rng.GoTo(What:=wdGoToBookmark, Name:="\HeadingLevel")
StrHd = Rng.Paragraphs.First.Range.Text
Do While Right(Rng.Paragraphs.First.Style, 1) > 1
Rng.End = Rng.Start - 1
Set Rng = Rng.GoTo(What:=wdGoToBookmark, Name:="\HeadingLevel")
With Rng.Paragraphs.First
If Right(.Style, 1) < s Then
s = Right(.Style, 1)
StrHd = .Range.Text & StrHd
End If
End With
Loop
MsgBox StrHd
End With
End Sub

Macro to insert comments on keywords in selected text in a Word doc?

I'm new to VBA and would greatly appreciate some help on a problem.
I have long Word documents where I need to apply standard comments to the same set of keywords, but only in selected sections of the document. The following macro worked to find a keyword and apply a comment (from question here https://superuser.com/questions/547710/macro-to-insert-comment-bubbles-in-microsoft-word):
Sub label_items()
'
' label_items Macro
'
'
Do While Selection.Find.Execute("keyword1") = True
ActiveDocument.Comments.Add range:=Selection.range, Text:="comment for keyword 1"
Loop
End Sub
The two modifications are:
1) only apply the comments to user selected text, not the whole document. I tried a "With Selection.Range.Find" approach but I don't think comments can be added this way (??)
2) repeat this for 20+ keywords in the selected text. The keywords aren't totally standard and have names like P_1HAI10, P_1HAI20, P_2HAI60, P_HFS10, etc.
EDIT: I have tried to combine code from similar questions ( Word VBA: finding a set of words and inserting predefined comments and Word macro, storing the current selection (VBA)) but my current attempt (below) only runs for the first keyword and comment and runs over the entire document, not just the text I have highlighted/selected.
Sub label_items()
'
' label_items Macro
'
Dim selbkup As range
Set selbkup = ActiveDocument.range(Selection.range.Start, Selection.range.End)
Set range = selbkup
Do While range.Find.Execute("keyword 1") = True
ActiveDocument.Comments.Add range, "comment for keyword 1"
Loop
Set range = selbkup
Do While range.Find.Execute("keyword 2") = True
ActiveDocument.Comments.Add range, "comment for keyword 2"
Loop
'I would repeat this process for all of my keywords
End Sub
I've combed through previous questions and the Office Dev Center and am stuck. Any help/guidance is greatly appreciated!
It's a matter of adding a loop and a means of Finding the next keyword you're looking for. There are a few suggestions in the code example below, so please adjust it as necessary to fit your requirements.
Option Explicit
Sub label_items()
Dim myDoc As Document
Dim targetRange As Range
Set myDoc = ActiveDocument
Set targetRange = Selection.Range
'--- drop a bookmark to return the cursor to it's original location
Const RETURN_BM = "OrigCursorLoc"
myDoc.Bookmarks.Add Name:=RETURN_BM, Range:=Selection.Range
'--- if nothing is selected, then search the whole document
If Selection.Start = Selection.End Then
Selection.Start = 0
targetRange.Start = 0
targetRange.End = myDoc.Range.End
End If
'--- build list of keywords to search
Dim keywords() As String
keywords = Split("SMS,HTTP,SMTP", ",", , vbTextCompare)
'--- search for all keywords within the user selected range
Dim i As Long
For i = 0 To UBound(keywords)
'--- set the cursor back to the beginning of the
' originally selected range
Selection.GoTo What:=wdGoToBookmark, Name:=RETURN_BM
Do
With Selection.Find
.Forward = True
.Wrap = wdFindStop
.Text = keywords(i)
.Execute
If .Found Then
If (Selection.Start < targetRange.End) Then
Selection.Comments.Add Selection.Range, _
Text:="Found the " & keywords(i) & " keyword"
Else
Exit Do
End If
Else
Exit Do
End If
End With
Loop
Next i
'--- set the cursor back to the beginning of the
' originally selected range
Selection.GoTo What:=wdGoToBookmark, Name:=RETURN_BM
End Sub

Index has incorrect page numbers

I'm building an index via a macro, and after a little bit, the page numbers start to get wonky. At first, they are correct, but as we go deeper in the document, they start getting offset.
I have a hunch it's because the code I'm using uses a range (.Index.MarkEntry Range:=theRange ...), and the page of the end of the range is where the page number comes from.
How can I make sure that the page number the index uses, is the page that has the first character in the range (does that make sense? Whatever page the entry starts on, is the page I want to use).
Here's my (truncated for relevance) code:
Sub Find_Definitions()
Dim myDoc As Word.Document
Dim oRng As Word.Range, rng As Word.Range, rngXE As Word.Range, tempHold As Word.Range
Dim addDefinition$, findText$, editedDefinition$
Dim meanTypes() As Variant
Dim rngEdited
Dim y&
Dim bFound As Boolean
meanTypes = Array(Chr(150) & " means", Chr(151) & " means", "- means", Chr(150) & " meaning", Chr(151) & " meaning", "- meaning")
Set myDoc = ActiveDocument
bFound = True
Call Clear_Index
For y = LBound(meanTypes) To UBound(meanTypes)
'Loop through the document
Set oRng = myDoc.Content
Set rngXE = oRng.Duplicate
With oRng.Find
.ClearFormatting
.ClearAllFuzzyOptions
'.Text = findText
.Text = meanTypes(y)
.MatchCase = False
.Wrap = wdFindStop
End With 'orng.find
Do While bFound
bFound = oRng.Find.Execute
If bFound Then
Set rngXE = oRng.Paragraphs(1).Range.Duplicate
rngXE.Select
' Here's where I could check the text, and see if it starts with Roman numerals.
editedDefinition = Check_For_Roman_Numerals(rngXE, findText)
If editedDefinition <> "" Then 'If editedDefinition is empty, that means there's no definition to add to the index
Set rngEdited = rngXE.Duplicate
With rngEdited
.moveStart unit:=wdCharacter, Count:=x
.Select
‘ This next line is my idea that the range’s page number is being used, so I just wanted to print it to see.
Debug.Print rngEdited.Information(wdActiveEndPageNumber)
End With 'rngEdited
myDoc.Indexes.MarkEntry Range:=rngEdited, entry:=editedDefinition, entryautotext:=editedDefinition
End If ''editedDefinition <> ""
oRng.Collapse wdCollapseEnd
oRng.Start = oRng.Paragraphs(1).Range.End
oRng.End = myDoc.Content.End
rngEdited.Collapse wdCollapseEnd
rngEdited.End = myDoc.Content.End
' Set rngXE = Nothing
End If 'bFound
Loop
bFound = True
Next y
TheEnd:
Set rng = Nothing
myDoc.Indexes(1).Update
MsgBox ("Added all definitions.")
End Sub
I'm thinking what I'll need to do is to "tighten up" the editedRange, so it ends on the same page? But if a definition spans a page break, I want to use the smaller of the page numbers that it appears on (the first one).
Thanks for any ideas/tips/thoughts.
Generally, when the page numbers in an Index don't match with what you expect it's because the document is displaying content that won't be in the printed result. This affects the pagination on-screen, "pushing" content "down" in the document. Most often, the reason is field codes, which can be suppressed by pressing Alt+F9 until the field results display.
This approach does not work for XE (index markers) and some other field types, as well as hidden text, however. They display whenever the display of "Hidden" text is allowed. Depending on the settings in File/Options/Display/"Always show these formatting marks on the screen" clicking the "backwards P" button in the Ribbon's Home tab may or may not turn them off. If it does not, then you have to go into options to togge the display, or create a macro to do this and run it as required.
The other possible reason is that the programmatically generated XE field was inserted at the end of a long range of text that broke to another page, instead of being on the page where the text starts. In order to ensure the field is the start, rather than the end of a Range, collapse the Range to its starting point:
rngEdited.Collapse wdCollapseStart

How do I strip all formatting out of this Word VBA output and use the "Normal" quickstyle?

I am using the following VBA macro to add page numbers after all bookmark hyperlinks in my document:
Sub InsertPageRefs()
Application.ScreenUpdating = False
Dim hLnk As Hyperlink, Rng As Range
For Each hLnk In ActiveDocument.Hyperlinks
With hLnk
If InStr(.SubAddress, "_Toc") = 0 And .Address = "" Then
Set Rng = .Range
With Rng
.Collapse Direction:=wdCollapseEnd
.InsertAfter Text:=" (See page #)"
.Font.Underline = wdUnderlineNone
End With
ActiveDocument.Fields.Add Range:=Rng.Characters(InStr(Rng, "#")), Text:="PAGEREF " & .SubAddress
End If
End With
Next
Set Rng = Nothing
Application.ScreenUpdating = True
Application.ScreenRefresh
MsgBox ActiveDocument.Hyperlinks.Count & " page numbers have been added.", vbOKOnly
End Sub
However, it's having undesirable results.
The blue color of the hyperlinks is partially spilling over into the added text.
It's creating a bunch of crazy span tags when I save the resulting file to HTML. I don't want this because I am going to convert the HTML to .mobi for Kindle and all the span tags are going to create chaos in my .mobi.
How do I strip out all the formatting and insert the page numbers in the "Normal" word style?
I suspect the real answer for this would be to use a good e-book editor that will keep track of this for you.
That said, the problem is likely that you are working on the Hyperlink's range, so all you should have to do is duplicate it. This allows the formatting of your range separate itself from whatever formatting is attached to the hyperlink. The other benefit of using a duplicate of a Hyperlink's range is that you can operate on the text of the range directly without destroying the link, which is also an easy way to preserve the target formatting:
Sub InsertPageRefs()
Dim hLnk As Hyperlink
Dim Rng As Range
For Each hLnk In ActiveDocument.Hyperlinks
If InStr(hLnk.SubAddress, "_Toc") = 0 And hLnk.Address = vbNullString Then
Set Rng = hLnk.Range.Duplicate
Rng.Start = Rng.End
Rng.Text = " (See page #)"
Rng.Font.Underline = wdUnderlineNone
ActiveDocument.Fields.Add Range:=Rng.Characters(InStr(Rng, "#")), _
Text:="PAGEREF " & hLnk.SubAddress
End If
Next
MsgBox ActiveDocument.Hyperlinks.Count & " page numbers have been added.", vbOKOnly
End Sub
Note that I pulled out the With blocks to make this more readable. Nested Withs make it a lot more difficult to tell at a glance what object you're operating on.