Why is my VBA macro only splitting the 1st and 3rd parts of a Word document? - vba

I have a macro which takes one Word document, copies the data inside my parameters then pastes it multiple separate documents (in this case three).
This is the first time using VBA, so please go easy.
The original document is a long document, which has multiple repeating sections. By filling in the original document, the user can save time completing one rather than three near identical documents. I have split the original into three sections. My code takes the data from the first declared section and pastes it into a new document. It also works for the third. The second, however does not work.
The
With R.Find
.Text = "START OF FORM*^12"
.MatchWildcards = True
section looks for the text 'Start of Form' and takes that and the rest of the contents up until '^12' (which I believe refers to a page break).
The document is set out so that each section of the document starts with that text and finishes with page break.
Sub DocSplit()
' Declares variable (in this case R).
Dim R As Range
' Sets R to the active document, being a number of ranges (will be defined later).
Set R = ActiveDocument.Range.Duplicate
' You won't be able to see what the macro is doing, but will run quicker.
Application.ScreenUpdating = False
' For R, find text with whatever is in the " marks.
With R.Find
.Text = "START OF FORM*^12"
.MatchWildcards = True
' Runs a series of statements as long as given conditions are true. While it's doing this,
While .Execute
' Copy and saves contents of R.
CopyAndSave R
' While ends.
Wend
'With ends.
End With
' Collapses range to the ending point.
R.Collapse wdCollapseEnd
' Returns or sets the ending character position of a range.
R.End = R.Parent.Range.End
CopyAndSave R
End Sub
Static Sub CopyAndSave(R As Range)
' Declares D as document.
Dim D As Document
' Represents the number of words in the collection.
' Long is a datatype for values too large for "integer".
Dim Count As Long
Count = Count + 1
' Copies R from previous Sub to a new document.
R.Copy
Set D = Documents.Add
' Pastes range, preserving original formatting.
D.Range.PasteAndFormat wdFormatOriginalFormatting
D.SaveAs R.Parent.Path & Application.PathSeparator & _
"F00" & Count, wdFormatDocument
D.Close
End Sub
I did expect three documents, F001, F002 and F003 to be created. I get two files, one containing the first section (as intended) and one file containing the last two.

I took a quick look at your code and I found these errors:
If you want the counter to increment each time the function is called, you must declare it in the main function, otherwise it will lose memory each time it's called.
R.Find needs an argument. If you want more details, look at here
R.End needs an argument, here you'll find some hints, depending on what you need to do.
I've updated some parts of your code to help you:
Sub DocSplit()
' Declares variable (in this case R).
Dim R As Range
' Represents the number of words in the collection.
' Long is a datatype for values too large for "integer".
Dim count As Long
count = 0
' Sets R to the active document, being a number of ranges (will be defined later).
Set R = ActiveDocument.Range.Duplicate
' You won't be able to see what the macro is doing, but will run quicker.
Application.ScreenUpdating = False
' For R, find text with whatever is in the " marks.
With R.Find("Text your're searching")
.Text = "START OF FORM*^12"
.MatchWildcards = True
' Runs a series of statements as long as given conditions are true. While it's doing this,
While .Execute
' Copy and saves contents of R.
Call CopyAndSave(R, count)
' While ends.
Wend
'With ends.
End With
' Collapses range to the ending point.
R.Collapse wdCollapseEnd
' Returns or sets the ending character position of a range.
R.End = R.Parent.Range.End
Call CopyAndSave(R)
End Sub
Static Sub CopyAndSave(R As Range, count As Long)
' Declares D as document.
Dim D As Document
count = count + 1
' Copies R from previous Sub to a new document.
R.Copy
Set D = Documents.Add
' Pastes range, preserving original formatting.
D.Range.PasteAndFormat wdFormatOriginalFormatting
D.SaveAs R.Parent.Path & Application.PathSeparator & _
"F00" & count, wdFormatDocument
D.Close
End Sub
If you have any doubts, don't hesitate to ask.

Related

What does a hyperlink range.start and range.end refer to?

I'm trying to manipulate some text from a MS Word document that includes hyperlinks. However, I'm tripping up at understanding exactly what Range.Start and Range.End are returning.
I banged a few random words into an empty document, and added some hyperlinks. Then wrote the following macro...
Sub ExtractHyperlinks()
Dim rHyperlink As Range
Dim rEverything As Range
Dim wdHyperlink As Hyperlink
For Each wdHyperlink In ActiveDocument.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Set rEverything = ActiveDocument.Range
rEverything.TextRetrievalMode.IncludeFieldCodes = True
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start) & "#" & vbCrLf
Next
End Sub
However, the output between the #s does not quite match up with the hyperlinks, and is more than a character or two out. So if the .Start and .End do not return char positions, what do they return?
This is a bit of a simplification but it's because rEverything counts everything before the hyperlink, then all the characters in the hyperlink field code (including 1 character for each of the opening and closing field code braces), then all the characters in the hyperlink field result, then all the characters after the field.
However, the character count in the range (e.g. rEverything.Characters.Count or len(rEverything)) only includes the field result if TextRetrievalMode.IncludeFieldCodes is set to False and only includes the field code if TextRetrievalMode.IncludeFieldCodes is set to True.
So the character count is always smaller than the range.End-range.Start.
In this case if you change your Debug expression to something like
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start - (rEverything.End - rEverything.Start - 1 - Len(rEverything))) & "#" & vbCrLf
you may see results more along the lines you expect.
Another way to visualise what is going on is as follows:
Create a very short document with a piece of text followed by a short hyperlink field with short result, followed by a piece of text. Put the following code in a module:
Sub Select1()
Dim i as long
With ActiveDocument
For i = .Range.Start to .Range.End
.Range(i,i).Select
Next
End With
End Sub
Insert a breakpoint on the "Next" line.
Then run the code once with the field codes displayed and once with the field results displayed. You should see the progress of the selection "pause" either at the beginning or the end of the field, as the Select keeps "selecting" something that you cannot actually see.
Range.Start returns the character position from the beginning of the document to the start of the range; Range.End to the end of the range.
BUT everything visible as characters are not the only things that get counted, and therein lies the problem.
Examples of "hidden" things that are counted, but not visible:
"control characters" associated with content controls
"control characters" associated with fields (which also means hyperlinks), which can be seen if field result is toggled to field code display using Alt+F9
table structures (ANSI 07 and ANSI 13)
text with the font formatting "hidden"
For this reason, using Range.Start and Range.End to get a "real" position in the document is neither reliable nor recommended. The properties are useful, for example, to set the position of one range relative to the position of another.
You can get a somewhat more accurate result using the Range.TextRetrievalMode boolean properties IncludeHiddenText and IncludeFieldCodes. But these don't affect the structural elements involved with content controls and tables.
Thank you both so much for pointing out this approach was doomed but that I could still use .Start/.End for relative positions. What I was ultimately trying to do was turn a passed paragraph into HTML, with the hyperlinks.
I'll post what worked here in case anyone else has a use for it.
Function ExtractHyperlinks(rParagraph As Range) As String
Dim rHyperlink As Range
Dim wdHyperlink As Hyperlink
Dim iCaretHold As Integer, iCaretMove As Integer, rCaret As Range
Dim s As String
iCaretHold = 1
iCaretMove = 1
For Each wdHyperlink In rParagraph.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Do
Set rCaret = ActiveDocument.Range(rParagraph.Characters(iCaretMove).Start, rParagraph.Characters(iCaretMove).End)
If RangeContains(rHyperlink, rCaret) Then
s = s & Mid(rParagraph.Text, iCaretHold, iCaretMove - iCaretHold) & "" & IIf(wdHyperlink.TextToDisplay <> "", wdHyperlink.TextToDisplay, wdHyperlink.Address) & ""
iCaretHold = iCaretMove + Len(wdHyperlink.TextToDisplay)
iCaretMove = iCaretHold
Exit Do
Else
iCaretMove = iCaretMove + 1
End If
Loop Until iCaretMove > Len(rParagraph.Text)
Next
If iCaretMove < Len(rParagraph.Text) Then
s = s & Mid(rParagraph.Text, iCaretMove)
End If
ExtractHyperlinks = "<p>" & s & "</p>"
End Function
Function RangeContains(rParent As Range, rChild As Range) As Boolean
If rChild.Start >= rParent.Start And rChild.End <= rParent.End Then
RangeContains = True
Else
RangeContains = False
End If
End Function

Macro for Adding Text To Begining of Every Paragraph

I am trying to create a Word macro that will go through a large document that I have and add the text "SAMPLE" to the beginning of every paragraph.
The document contains a Title page, Table of Contents and Headings throughout and I would prefer none of these have the "SAMPLE" text on them, just the paragraphs.
Below is some macro code I have found on various sites and kind of pieced together to do somewhat of what I want. It does place the "SAMPLE" text at the beginning of some paragraphs but not all, usually only the first paragraph of a new section within my document. And it also places it at the end of the Table of Contents and Beginning of the Title page.
I am brand new to macros in Word so any help is appreciated or if there is a better way of doing this perhaps? There might even be some unnecessary bits in this code since it is pieced together from other samples.
Sub SAMPLE()
Application.ScreenUpdating = False
Dim Par As Paragraph, Rng As Range
For Each Par In ActiveDocument.Paragraphs
If Par.Style = "Normal" Then
If Rng Is Nothing Then
Set Rng = Par.Range
Else
Rng.End = Par.Range.End
End If
Else
Call RngFmt(Rng)
End If
If Par.Range.End = ActiveDocument.Range.End Then
Call RngFmt(Rng)
End If
Next
Application.ScreenUpdating = True
End Sub
Sub RngFmt(Rng As Range)
If Not Rng Is Nothing Then
With Rng
.End = .End - 1
.InsertBefore "SAMPLE"
End With
Set Rng = Nothing
End If
End Sub
Provided your Title, Table of Contents and Headings etc. don't use the Normal Style - as they shouldn't - you really don't need a macro for this - all you need is a wildcard Find/Replace where:
Find = [!^13]*^13
Replace = SAMPLE: ^&
and you specify the Normal Style as a Find formatting parameter. You could, of course, record the above as a macro, but that seems overkill unless you're doing this often.

Check if a Range of text fits onto a single line

I'm programmatically filling in a regulated form template where lines are predefined (as table cells):
(Using plain text Content Controls as placeholders but this isn't relevant to the current question.)
So, I have to break long text into lines manually (auto-adding rows or something is not an option because page breaks are also predefined).
Now, since characters have different width, I cannot just set some hardcoded character limit to break at (or rather, I can, and that's what I'm doing now, but this has proven to be inefficient and unreliable, as expected). So:
How do I check if a Range of text fits on a single line -- and if it doesn't, how much of it fits?
I've checked out Range Members (Word) but can't see anything relevant.
The only way is to .Select that text, them manipulate the selection. Selection in the only object for which you can use wdLine as a boundary. Nothing else in the Word object model works with automatic line breaks.
Sub GetFirstLineOfRange(RangeToCheck As Range, FirstLineRange As Range)
'Otherwise, Word doesn't always insert automatic line breaks
'and all the text will programmatically look like it's on a single line
If Not Application.Visible Or Not Application.ScreenUpdating Then
Application.ScreenRefresh
End If
Dim SelectionRange As Range
Set SelectionRange = Selection.Range
Set FirstLineRange = RangeToCheck
FirstLineRange.Select
Selection.Collapse Direction:=wdCollapseStart
Selection.EndOf Unit:=wdLine, Extend:=wdExtend
Set FirstLineRange = Selection.Range
If FirstLineRange.End > RangeToCheck.End Then
FirstLineRange.End = RangeToCheck.End
End If
SelectionRange.Select
End Sub
Function IsRangeOnOneLine(RangeToCheck As Range) As Boolean
Dim FirstLineRange As Range
GetFirstLineOfRange RangeToCheck, FirstLineRange
IsRangeOnOneLine = FirstLineRange.End >= RangeToCheck.End
End Function
The subroutine GetFirstLineOfRange takes a RangeToCheck and sets FirstLineRange to the first text line in the given range.
The function IsRangeOnOneLine takes a RangeToCheck and returns True if the range fits on one line of text, and False otherwise. The function works by getting the first text line in the given range and checking whether it contains the range or not.
The manipulation of the Selection in GetFirstLineOfRange is necessary because the subroutine wants to move the end of the range to the end of the line, and the movement unit wdLine is available only with Selection. The subroutine saves and restores the current Selection; if this is not necessary then the temporary variable SelectionRange and the associated statements can be deleted.
Note:
There is no need to scroll anything - which in any event is not reliable. Try something based on:
With Selection
If .Characters.First.Information(wdVerticalPositionRelativeToPage) = _
.Characters.Last.Information(wdVerticalPositionRelativeToPage) Then
MsgBox .Text & vbCr & vbCr & "Spans one line or less."
Else
MsgBox .Text & vbCr & vbCr & "Spans more than one line."
End If
End With

Retrieve info from Word tables

I've got a Word document with a section surrounded by hidden text tags < Answers > ...some tables... < /Answers >. A Word macro can return the range of the text between these tags (used to be bookmarks but they had to go).
What I want to do from Excel is open the Word document, get the range between the tags, iterate the tables in that block and retrieve some cells from each row. Those cell data is then written in some rows on a new Excel sheet.
I saw many Word/Excel automation but none that inspired me to retrieve that range between two pieces of text. Best would be to be able to run the Word macro RetrieveRange(strTagName, rngTextBlock) in Word to return the range in rngTextBlock for "Answers" but this seems impossible.
As background: the .docm file is an exam paper with answers and maximum points that I 'd like to transfer into Excel to contain gradings for each student.
Browsing though some more sites, I encountered a C# example that partly did what I needed: rather than using Word's SELECTION stick to ranges to find something. I now can find the text block between the two tags, but still fail on traversing its tables and table rows. No compiler error (and working in Word itself) but I must be missing an external link...
Function CreateSEWorksheet() As Boolean
' Find <ANSWERS> in Word Document, and traverse all tables and write them as rows in worksheet
Dim wdrngStart As Word.Range
Dim wdrngEnd As Word.Range
Dim wdrngAnswers As Word.Range
Dim wdTable As Word.Table
Dim wdRow As Word.Row
Dim strStr As String
Dim bGoOn As Boolean
' Following set elsewhere:
' Set WDApp = GetObject(class:="Application.Word")
' Set WDDoc = WDApp.Documents.Open(filename:="filespec", visible:=True)
Set wdrngStart = WDDoc.Range ' select entire document - will shrink later
Set wdrngEnd = WDDoc.Range
Set wdrngAnswers = WDDoc.Range
' don't use Word SELECT/SELECTION but use ranges instead when finding tags.
If wdrngStart.Find.Execute(findText:="<ANSWERS>", MatchCase:=False) Then
' found!
wdrngAnswers.Start = wdrngStart.End
If wdrngEnd.Find.Execute(findText:="</ANSWERS>", MatchCase:=False) Then
wdrngAnswers.End = wdrngEnd.Start
bGoOn = True
Else
' no closing tag found
bGoOn = False
End If
Else
'no opening tag found
bGoOn = False
End If
If bGoOn Then
For Each wdTable In wdrngAnswers.Tables
' ** below doesn't work anymore: object doesn't support this method **
For Each wdRow In wdTable
' as example, take column 4 of each row
strStr = wdRow.Cells(4).Range.Text
strStr = Left(strStr, Len(strStr) - 2) ' remove end of cell markers
Debug.Print strStr
Next
Next
CreateSEWorksheet = True
Else
CreateSEWorksheet = False
End If
End Function

Loop through pages OR page breaks?

I'm basically trying to create a cumulative word count for documents that will put the number of words on each page into its footer and add it to the total words each page. After a lot of looking around, I found that Word doesn't really handle pages the same for everybody and so doesn't have any interface to access the individual pages through.
Now I'm trying to separate each page with page breaks so there's a clear delimiter between pages, but I still can't find how to loop through these. Any clues?
I'm going to post the code I have, but it's only for getting the word count currently. No proper attempts at cycling through page breaks because I don't know how.
Sub getPageWordCount()
Dim iPgNum As Integer
Dim sPgNum As String
Dim ascChar As Integer
Dim rngPage As Range
Dim iBeginPage As Integer
Dim iEndPage As Integer
' Go to start of document and make sure its paginated correctly.
Selection.HomeKey Unit:=wdStory, Extend:=wdMove
ActiveDocument.Repaginate
' Loop through the number of pages in the document.
For iPgNum = 2 To Selection.Information(wdNumberOfPagesInDocument)
sPgNum = CStr(iPgNum)
iBeginPage = Selection.Start
' Go to next page
Selection.GoTo wdGoToPage, wdGoToAbsolute, sPgNum
' and to the last character of the previous page...
Selection.MoveLeft wdCharacter, 1, wdMove
iEndPage = Selection.Start
' Retrieve the character code at insertion point.
Set rngPage = ActiveDocument.Range(iBeginPage, iEndPage)
MsgBox rngPage.ComputeStatistics(wdStatisticWords)
'rngPage.Footers(wdHeaderFooterPrimary).Range.Text = rngPage.ComputeStatistics(wdStatisticWords)
'ActiveDocument.Sections(2).Footers
' Check the character code for hard page break or text.
Next
' ActiveDocument.Sections(2).Footers(wdHeaderFooterPrimary).Range.Text = "bob" 'Testing
End Sub
Finally got it, managed to guess my way through it a bit, taking assorted bits from dark corners of the internet:
Sub getPageWordCount()
'Replace all page breaks with section breaks
Dim myrange1 As Range, myrangedup As Range
Selection.HomeKey wdStory
Selection.Find.ClearFormatting
With Selection.Find
Do While .Execute(findText:="^m", Forward:=True, _
MatchWildcards:=False, Wrap:=wdFindStop) = True
Set myrange = Selection.Range
Set myrangedup = Selection.Range.Duplicate
myrange.Collapse wdCollapseEnd
myrange.InsertBreak wdSectionBreakNextPage
myrangedup.Delete
Loop
End With
'Unlink all footers and insert word count for each section
Dim sectionCount, sectionNumber, i, sectionWordCount, cumulativeWordCount As Integer
sectionCount = ActiveDocument.Sections.Count
For sectionNumber = 1 To sectionCount
With ActiveDocument.Sections(sectionNumber)
sectionWordCount = .Range.ComputeStatistics(wdStatisticWords)
cumulativeWordCount = cumulativeWordCount + sectionWordCount
With .Footers.Item(1)
.LinkToPrevious = False
.Range.Text = "This page's word count: " + CStr(sectionWordCount) + " | Cumulative word count: " + CStr(cumulativeWordCount)
.Range.ParagraphFormat.Alignment = wdAlignParagraphCenter
End With
End With
Next
End Sub
And now I've just discovered that if I want to port this macro to an add-in for ease of use for non-techy users I have to write it in VB 2010 in Visual Studio where the API is different. Good luck me!
It sounds as if you have what you need, but I was working on an alternative that I may as well post because it does not require you to add page breaks or section breaks. But you would have to add the same nested field in each footer that appears in the document (I haven't done that part here, but it's not completely trivial because there may be multiple sections and multiple footers per section).
The field code you need to add (in addition to your 'This page's word count: ' text) is
{ DOCVARIABLE "s{ SECTION }p{ PAGE \*arabic }" }
As written, the method may break in some circumstances, e.g. if there are continuous section breaks. I haven't checked.
Sub createWordCounts()
Dim i As Integer
Dim rng As Word.Range
With ActiveDocument
For i = 1 To .Range.Information(wdActiveEndPageNumber)
Set rng = .GoTo(wdGoToPage, wdGoToAbsolute, i).Bookmarks("\page").Range
.Variables("s" & CStr(rng.Information(wdActiveEndSectionNumber)) & "p" & CStr(rng.Information(wdActiveEndAdjustedPageNumber))).Value = rng.ComputeStatistics(wdStatisticWords)
Set rng = Nothing
Next
End With
End Sub