What does a hyperlink range.start and range.end refer to? - vba

I'm trying to manipulate some text from a MS Word document that includes hyperlinks. However, I'm tripping up at understanding exactly what Range.Start and Range.End are returning.
I banged a few random words into an empty document, and added some hyperlinks. Then wrote the following macro...
Sub ExtractHyperlinks()
Dim rHyperlink As Range
Dim rEverything As Range
Dim wdHyperlink As Hyperlink
For Each wdHyperlink In ActiveDocument.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Set rEverything = ActiveDocument.Range
rEverything.TextRetrievalMode.IncludeFieldCodes = True
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start) & "#" & vbCrLf
Next
End Sub
However, the output between the #s does not quite match up with the hyperlinks, and is more than a character or two out. So if the .Start and .End do not return char positions, what do they return?

This is a bit of a simplification but it's because rEverything counts everything before the hyperlink, then all the characters in the hyperlink field code (including 1 character for each of the opening and closing field code braces), then all the characters in the hyperlink field result, then all the characters after the field.
However, the character count in the range (e.g. rEverything.Characters.Count or len(rEverything)) only includes the field result if TextRetrievalMode.IncludeFieldCodes is set to False and only includes the field code if TextRetrievalMode.IncludeFieldCodes is set to True.
So the character count is always smaller than the range.End-range.Start.
In this case if you change your Debug expression to something like
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start - (rEverything.End - rEverything.Start - 1 - Len(rEverything))) & "#" & vbCrLf
you may see results more along the lines you expect.
Another way to visualise what is going on is as follows:
Create a very short document with a piece of text followed by a short hyperlink field with short result, followed by a piece of text. Put the following code in a module:
Sub Select1()
Dim i as long
With ActiveDocument
For i = .Range.Start to .Range.End
.Range(i,i).Select
Next
End With
End Sub
Insert a breakpoint on the "Next" line.
Then run the code once with the field codes displayed and once with the field results displayed. You should see the progress of the selection "pause" either at the beginning or the end of the field, as the Select keeps "selecting" something that you cannot actually see.

Range.Start returns the character position from the beginning of the document to the start of the range; Range.End to the end of the range.
BUT everything visible as characters are not the only things that get counted, and therein lies the problem.
Examples of "hidden" things that are counted, but not visible:
"control characters" associated with content controls
"control characters" associated with fields (which also means hyperlinks), which can be seen if field result is toggled to field code display using Alt+F9
table structures (ANSI 07 and ANSI 13)
text with the font formatting "hidden"
For this reason, using Range.Start and Range.End to get a "real" position in the document is neither reliable nor recommended. The properties are useful, for example, to set the position of one range relative to the position of another.
You can get a somewhat more accurate result using the Range.TextRetrievalMode boolean properties IncludeHiddenText and IncludeFieldCodes. But these don't affect the structural elements involved with content controls and tables.

Thank you both so much for pointing out this approach was doomed but that I could still use .Start/.End for relative positions. What I was ultimately trying to do was turn a passed paragraph into HTML, with the hyperlinks.
I'll post what worked here in case anyone else has a use for it.
Function ExtractHyperlinks(rParagraph As Range) As String
Dim rHyperlink As Range
Dim wdHyperlink As Hyperlink
Dim iCaretHold As Integer, iCaretMove As Integer, rCaret As Range
Dim s As String
iCaretHold = 1
iCaretMove = 1
For Each wdHyperlink In rParagraph.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Do
Set rCaret = ActiveDocument.Range(rParagraph.Characters(iCaretMove).Start, rParagraph.Characters(iCaretMove).End)
If RangeContains(rHyperlink, rCaret) Then
s = s & Mid(rParagraph.Text, iCaretHold, iCaretMove - iCaretHold) & "" & IIf(wdHyperlink.TextToDisplay <> "", wdHyperlink.TextToDisplay, wdHyperlink.Address) & ""
iCaretHold = iCaretMove + Len(wdHyperlink.TextToDisplay)
iCaretMove = iCaretHold
Exit Do
Else
iCaretMove = iCaretMove + 1
End If
Loop Until iCaretMove > Len(rParagraph.Text)
Next
If iCaretMove < Len(rParagraph.Text) Then
s = s & Mid(rParagraph.Text, iCaretMove)
End If
ExtractHyperlinks = "<p>" & s & "</p>"
End Function
Function RangeContains(rParent As Range, rChild As Range) As Boolean
If rChild.Start >= rParent.Start And rChild.End <= rParent.End Then
RangeContains = True
Else
RangeContains = False
End If
End Function

Related

Finding occurrences of a string in a Word document - problem if string is found in a table

Would appreciate some help with this problem.
I need to find all occurrences of a string in a Word document. When the string is found some complicated editing is performed on it. Sometimes no editing is needed and the string is left untouched. When all that is taken care of, I continue looking for the next occurrence of the string. Until the end of the document.
I wrote a routine to do that :
It starts by defining a Range (myRange) that covers the whole document.
Then a Find.Execute is performed.
When an occurrence is found I do the editing work.
Meanwhile myRange has been automatically redefined to cover only the found region (this is well documented in the VBA WORD documentation > FIND Object).
Then I redefine myRange to cover the portion of the text from the end of the previous found region down to the end of the text.
I iterate this until the end of the document.
This routine works well EXCEPT when an occurrence of the string is found in a TABLE. Then it is impossible to redefine myRange to cover the region from the end of the previous found down to the end of the text. In the redefinition VBA insists on including the previous found region (actually the whole TABLE). So when I iterate it keeps finding the same occurrence again and again and looping for ever.
What follows is a simplified version of my routine. It does nothing it is just to illustrate the problem. If you run it on a document where the string "abc" appears you will see it running happily to completion. But if your document has an occurrence of "abc" in a TABLE the routine loops for ever.
Sub moreTests()
Dim myRange As Range
Dim lastCharPos As Integer
Set myRange = ActiveDocument.Range
lastCharPos = myRange.End
myRange.Find.ClearFormatting
With myRange.Find
.Text = "abc"
End With
While myRange.Find.Execute = True
'An occurrence of "abc" has been found
MsgBox (myRange.Text)
MsgBox ("Range starts at : " & myRange.Start & "; Range ends at : " & myRange.End)
'myRange has been redefined to encompass only the found region (the "abc" string)
'Perform whatever editing work is needed on the string myRange.Text ("abc")
'Now redefine myRange to cover the remainder of the document
myRange.Start = myRange.End
myRange.End = lastCharPos
MsgBox ("Range starts at : " & myRange.Start & "; Range ends at : " & myRange.End)
Wend
End Sub 'moreTests
I have several ways in mind to circumvent this problem. But none of them is simple, let alone 'elegant'. Does someone know if there is a 'standard' / 'proven' way of avoiding this problem ?
Many many thanks in advance.

How to get Selection.Text to read displaytext of Macro field code

I'm having some trouble figuring this out, and would really appreciate some help. I'm trying to write a macro that uses the selection.text property as a Case text-expression. When the macro is clicked in Microsoft Word, the selected text is automatically set to the DisplayText. This method worked great for the formatting via Selection.Font.Color for a quick and dirty formatting toggling macro, but it doesn't work for the actual text.
When debugging with MsgBox, it is showing a box (Eg: □ ) as the value.
For example,
Word Field Code:
{ MACROBUTTON Macro_name DisplayText }
VBA Code run when highlighting "DisplayText" in Word:
Sub Macro_name()
Dim Str As String
Str = Selection.Text
MsgBox Str
Select Case Str
Case "DisplayText"
MsgBox "A was selected"
Case "B"
MsgBox "B was selected"
End Select
End Sub
What is output is a Message Box that only shows □
When I run this macro with some regular text selected, it works just fine.
My question is this: Is there a way to have the macro read the displaytext part of the field code for use in the macro?
You can read the field code, directly, instead of the selection (or the Field.Result which also doesn't give the text).
It's not quite clear how this macro is to be used throughout the document, so the code sample below provides two variations.
Both check whether the selection contains fields and if so, whether the (first) field is a MacroButton field. The field code is then tested.
In the variation that's commented out (the simpler one) the code then simply checks whether the MacroButton display text is present in the field code. If it is, that text is assigned to the string variable being tested by the Select statement.
If this is insufficient because the display text is "unknown" (more than one MacroButton field, perhaps) then it's necessary to locate the part of the field code that contains the display text. In this case, the function InstrRev locates the end point of the combined field name and macro name, plus the intervening spaces, in the entire field code, searching from the end of the string. After that, the Mid function extracts the display text and assigns it to the string variable tested by the Select statement.
In both variations, if the selection does not contain a MacroButton field then the selected test is assigned to the string variable for the Select statement.
(Note that for my tests I needed to use Case Else in the Select statement. You probably want to change that back to Case "B"...)
Sub Display_Field_DisplayText()
Dim Str As String, strDisplayText As String
Dim textLoc As Long
Dim strFieldText As String, strMacroName As String
Dim strFieldName As String, strFieldCode As String
strDisplayText = "text to display"
If Selection.Fields.Count > 0 Then
If Selection.Fields(1).Type = wdFieldMacroButton Then
strFieldName = "MacroButton "
strMacroName = "Display_Field_DisplayText "
strFieldCode = strFieldName & strMacroName
Str = Selection.Fields(1).code.text
textLoc = InStrRev(Str, strFieldCode)
strFieldText = Mid(Str, textLoc + Len(strFieldCode))
MsgBox strFieldText
Str = strFieldText
'If InStr(Selection.Fields(1).code.text, strDisplayText) > 0 Then
' Str = strDisplayText
'End If
End If
Else
Str = Selection.text
End If
Select Case Str
Case strDisplayText
MsgBox "A was selected"
Case Else
MsgBox "B was selected"
End Select
End Sub

Check if a Range of text fits onto a single line

I'm programmatically filling in a regulated form template where lines are predefined (as table cells):
(Using plain text Content Controls as placeholders but this isn't relevant to the current question.)
So, I have to break long text into lines manually (auto-adding rows or something is not an option because page breaks are also predefined).
Now, since characters have different width, I cannot just set some hardcoded character limit to break at (or rather, I can, and that's what I'm doing now, but this has proven to be inefficient and unreliable, as expected). So:
How do I check if a Range of text fits on a single line -- and if it doesn't, how much of it fits?
I've checked out Range Members (Word) but can't see anything relevant.
The only way is to .Select that text, them manipulate the selection. Selection in the only object for which you can use wdLine as a boundary. Nothing else in the Word object model works with automatic line breaks.
Sub GetFirstLineOfRange(RangeToCheck As Range, FirstLineRange As Range)
'Otherwise, Word doesn't always insert automatic line breaks
'and all the text will programmatically look like it's on a single line
If Not Application.Visible Or Not Application.ScreenUpdating Then
Application.ScreenRefresh
End If
Dim SelectionRange As Range
Set SelectionRange = Selection.Range
Set FirstLineRange = RangeToCheck
FirstLineRange.Select
Selection.Collapse Direction:=wdCollapseStart
Selection.EndOf Unit:=wdLine, Extend:=wdExtend
Set FirstLineRange = Selection.Range
If FirstLineRange.End > RangeToCheck.End Then
FirstLineRange.End = RangeToCheck.End
End If
SelectionRange.Select
End Sub
Function IsRangeOnOneLine(RangeToCheck As Range) As Boolean
Dim FirstLineRange As Range
GetFirstLineOfRange RangeToCheck, FirstLineRange
IsRangeOnOneLine = FirstLineRange.End >= RangeToCheck.End
End Function
The subroutine GetFirstLineOfRange takes a RangeToCheck and sets FirstLineRange to the first text line in the given range.
The function IsRangeOnOneLine takes a RangeToCheck and returns True if the range fits on one line of text, and False otherwise. The function works by getting the first text line in the given range and checking whether it contains the range or not.
The manipulation of the Selection in GetFirstLineOfRange is necessary because the subroutine wants to move the end of the range to the end of the line, and the movement unit wdLine is available only with Selection. The subroutine saves and restores the current Selection; if this is not necessary then the temporary variable SelectionRange and the associated statements can be deleted.
Note:
There is no need to scroll anything - which in any event is not reliable. Try something based on:
With Selection
If .Characters.First.Information(wdVerticalPositionRelativeToPage) = _
.Characters.Last.Information(wdVerticalPositionRelativeToPage) Then
MsgBox .Text & vbCr & vbCr & "Spans one line or less."
Else
MsgBox .Text & vbCr & vbCr & "Spans more than one line."
End If
End With

Reading date from table in Word without additional characters

so I started with VBA yesterday and keep running into walls. In the long run, I'm trying to create a Word template that checks if it's still up to date or if it's time for revision.
Right now I want to store a date from the document in a variable. I couldn't find a method to directly read out something in date format so now I'm using Selection.Text and CDate but that gives me an error (incompatible types) because my selection seems to contain another character or marker ([]). I'm guessing it has something to do with the fact that the bookmark is on a cell of a table within my Word document because it works fine in the running text.
I'm doing this in a table because this way I can be sure where the date in question is in the document and because I'm not sure how to reset the bookmark after the date has been changed.
I tried to limit the selection to the date by using
Selection.SetRange Start:=0, End:=8 (and a few variations) but that selects only a space and the ominous marker (or another cell entirely).
I have also looked into Ranges but as far as I can tell it doesn't solve my problem and I can't really use them yet, so for now I'm sticking to selection.
This is my code:
Sub ChangeNextRev()
Dim nextRevision As Date
Dim RevisionDate As Date
Dim temp As String
'Selection.GoTo what:=wdGoToBookmark, Name:="lastRevision"
'Selection.SetRange Start:=0, End:=8
'Selection.GoTo what:=wdGoToBookmark, Name:="lastRevision"
Selection.GoTo what:=wdGoToBookmark, Name:="runningText"
temp = Selection.Text
RevisionDate = CDate(temp)
Debug.Print (RevisionDate)
nextRevision = RevisionDate + 14
With Selection
.GoTo what:=wdGoToBookmark, Name:="nextRevision"
.TypeText Text:=Format$(nextRevision, "DD.MM.YY")
End With
End Sub
Can someone point me in the right direction? How can I only select the date I need? Is there an easier way besides a table to control where the date is entered or to find it afterwards?
Any help on where I'm going wrong would be greatly appreciated :)
Your guess about the table cell is correct, but you can work around that by trimming off the extraneous character(s). End-of-cell is a Chr(13) + Chr(7) (Word paragraph plus cell structure marker).
There are various ways to code this, but I have the following function at-hand:
'Your code, relevant lines, slightly altered:
Selection.GoTo what:=wdGoToBookmark, Name:="runningText"
temp = TrimCellText(Selection.Text)
RevisionDate = CDate(temp)
Debug.Print (RevisionDate)
'Function to return string without end-of-cell characters
Function TrimCellText(s As String) As String
Do While Len(s) > 0 And (Right(s, 1) = Chr(13) Or Right(s, 1) = Chr(7))
s = Left(s, Len(s) - 1)
Loop
TrimCellText = s
End Function
If the date is the only content in the cell you could use:
Dim Dt As Date
Dt = CDate(Replace(Split(ActiveDocument.Bookmarks("runningText").Range.Text, vbCr)(0), ".", "/"))
You could try something along these lines
Sub test()
Dim d As Date
d = CDate(Replace(ThisDocument.GoTo(wdGoToBookmark, , , "TEST_BM").Text, ".", "/"))
Debug.Print d
End Sub

Using Cross Reference Field Code to move selection to Target of Field Code

OP Update:
Thanks for the code KazJaw, it prompted me to change the approach I am trying to tackle the problem with. This is my current code:
Sub Method3()
Dim intFieldCount As Integer
Dim i As Integer
Dim vSt1 As String
intFieldCount = ActiveDocument.Fields.Count
For i = 1 To intFieldCount
ActiveDocument.Fields(i).Select 'selects the first field in the doc
Selection.Expand
vSt1 = Selection.Fields(1).Code
'MsgBox vSt1
vSt1 = Split(vSt1, " ")(2) 'Find out what the (2) does
MsgBox vSt1
ActiveDocument.Bookmarks(vSt1).Select 'Selects the current crossreference in the ref list
Next i
End Sub
Ok the so the Code currently finds the first field in the document, reads its field code and then jumps to the location in the document to mimic a CTRL+Click.
However, It does this for all types of fields Bookmarks, endnotes, figures, tables etc. I only want to find Reference fields. I thought I could deduce this from the field code but it turns out figures and bookmarks use the same field code layout ie.
A Reference/Boookmark has a field code {REF_REF4123123214\h}
A Figure cross ref has the field code {REF_REF407133655\h}
Is there an effective way to get VBA to distinguish between the two? I was thinking as reference fields in the document are written as (Reference 1) I could find the field and then string compare the word on the left to see if it says "Reference".
I was thinking of using the MoveLeft Method to do this
Selection.MoveLeft
But I can't work out how to move left 1 word from the current selection and select that word instead to do the strcomp
Or perhaps I can check the field type? with...
If Selection.Type = wdFieldRef Then
Do Something
End If
But I am not sure which "Type" i should be looking for.
Any advice is appreciated
All REF fields "reference" bookmarks. Word sets bookmarks on all objects that get a reference for a REF field: figures, headings, etc. There's no way to distinguish from the content of the field what's at the other end. You need to "inspect" that target, which you can do without actually selecting it. For example, you could check whether the first six letters are "Figure".
The code you have is inefficient - there's no need to use the Selection object to get the field code. The following is more efficient:
Sub Method3()
Dim fld As Word.Field
Dim rng as Word.Range
Dim vSt1 As String
ForEach fld in ActiveDocument.Fields
vSt1 = fld.Code
'MsgBox vSt1
vSt1 = Split(vSt1, " ")(2) 'Find out what the (2) does
MsgBox vSt1
Set rng = ActiveDocument.Bookmarks(vSt1).Range
If Left(rng.Text, 6) <> "Figure" Then
rng.Select
End If
Next
End Sub