Getting text from header columns - vba

I have an MS Word document with a three-column header like in the screenshot below.
Please advise a VBA macro to get the text in each of the columns (in my case "foo", "bar" and "baz").
So far I have tried
Sub Test()
MsgBox Documents(1).Sections(1).Headers(wdHeaderFooterPrimary).Range.Text
End Sub
enter code here
, but it returns text with zeros ("foo0bar0baz"), which seems not to be suitable to break this text in general case, when the column texts themselves can contain zeros (e.g. "foo0", "0bar00" and "0baz").

You use the Split function to create an array of the text. You will need to know what character has been used to separate the columns. It will probably be either a normal tab or an alignment tab.
For a normal tab:
Sub SplitHeader()
Dim colHeads As Variant
colHeads = Split(ActiveDocument.Sections(1).Headers(wdHeaderFooterPrimary).Range.Text, vbTab)
Debug.Print colHeads(0)
Debug.Print colHeads(1)
Debug.Print colHeads(2)
End Sub
For an alignment tab:
Sub SplitHeader()
Dim colHeads As Variant
colHeads = Split(ActiveDocument.Sections(1).Headers(wdHeaderFooterPrimary).Range.Text, Chr(48))
End Sub

Related

VBA store formatted text in clipboard

I need to copy/store a string of text into the clipboard but need that text to be formatted (font type, color, weight, etc.)
Private Sub copyToCB(varText As String)
Dim x As Variant
x = varText
CreateObject("htmlfile").parentWindow.clipboardData.setData "text", x
End Sub
The above does the job of storing the referred text into the clipboard but it's stored as plain text. I'd like it to be e.g. bold and red.
I've been scouring the Internet literally for hours, to no avail. You'd think this would be something straightforward but I'm at a total loss!
If you use the clipboard classes from #GMCB found at https://stackoverflow.com/a/63735992/478884
You can do this:
Sub TestCopying()
CopyWithSomeFormatting "This should paste as red/bold"
End Sub
Sub CopyWithSomeFormatting(txt As String)
Dim myClipboard As New vbaClipboard 'Instantiate a vbaClipboard object
myClipboard.SetClipboardText _
"<span style='color:#F00;font-weight:bold'>" & txt & "</span>", "HTML Format"
End Sub
Works for me at least when pasting to Word/Excel

Sub to find text in a Word document by specified font and font size

Goal: Find headings in a document by their font and font size and put them into a spreadsheet.
All headings in my doc are formatted as Ariel, size 16. I want to do a find of the Word doc, select the matching range of text to the end of the line, then assign it to a variable so I can put it in a spreadsheet. I can do an advanced find and search for the font/size successfully, but can't get it to select the range of text or assign it to a variable.
Tried modifying the below from http://www.vbaexpress.com/forum/showthread.php?55726-find-replace-fonts-macro but couldn't figure out how to select and assign the found text to a variable. If I can get it assigned to the variable then I can take care of the rest to get it into a spreadsheet.
'A basic Word macro coded by Greg Maxey
Sub FindFont
Dim strHeading as string
Dim oChr As Range
For Each oChr In ActiveDocument.Range.Characters
If oChr.Font.Name = "Ariel" And oChr.Font.Size = "16" Then
strHeading = .selected
Next
lbl_Exit:
Exit Sub
End Sub
To get the current code working, you just need to amend strHeading = .selected to something like strHeading = strHeading & oChr & vbNewLine. You'll also need to add an End If statement after that line and probably amend "Ariel" to "Arial".
I think a better way to do this would be to use Word's Find method. Depending on how you are going to be inserting the data into the spreadsheet, you may also prefer to put each header that you find in a collection instead of a string, although you could easily delimit the string and then split it before transferring the data into the spreadsheet.
Just to give you some more ideas, I've put some sample code below.
Sub Demo()
Dim Find As Find
Dim Result As Collection
Set Find = ActiveDocument.Range.Find
With Find
.Font.Name = "Arial"
.Font.Size = 16
End With
Set Result = Execute(Find)
If Result.Count = 0 Then
MsgBox "No match found"
Exit Sub
Else
TransferToExcel Result
End If
End Sub
Function Execute(Find As Find) As Collection
Set Execute = New Collection
Do While Find.Execute
Execute.Add Find.Parent.Text
Loop
End Function
Sub TransferToExcel(Data As Collection)
Dim i As Long
With CreateObject("Excel.Application")
With .Workbooks.Add
With .Sheets(1)
For i = 1 To Data.Count
.Cells(i, 1) = Data(i)
Next
End With
End With
.Visible = True
End With
End Sub

What does a hyperlink range.start and range.end refer to?

I'm trying to manipulate some text from a MS Word document that includes hyperlinks. However, I'm tripping up at understanding exactly what Range.Start and Range.End are returning.
I banged a few random words into an empty document, and added some hyperlinks. Then wrote the following macro...
Sub ExtractHyperlinks()
Dim rHyperlink As Range
Dim rEverything As Range
Dim wdHyperlink As Hyperlink
For Each wdHyperlink In ActiveDocument.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Set rEverything = ActiveDocument.Range
rEverything.TextRetrievalMode.IncludeFieldCodes = True
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start) & "#" & vbCrLf
Next
End Sub
However, the output between the #s does not quite match up with the hyperlinks, and is more than a character or two out. So if the .Start and .End do not return char positions, what do they return?
This is a bit of a simplification but it's because rEverything counts everything before the hyperlink, then all the characters in the hyperlink field code (including 1 character for each of the opening and closing field code braces), then all the characters in the hyperlink field result, then all the characters after the field.
However, the character count in the range (e.g. rEverything.Characters.Count or len(rEverything)) only includes the field result if TextRetrievalMode.IncludeFieldCodes is set to False and only includes the field code if TextRetrievalMode.IncludeFieldCodes is set to True.
So the character count is always smaller than the range.End-range.Start.
In this case if you change your Debug expression to something like
Debug.Print "#" & Mid(rEverything.Text, rHyperlink.Start, rHyperlink.End - rHyperlink.Start - (rEverything.End - rEverything.Start - 1 - Len(rEverything))) & "#" & vbCrLf
you may see results more along the lines you expect.
Another way to visualise what is going on is as follows:
Create a very short document with a piece of text followed by a short hyperlink field with short result, followed by a piece of text. Put the following code in a module:
Sub Select1()
Dim i as long
With ActiveDocument
For i = .Range.Start to .Range.End
.Range(i,i).Select
Next
End With
End Sub
Insert a breakpoint on the "Next" line.
Then run the code once with the field codes displayed and once with the field results displayed. You should see the progress of the selection "pause" either at the beginning or the end of the field, as the Select keeps "selecting" something that you cannot actually see.
Range.Start returns the character position from the beginning of the document to the start of the range; Range.End to the end of the range.
BUT everything visible as characters are not the only things that get counted, and therein lies the problem.
Examples of "hidden" things that are counted, but not visible:
"control characters" associated with content controls
"control characters" associated with fields (which also means hyperlinks), which can be seen if field result is toggled to field code display using Alt+F9
table structures (ANSI 07 and ANSI 13)
text with the font formatting "hidden"
For this reason, using Range.Start and Range.End to get a "real" position in the document is neither reliable nor recommended. The properties are useful, for example, to set the position of one range relative to the position of another.
You can get a somewhat more accurate result using the Range.TextRetrievalMode boolean properties IncludeHiddenText and IncludeFieldCodes. But these don't affect the structural elements involved with content controls and tables.
Thank you both so much for pointing out this approach was doomed but that I could still use .Start/.End for relative positions. What I was ultimately trying to do was turn a passed paragraph into HTML, with the hyperlinks.
I'll post what worked here in case anyone else has a use for it.
Function ExtractHyperlinks(rParagraph As Range) As String
Dim rHyperlink As Range
Dim wdHyperlink As Hyperlink
Dim iCaretHold As Integer, iCaretMove As Integer, rCaret As Range
Dim s As String
iCaretHold = 1
iCaretMove = 1
For Each wdHyperlink In rParagraph.Hyperlinks
Set rHyperlink = wdHyperlink.Range
Do
Set rCaret = ActiveDocument.Range(rParagraph.Characters(iCaretMove).Start, rParagraph.Characters(iCaretMove).End)
If RangeContains(rHyperlink, rCaret) Then
s = s & Mid(rParagraph.Text, iCaretHold, iCaretMove - iCaretHold) & "" & IIf(wdHyperlink.TextToDisplay <> "", wdHyperlink.TextToDisplay, wdHyperlink.Address) & ""
iCaretHold = iCaretMove + Len(wdHyperlink.TextToDisplay)
iCaretMove = iCaretHold
Exit Do
Else
iCaretMove = iCaretMove + 1
End If
Loop Until iCaretMove > Len(rParagraph.Text)
Next
If iCaretMove < Len(rParagraph.Text) Then
s = s & Mid(rParagraph.Text, iCaretMove)
End If
ExtractHyperlinks = "<p>" & s & "</p>"
End Function
Function RangeContains(rParent As Range, rChild As Range) As Boolean
If rChild.Start >= rParent.Start And rChild.End <= rParent.End Then
RangeContains = True
Else
RangeContains = False
End If
End Function

How to get Selection.Text to read displaytext of Macro field code

I'm having some trouble figuring this out, and would really appreciate some help. I'm trying to write a macro that uses the selection.text property as a Case text-expression. When the macro is clicked in Microsoft Word, the selected text is automatically set to the DisplayText. This method worked great for the formatting via Selection.Font.Color for a quick and dirty formatting toggling macro, but it doesn't work for the actual text.
When debugging with MsgBox, it is showing a box (Eg: □ ) as the value.
For example,
Word Field Code:
{ MACROBUTTON Macro_name DisplayText }
VBA Code run when highlighting "DisplayText" in Word:
Sub Macro_name()
Dim Str As String
Str = Selection.Text
MsgBox Str
Select Case Str
Case "DisplayText"
MsgBox "A was selected"
Case "B"
MsgBox "B was selected"
End Select
End Sub
What is output is a Message Box that only shows □
When I run this macro with some regular text selected, it works just fine.
My question is this: Is there a way to have the macro read the displaytext part of the field code for use in the macro?
You can read the field code, directly, instead of the selection (or the Field.Result which also doesn't give the text).
It's not quite clear how this macro is to be used throughout the document, so the code sample below provides two variations.
Both check whether the selection contains fields and if so, whether the (first) field is a MacroButton field. The field code is then tested.
In the variation that's commented out (the simpler one) the code then simply checks whether the MacroButton display text is present in the field code. If it is, that text is assigned to the string variable being tested by the Select statement.
If this is insufficient because the display text is "unknown" (more than one MacroButton field, perhaps) then it's necessary to locate the part of the field code that contains the display text. In this case, the function InstrRev locates the end point of the combined field name and macro name, plus the intervening spaces, in the entire field code, searching from the end of the string. After that, the Mid function extracts the display text and assigns it to the string variable tested by the Select statement.
In both variations, if the selection does not contain a MacroButton field then the selected test is assigned to the string variable for the Select statement.
(Note that for my tests I needed to use Case Else in the Select statement. You probably want to change that back to Case "B"...)
Sub Display_Field_DisplayText()
Dim Str As String, strDisplayText As String
Dim textLoc As Long
Dim strFieldText As String, strMacroName As String
Dim strFieldName As String, strFieldCode As String
strDisplayText = "text to display"
If Selection.Fields.Count > 0 Then
If Selection.Fields(1).Type = wdFieldMacroButton Then
strFieldName = "MacroButton "
strMacroName = "Display_Field_DisplayText "
strFieldCode = strFieldName & strMacroName
Str = Selection.Fields(1).code.text
textLoc = InStrRev(Str, strFieldCode)
strFieldText = Mid(Str, textLoc + Len(strFieldCode))
MsgBox strFieldText
Str = strFieldText
'If InStr(Selection.Fields(1).code.text, strDisplayText) > 0 Then
' Str = strDisplayText
'End If
End If
Else
Str = Selection.text
End If
Select Case Str
Case strDisplayText
MsgBox "A was selected"
Case Else
MsgBox "B was selected"
End Select
End Sub

Using Cross Reference Field Code to move selection to Target of Field Code

OP Update:
Thanks for the code KazJaw, it prompted me to change the approach I am trying to tackle the problem with. This is my current code:
Sub Method3()
Dim intFieldCount As Integer
Dim i As Integer
Dim vSt1 As String
intFieldCount = ActiveDocument.Fields.Count
For i = 1 To intFieldCount
ActiveDocument.Fields(i).Select 'selects the first field in the doc
Selection.Expand
vSt1 = Selection.Fields(1).Code
'MsgBox vSt1
vSt1 = Split(vSt1, " ")(2) 'Find out what the (2) does
MsgBox vSt1
ActiveDocument.Bookmarks(vSt1).Select 'Selects the current crossreference in the ref list
Next i
End Sub
Ok the so the Code currently finds the first field in the document, reads its field code and then jumps to the location in the document to mimic a CTRL+Click.
However, It does this for all types of fields Bookmarks, endnotes, figures, tables etc. I only want to find Reference fields. I thought I could deduce this from the field code but it turns out figures and bookmarks use the same field code layout ie.
A Reference/Boookmark has a field code {REF_REF4123123214\h}
A Figure cross ref has the field code {REF_REF407133655\h}
Is there an effective way to get VBA to distinguish between the two? I was thinking as reference fields in the document are written as (Reference 1) I could find the field and then string compare the word on the left to see if it says "Reference".
I was thinking of using the MoveLeft Method to do this
Selection.MoveLeft
But I can't work out how to move left 1 word from the current selection and select that word instead to do the strcomp
Or perhaps I can check the field type? with...
If Selection.Type = wdFieldRef Then
Do Something
End If
But I am not sure which "Type" i should be looking for.
Any advice is appreciated
All REF fields "reference" bookmarks. Word sets bookmarks on all objects that get a reference for a REF field: figures, headings, etc. There's no way to distinguish from the content of the field what's at the other end. You need to "inspect" that target, which you can do without actually selecting it. For example, you could check whether the first six letters are "Figure".
The code you have is inefficient - there's no need to use the Selection object to get the field code. The following is more efficient:
Sub Method3()
Dim fld As Word.Field
Dim rng as Word.Range
Dim vSt1 As String
ForEach fld in ActiveDocument.Fields
vSt1 = fld.Code
'MsgBox vSt1
vSt1 = Split(vSt1, " ")(2) 'Find out what the (2) does
MsgBox vSt1
Set rng = ActiveDocument.Bookmarks(vSt1).Range
If Left(rng.Text, 6) <> "Figure" Then
rng.Select
End If
Next
End Sub