VB.net: Getting underlined words between two strings in Word - vb.net

I am trying to select the underlined words between two sections of a word document and save them to a collection variable. I am building this in Blue Prism, which utilizes vb.net.
I have the following code, which is working, but it seems to be selecting all underlined words in the document (and also is selecting some blank lines), instead of only selecting the underlined words between the two specific sections in the document.
Document looks like this (ÜWord Document Example) :
I would want to select "Example1" and "Example3" in the document and save them to a variable, since those are between the two sections and underlined. The two section names will always be the same.
Here's the code I currently have:
Dim w As Object = doc.Application
Dim s As Object = w.Selection
Dim Para as Microsoft.Office.Interop.Word.Paragraph
Dim blnStart as Boolean
blnStart = false
Dim table As New System.Data.DataTable()
table.Columns.Add("Underlined_Text", GetType(String))
For Each Para In doc.Paragraphs
If Para.Range.Text.ToLower.Contains(strStartText) Then
blnStart = true
End If
If Para.Range.Font.Underline = 1 and blnStart Then
With s.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Font.Underline = 1
.Text = ""
.Replacement.Text = ""
.Format = True
.Forward = True
.Wrap = 0
.Execute
End With
.Select
table.Rows.Add(s.Range.Text)
End With
End If
If Para.Range.Text.ToLower.Contains(strEndText) Then
exit for
End If
Next Para
Underlined_Text = table
doc = Nothing
The variables 'strStartText' and 'strEndText' would be equal to the two section names.

Related

Find/Replace an Inserted Check Box Symbol with a Check Box Content Control

I would like to find/replace all inserted check box symbols with checkbox content controls. The symbol's font is Wingdings (either 111 or 168). Below is the code I started with, but I hit a wall when I realized that Word find doesn't recognize the symbol. I appreciate any help or guidance. Thank you.
Sub ReplaceUnicode168()
Dim objContentControl As ContentControl
With ActiveDocument
Set objContentControl = ActiveDocument.ContentControls.Add(wdContentControlCheckBox)
objContentControl.Cut
With Selection.Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.MatchCase = False
.MatchWholeWord = True
.MatchWildcards = False
.Text = Chr(168)
.Replacement.Text = "^c"
.Execute Replace:=wdReplaceAll
End With
End With
End Sub
I suggest that you try to find/replace these two particular characters using
.Text = ChrW(61551)
for the "111" WingDings Character and
.Text = ChrW(61608)
for the "168" WingDings character.
Be aware that the way Word encodes these characters is not very helpful. As far as Find/Replace is concerned, you have to use these Unicode Private Use Area encodings.
If you actually select the character and use VBA to discover its code using e.g.
Debug.Print AscW(Selection)
the answer is always 40 (and the Font of the character will probably be the same as the Surrounding font) Pretty useless. In older versions of Word you used to be able to look for the 40 character and find these characters, but I don't think that's possible now. But if you select the character and use
Sub SymInfo()
With Dialogs(wdDialogInsertSymbol)
' You won't see .Font and .CharNum listed under the
' properties of a Word.Dialog - some older Dialogs add
' per-Dialog properties at runtime.
Debug.Print .Font
Debug.Print .CharNum
End With
End Sub
Then you get the font name (Wingdings in this case) and the private use area character number, except it's expressed as a negative number (-3928 for Wingdings 168). The character to use in the Find/Replace is 65536-3928 = 61608.
Alternatively, you can find the private use area code by selecting the character, getting its WordOpenXML code, then finding the XML element that gives the code (and the font). Ideally use MSXML to look for the element but the following gives the general idea.
Sub getSymElement
Dim finish As Long
Dim start As Long
Dim x As String
x = Selection.WordOpenXML
start = Instr(1,x,"<w:sym")
' Should check for start = 0 (not found) here.
finish = Instr(start,x,">")
Debug.Print Mid(x,start, finish + 1 - start)
and for the 168 character you should see something like
<w:sym w:font="Wingdings" w:char="F0A8"/>
(Hex F0A8 is 61608)
There may be a problem where Word could potentially map more than one font/code to the same unicode private use area codepoint. There is some further code by Dave Rado here but I do not think you will need it for this particular problem.
After some follow-up, the following seems to work reasonably well here:
Sub replaceWingdingsWithCCs()
Dim cc As Word.ContentControl
Dim charcode As Variant
Dim ccchecked As Variant
Dim i As Integer
Dim r As Word.Range
' Make sure the selection point is not in the way
' (If the selection contains one of the characters you are trying to
' replace, Word will raise an error about the selection being in a
' plain text content control.
' If the first item in the document is not a CC,
' it's enouugh to do this:
ActiveDocument.Select
Selection.Collapse WdCollapseDirection.wdCollapseStart
' Put the character codes you need to look for here. Maybe you have some checked boxes too?
charcode = Array(61551, 61608)
' FOr each code, say whether you want a checked box (True) or an unchecked one.
ccchecked = Array(False, False)
For i = LBound(charcode) To UBound(charcode)
Set r = ActiveDocument.Range
With r.Find
.ClearFormatting
With .Replacement
.ClearFormatting
.Text = ""
End With
.Forward = True
.Wrap = wdFindStop
.MatchCase = False
.MatchWholeWord = True
.MatchWildcards = False
.Text = ChrW(charcode(i))
Do While .Execute(Replace:=True)
Set cc = r.ContentControls.Add(WdContentControlType.wdContentControlCheckBox)
cc.Checked = ccchecked(i)
r.End = r.Document.Range.End
r.Start = cc.Range.End + 1
Set cc = Nothing
Loop
End With
Next
Set r = Nothing
End Sub

MS Word VBA Find Variable-Length Pattern String

Question: Is there a way to specify a repeating pattern of variable but bounded length in the Find.Text argument?
Background:
I have a collection of Word documents, each containing several hundred pages of numbered text blocks. I want to copy each block of text into its own cell in a spreadsheet, but the text blocks aren't in Ordered or Multi-Level Lists and each block of text may contain multiple paragraphs, so I can't simply select and copy each paragraph in the document. To work around this, I've tried to use the Range.Find method to locate two adjacent number headings and copy all the characters between them. For testing purposes, I'm using the following sample document:
The paragraph header numbers can be 2-5 levels deep, with 1-2 digits in each level (i.e. "x.x." through "xx.xx.xx.xx.xx."). I'm using a wildcard search of the form "xx.xx.", relying on the placement of the decimal points to identify the headers. Here's my code:
'Open the Word document
Doc = CStr(folderPath & objFile.Name)
Set wDoc = wApp.Documents.Open(Doc)
Set wRange = wDoc.Range
RngEnd = wRange.End
'Search for text block
With wRange
Do While i < 7 And subRngStart2 < RngEnd
With .Find 'Search for starting keyword
.ClearFormatting
.Text = "[0-9]{1,2}.[0-9]{1,2}."
.Forward = True
'.Format = True
.MatchWildcards = True
.MatchCase = False
.Execute
End With
If .Find.Found = True Then
subRngStart1 = wRange.Start 'Mark starting position
wRange.SetRange Start:=subRngStart1 + 6, End:=RngEnd 'Reset range starting at end of keyword
contentFlag = True
Else
contentFlag = False
End If
With .Find 'Search for ending keyword
.ClearFormatting
.Text = "[0-9]{1,2}.[0-9]{1,2}."
.Forward = True
.MatchWildcards = True
.MatchCase = False
.Execute
End With
If .Find.Found = True Then
subRngStart2 = wRange.Start 'Mark ending position
Else
subRngStart2 = RngEnd
End If
wRange.SetRange Start:=subRngStart1, End:=subRngStart2 'Set range between first and second keywords
'Copy text in range to Excel
If contentFlag = True Then
Cells(i + 1, 1) = wRange.Text
End If
wRange.SetRange Start:=subRngStart2 - 3, End:=RngEnd 'Reset range starting at last keyword
i = i + 1
Loop
End With
This works fine for headers up to 3 levels but breaks down beyond that: the "Long Headers" example gets split in half because the search thinks the first two levels in the string form a complete text block (Row 7 in the output sample below).
I could just increase the starting offset (first IF statement, second line) from 6 to 10 to "skip over" long number strings, but this can cause problems with very short headers. I think the proper way to fix this is to search for a pattern of the form "xx.xx." which may repeat up to 4 consecutive times. I've tried a couple of variations on the wildcard string to achieve this, including:
.Text = "[0-9]{1,2}.*[0-9]{1,2}."
.Text = "[0-9]{1,2}.[0-9]{0,2}[.]{0,1}[0-9]{1,2}."
But these either don't do what I want or fail to compile (I'm guessing a min length of zero isn't allowed in wildcard charlists). Is it possible to specify variable-length patterns in Find.Text, or do I need to take a completely different approach?

Adjusting the width of columns of all tables in a Word document

In my Word document, I have over 300 tables and I want to change the table style and adjust the columns' widths. I am using the following code in my VBA macro. It's working for a style but not for column width. Please help me find where the problem is.
Sub Makro1()
'
' Makro1 Makro
'
'
Selection.Find.ClearFormatting
With Selection.Find
.Text = "Variable"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute
Selection.MoveRight Unit:=wdCharacter, Count:=1
Selection.MoveRight Unit:=wdCharacter, Count:=4, Extend:=wdExtend
Selection.Tables(1).Style = "eo_tabelle_2"
With Tables(1).Spacing
.Item(1) = 5.5 'adjusts width of text box 1 in cm
.Item(2) = 8.5 'adjusts width of text box 2 in cm
.Item(3) = 7.5 'adjusts width of text box 3 in cm
.Item(4) = 1.1 'adjusts width of text box 4 in cm
End With
End Sub
I'm going to interpret your question literally: that you merely want to process all the tables in the document and that your code is using Find only in order to locate a table...
The following example shows how you can work with the underlying objects in Word directly, rather than relying on the current Selection, which is what the macro recorder gives you.
So, at the beginning we declare object variables for the Document and a Table. The current document with the focus is assigned to the first. Then, with For Each...Next we can loop through each Table object in that document and perform the same actions on each one.
In this case, the style is specified and the column widths set. Note that in order to give a column width in centimeters it's necessary to use a built-in conversion function CentimetersToPoints since Word measures column width in Points.
Sub FormatTables
Dim doc as Document
Dim tbl as Table
Set doc = ActiveDocument
For Each tbl in doc.Tables
tbl.Style = "eo_tabelle_2"
tbl.Columns(1).Width = CentimetersToPoints(5.5)
tbl.Columns(2).Width = CentimetersToPoints(8.5)
tbl.Columns(3).Width = CentimetersToPoints(7.5)
tbl.Columns(4).Width = CentimetersToPoints(1.1)
Next
End Sub
As far as I can recall all the tables in a word file are a part of Tables collection and we can access the individual table item using an index. Assuming that you wont know the number of tables, here's the code that works for me.
For Each tbl In Doc.Tables
tbl.Columns(3).Width = 40
Next

How can I replace multiple tables and text style within a range/selection in Word-VBA?

So, I am working with VBA on a word template which for every item (requirements in this case) contains a table with different specifications (all the tables are in the same format) and some other information. Below each table I have a text which shows the status of each item like: status: Approved or Work, or Rejected etc. I am asked to delete all the other statuses in the template and keep only the "Rejected" status and the whole information and table with that has this status to format in a light grey. Does anybody has any idea how to navigate to all tables, information, and specify the section I need to Format? I am very new to this and I am completely stucked! Here's some code I wrote:
Sub DeleteWorkflow()
Selection.Find.ClearFormatting
Selection.Find.Style = ActiveDocument.Styles("Normal")
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Italic = False
With Selection.Find.Replacement.ParagraphFormat
.SpaceBefore = 0
.SpaceBeforeAuto = False
.SpaceAfter = 0
.SpaceAfterAuto = False
End With
With Selection.Find
.Text = "Status: Approved"
.Text = "Status: Work"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.Execute
'Finds status "Rejected" and changes the font color
Selection.Find.ClearFormatting
With Selection.Find
.Text = "Status: Rejected"
.Forward = True
.Wrap = Word.WdFindWrap.wdFindContinue
.Font.ColorIndex = wdGray50
Selection.Find.Execute
End With
The code to find the rejected status and to change its color is not working and I am not getting it why. Any idea?
Basis of the idea
The idea is to look through the sentences of the word document. Sentences comprise regular text and also text contained within tables.
As you load all the sentences in a single object in VBA, you can look through the content of the document sentences by sentences and perform an action on it.
We can also apply that type of search to tables within the document, if the text they contain match the characters you want.
The code
For sentences
Sub SENTENCE_CHANGE_COLOR()
Dim i As Long
Dim oSentences As Sentences
'Here we instantiate the variable oSentences to store all the values of the current opened document
Set oSentences = ThisDocument.Sentences
' We loop through every fields of the document
For i = 1 To oSentences.Count
' The property .Text contains the text of the item in it
' Then we just have to look for the text within the string of characters
If InStr(oSentences.Item(i).Text, "Status: Rejected") Then
'Do some stuff, like changing the color
oSentences.Item(i).Font.ColorIndex = wdGray50
else
' Do some other things like changing the color to a different color
oSentences.Item(i).Font.ColorIndex = wdGray25
End If
Next i
End Sub
For tables
Sub TABLE_CHANGE_COLOR()
Dim i As Long
Dim oTables As Tables
'Here we instantiate the variable oTables to store all the tables of the current opened document
Set oTables = ThisDocument.Tables
' We loop through every fields of the document
For i = 1 To oTables.Count
' Finding the occurence of the text in the table
If Not InStr(oTables.Item(i).Range.Text, "Status: Rejected") = 0 Then
'Do some stuff, like changing the color
oTables.Item(i).Range.Font.ColorIndex = wdGray50
End If
Next i
End Sub
Combination of the above methods
After we found the occurrence of a "Status: Rejected" document we can select the table right before it by comparing the table's end to the start of the occurrence.
Beware since the following code would modify any table before "Status: rejected". So if "Status: rejected" is input in an incorrect location, it will modify the previous table wherever this table will be in the document.
Sub REJECTED_TABLE_CHANGE_COLOR()
Dim i As Long, j As Long
Dim oSentences As Sentences
Dim oTables As Tables
'Here we instantiate the variable oSentences to store all the values of the current opened document
Set oSentences = ThisDocument.Sentences
'Here we instantiate the variable oTables to store all the tables of the current opened document
Set oTables = ThisDocument.Tables
' We loop through every fields of the document
For i = 1 To oSentences.Count
' The property .Text contains the text of the item in it
' Then we just have to look for the text within the string of characters
If InStr(oSentences.Item(i).Text, "Status: Rejected") Then
' When we have found the correct text, we try to find the table just above it
' We start from the last table
' This condition ensures we do not start looking for before the first table
If oTables.Item(1).Range.End < oSentences.Item(i).Start Then
j = oTables.Count
While oTables.Item(j).Range.End > oSentences.Item(i).Start
j = j - 1
Wend
oTables.Item(j).Range.Font.ColorIndex = wdGray50
End If
End If
Next i
End Sub
This solution would provide you the basis to edit the document when the matching criteria is found within an item.

Microsoft Word VBA Macro - One Paragraph Find-Replace Styles

I am executing a style search in Microsoft Word using a VBA Macro.
My goal is to perform certain actions once for every style found in the document.
The macro works correctly on documents that have at least two paragraphs, but the macro does not alert the style correctly in a document that has exactly one paragraph in it. It seems strange that when I enter a new paragraph mark, the styles are found, even though I did not add any new text or styles to the document, just an extra blank paragraph mark. Does anyone know what is wrong with my macro and how I can fix this? Thanks for taking a look.
Sub AlertAllStylesInDoc()
Dim Ind As Integer
Dim numberOfDocumentStyles As Integer
Dim styl As String
Dim StyleFound As Boolean
numberOfDocumentStyles = ActiveDocument.styles.count
For Ind = 1 To numberOfDocumentStyles
styl = ActiveDocument.styles(Ind).NameLocal
With ActiveDocument.Content.Find
.ClearFormatting
.text = ""
.Forward = True
.Format = True
.Style = styl
Do
StyleFound = .Execute
If StyleFound = True Then
' actual code does more than alert, but keeping it simple here'
MsgBox styl
GoTo NextStyle
Else
Exit Do
End If
Loop
End With
NextStyle:
Next
End Sub
I don't understand why ActiveDocument.Content is not working, but replacing it with ActiveDocument.Range(0,0) appears to resolve the issue (tested in Word 2016).
With ActiveDocument.Range(0, 0).Find