Selecting text from a specific column in Word with VBA - vba

I have a document which is separated by section breaks.
Within each section I may have zero or one column breaks.
I want to extract the text from the first column of each section that contains 2 columns, like so:
For Each oSec In ActiveDocument.Sections
iSectionStart = oSec.Range.Start
iSectionEnd = oSec.Range.End
i = oSec.PageSetup.TextColumns.Count
If (2 = i) Then
' Update the range to only contain the text in textcolumn 1
' then select and copy it to a destination string
End If
Next oSec
However, the TextColumns object does not seem to have a method for returning the column contents.

TextColums.Count is actually not specified by the number of Column Breaks. You can have 2 columns (i.e. TextColumns.Count = 2) without a single Column Break.
If you for instance create a new document, fill it with random text by typing
=Rand(100)
and hit enter and select Two Columns from the Layout Tab. You will notice that you get two columns over 8 pages or so where none of the pages have Column Breaks.
The Office Object Model does not provide with an option to automatically select a specific column on a specifice page within a section. If the document actually has Column Breaks you can use the Find option to find the Column Break and from there select the Range from the start of the page to the start of the Column Break character that you just found using the Find option. Not a trivial thing to do as you can see.

Since the column break marker is represented by the ASCII value 14, all I had to do was look at each word in the section until I found the expected marker
Sub ExtractColumnText()
'
' On pages with no columns, the text is copied to both output files
' On pages with two columns, the column1 text is copied to "C:\DocTemp\Italian.doc"
' and column2 text is copied to "C:\DocTemp\English.doc"
'
Dim DestFileNum1 As Long
Dim DestFileNum2 As Long
Dim strDestFile1 As String
Dim strDestFile2 As String
Dim strCol1 As String
Dim strCol2 As String
Dim i As Integer
Dim oSec As Section
Dim oRngCol1 As Range
Dim oRngCol2 As Range
Dim oRngWord As Range
strDestFile1 = "C:\DocTemp\Italian.doc" 'Location of external file
DestFileNum1 = FreeFile()
strDestFile2 = "C:\DocTemp\English.doc" 'Location of external file
DestFileNum2 = DestFileNum1 + 1
Open strDestFile1 For Output As DestFileNum1
Open strDestFile2 For Output As DestFileNum2
For Each oSec In ActiveDocument.Sections
Set rngWorking = oSec.Range.Duplicate
Set oRngCol1 = rngWorking.Duplicate
oRngCol1.End = rngWorking.End - 1 ' exclude the page break
Set oRngCol2 = oRngCol1.Duplicate
If 2 <= oSec.PageSetup.TextColumns.Count Then
'examine each word in the section until we switch columns
For Each rngWord In rngWorking.Words
' 14 = column break marker
If 14 = AscW(rngWord.Text) Then
oRngCol1.End = rngWord.Start
oRngCol2.Start = rngWord.End
GoTo Xloop
End If
Next rngWord
End If
Xloop:
oRngCol1.Select
Print #DestFileNum1, oRngCol1.Text
oRngCol2.Select
Print #DestFileNum2, oRngCol2.Text
Next oSec
Close #DestFileNum1
Close #DestFileNum2
MsgBox "Done!"
End Sub

Related

Looping through columns to get column numbers based on headers

I have a template with a set number of columns (170) and title headers (row 1 cell name's). This is always the same, until users add columns in between (they're instructed not to change headers). The idea is to make it tamperproof as far as the adding of columns is involved.
I'd like to make variables to hold some of the headers (with the capacity to hold all) and check these with the template to find out the column number (in a loop I reckon). It's probably wisest to make a function to call upon it?
Dim ColHeader1Str as string 'literal row 1, column 1 value (which is always
'the same string and position in the template)
Dim iColHeader1 as integer 'holds the (to be set) value of the column number
Set ColHeader1Str = "ColHeader1"
Now I'd like a loop where it loops trough all the columns (last column = 200) and checks to see what the column number is that matches the ColHeader1Str and store this in the iColHeader1
So something like:
Function find_columnNmbr
Dim i As Integer
For i = 1 To 200 Step 1
If 'ColHeader1Str matches actual column header name
'set found integer as iColHeader1 and so forth
Exit For
End If
Next
End Function`
I know I'm missing a few steps and I'm hoping you guys can help me out.
Update: The template has set column headers. When users interact with it a result could be that columns shift position, or they add more. I have a workbook that needs to load data out of the user's altered template.
I.E. The template has columns 1, 2, 3, 4 and the names are column1, column 2 etc. A user ads a random column so now there are 5. The loop needs to loop through the names of the column headers and identify the column number of the original template columns 1, 2 etc based on a string variable with the original names, which I've hard coded beforehand. These are public constants.
What function LookForHeaders do: input a string, then search for the string in usersheet.range(1:1). If it is found, return the column number of that cell, otherwise it returns 0.
Private Function LookForHeaders(ByVal headerName As String) As Long
Dim rng As Range
Dim userSheet As WorkSheet
Set userSheet = 'userSheet here'
On Error GoTo NotFound
LookForHeaders = userSheet.Range("1:1").Find(headerName).Column
Exit Function
NotFound:
LookForHeaders = 0
End Function
Private Sub Test()
Dim rng As Range
Dim template As WorkSheet
Set template = 'template here'
For Each rng In template.Range(Cells(1,1), Cells(1,200))
iColHeader1 = LookForHeaders(rng.Value)
'Do something with iColHeader1
Next rng
End Sub
Not sure what your looking for but here is example
Option Explicit
Public Sub Example()
Dim LastCol As Long
Dim i As Long
LastCol = ActiveSheet.UsedRange.Columns(ActiveSheet.UsedRange.Columns.Count).Column
For i = 1 To LastCol
If Cells(i) = "Name" Then
Debug.Print Cells(i).Address
End If
Next
End Sub

How to get multiple results with one vlookup in VBA, Where vlookup is the part of the whole string (vlookup value)

I have 3 sheets, in sheet one I have a column "Register Codes" and I have extracted the unique codes in next column. Please check the below image.
Based on these unique codes, sub-codes are allocated in sheet 2. please check the below image.
Now what I am trying here is that in sheet 3 I need every "Register code" with the relevant "sub-code" which is allocated in sheet2 based on the "unique ID" given in Sheet1. please check the below image for expected output.
I have been using various combinations of formulas but could not get a proper solution. What is the best way to do it in VBA as I just started learning in this field.
Subject to a few conditions the following code will do what you want. Install it in a standard code module (by default "Module1", but you can name it as you like) in the workbook where you have your data.
Option Explicit
Enum Nws ' Worksheet navigation
NwsFirstDataRow = 2 ' presumed the same for all worksheets
NwsCode = 1 ' 1 = column A (change as required)
NwsSubCode ' No value means previous + 1
NwsNumer
End Enum
Sub NumerList()
' 05 Apr 2017
Dim Wb As Workbook ' all sheets are in the same workbook
Dim WsCodes As Worksheet ' Register codes
Dim WsNum As Worksheet ' Sub-code values
Dim WsOut As Worksheet ' Output worksheet
Dim RegName As String, RegCode As String
Dim Sp() As String
Dim Rs As Long ' Source row in WsNum
Dim Rt As Long ' Target row in WsOut
Dim R As Long, Rl As Long ' rows / Last row in WsCodes
Set Wb = ActiveWorkbook ' Make sure it is active!
Set WsCodes = Wb.Worksheets("Reg Codes") ' Change name to your liking
Set WsNum = Wb.Worksheets("Code Values") ' Change name to your liking
On Error Resume Next
Set WsOut = Wb.Worksheets("Output") ' Change name to your liking
If Err Then
Set WsOut = Wb.Worksheets.Add(After:=WsNum)
WsOut.Name = "Output" ' create the worksheet if it doesn't exist
End If
On Error GoTo 0
Rt = NwsFirstDataRow
With WsCodes
Rl = .Cells(.Rows.Count, NwsCode).End(xlUp).Row
For R = NwsFirstDataRow To Rl
RegName = .Cells(R, NwsCode).Value
Sp = Split(RegName, "-")
If UBound(Sp) > 1 Then ' must find at least 2 dashes
RegCode = Trim(Sp(1))
Else
RegCode = ""
End If
If Len(RegCode) Then
On Error Resume Next
Rs = WorksheetFunction.Match(RegCode, WsNum.Columns(NwsCode), 0)
If Err Then Rs = 0
On Error GoTo 0
If Rs Then
Do
WsOut.Cells(Rt, NwsCode).Value = RegName
WsOut.Cells(Rt, NwsSubCode).Value = WsNum.Cells(Rs, NwsSubCode).Value
WsOut.Cells(Rt, NwsNumer).Value = WsNum.Cells(Rs, NwsNumer).Value
Rt = Rt + 1
Rs = Rs + 1
Loop While WsNum.Cells(Rs, NwsCode).Value = RegCode
Else
RegCode = ""
End If
End If
If Len(RegCode) = 0 Then
WsOut.Cells(Rt, NwsCode).Value = RegName
WsOut.Cells(Rt, NwsSubCode).Value = "No sub-code found"
Rt = Rt + 1
End If
Next R
End With
End Sub
And here are the conditions.
All 3 sheets must be in the same workbook. If you have them in different workbooks the code must be adapted to handle more than one workbook.
The two worksheets with data must exist. They must be named as the code prescribes or the code must be modified to match the names they have. The same goes for the Output worksheet, but that sheet will be created by the code if it doesn't exist. You can change its name in the code.
The enumeration at the top of the code presumes that all 3 sheets are identically formatted with no data in row 1 (captions) and data in columns A, B and C. Changes aren't difficult but must be made if you want a different input or output. You can change the columns in the existing code by assigning other values to the columns in the enum, but the code requires the same arrangement in all sheets.
The extracted codes in the Codes sheet aren't used. The code does its own extraction. It will mark an error in the output list if a code can't be extracted or if it isn't found in the Sub-code list.
The sub-codes in the Numer sheet must be sorted like the picture you posted. The code will look for the first occurrence of "image" and find the subcodes in the following rows while the code is "image" in column A. It will not find further occurrences of "image" that might follow after an intermission.
The code doesn't do any colouring. Adding it wouldn't be difficult, but you would have to specify some rules, such as "20 different colours for the first 20 codes and then repeat the same sequence".
Other cell formatting could be added without much effort because each cell is already individually named. More properties can be added easily.

How to reference MS Word table of contents page numbers with Excel VBA?

I am trying to use the page numbers to the right-hand side of a table of contents object in Word in some VBA code. I can access the array storing the text associated with these page numbers using GetCrossReferenceItems(wdRefTypeHeading) but cannot seem to get at the page numbers themselves. None of the GetCrossReferenceItems constants listed here seem relevant.
Is there a way to reference these page numbers? Thanks!
I'm not a "Worder" so here is what I came up to:
Function GetPagesNumber(doc As Document) As Long()
Dim i As Long
Dim myRng As Range
Dim myHeadings As Variant
With doc
Set myRng = .Content
myRng.Collapse Direction:=wdCollapseEnd
myHeadings = .GetCrossReferenceItems(wdRefTypeHeading)
ReDim pages(1 To UBound(myHeadings)) As Long
For i = 1 To UBound(myHeadings)
myRng.InsertCrossReference ReferenceType:=wdRefTypeHeading, ReferenceKind:=wdPageNumber, ReferenceItem:=i
With .Paragraphs(ActiveDocument.Paragraphs.count).Range
myRng.SetRange Start:=.Start, End:=.End - 1
End With
pages(i) = CLng(myRng.Text)
Next i
End With
myRng.Delete
GetPagesNumber = pages
End Function
to be used like follows:
Option Explicit
Sub main()
Dim myPagesNumber() As Long
myPagesNumber = GetPagesNumber(ActiveDocument) '<-- store index pages numbers in myPagesNumber
End Sub
Instead of using;
myHeadings = .GetCrossReferenceItems(wdRefTypeHeading)
You can also use;
Dim myField As Field
Set myField = ActiveDocument.TablesOfContents(1).Range.Fields(1)
myHeadings = Split(myField.Result.Text, Chr(13))
This will return an array of strings, with within the last characters of the array the page number of the heading. Use pgnr = CInt(Right(myHeadings (i), Len(myHeadings (i)) - InStrRev(myHeadings (i), Chr(9)))) to get the pagenumber.
What would be better is to first split myHeadings into rows with Chr(13) as delimiter and then split into columns with Chr(9) as delimiter.
So the whole table of contents in an array.

Read a table in outlook mail using macro

I'm writing a macro to read the below Email:
Start Date: July-07-2016
Name Accept Approved
John Yes No
Peter No No
I'm good with search the word "Start date" and get the next 13 character to copy and paste that in a text file. But my problem is the next part is in a Table format. So when I'm searching for the name "John" and trying to copy the next 10 Characters. It doesn't work.
Is there a way to search for the word "Accept" and get the First Row data(Which will be No) and then Second Row data(Which will be No)? Is that possible?
This EMail's table will have only 2 Rows. So, I don't need any dynamic way to get the data. Can someone guide me?
I've tried searching the internet first, but the solutions are too huge for me to understand. Is there any simple way?
I have even tried the solution give here: How to read table pasted in outlook message body using vba? but that method works when the body has ONLY TABLE. But my EMail will have text as well as table.
I've never actually programmed in vba, but I think I can help (a bit) nevertheless.
In the answer on the post you linked to, there is the line
Set msg = ActiveExplorer.Selection.item(1)
I think you can change this to something like
Set msg = Right(ActiveExplorer.Selection.item(1), 25)
to get rid of the text before the table (I got the Right part from here: http://www.exceltrick.com/formulas_macros/vba-substring-function/, but it should also work in Outlook).
This way, you run the code on the table itself instead of on the whole message.If there is also text after the table, it might be more difficult, but you might get that done by searching for the table ending.
I hope this helps!
Attempt 2
After some searching and thinking, I came up with the idea to get the html of the message and use that to parse the table (Ok, not really, I got it from the comments here: http://www.codeproject.com/Questions/567073/Howplustoplusrecognizeplusandplusreadplustableplus). Based on that and other sources, it is possible to write a code that gets the table from an email.
I've written some code that might work, but I couldn't test it as I do not have Outlook. Also, this is my first time writing vba, so there may be a lot of syntax errors (and the code is ugly).
Sub GetTable()
Dim msg As Outlook.mailItem
Dim html As String
Dim tableBegin As String
Dim tableEnd As String
Dim posTableBegin As Long
Dim posTableEnd As Long
Dim table As String
Dim rowBegin As String
Dim rowEnd As String
Dim rowCount As Long
Dim columnBegin As String
Dim columnBeginLen As Long
Dim columnEnd As String
Dim posRowBegin As Long
Dim posRowEnd As Long
Dim values As String(0, 3)
Dim beginValue0 As Long
Dim beginValue1 As Long
Dim beginValue2 As Long
Dim EndValue0 As Long
Dim EndValue1 As Long
Dim EndValue2 As Long
' Get the message and the html
Set msg = ActiveExplorer.Selection.item(1)
html = msg.HTMLbody
' Get the begin and end positions of the table (within the html)
tableBegin = "<table>"
tableEnd = "</table>"
posTableBegin = InStr(1, html, tableBegin)
posTableEnd = InStr(posTableBegin, html, tableEnd)
' Get the html table
table = Mid(html, posTableBegin + Len(tableBegin), posTableEnd - posTableBegin - Len(tableBegin))
' Set the variables for the loop
rowBegin = "<tr>"
rowEnd = "</tr>"
rowCount = 0
columnBegin = "<td>"
columnBeginLen = Len(columnBegin)
columnEnd = "</td>"
' Loop trough all rows
posRowBegin = InStr(lastPos, table, rowBegin)
Do While posRowBegin != 0
' Get the end from the current row
posRowEnd = InStr(posRowBegin, table, rowEnd)
rowCount = rowCount + 1
' Make the array larger
ReDim Preserve values(rowCount + 1, 3)
' Get the contents from that row
row = Mid(table, posRowBegin + Len(rowBegin), posRowEnd - posRowBegin - Len(rowBegin))
' Get the three values from that row (name, Accept, Approved) and put it in the array
beginValue0 = InStr(1, row, columnBegin) + columnBeginLen
endValue0 = InStr(beginValue0, row, columnEnd)
beginValue1 = InStr(endValue0, row, columnBegin) + columnBeginLen
endValue1 = InStr(beginValue1, row, columnEnd)
beginValue2 = InStr(endValue1, row, columnBegin) + columnBeginLen
endValue2 = InStr(beginValue2, row, columnEnd)
values(rowCount, 0) = Mid(row, beginValue0, endValue0)
values(rowCount, 1) = Mid(row, beginValue1, endValue1)
values(rowCount, 2) = Mid(row, beginValue2, endValue2)
' Get the beginning of the next row
posRowBegin = InStr(lastPos, table, rowBegin)
Loop
' The values are now in the (double) array 'values'.
' values(0, [1-3]) contains the headers.
End Sub
As said before, the original idea came from http://www.codeproject.com/Questions/567073/Howplustoplusrecognizeplusandplusreadplustableplus. Additionally, I used Word VBA how to select text between two substrings and assign to variable? and the Microsoft documentation to write this.
While it is likely that the code does not work out of the box, I think it still gets the general idea (and some specifics) across, so that it can be used as a guide. I hope this is the solution you need!
You can actually use the Word Object Model to parse out the text from the table - assuming that the email is in HTML format.
Get a Word.Document object from the Inspector.WordEditor property and use Word objects and methods to get the text, like the following below example from MSDN. Just replace ActiveDocument with the variable you declare and set from WordEditor.
Sub ReturnCellContentsToArray()
Dim intCells As Integer
Dim celTable As Cell
Dim strCells() As String
Dim intCount As Integer
Dim rngText As Range
If ActiveDocument.Tables.Count >= 1 Then
With ActiveDocument.Tables(1).Range
intCells = .Cells.Count
ReDim strCells(intCells)
intCount = 1
For Each celTable In .Cells
Set rngText = celTable.Range
rngText.MoveEnd Unit:=wdCharacter, Count:=-1
strCells(intCount) = rngText
intCount = intCount + 1
Next celTable
End With
End If
End Sub

VBA UBound returns a negative value

I would like to know what I'm doing wrong...
I have a word document open (in word 2010) with three tables in it. I wanted to test basic table extraction in VBA and followed the instructions http://msdn.microsoft.com/en-us/library/office/aa537149(v=office.11).aspx.
Sub ExtractTableData()
Dim doc As Word.Document
Dim tbl As Word.Table
Dim rng As Word.Range
Dim sData As String
Dim aData1() As String
Dim aData2() As String
Dim aDataAll() As String
Dim nrRecs As Long
Dim nrFields As Long
Dim lRecs As Long
Dim lFields As Long
Set doc = ActiveDocument
Set tbl = doc.Tables(1)
Set rng = tbl.ConvertToText(Separator:=vbTab, _
NestedTables:=False)
' Pick up the delimited text into and put it into a string variable.
sData = rng.Text
' Restore the original table.
doc.Undo
' Strip off last paragraph mark.
sData = Mid(sData, 1, Len(sData) - 1)
' Break up each table row into an array element.
aData1() = Split(sData, vbCr)
nrRecs = UBound(aData1())
' The messagebox below is for debugging purposes and tells you
' how many rows are in the table. It is commented out but can
' be used simply by uncommenting it.
'MsgBox "The table contained " & nrRecs + 1 & " rows"
'Process each row to break down the field information
'into another array.
For lRecs = LBound(aData1()) To nrRecs
aData2() = Split(aData1(lRecs), vbTab)
' We need to do this only once!
If lRecs = LBound(aData1()) Then
nrFields = UBound(aData2())
ReDim Preserve aDataAll(nrRecs, nrFields)
End If
' Now bring the row and field information together
' in a single, two-dimensional array.
For lFields = LBound(aData2()) To nrFields
aDataAll(lRecs, lFields) = aData2(j)
Next
Next
End Sub
I'm getting an error at this line: ReDim Preserve aDataAll(nrRecs, nrFields), which is due to "nrFields" being set to a negative value (-1)...
No idea how the upper bound of the array is a negative value... Any help on this would be much appreciated.
I figured it out - I was trying to extract a nested table. I had to cycle through all sub-tables and extract individually. Also, I had to search for and remove ^p before extraction to retain table structure.
After I had figured it out, I noticed that the MS code sample had an error: aData2(j) should actually be aData2(lFields).
Hope this helps some other newbie!
If UBound is -1 and LBound = 0, the array is empty. You can generate an empty array as follows:
Dim EmptyArray() As String
Dim s As String
EmptyArray = Split("")
Debug.Print (UBound(EmptyArray)) ' displays -1
Debug.Print (LBound(EmptyArray)) ' displays 0
In your case I suspect you need to skip the processing if the array is empty:
aData1 = Split(...)
If (UBound(aData1) < LBound(aData1) Then
' UBound is -1 and LBound is 0, array is empty, nothing to do
Else
' Array is non-empty, do your stuff
End If
Although quite bizarre, it is possible for VARIANT SAFEARRAY to have negative lower and upper bound values for any of the dimensions. The array extent is LBound(,dimension) to UBound(,dimension).
What must be true is UBound >= LBound.
To get the array size, use UBound - LBound + 1.
It used to be convention to set the lower bound using an Option Base statement at the top of VBA code although, of course, that didn't affect arrays being returned by 3rd party libraries. Most folk used to use 1 as the lower bound.