Transforming Word tables into Excel array - vba

I am trying to transfer Word tables to Excel - this has already been done here - and in addition, during the transfer I'd like to keep only rows that contain certain content, and would like to reshape the table before pasting it into Excel. I thought this could be done by converting each table first into an Excel array and then modifying the array as needed before pasting it to a specified range. Yet, I am not so familiar with Word VBA and I am finding this task pretty hard. I am starting from this code here, which I found at the post referenced above.
Option Explicit
Sub ImportWordTable()
Dim wdDoc As Object
Dim wdFileName As Variant
Dim tableNo As Integer 'table number in Word
Dim iRow As Long 'row index in Excel
Dim iCol As Integer 'column index in Excel
Dim resultRow As Long
Dim tableStart As Integer
Dim tableTot As Integer
On Error Resume Next
ActiveSheet.Range("A:AZ").ClearContents
wdFileName = Application.GetOpenFilename("Word files (*.docx),*.doc", , _
"Browse for file containing table to be imported")
If wdFileName = False Then Exit Sub '(user cancelled import file browser)
Set wdDoc = GetObject(wdFileName) 'open Word file
With wdDoc
tableTot = wdDoc.tables.Count
If tableTot = 0 Then
MsgBox "This document contains no tables", _
vbExclamation, "Import Word Table"
End If
For tableStart = 1 To tableTot
With .tables(tableStart)
'copy cell contents from Word table cells to Excel cells
For iRow = 1 To .Rows.Count
For iCol = 1 To .Columns.Count
Cells(resultRow, iCol) = WorksheetFunction.Clean(.cell(iRow, iCol).Range.Text)
Next iCol
resultRow = resultRow + 1
Next iRow
End With
resultRow = resultRow + 1
Next tableStart
End With
End Sub
I think I should change this chunk to obtain what I am looking for.
For tableStart = 1 To tableTot
With .tables(tableStart)
'copy cell contents from Word table cells to Excel cells
For iRow = 1 To .Rows.Count
For iCol = 1 To .Columns.Count
Cells(resultRow, iCol) = WorksheetFunction.Clean(.cell(iRow, iCol).Range.Text)
Next iCol
resultRow = resultRow + 1
Next iRow
End With
resultRow = resultRow + 1
Next tableStart
End With
Can someone help me with this? I can provide more details if needed. Many thanks!
Riccardo

If you want to copy only certain rows:
For tableStart = 1 To tableTot
With .tables(tableStart)
For iRow = 1 To .Rows.Count
v = WorksheetFunction.Clean(.cell(iRow, 1).Range.Text)
If v = "A" Or v = "B" Or v = "C" Then
For iCol = 1 To .Columns.Count
Cells(resultRow, iCol) = WorksheetFunction.Clean( _
.cell(iRow, iCol).Range.Text)
Next iCol
resultRow = resultRow + 1
End If
Next iRow
End With
resultRow = resultRow + 1
Next tableStart

With the help of Tim, this is the code that does what I was looking for.
Sub ImportWordTable()
Dim wdDoc As Object
Dim wdFileName, v, cont As Variant
Dim tableNo As Integer 'table number in Word
Dim iRow As Long 'row index in Excel
Dim iCol As Integer 'column index in Excel
Dim resultRow As Long
Dim tableStart As Integer
Dim tableTot As Integer
Dim rtemp, i As Integer
Dim categ(4), content(4) As Variant
Dim found, temprange As Range
Worksheets.Add.Name = "tempsht"
Worksheets.Add.Name = "final"
With Sheets("final")
.Cells(1, 1) = "Author"
.Cells(1, 2) = "Title"
.Cells(1, 3) = "Date"
.Cells(1, 4) = "Publication name"
.Cells(1, 5) = "Word count"
End With
categ(0) = "BY"
categ(1) = "HD"
categ(2) = "PD"
categ(3) = "SN"
categ(4) = "WC"
resultRow = 2
wdFileName = Application.GetOpenFilename("Word files (*.rtf),*.rtf", , "Browse for file containing table to be imported")
If wdFileName = False Then Exit Sub '(user cancelled import file browser)
Set wdDoc = GetObject(wdFileName) 'open Word file
With wdDoc
tableTot = wdDoc.tables.Count
If tableTot = 0 Then
MsgBox "This document contains no tables", _
vbExclamation, "Import Word Table"
End If
For tableStart = 1 To tableTot - 1
With .tables(tableStart) 'subset the table and copy it to a tempsheet
rtemp = 1
For iRow = 1 To .Rows.Count
v = WorksheetFunction.Clean(.cell(iRow, 1).Range.Text)
If v = " HD" Or v = " BY" Or v = " WC" Or v = " PD" Or v = " SN" Or v = "HD" Or v = "BY" Or v = "WC" Or v = "PD" Or v = "SN" Then
For iCol = 1 To .Columns.Count
Sheets("tempsht").Cells(rtemp, iCol) = Trim(WorksheetFunction.Clean(.cell(iRow, iCol).Range.Text))
Next iCol
rtemp = rtemp + 1
End If
Next iRow
Set temprange = Sheets("tempsht").Range("A1:A5")
With temprange
For i = 0 To 4
Set found = .find(What:=categ(i))
If found Is Nothing Then
content(i) = ""
Else
content(i) = Sheets("tempsht").Cells(found.Row, 2).Value
End If
Next i
End With
Sheets("final").Range(Cells(resultRow, 1), Cells(resultRow, 5)) = content
Sheets("tempsht").Range("A1:B5").ClearContents 'remove content from tempsheet
End With
resultRow = resultRow + 1
Next tableStart
Application.DisplayAlerts = False 'delete temporary sheet
Sheets("tempsht").Select
ActiveWindow.SelectedSheets.Delete
End With
End Sub

Related

Copy paste tables from word with multi-line cells

I have a word document that contains a number of tables. I've written a script to retrieve the tables starting at a specified table number i.e. table 1, 2, 3, or 4 and so on (user selects). The script then pulls the tables into an excel workbook. The problem I am having is that all of the tables have 4 columns. The 3rd column has content in it which contains multiple lines so when it pastes to excel, it looks bad. I understand that if you copy the 3rd column of any table, double click in a cell in excel and paste, it'll paste in the line breaks so it looks ok. Was wondering if there was a way to do this in vba.
Here is my table that I want to copy into Excel:
Here is what it looks like it when the script pastes it in:
Here is what I needed it to look like:
Here's what I have so far:
Option Explicit
Sub Macro1()
Dim wdDoc As Object
Dim wdFileName As Variant
Dim tableNo As Integer 'table number in Word
Dim iRow As Long 'row index in Excel
Dim iCol As Integer 'column index in Excel
Dim resultRow As Long
Dim tableStart As Integer
Dim tableTot As Integer
On Error Resume Next
ActiveSheet.Range("A:AZ").ClearContents
wdFileName = Application.GetOpenFilename("Word files (*.docx),*.docx", , _
"Browse for file containing table(s) to be imported")
If wdFileName = False Then Exit Sub '(user cancelled import file browser)
Set wdDoc = GetObject(wdFileName) 'open Word file
With wdDoc
tableNo = wdDoc.Tables.Count
tableTot = wdDoc.Tables.Count
If tableNo = 0 Then
MsgBox "This document contains no tables", _
vbExclamation, "Import Word Table"
ElseIf tableNo > 1 Then
tableNo = InputBox("This Word document contains " & tableNo & " tables." & vbCrLf & _
"Enter the table to start from", "Import Word Table", "1")
End If
resultRow = 1
For tableStart = 1 To tableTot
With .Tables(tableStart)
'copy cell contents from Word table cells to Excel cells
For iRow = 1 To .Rows.Count
For iCol = 1 To .Columns.Count
Cells(resultRow, iCol) = WorksheetFunction.Clean(.Cell(iRow, iCol).Range.Text)
Next iCol
resultRow = resultRow + 1
Next iRow
End With
resultRow = resultRow + 1
Next tableStart
End With
End Sub
I've found this solution which still requires the cell-wise iteration (unfortunately the line breaks are treated as cell delimiters when Pasting directly to Excel using Paste, PasteSpecial or several of the CommandBars.ExecuteMso options.
Try replacing the Ascii 13 character with a vbCrLf (carriage return + line feed) and also replacing Ascii 7 with an empty string:
Dim thisText as String, newText as String
For tableStart = 1 To tableTot
With .Tables(tableStart)
'copy cell contents from Word table cells to Excel cells
For iRow = 1 To .Rows.Count
For iCol = 1 To .Columns.Count
thisText = .Cell(iRow, iCol).Range.Text
newText = Replace(thisText, Chr(13), vbCrLf)
newText = Replace(newText, Chr(7), vbNullString)
Cells(resultRow, iCol) = WorksheetFunction.Clean(newText)
Next iCol
resultRow = resultRow + 1
Next iRow
End With
resultRow = resultRow + 1
Next tableStart
There may be a more elegant way to do this without looping row/column, but for now this should work.
Actual code that I tested
Sub foo2()
Dim wdApp As Object, wdDoc As Object, wdTable As Object
Set wdApp = GetObject(, "Word.Application")
Set wdDoc = wdApp.Documents(1)
Set wdTable = wdDoc.Tables(1)
Dim iRow As Long, iCol As Long, resultRow As Long
Dim thisText As String, newText As String
resultRow = 1
With wdTable
For iRow = 1 To .Rows.Count
For iCol = 1 To .Columns.Count
thisText = .Cell(iRow, iCol).Range.Text
newText = Replace(thisText, Chr(13), vbCrLf)
newText = Replace(newText, Chr(7), vbNullString)
Cells(resultRow, iCol) = newText
Next iCol
resultRow = resultRow + 1
Next iRow
End With
End Sub

How to loop through sub-folders?

I have VBA code that returns external file details like path, type, last modified date, last created date etc. for files in a folder.
I want to return the details for files in the sub-folders of that folder.
Dim IRow
Sub ListFiles()
IRow = 11 'where you want your first row of data
Call ListMyFiles(Range("B5"), False) 'Where B5 is your filepath (eg, C:\)
End Sub
Sub ListMyFiles(MySourcePath, includesubfolders)
Dim xSubFolder As Object
Set MyObject = New FileSystemObject
Set mysource = MyObject.GetFolder(MySourcePath)
On Error Resume Next
For Each myfile In mysource.Files
icol = 1
Cells(IRow, icol).Value = myfile.Path
icol = icol + 1
Cells(IRow, icol).Value = myfile.Name
icol = icol + 1
Cells(IRow, icol).Value = myfile.Type
icol = icol + 1
Cells(IRow, icol).Value = myfile.DateLastModified
icol = icol + 1
Cells(IRow, icol).Value = myfile.DateCreated
icol = icol + 1
IRow = IRow + 1
Next
If xIsSubfolders Then
For Each xSubFolder In xFolder.subfolders
ListFilesInFolder xSubFolder.Path, True
Next xSubFolder
End If
Set xFile = Nothing
Set xFolder = Nothing
Set xFileSystemObject = Nothing
End Sub
I haven't tested this, but basically what you want is recursion. This is when you call the subroutine/function while in that function. Basically it's the subroutine calling itself every time it finds a subfolder.
Something like:
Dim IRow
Sub ListFiles()
IRow = 11 'where you want your first row of data
Call ListMyFiles(Range("B5"), False) 'Where B5 is your filepath (eg, C:\)
End Sub
Sub ListMyFiles(MySourcePath as string, includesubfolders as boolean)
Set MyObject = New FileSystemObject
Set mysource = MyObject.GetFolder(MySourcePath)
On Error Resume Next
For Each myfile In mysource.Files
icol = 1
Cells(IRow, icol).Value = myfile.Path
icol = icol + 1
Cells(IRow, icol).Value = myfile.Name
icol = icol + 1
Cells(IRow, icol).Value = myfile.Type
icol = icol + 1
Cells(IRow, icol).Value = myfile.DateLastModified
icol = icol + 1
Cells(IRow, icol).Value = myfile.DateCreated
icol = icol + 1
IRow = IRow + 1
Next
'Check if the subroutine was called to include subfolders
If includesSubFolders Then
'Loop through all of the subfolders in the FSO folder
For Each SubFolder in mysource.SubFolders
'Call this same subroutine
ListMyFiles(Subfolder.Path, true)
Next xSubFolder
End If
End Sub
A lot of things can be simplified here. More importantly, the folder transversing needs to be a recursive call in order to go through to all subfolder levels.
Here is a sample code that does this. The first argument to ListMyFiles is the cell location where the path is stored, and the second argument when you want the file list to start.
Sub ListFiles()
Call ListMyFiles(Sheet1.Range("B5"), Sheet1.Range("B11"), True)
End Sub
Sub ListMyFiles(ByVal r_SourcePath As Range, ByRef r_List As Range, Optional includesubfolders As Boolean = False)
Dim path As String, ff As Folder
path = r_SourcePath.Text
Dim fso As New FileSystemObject
Set ff = fso.GetFolder(path)
Call ListFileInFolder(r_List, ff, includesubfolders)
End Sub
Public Sub ListFileInFolder(ByRef r_List As Range, ByRef ff As Folder, Optional inclSubFolders As Boolean = False)
On Error Resume Next
Dim index As Long, n As Long
index = 0
Dim f As File
For Each f In ff.Files
r_List.Offset(index, 0).Resize(1, 5).Value2 = _
Array(f.path, f.Name, f.Type, f.DateLastModified, f.DateCreated)
index = index + 1
Next
If inclSubFolders Then
For Each ff In ff.SubFolders
n = ff.Files.Count
If n > 0 Then
Call ListFileInFolder(r_List.Offset(index, 0), ff, True)
index = index + n
End If
Next
End If
On Error GoTo 0
End Sub
Of note here is the writing of a single row of data of 5 columns using a single line and the Array() function.

Run-time Error '1004' while pasting

I've been looking around to find out why this is giving me a Run-time error '1004' You can't paste this here because the Copy area and paste area aren't the same size. Select just one cell in the paste area or an area that's the same size, and try pasting again. but I've had no luck with the solutions I've found.
The problem here happens only on this line: ws.Columns(1).Copy Destination:=ws2.Columns(1).Rows(5)
The paste content should be just a few cells vertically i.e. "B1:B5". I can't do Range("B1:B5") because I need to have it constantly update depending on a couple of things.
Any ideas as to why I'm getting the error?
Option Explicit
Sub chkPercent()
Dim wb As Workbook
Dim ws As Worksheet
Dim ws2 As Worksheet
Dim rng As Range
Set wb = ActiveWorkbook
Set ws = wb.Worksheets(1)
Set rng = ws.Range("A1")
Dim iq_Array As Variant
Dim colNumb As Long
Dim rowNumb As Long
Application.ScreenUpdating = False
colNumb = ws.Cells(1, ws.Columns.Count).End(xlToLeft).Column
rowNumb = ws.Cells(ws.Rows.Count, 1).End(xlUp).Row
Dim iQRef() As String
Dim iCol As Long
Dim pptText As String
ReDim iQRef(colNumb)
' capture IQ refs locally
For iCol = 2 To colNumb
iQRef(iCol) = ws.Cells(1, iCol).Value
Next iCol
Worksheets.Add After:=ws
Set ws2 = wb.Worksheets(2)
pptText = "iq_1,2,3,4"
'Identify if within text there is "iq_"
'If InStr(1, pptText, "iq_") <= 0 Then GoTo nextShpe
'set iq_Array as an array of the split iq's
iq_Array = Split(pptText, ",")
Dim hasIQs As Boolean
Dim checkStr As String
Dim pCol As Long
Dim checkOne
Dim arrayLoop As Long
checkOne = iq_Array(0)
hasIQs = Left(checkOne, 3) = "iq_"
If hasIQs Then
' paste inital column into temporary worksheet
ws.Columns(1).Copy Destination:=ws2.Columns(1)
End If
' loop for each iq_ in the array
For arrayLoop = LBound(iq_Array) To UBound(iq_Array)
' Take copy of potential ref and adjust to standard if required
checkStr = iq_Array(arrayLoop)
If hasIQs And Left(checkStr, 3) <> "iq_" Then checkStr = "iq_" & checkStr
Dim iQRefArray As Variant
Dim iQRefString As String
Dim checkRefStr As String
Dim nCol As Long
Dim doUntilCheck As String
Dim rowCount As Long
Dim copy1
Dim paste1
doUntilCheck = 99
' Look for existence of corresponding column in local copy array
pCol = 0
For iCol = 2 To colNumb
iQRefString = Left(iQRef(iCol), Len(iQRef(iCol)) - 1)
iQRefArray = Replace(iQRefString, "__", "_")
iQRefArray = Split(iQRefArray, "_")
checkRefStr = "iq_" & iQRefArray(1)
If checkStr = checkRefStr Then
pCol = iCol
Exit For
End If
Next iCol
If pCol > 0 Then
' Paste the corresponding column into the forming table
ws.Columns(pCol).Copy Destination:=ws2.Columns(2)
If iQRefArray(2) = "00" Then GoTo nxtArrayLoop
nCol = 0
rowCount = 1
Do Until doUntilCheck = "00"
Do Until doUntilCheck = "01"
nCol = nCol + 1
rowCount = rowCount + rowNumb
iQRefString = Left(iQRef(iCol + nCol), Len(iQRef(iCol + nCol)) - 1)
iQRefArray = Replace(iQRefString, "__", "_")
iQRefArray = Split(iQRefArray, "_")
doUntilCheck = iQRefArray(2)
If doUntilCheck = "00" Then GoTo nxtArrayLoop
If doUntilCheck = "01" Then GoTo nxtArrayLoop
ws.Columns(1).Copy Destination:=ws2.Columns(1).Rows(rowCount)
ws.Columns(pCol + nCol).Copy Destination:=ws2.Columns(2).Rows(rowCount)
Loop
Loop
End If
nxtArrayLoop:
Next arrayLoop
Application.ScreenUpdating = True
End Sub
The error is just as it says it is. The copy and paste ranges are two different sizes.
The code tries to copy an entire column (ws.Columns(1).Copy) into a defined range (Destination:=ws2.Columns(1).Rows(5) - which I think is one cell, but I have never used that type of syntax before).
If you need to define the copy range to be dynamic than do this:
ws.Range(ws.Cells(ws.Rows.Count,1).End(Xlup),ws.Cells(1,1)).Copy Destination:=ws2.Cells(1,1)
This assumes contiguous cells starting in row 1 of column A in ws.

Excel, comparing two specific rows in two sheets and highlighting the differences

Before commenting on saying that there are similar questions, Ive tried them but they do not work unfortunately
Hi, this is the first time I am on S.O, rest assured, I have spent hours looking for a solution for this. I have a status column which shows statuses such as, deleted, new, changed.
When the status is "changed", I would like to compare that particular row from column E to the last possible column in Excel (XFD) in Sheet3 to columns A to the last possible column in Excel (XFD) in Sheet1 and highlight the cells which are different.
I have found this solution:-
Dim diffB As Boolean
Dim r As Long, c As Integer, m As Integer
Dim lr1 As Long, lr2 As Long, lc1 As Integer, lc2 As Integer
Dim maxR As Long, maxC As Integer, cf1 As String, cf2 As String
Dim rptWB As Workbook, DiffCount As Long
Application.ScreenUpdating = False
Application.StatusBar = "Creating the report..."
Application.DisplayAlerts = True
With Sheet1.UsedRange
lr1 = .Rows.Count
lc1 = .Columns.Count
End With
With Sheet3.UsedRange
lr2 = .Rows.Count
lc2 = .Columns.Count
End With
maxR = lr1
maxC = lc1
If maxR < lr2 Then maxR = lr2
If maxC < lc2 Then maxC = lc2
DiffCount = 0
For c = 1 To maxC
For i = 2 To lr1
diffB = True
Application.StatusBar = "Comparing cells " & Format(i / maxR, "0 %") & "..."
For r = 2 To lr2
cf1 = ""
cf2 = ""
On Error Resume Next
cf1 = Sheet1.Cells(i, c).FormulaLocal
cf2 = Sheet3.Cells(r, c).FormulaLocal
On Error GoTo 0
If cf1 = cf2 Then
diffB = False
Sheet1.Cells(i, c).Interior.ColorIndex = 19
Sheet1.Cells(i, c).Select
Selection.Font.Bold = True
Exit For
End If
Next r
If diffB Then
DiffCount = DiffCount + 1
Sheet1.Cells(i, c).Interior.ColorIndex = 0
Sheet1.Cells(i, c).Select
Selection.Font.Bold = False
End If
Next i
Next c3
Application.StatusBar = "Formatting the report..."
'Columns("A:IV").ColumnWidth = 10
m = maxR - DiffCount - 1
Application.StatusBar = False
Application.ScreenUpdating = True
MsgBox m & " cells contain same values!", vbInformation, _
"Compare " & Sheet1.Name & " with " & Sheet3.Name
However, this compares columns and I do not know how to limit the comparison to column E-XFD in sheet1 to column A-XFD in sheet2.
There are also several sheets in this workbook but I only want to compare sheet1 and sheet2.
It will be much appreciated if you guys can help me out :)
Thanks!
Dim lrOne As Integer
Dim lcOne As Integer
Dim lrTwo As Integer
Dim lcTwo As Integer
Dim cellA As Variant
Dim cellB As Variant
Dim cellCnt As Integer
Dim lookupRange As Range
Dim lookinRange As Range
lrOne = Sheet1.Cells(Rows.Count, 5).End(xlUp).Row
lrTwo = Sheet3.Cells(Rows.Count, 1).End(xlUp).Row
lcOne = Sheet1.Cells(1, Columns.Count).End(xlToLeft).Column
lcTwo = Sheet3.Cells(1, Columns.Count).End(xlToLeft).Column
Set lookupRange = Sheet1.Range(Cells(1,5), Cells(lrOne, lcOne))
Set lookinRange = Sheet3.Range(Cells(1,1), Cells(lrTwo, lcTwo))
For Each cellA In lookupRange
For Each cellB in lookinRange
If cellA.Value = cellB.Value And cellA.Value <> "" Then
cellB.Interior.ColorIndex = 3
cellCnt = cellCnt + 1
End If
Next cellB
Next cellA

add the word "somme" in the end of the table

this is my code:
xlWorkSheet = CType(xlWorkBook.Sheets(ComboBox1.Text), Excel.Worksheet)
xlWorkSheet.Activate()
xlApp.Visible = True
Dim j As Integer
Dim lastrow As Integer
Dim lastcol As Integer
With xlWorkSheet
.Select()
j = xlApp.Cells.SpecialCells(2).Column
lastrow=xlApp.Cells.SpecialCells(2).Rows.End(XlDirection.xlDown).Row
lastcol = xlApp.Cells.SpecialCells(2).Columns.End(XlDirection.xlToRight).Column
For thiscol = j To lastcol
.Cells(lastrow + 1, thiscol).Value = _
xlApp.Sum(.Range(.Cells(1, thiscol), .Cells(lastrow, thiscol)))
Next
End With
i want to add the word "sum" in the end of the table
when i find the end of the table i insert the word "sum"
After next', and beforeEnd With` add a line like:
.cells(lastRow + 2, 1).value = "Somme"
That will put the word "Somme" after the last row of data in the first column.