Convert multiple Word documents to HTML files using VBA - vba

How can I convert multiple MS Word documents in a certain folder to HTML using VBA?
I have used Powershell to do this but unfortunately the access is blocked to run the scripts.

I wrote a Word VBA program to do this. The program is open-source, under the MIT License.
The program is in a Word-doc on Github at:
https://github.com/jimyuill/word-web-nav/blob/main/tools/generate_word_html.docm
The Word-doc's text describes how to use the program.
The code is commented to explain how it works (alt+F11 opens the IDE).
The program optionally adds a table-of-contents to the beginning of the document. That part of the code can be ignored, for the OP.
As mentioned by #Cindy Meister, the OP is "much too broad". The program is too big to post all of the code here. In summary:
Prompt for input Word-docs:
There is code to prompt the user for the Word-docs to be converted, from a particular directory. Most of that code is adapted from the example program here:
https://learn.microsoft.com/en-us/office/vba/api/office.filedialog.initialview
One of two APIs is used to get the list of Word-docs. Each API presents a GUI for file-system browsing.
One API allows the user to select particular files in a directory. The API is: Application.FileDialog(msoFileDialogFilePicker)
https://learn.microsoft.com/en-us/office/vba/api/office.filedialog
The other API allows the user to just select a directory. All Word-docs in it are converted. The API is: Application.FileDialog(msoFileDialogFolderPicker)
https://learn.microsoft.com/en-us/office/vba/api/office.msofiledialogtype
The program checks if there are two input Word-docs with the same root-name, and the extensions .doc, .docx, or .docm, e.g., "foo.doc" and "foo.docx". This isn't allowed for input Word-docs because, for each Word-doc, a Word HTML-file is created using the Word-doc's root-name:
<root-name>.html
Save each Word-doc in HTML format:
Within the chosen directory, the program creates a subdirectory that will be used to hold the Word-docs' HTML files and directories.
The program loops to process each input Word-doc. Below are highlights of the code for each doc (some code is omitted). The code is based on the answer by #ASH
' Open the Word doc
Set wordDocObj = Application.Documents.Open(fileObj.Path)
' Change the current directory to the output-directory
ChangeFileOpenDirectory outputFolderObj
' Save the Word-doc in the format "filtered" HTML
fileBaseName = fileSystemObj.GetBaseName(fileName)
outputFileName = fileBaseName & ".html"
ActiveDocument.SaveAs fileName:=outputFileName, FileFormat:=wdFormatFilteredHTML
' Close the Word doc
wordDocObj.Close
' Change the current directory to the input-directory
ChangeFileOpenDirectory sourceFolderObj

Code:
Option Explicit
Sub ChangeDocsToTxtOrRTFOrHTML()
'with export to PDF in Word 2007
Dim fs As Object
Dim oFolder As Object
Dim tFolder As Object
Dim oFile As Object
Dim strDocName As String
Dim intPos As Integer
Dim locFolder As String
Dim fileType As String
On Error Resume Next
locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\myDocs")
Select Case Application.Version
Case Is < 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML")
Case Is >= 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF")
End Select
Application.ScreenUpdating = False
Set fs = CreateObject("Scripting.FileSystemObject")
Set oFolder = fs.GetFolder(locFolder)
Set tFolder = fs.CreateFolder(locFolder & "Converted")
Set tFolder = fs.GetFolder(locFolder & "Converted")
For Each oFile In oFolder.Files
Dim d As Document
Set d = Application.Documents.Open(oFile.Path)
strDocName = ActiveDocument.Name
intPos = InStrRev(strDocName, ".")
strDocName = Left(strDocName, intPos - 1)
ChangeFileOpenDirectory tFolder
Select Case fileType
Case Is = "TXT"
strDocName = strDocName & ".txt"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText
Case Is = "RTF"
strDocName = strDocName & ".rtf"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF
Case Is = "HTML"
strDocName = strDocName & ".html"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML
Case Is = "PDF"
strDocName = strDocName & ".pdf"
' *** Word 2007 users - remove the apostrophe at the start of the next line ***
'ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF
End Select
d.Close
ChangeFileOpenDirectory oFolder
Next oFile
Application.ScreenUpdating = True
End Sub
Option Explicit
Private Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub ConvertDocs()
Dim fs As Object
Dim oFolder As Object
Dim tFolder As Object
Dim oFile As Object
Dim strDocName As String
Dim intPos As Integer
Dim locFolder As String
Dim fileType As String
Dim office2007 As Boolean
Dim lf As LinkFormat
Dim oField As Field
Dim oIShape As InlineShape
Dim oShape As Shape
On Error Resume Next
locFolder = InputBox("Enter the path to the folder with the documents to be converted", "File Conversion", "C:\myDocs")
If Application.Version >= 12 Then
office2007 = True
Do
fileType = UCase(InputBox("Enter one of the following formats (to convert to): TXT, RTF, HTML, DOC, DOCX or PDF", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF" Or fileType = "DOC" Or fileType = "DOCX")
Else
office2007 = False
Do
fileType = UCase(InputBox("Enter one of the following formats (to convert to): TXT, RTF, HTML or DOC", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "DOC")
End Select
Application.ScreenUpdating = False
Set fs = CreateObject("Scripting.FileSystemObject")
Set oFolder = fs.GetFolder(locFolder)
Set tFolder = fs.CreateFolder(locFolder & "Converted")
Set tFolder = fs.GetFolder(locFolder & "Converted")
For Each oFile In oFolder.Files
Dim d As Document
Set d = Application.Documents.Open(oFile.Path)
' put the document into print view
If fileType = "RTF" Or fileType = "DOC" Or fileType = "DOCX" Then
With ActiveWindow.View
.ReadingLayout = False
.Type = wdPrintView
End With
End If
' try to embed linked images from fields, shapes and inline shapes into the document
' (for some reason this does not work for all images in all HTML files I've tested)
If Not fileType = "HTML" Then
For Each oField In d.Fields
Set lf = oField.LinkFormat
If oField.Type = wdFieldIncludePicture And Not lf Is Nothing And Not lf.SavePictureWithDocument Then
lf.SavePictureWithDocument = True
Sleep (2000)
lf.BreakLink()
d.UndoClear()
End If
Next
For Each oShape In d.Shapes
Set lf = oShape.LinkFormat
If Not lf Is Nothing And Not lf.SavePictureWithDocument Then
lf.SavePictureWithDocument = True
Sleep (2000)
lf.BreakLink()
d.UndoClear()
End If
Next
For Each oIShape In d.InlineShapes
Set lf = oIShape.LinkFormat
If Not lf Is Nothing And Not lf.SavePictureWithDocument Then
lf.SavePictureWithDocument = True
Sleep (2000)
lf.BreakLink()
d.UndoClear()
End If
Next
End If
strDocName = d.Name
intPos = InStrRev(strDocName, ".")
strDocName = Left(strDocName, intPos - 1)
ChangeFileOpenDirectory(tFolder)
' Check out these links for a comprehensive list of supported file formats and format constants:
' http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.wdsaveformat.aspx
' http://msdn.microsoft.com/en-us/library/office/bb238158.aspx
' (In the latter list you can see the values that the constants are associated with.
' Office 2003 only supported values up to wdFormatXML(=11). Values from wdFormatXMLDocument(=12)
' til wdFormatDocumentDefault(=16) were added in Office 2007, and wdFormatPDF(=17) and wdFormatXPS(=18)
' were added in Office 2007 SP2. Office 2010 added the various wdFormatFlatXML* formats and wdFormatOpenDocumentText.)
If Not office2007 And fileType = "DOCX" Then
fileType = "DOC"
End If
Select Case fileType
Case Is = "TXT"
strDocName = strDocName & ".txt"
d.SaveAs(FileName := strDocName, FileFormat := wdFormatText)
Case Is = "RTF"
strDocName = strDocName & ".rtf"
d.SaveAs(FileName := strDocName, FileFormat := wdFormatRTF)
Case Is = "HTML"
strDocName = strDocName & ".html"
d.SaveAs(FileName := strDocName, FileFormat := wdFormatFilteredHTML)
Case Is = "DOC"
strDocName = strDocName & ".doc"
d.SaveAs(FileName := strDocName, FileFormat := wdFormatDocument)
Case Is = "DOCX"
strDocName = strDocName & ".docx"
' *** Word 2007+ users - remove the apostrophe at the start of the next line ***
'd.SaveAs(FileName := strDocName, FileFormat := wdFormatDocumentDefault)
Case Is = "PDF"
strDocName = strDocName & ".pdf"
' *** Word 2007 SP2+ users - remove the apostrophe at the start of the next line ***
'd.ExportAsFixedFormat(OutputFileName := strDocName, ExportFormat := wdExportFormatPDF)
End Select
d.Close
ChangeFileOpenDirectory(oFolder)
Next oFile
Application.ScreenUpdating = True
End Sub
Also, see this...
https://www.youtube.com/watch?v=4vFQV6RtYMM

Related

Rename text file based on first line of text

I have a list of text files (all different names.txt, but small files).
I want to rename them based on the first line of text in the file.
Some files have a few enters before the text. So the code returns with a (blank).txt.
Sub RenameTextFile()
Const SpecialCharacters As String = "\,/,:,*,?,<,>,|,""," ' Modify this as neccesary
Const ForReading = 1, ForWriting = 2, ForAppending = 8
Dim fso, MyFile, FileName, TextLine
Dim char As Variant
Set fso = CreateObject("Scripting.FileSystemObject")
Set fol = fso.GetFolder("C:\Research syntheses - Meta analysis\Txt files ECS\out\")
For Each fil In fol.Files
FileName = fil
Set MyFile = fso.OpenTextFile(FileName, ForReading)
Do While MyFile.AtEndOfStream <> True
Do
Dim tmpLine As String
TextLine = MyFile.ReadLine
tmpLine = RemoveWhiteSpace(TextLine)
If Len(tmpLine) = 0 Then
TextLine = tmpLine
End If
Loop Until Len(TextLine) > 0
MyFile.Close
For Each char In Split(SpecialCharacters, ",")
TextLine = Replace(TextLine, char, "")
Next
fil.Name = TextLine & ".txt"
Exit Do
Loop
MyFile.Close
Next fil
End Sub
You could add another loop to your code like that
Sub RenameTextFile()
Const ForReading = 1, ForWriting = 2, ForAppending = 8
Dim fso, MyFile, FileName, TextLine
Set fso = CreateObject("Scripting.FileSystemObject")
Set fol = fso.GetFolder("d:\tmp\")
For Each fil In fol.Files
FileName = fil
Set MyFile = fso.OpenTextFile(FileName, ForReading)
Do While MyFile.AtEndOfStream <> True
Do
Dim tmpLine As String
TextLine = MyFile.ReadLine
tmpLine = removeWhiteSpace(TextLine)
If Len(tmpLine) = 0 Then
TextLine = tmpLine
End If
Loop Until Len(TextLine) > 0
MyFile.Close
If isValidFilename(Trim(TextLine)) Then
On Error Resume Next
fil.Name = Trim(TextLine) & ".txt"
On Error Goto 0
Else
MsgBox "Renaming: " & fil.Name & " -to- " & Trim(TextLine) & " failed", vbCritical + vbOKOnly, "Invalid Filename"
End If
Exit Do
Loop
MyFile.Close
Next fil
End Sub
On the long run you need to think about kind of error handling as a file with the same name might already exist. And my extra loop will also fail in case one of the files only contains empty lines.
Update By the OP's comment I strongly guess that some of the files contain white spaces at the beginning and the OP is not aware of that. One can use the following function (taken from here) to remove these
'Add a reference to Microsoft VBScript Regular Expressions 5.5
Public Function RemoveWhiteSpace(ByVal target As String) As String
With New RegExp
.Pattern = "\s"
.MultiLine = True
.Global = True
RemoveWhiteSpace = .Replace(target, vbNullString)
End With
End Function
This will remove also white spaces from textline even if you want to keep them as the empty spaces in the line of your picture.
Update 2: In order to avoid invalid filename add the following function
Function isValidFilename(ByVal FileName As String) As Boolean
'PURPOSE: Determine If A Given Excel File Name Is Valid
'SOURCE: www.TheSpreadsheetGuru.com/the-code-vault
With New RegExp
.Pattern = "[\\/:\*\?""<>\|\[\]]"
ValidFileName = Not .Test(FileName)
End With
End Function
I modified the main procedure.

Outlook - VBA set signature to new Email ... so the signature can be changed via menu

I wrote a script where I add a signature from an htm file in the appData ... signature folder to a newly opened email.
My question is - how do i modify this VBA script to add that signature in a way so Outlook knows its a signature and the signature might be changed by a user via gui.
I assume it may have something to do with setting a "_MailAutoSig" bookmark, is that right?
Script looks like this and works so far:
Dim WithEvents m_objMail As Outlook.MailItem
Dim LODGIT_SUBJECT_IDENTIFIERS() As String
Private Sub Application_ItemLoad(ByVal Item As Object)
'MsgBox "Application_ItemLoad"
Select Case Item.Class
Case olMail
Set m_objMail = Item
End Select
End Sub
Private Sub m_objMail_Open(Cancel As Boolean)
'string array containing lodgit email subject identifiers (beginning string!!! of email subject)
LODGIT_SUBJECT_IDENTIFIERS = Split("Angebot von Bödele Alpenhotel,Angebot von,bestätigt Ihre Reservierung,Rechnung Nr.,Stornogutschrift für die Rechnung,Ausstehende Zahlung", ",")
Dim Application As Object
Dim oOutApp As Object, oOutMail As Object
Dim strbody As String, FixedHtmlBody As String
Dim Ret
Set Application = CreateObject("Outlook.Application")
'Change only Mysig.htm to the name of your signature
' C:\Users\nicole\AppData\Roaming\Microsoft\Signatures
Ret = Environ("appdata") & _
"\Microsoft\Signatures\AH Andrea kurz.htm"
If Ret = False Then Exit Sub
'~~> Use the function to fix image paths in the htm file
FixedHtmlBody = FixHtmlBody(Ret)
'CHECK FOR LODGIT IDENTIFIER
If myInStr(m_objMail.Subject, LODGIT_SUBJECT_IDENTIFIERS()) Then
Debug.Print "E-Mail as from Lodgit identified"
Dim str As String
Dim a As Object
str = Replace(m_objMail.Body, vbCrLf, "<br>")
str = Replace(str, vbNewLine, "<br>")
m_objMail.HTMLBody = "<html><body><span style='font-size:11.0pt;font-family:""Times New Roman"" '>" & str & "</span>" & FixedHtmlBody & "</body></html>"
End If
End Sub
'~~> Function to fix image paths in Signature .htm Files
Function FixHtmlBody(r As Variant) As String
Dim FullPath As String, filename As String
Dim FilenameWithoutExtn As String
Dim foldername As String
Dim MyData As String
'~~> Read the html file as text file in a string variable
Open r For Binary As #1
MyData = Space$(LOF(1))
Get #1, , MyData
Close #1
'~~> Get File Name from path
filename = GetFilenameFromPath(r)
'~~> Get File Name without extension
FilenameWithoutExtn = Left(filename, (InStrRev(filename, ".", -1, vbTextCompare) - 1))
'~~> Get the foldername where the images are stored
foldername = FilenameWithoutExtn & "-Dateien"
'~~> Full Path of Folder
FullPath = Left(r, InStrRev(r, "\")) & foldername
'~~> To cater for spaces in signature file name
'FullPath = Replace(FullPath, " ", "%20")
'~~> Replace incomplete path with full Path
FixHtmlBody = Replace(MyData, "AH%20Andrea%20kurz-Dateien", FullPath)
'FixHtmlBody = Replace(MyData, foldername, FullPath)
End Function
'~~> Gets File Name from path
Public Function GetFilenameFromPath(ByVal strPath As String) As String
If Right$(strPath, 1) <> "\" And Len(strPath) > 0 Then _
GetFilenameFromPath = GetFilenameFromPath(Left$(strPath, Len(strPath) - 1)) + Right$(strPath, 1)
End Function
'check if str contains on of the elements of a str array
Public Function myInStr(myString As String, a() As String) As Boolean
For Each elem In a
If InStr(1, myString, elem, vbTextCompare) <> 0 Then
myInStr = True
Exit Function
End If
Next
myInStr = False
End Function
Outlook looks for the "_MailAutoSig" bookmark. This needs to be done with Word Object Model, not by setting the HTMLBody property. Something along the lines:
wdStory = 6
wdMove = 0
Set objBkm = Nothing
Set objDoc = Inspector.WordEditor
Set objSel = objDoc.Application.Selection
'remember the cursor position
set cursorRange = objDoc.Range
cursorRange.Start = objSel.Start
cursorRange.End = objSel.End
If objDoc.Bookmarks.Exists("_MailAutoSig") Then
'replace old signature
Debug.Print "old signature found"
set objBkm = objDoc.Bookmarks("_MailAutoSig")
objBkm.Select
objDoc.Windows(1).Selection.Delete
ElseIf objDoc.Bookmarks.Exists("_MailOriginal") Then
' is there the original email? (_MailOriginal)
set objBkm = objDoc.Bookmarks("_MailOriginal")
objSel.Start = objBkm.Start-2 'give room for the line break before. It includes the line
objSel.End = objBkm.Start-2
Else
'insert at the end of the email
objSel.EndOf wdStory, wdMove
End If
'start bookmark
set bkmStart = objDoc.Bookmarks.Add("_tempStart", objSel.Range)
'end bookmark
set bkmEnd = objDoc.Bookmarks.Add("_tempEnd", objSel.Range)
bkmEnd.End = bkmEnd.End + 1
bkmEnd.Start = bkmEnd.Start + 1
objSel.Text = " "
set objBkm = objDoc.Bookmarks.Add("_MailAutoSig", bkmStart.Range)
objBkm.Range.insertFile "c:\Users\<user>\AppData\Roaming\Microsoft\Signatures\test.htm", , false, false, false
objBkm.Range.InsertParagraphBefore
objBkm.End = bkmEnd.Start - 1 'since we added 1 above for bkmEnd
objSel.Start = cursorRange.Start
objSel.End = cursorRange.End
bkmStart.Delete
bkmEnd.Delete

VBA to read input from a file

I am trying to modify the following code, it will merge the Word Documents fine, but I have text file with every line being "*Name*.docx" "*Name2*.docx", etc, I would like the VBA macro to read the text file line by line and merge all the documents that match the pattern, should be 27 documents when done and save each one preferably with the a title that includes the "*Name" tag so I can know which is which. Any help would be greatly appreciated
Sub MergeDocs()
Dim rng As Range
Dim MainDoc As Document
Dim strFile As String
Const strFolder = "C:\test\"
Set MainDoc = Documents.Add
strFile = Dir$(strFolder & "*Name*.docx")
Do Until strFile = ""
Set rng = MainDoc.Range
rng.Collapse wdCollapseEnd
rng.InsertFile strFolder & strFile
strFile = Dir$()
Loop
MsgBox ("Files are merged")
End Sub
I think it's just a matter of adding an extra loop that reads the input file line by line and then uses your loop above.
This example uses the scripting filesystemobject to open the file and read it.
I assume what you've said above is what you actually mean - and the file spec is in the text file. Change the constants to fit your needs
Sub MergeDocs()
Const FOLDER_START As String = "C:\test\" ' Location of inout word files and text file
Const FOLDER_OUTPUT As String = "C:\test\output\" ' send resulting word files here
Const TEST_FILE As String = "doc-list.txt"
Dim rng As Range
Dim MainDoc As Document
Dim strFile As String
Dim strFileSpec As String
Dim strWordFile As String
Dim objFSO As Object ' FileSystemObject
Dim objTS As Object ' TextStream
Set objFSO = CreateObject("Scripting.FileSystemObject")
strFile = FOLDER_START & TEST_FILE
If Not objFSO.FileExists(strFile) Then
MsgBox "File Doesn't Exist: " & strFile
Exit Sub
End If
Set objTS = objFSO.OpenTextFile(strFile, 1, False) 'The one was ForReading but for me it threw an error
While Not objTS.AtEndOfStream
Set MainDoc = Documents.Add
' Read file spec from each line in file
strFileSpec = objTS.ReadLine ' get file seacrh spec from input file
'strFileSpec = "*NAME2*"
strFile = Dir$(FOLDER_START & strFileSpec & ".docx") ' changed strFolder to FOLDER_START
Do Until strFile = ""
Set rng = MainDoc.Range
rng.Collapse wdCollapseEnd
rng.InsertFile FOLDER_START & strFile ' changed strFolder again
strFile = Dir$() ' Get next file in search
Loop
strWordFile = Replace(strFileSpec, "*", "") ' Remove wildcards for saving filename
strWordFile = FOLDER_OUTPUT & strWordFile & ".docx"
MainDoc.SaveAs2 strWordFile
MainDoc.Close False
Set MainDoc = Nothing
Wend
objTS.Close
Set objTS = Nothing
Set objFSO = Nothing
MsgBox "Files are merged"
End Sub

import folder of .txt files into word document

I've found a lot on importing folder of .txt files into excel, but not many on importing .txt files into word. I'm trying to get my macro to open all .txt files in a specific folder and import them into a single word document, with each .txt file having its own page. This is the code I have so far (that I found online):
Sub AllFilesInFolder()
Dim myFolder As String, myFile As String
myFolder = Application.FileDialog(msoFileDialogFolderPicker)
With Application.FileDialog(msoFileDialogFolderPicker)
.AllowMultiSelect = False
.Show
If .SelectedItems.Count > 0 Then
myFolder = .SelectedItems(1)
End If
End With
myFile = Dir(myFolder & "\*.txt") '
Do While myFile <> ""
Open myFolder & "\" & myFile For Input As #1
'Copy & Paste Macro?
myFile = Dir
Loop
End Sub
here is something to get you started
Word 2010
Edit this should allow you to open all txt files in one document and save it
Option Explicit
Sub AllFilesInFolder()
Dim myFolder As String
Dim myFile As String
Dim wdDoc As Document
Dim txtFiles As Document
Application.ScreenUpdating = False
myFolder = openFolder
If myFolder = "" Then Exit Sub
myFile = Dir(myFolder & "\*.txt", vbNormal)
Set wdDoc = ActiveDocument
While myFile <> ""
Set txtFiles = Documents.Open(FileName:=myFolder & "\" & myFile, AddToRecentFiles:=False, Visible:=False, ConfirmConversions:=False)
wdDoc.Range.InsertAfter txtFiles.Range.Text & vbCr
txtFiles.Close SaveChanges:=True
myFile = Dir()
Wend
Set txtFiles = Nothing
Set wdDoc = Nothing
Application.ScreenUpdating = True
End Sub
Function openFolder() As String
Dim oFolder As Object
openFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then openFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function
Copy all the text files into a single file using the Command Prompt (cmd.exe) and the following command:
copy *.txt NewFile.txt
Then open this file with word and modify the way you want to see the text.

How do you convert a large number of files to docx?

I had a large number of doc files that I wanted to convert to docx files.
I discovered that there was not a really good way to automatically do this conversion.
I have submitted the method I used to do this but perhaps there are now other ways.
I found a few thing that might help:
Microsoft Bulk Converter
Simple Microsoft Word macro
However I was not satisfied with macro provided. I needed something recursive to also convert nested files. So I expanded it to do so.
Sub SaveAllAsDOCX()
'Search #EXT to change the extensions to save to docx
Dim strDocName As String
Dim strPath As String
Dim oDoc As Document
Dim fDialog As FileDialog
Dim intPos As Integer
'Create a folder dialog
Set fDialog = Application.FileDialog(msoFileDialogFolderPicker)
With fDialog
.Title = "Select root folder and click OK"
.AllowMultiSelect = False
.InitialView = msoFileDialogViewList
If .Show <> -1 Then
MsgBox "Cancelled By User", , "List Folder Contents"
Exit Sub
End If
'Select root folder
strPath = fDialog.SelectedItems.Item(1)
'Ensure the Folder Name ends with a "\"
If Right(strPath, 1) <> "\" Then strPath = strPath + "\"
End With
'Close any open documents
If Documents.Count > 0 Then
Documents.Close SaveChanges:=wdPromptToSaveChanges
End If
'remove any quotes from the folder string
If Left(strPath, 1) = Chr(34) Then
strPath = Mid(strPath, 2, Len(strPath) - 2)
End If
'begin recusion
recurse (strPath)
End Sub
'This method controls the recusion
Function recurse(folder As String)
'save all the files in the current folder
SaveFilesInFolder (folder)
'get all the subfolders of the current folder
Dim folderArray
folderArray = GetSubFolders(folder)
'Loop through all the non-empty elements for folders
For j = 1 To UBound(folderArray)
If folderArray(j) <> "" Then
'begin recusion on subfolder
recurse (folder & folderArray(j) & "\")
End If
Next
End Function
'Saves all files with listed extensions
Function SaveFilesInFolder(folder As String)
'List of extensions to look for #EXT
Dim strFilename As String
extsArray = Array("*.rtf", "*.doc")
'Loop through extensions
For i = 0 To (UBound(extsArray))
'select the 1st file with the current extension
strFilename = Dir(folder & extsArray(i), vbNormal)
'double check the current extension (don't to resave docx files)
Dim ext As String
ext = ""
On Error Resume Next
ext = Right(strFilename, 5)
If ext = ".docx" Or ext = "" Then
'Don't need to resave files in docx format
Else
'Save the current file in docx format
While Len(strFilename) <> 0
Set oDoc = Documents.Open(folder & strFilename)
strDocName = ActiveDocument.FullName
intPos = InStrRev(strDocName, ".")
strDocName = Left(strDocName, intPos - 1)
strDocName = strDocName & ".docx"
oDoc.SaveAs FileName:=strDocName, _
FileFormat:=wdFormatDocumentDefault
oDoc.Close SaveChanges:=wdDoNotSaveChanges
strFilename = Dir
Wend
End If
Next
strFilename = ""
End Function
'List all the subfolders in the current folder
Function GetSubFolders(RootPath As String)
Dim FS As New FileSystemObject
Dim FSfolder As folder
Dim subfolder As Variant
Set FSfolder = FS.GetFolder(RootPath)
'subfolders is variable length
Dim subfolders() As String
ReDim subfolders(1 To 10)
Dim i As Integer
i = LBound(subfolders)
For Each subfolder In FSfolder.subfolders
subfolders(i) = subfolder.Name
'increase the size of subfolders if it's needed
i = i + 1
If (i >= UBound(subfolders)) Then
ReDim subfolders(1 To (i + 10))
End If
Next subfolder
Set FSfolder = Nothing
GetSubFolders = subfolders
End Function
Yeah I know it's a lot of code. :)